Page 1 of 4

How to calculate sortwork

PostPosted: Fri May 31, 2013 8:13 pm
by samb01
Heelo,

i would like to know if there's a methode to calculate the space and the number of sortwork needed for a sort step.

For example,

my dataset in entry have 712000000 records. LRECL :626. RECFM : VB BLKSIZE : 27998

It takes 12 tapes.

The sort is very simple :

SORT FIELDS=(22,91,A),FORMAT=CH



The dataset resulting will be on tape two.

I would like to optimize this sort by using the best dynamic allocation.

Actually, i'am using 60 sortwork with :

SPACE=(CYL,(100,200))

I have this message :

      ICE054I 0 RECORDS - IN: 712250276, OUT: 712250276                             
ICE134I 0 NUMBER OF BYTES SORTED: 145940009076                               
ICE253I 0 RECORDS SORTED - PROCESSED: 712250276, EXPECTED: 467886206         
ICE098I 0 AVERAGE RECORD LENGTH - PROCESSED: 204, EXPECTED: 313               
ICE165I 0 TOTAL WORK DATA SET TRACKS ALLOCATED: 2662500 , TRACKS USED: 2661120           


Thank's for your help.

Re: How to calculate sortwork

PostPosted: Fri May 31, 2013 8:41 pm
by steve-myers
samb01 wrote:Heelo,

i would like to know if there's a methode to calculate the space and the number of sortwork needed for a sort step.

For example,

my dataset in entry have 712000000 records. LRECL :626. RECFM : VB BLKSIZE : 27998

It takes 12 tapes.

The sort is very simple :

SORT FIELDS=(22,91,A),FORMAT=CH



The dataset resulting will be on tape two.

I would like to optimize this sort by using the best dynamic allocation.

Actually, i'am using 60 sortwork with :

SPACE=(CYL,(100,200))

I have this message :

      ICE054I 0 RECORDS - IN: 712250276, OUT: 712250276                             
ICE134I 0 NUMBER OF BYTES SORTED: 145940009076                               
ICE253I 0 RECORDS SORTED - PROCESSED: 712250276, EXPECTED: 467886206         
ICE098I 0 AVERAGE RECORD LENGTH - PROCESSED: 204, EXPECTED: 313               
ICE165I 0 TOTAL WORK DATA SET TRACKS ALLOCATED: 2662500 , TRACKS USED: 2661120           


Thank's for your help.
  • I''m not a sort expert, but I'd guess you did not provide enough sort work space. I suspect you need to tell sort a good estimate of the number of records in your data, and, since the data contains variable length records, a good estimate of the average record size. The mechanism to do this is documented in the sort manuals.
  • I do not understand this; the input appears to be on 12 tape volumes, but you appear to be saying you expect the output to require 2 tape volumes. You seem to be saying you want to compact 12 sausages into 2. Or, does, "The dataset resulting will be on tape two," mean something else?

Re: How to calculate sortwork

PostPosted: Fri May 31, 2013 10:20 pm
by Akatsukami
steve-myers wrote:I do not understand this; the input appears to be on 12 tape volumes, but you appear to be saying you expect the output to require 2 tape volumes. You seem to be saying you want to compact 12 sausages into 2. Or, does, "The dataset resulting will be on tape two," mean something else?

"Two" = "too"?

Re: How to calculate sortwork

PostPosted: Fri May 31, 2013 10:46 pm
by skolusu
Samb01 wrote:Actually, i'am using 60 sortwork with :
SPACE=(CYL,(100,200))


Since you have hard coded JCL SORTWORKs In your JCL,your allocation of 100 cylinders as primary and 200 as secondary spread across 60 sortwork datasets would result in

60 X (1500 + 45000 ) = 2,790,000

Now look at the message ICE165I
ICE165I 0 TOTAL WORK DATA SET TRACKS ALLOCATED: 2662500 , TRACKS USED: 2661120   


DFSORT allocated all the 60 sortworks with primary of 100 cylinders, the secondary extents will only come into picture as they are are needed.

But as steve-myers pointed out, you can provide the AVGLEN and estimated number of records.

Apart from that use DFSORT's Dynamic Allocation and get rid off the JCL sort work datasets.

Add the following line before your SORT statement and remove your JCL sortwork and re-run your job.

OPTION AVGRLEN=204,DYNALLOC=(,60),FILSZ=E730000000

Re: How to calculate sortwork

PostPosted: Fri May 31, 2013 11:06 pm
by c62ap90
I have very old notes (10+ years) on SORTWK I got from a systems programmer that you are welcome to play with. Have fun...

- SORTWK Calculation in Tracks
   -- (A * B * 1.3) / C
       |   |          |
       |   |          |> track capacity (47,476 tracks for 3380)
       |   |          |>                (56,664 tracks for 3390)
       |   |> Average record length
       |> Number of records to sort

   -- notes...---max SORTWK space in tracks should be 2-3 k
              ---50,000 tracks per 3380 disk pack

Re: How to calculate sortwork

PostPosted: Sat Jun 01, 2013 12:57 am
by BillyBoyo
You may have had it for 10 years, but I think that the advice is at least 30 years old and a little out-of-date.

Compare it to the actual example. If using this to calculate static SORTWK datasets, I'd check on how much "overallocation" there is.

Samb01,

Follow Kolusu's advice.

The reason the average record-length is so different is that the bulk of your records are "small" in comparison to the LRECL. The more skewed the figure is, the more useful the estimate of records and a good average record-length will be.

Re: How to calculate sortwork

PostPosted: Sat Jun 01, 2013 2:53 am
by c62ap90
BillyBoyo wrote:You may have had it for 10 years, but I think that the advice is at least 30 years old and a little out-of-date.

Compare it to the actual example. If using this to calculate static SORTWK datasets, I'd check on how much "overallocation" there is.

Actually if you replace your shops track capacity (i.e. 3390, model 1,2,3,9 etc.) into the calculation it looks much better. Google for your track capacity (very easy). I assume you did that? To get the CYLinder value, just divide by 15.

Re: How to calculate sortwork

PostPosted: Sat Jun 01, 2013 11:32 am
by BillyBoyo
I used your calculation with your figures. Only the model 9 has a different bytes-per-track. Tracks-on-a-pack is irrelevant to the calculation. Optimum blocksize is relevant but is static across the disk models within a range.

I've got an old SyncSort "Reference Card" dated 1986. By that time they were suggesting 1.2 * file size. From memory, the figure I used in 1979 (but don't know the year of publication) was 1.3.

If you are happy with 1.3, and with your company/client paying for using that, it's OK with me. I just don't like it being suggested to others.

I don't need to know how many cylinders, proportion is same whether calculated in tracks, bytes, cylinders or bananas. It is proportion, that is the way it works - same units throughout calculation, result will be the same whatever unit is used in calculation.

For me it is significantly overallocating, more especially since dynamic allocation will get a much better figure without having to calculate much at all

Re: How to calculate sortwork

PostPosted: Sun Jun 02, 2013 1:42 am
by samb01
Heelo.

Thanks for your answers.

I will try this :

OPTION AVGRLEN=204,DYNALLOC=(,60),FILSZ=E730000000


as skolusu said.

But i would like to know if the sort will work more quiclkly ?

Actually i have 12 tape on entry et 12 tape out.

I think it will be work fatser if the dataset is on disk. What do you think about it.

The problem is the size of the dataset : 712 000 000 records with 624 lrecl.

I calculate 50 disk model 9 (our disk are model 9). Actually we don't have enought disk... but we can ask for having more disk if we can win time by doing the sort ondisk. Only if it worth it.

Re: How to calculate sortwork

PostPosted: Sun Jun 02, 2013 3:38 am
by dick scherrer
Hello,

Only if it worth it.

Who decides this "worth"?
Are the tapes physical or virtual?
How long will the "output" be needed?

Let's start with this and see where it leads.