A complex requirement of SORT, MERGE, and also edit records



IBM's flagship sort product DFSORT for sorting, merging, copying, data manipulation and reporting. Includes ICETOOL and ICEGENER

A complex requirement of SORT, MERGE, and also edit records

Postby tilak.tumma » Wed Oct 21, 2009 12:54 pm

Hi ,
I have a real complex requirement. Here it goes.

Input
Dataset 0 (FB):
A list of 62000 Policies

Dataset 1 (VB): ( named in date format, M0208 for Feb 2008, M1198 for November 1998 )
Date record ( specifies when the Dataset is generated )
Over a million policy records, sorted order, Policy Number being the key

Dataset 2(VB):
Date record ( specifies when the Dataset is generated )
Over a million policy records, sorted order, Policy Number being the key

So on to about 250 datasets.

Requirement
I need to extract policies from all the 250 datasets that match the Dataset 0, of 62000 policies.
I need the duplicates. So if a policy is found in all 250 datasets, I should have 250 records for that policy.

I have done this part, in two steps.

Step 1. To merge all the 250 dataset records ( with same filter conditions that match the criteria of Dataset 0 ) ,
Step 2. Used WAAPDSUT to match the Dataset 0 with the output of step 1.

Here comes the complex bit, When I read the final output, I should be able to tell the Data set it has come from ( that can be deduced from the Date field of the Date record ).

Obviously the above 2 steps discards the Date record form all the 250 Datasets., Even if the date records were to be kept they would be sorted and clustered at one place and can not be associated with each record.

The only way I can think of is to append a corresponding Date Identifier at the end of each record of all of the 250 dataset, before I start to merge these 250 datasets.

Here is the Illustration:

Dataset 0: Policies

12345
23456
34567
45678
56789


DataSet 1:
M0188 (Date Record)
12345ZZZZZZZZZZZZXXXXXXXXXXXXXXKKKKKKKAAAAAAAAA
23456AAAAAAAAAAJJJJJJJJJJJJJDDDDDDDDDDLLLLLLLLEEEEEEEEEE
89898SFADSFJA;LDFSKDSLFSDFADSFASDFASDFDS

DataSet 2:
M0288
12345SSSSDDDDDDDDDKKKKKKKKKKKEWWWWWWWWWWWLLLLLLLL
23456FFFFFLLLLLLLLDDDDDDDDDDIIIIIIIIIIIEEEEEEEEEE
34567IIIIIUUUUUTTTTTTTTLLLLLLLLLEEEOOOOOOOOEEEE
22222ASF;LSJDFLDFSADSLFADSFLADSF

DataSet 250:
M0909
12345KKKKKKKKKKKJJJJJJJRRRRRRRRRRRRLLLLLLLLLLEEEEEEEE
23456IIIOOOOOOLLLLLLLLLLLLEEEEEUUUUUUUUUUURRRRRRRRRRRRRRRKKKK
34567SFJAEIRAWE;SLKDFAS;FASDF;ALSDFJASDFA
45678SFASDFSADFJDS;LDFSSDFDSFDSFASDF
56789ASDFDSFAOFSDF;LASDFIEWRWEQ
88888UUUUUUUUUUULLLLLLLLLLLLLLADFERWREQRWE
77777DFSASDFJDSFASDFADSFADSFADSFDS

Output:

12345ZZZZZZZZZZZZXXXXXXXXXXXXXXKKKKKKKAAAAAAAAAM0188
12345SSSSDDDDDDDDDKKKKKKKKKKKEWWWWWWWWWWWLLLLLLLLM0288
12345KKKKKKKKKKKJJJJJJJRRRRRRRRRRRRLLLLLLLLLLEEEEEEEEM0909
23456AAAAAAAAAAJJJJJJJJJJJJJDDDDDDDDDDLLLLLLLLEEEEEEEEEEM0188
23456FFFFFLLLLLLLLDDDDDDDDDDIIIIIIIIIIIEEEEEEEEEEM0288
23456IIIOOOOOOLLLLLLLLLLLLEEEEEUUUUUUUUUUURRRRRRRRRRRRRRRKKKKM0909
34567IIIIIUUUUUTTTTTTTTLLLLLLLLLEEEOOOOOOOOEEEEM0288
34567SFJAEIRAWE;SLKDFAS;FASDF;ALSDFJASDFAM0909
45678SFASDFSADFJDS;LDFSSDFDSFDSFASDFM0909
56789ASDFDSFAOFSDF;LASDFIEWRWEQM0909


The Date field added at the end of each record is for illustration, but to add a field may be only possible way out.

Could somebody help me on this please?

Regards,
Tilak
tilak.tumma
 
Posts: 5
Joined: Wed Oct 21, 2009 11:34 am
Has thanked: 0 time
Been thanked: 0 time

Re: A complex requirement of SORT, MERGE, and also edit records

Postby skolusu » Thu Oct 22, 2009 7:31 am

Tilak,

It can be done but I need the following details.

1. what is the LRECL and RECFM of Dataset 0 file?
2. What is the format and position of the policy number in this file?

3. Are all the 250 search datasets VB RECFM?
4. Do all of them have the policy number at the same position?
5. Does the first record on all these files start with M? can there be any record with M in the 5th byte ?
6. What is the LRECL of these files?

7. What is the lrecl and recfm of the OUTPUT file? FB or VB?

Last but not least run this step and show me the sysout which will determine the level of DFSORT you have

//STEP0100 EXEC PGM=SORT       
//SYSOUT   DD SYSOUT=*         
//SORTIN   DD *               
A                             
//SORTOUT  DD SYSOUT=*         
//SYSIN    DD *               
  SORT FIELDS=COPY             
/*
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
skolusu
 
Posts: 586
Joined: Wed Apr 02, 2008 10:38 pm
Has thanked: 0 time
Been thanked: 39 times

Re: A complex requirement of SORT, MERGE, and also edit records

Postby tilak.tumma » Fri Oct 23, 2009 4:45 pm

Hi Kolusu,

Thanks for the hope that it can be done.

1. what is the LRECL and RECFM of Dataset 0 file?
This actually is a generated Dataset (a SUBSET of the current policy file). It can be generated as (LRECL=3996, RECFM=VB) and then use another utility to extract just the policy numbers (LRECL=10, RECFM=FB). We could use which ever would prove handy for the solution.
Note: since this is a subset of the current policy file, its our choice to include the first record ( Date record ) or exclude.

2. What is the format and position of the policy number in this file?
The policy number is at position 3. ( but including the offset for Record Length of Variable block 4 bytes ) it would be 7. It is stored as PIC X(10).

3. Are all the 250 search datasets VB RECFM?

Yes all the 250 datasets are VB. Infact they are the month end backup/archived datasets. So they are all of the same format.


4. Do all of them have the policy number at the same position?
Yes they have the policy number at the same position.

5. Does the first record on all these files start with M? can there be any record with M in the 5th byte ?
Actually the date record starts with B ( thats the first record of all the files ), and no other record starts with B.

6. What is the LRECL of these files?
It is 3996.

7. What is the lrecl and recfm of the OUTPUT file? FB or VB?
(LRECL=3996, RECFM=VB)

Here is the output of the JCL

1ICE143I 0 BLOCKSET COPY TECHNIQUE SELECTED
ICE250I 0 VISIT http://www.ibm.com/storage/dfsort FOR DFSORT PAPERS, EXAMPLES AND MORE
ICE000I 1 - CONTROL STATEMENTS FOR 5694-A01, Z/OS DFSORT V1R10 - 05:41 ON FRI OCT 23, 2009 -
0 SORT FIELDS=COPY 00027000
ICE201I F RECORD TYPE IS F - DATA STARTS IN POSITION 1
ICE751I 0 C5-K90014 C6-K90014 C7-BASE C8-K42136 E9-K40168 C9-BASE E5-K38900 E7-BASE
ICE193I 0 ICEAM1 INVOCATION ENVIRONMENT IN EFFECT - ICEAM1 ENVIRONMENT SELECTED
ICE088I 0 BEBTILAA.STEP0100. , INPUT LRECL = 80, BLKSIZE = 80, TYPE = FB
ICE093I 0 MAIN STORAGE = (MAX,20971520,20958302)
ICE156I 0 MAIN STORAGE ABOVE 16MB = (20841438,20841438)
ICE127I 0 OPTIONS: OVFLO=RC0 ,PAD=RC0 ,TRUNC=RC0 ,SPANINC=RC16,VLSCMP=N,SZERO=Y,RESET=Y,VSAMEMT=Y,DYNSPC=256
ICE128I 0 OPTIONS: SIZE=20971520,MAXLIM=6291456,MINLIM=450560,EQUALS=N,LIST=Y,ERET=ABEND,MSGDDN=SYSOUT
ICE129I 0 OPTIONS: VIO=N,RESDNT=NONE,SMF=NO ,WRKSEC=Y,OUTSEC=Y,VERIFY=N,CHALT=N,DYNALOC=N ,ABCODE=016
ICE130I 0 OPTIONS: RESALL=4096,RESINV=0,SVC=109 ,CHECK=Y,WRKREL=Y,OUTREL=Y,CKPT=N,COBEXIT=COB2
ICE131I 0 OPTIONS: TMAXLIM=20971520,ARESALL=0,ARESINV=0,OVERRGN=65536,CINV=Y,CFW=Y,DSA=0
ICE132I 0 OPTIONS: VLSHRT=N,ZDPRINT=N,IEXIT=N,TEXIT=N,LISTX=N,EFS=NONE ,EXITCK=W,PARMDDN=DFSPARM ,FSZEST=N
ICE133I 0 OPTIONS: HIPRMAX=0 ,DSPSIZE=MAX ,ODMAXBF=0,SOLRF=Y,VLLONG=N,VSAMIO=N,MOSIZE=MAX
ICE235I 0 OPTIONS: NULLOUT=RC0
ICE084I 0 BSAM ACCESS METHOD USED FOR SORTOUT
ICE084I 0 BSAM ACCESS METHOD USED FOR SORTIN
ICE751I 1 EF-BASE F0-BASE E8-K38900
ICE090I 0 OUTPUT LRECL = 80, BLKSIZE = 80, TYPE = FB
ICE055I 0 INSERT 0, DELETE 0
ICE054I 0 RECORDS - IN: 1, OUT: 1
ICE052I 0 END OF DFSORT

Regards,
Tilak
tilak.tumma
 
Posts: 5
Joined: Wed Oct 21, 2009 11:34 am
Has thanked: 0 time
Been thanked: 0 time

Re: A complex requirement of SORT, MERGE, and also edit records

Postby skolusu » Sat Oct 24, 2009 12:04 am

Tilak,

The following JCL will give you the desired results. The following are the assumptions I made

1.Create the subset file with the B record as the first record and the policy number starts 7(rdw 4 byte) and the same attributes as the other VB files. This will be the first dataset in the concatenation list (Vb0)

2. The first record of all these files have a B in pos 5 considering the RDW and the date is next 4 bytes.

3. The output will be Bdateval+3996 bytes of original data. The bdateval is got from the individual files and is added at the beginning of every record that found a match with the look up file



//STEP0100 EXEC PGM=SORT                                             
//SYSOUT   DD SYSOUT=*                                               
//SORTIN   DD DSN=VB0,DISP=SHR                                     
//         DD DSN=VB1,DISP=SHR                                     
//         DD DSN=VB2,DISP=SHR                                     
//         DD DSN=VB3,DISP=SHR                                     
....all your 250 VB datasets
//SORTOUT  DD SYSOUT=*                                               
//SYSIN    DD *                                                     
  INREC IFTHEN=(WHEN=INIT,BUILD=(1,4,29X,5)),                         
  IFTHEN=(WHEN=GROUP,BEGIN=(34,1,CH,EQ,C'B'),                         
  PUSH=(5:ID=3,11:SEQ=8,34,5)),                                       
  IFTHEN=(WHEN=(11,8,ZD,GT,1),OVERLAY=(24:36,10))                     
  SORT FIELDS=(24,10,CH,A),EQUALS                                     
  OUTREC IFTHEN=(WHEN=INIT,OVERLAY=(11:SEQNUM,8,ZD,RESTART=(24,10))),
  IFTHEN=(WHEN=GROUP,BEGIN=(11,8,ZD,EQ,1),PUSH=(8:5,3))               
                                                                     
  OUTFIL BUILD=(1,4,19,5,34),                                         
  INCLUDE=(8,3,ZD,EQ,1,AND,5,3,ZD,GT,1,AND,34,1,CH,NE,C'B')           
/*
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
skolusu
 
Posts: 586
Joined: Wed Apr 02, 2008 10:38 pm
Has thanked: 0 time
Been thanked: 39 times

Re: A complex requirement of SORT, MERGE, and also edit records

Postby tilak.tumma » Mon Oct 26, 2009 4:50 pm

Hi skolusu,

I tried it a few datasets in the input and it works like magic. Thank you very much.

However, I need to make a few amendments to this (w.r.t the assumptions you made, they were infact quite close to the requirement).
the Date rocord starts with B but has a filler of length 16 and the date field starts only 18th character until 21 ( 4 bytes )

Being a beginner to the JCL and Mainframe, I find it little difficult to understand the JCL.
It would be greatly helpful if you could just add a few lines of comments between the codes, so that I can understand it and use it for my future requirements as well.

Thanks one again,

Tilak
tilak.tumma
 
Posts: 5
Joined: Wed Oct 21, 2009 11:34 am
Has thanked: 0 time
Been thanked: 0 time

Re: A complex requirement of SORT, MERGE, and also edit records

Postby skolusu » Tue Oct 27, 2009 1:06 am

Tilak,

here is a brief explanation of the control cards

  INREC IFTHEN=(WHEN=INIT,BUILD=(1,4,29X,5)),


The above statement will put 29 spaces after RDW which looks like this
====|=============================|==========================     
RDW |     29 SPACES               |ACTUAL DATA                   
1-4 |                             | 34 - 3992                   
====|=============================|==========================     
    |                             |B                DAT1         
    |                             |  1111111111                   
    |                             |  2222222222                   
    |                             |  3333333333                   
    |                             |  4444444444                   
    |                             |  5555555555                   
    |                             |B                DAT2         
    |                             |  1111111111                   
    |                             |  6666666666                   
                                                                 


Since we added the 29 spaces the actual data now starts at 34 and dat1 is now at pos 51 as u said the date is after 16 bytes after 'b'

IFTHEN=(WHEN=GROUP,BEGIN=(34,1,CH,EQ,C'B'), 
PUSH=(5:ID=3,11:SEQ=8,34,1,51,4)),               


this will populate the data like below. so every time we find a B at pos 34 we add a id starting with 1 and increment and also seqnum within each group

====|===|===|========|============|==========================   
RDW |   |   |        |            |ACTUAL DATA                 
1-4 | ID|   |SEQ     |            |                             
====|===|===|========|============|==========================   
    |001|   |00000001|BDAT1       |B                DAT1       
    |001|   |00000002|BDAT1       |  1111111111                 
    |001|   |00000003|BDAT1       |  2222222222                 
    |001|   |00000004|BDAT1       |  3333333333                 
    |001|   |00000005|BDAT1       |  4444444444                 
    |001|   |00000006|BDAT1       |  5555555555                 

    |002|   |00000001|BDAT2       |B                DAT2       
    |002|   |00000002|BDAT2       |  1111111111                 
    |002|   |00000003|BDAT2       |  6666666666                       
                                                                 


Now only the first record is a b record and the rest are actual key records so we now use the seqnum to denote that and overlay the spaces after the date field and seqnum with the actual key values at pos 24

IFTHEN=(WHEN=(11,8,ZD,GT,1),OVERLAY=(24:36,10)) 


the data looks like this now

====|===|===|========|================|========================== 
RDW |   |   |        |     |          |ACTUAL DATA                 
1-4 | ID|   |SEQ     |DATE |KEY       |                           
====|===|===|========|=====|==========|========================== 
    |001|   |00000001|BDAT1|          |B                DAT1       
    |001|   |00000002|BDAT1|1111111111|  1111111111               
    |001|   |00000003|BDAT1|2222222222|  2222222222               
    |001|   |00000004|BDAT1|3333333333|  3333333333               
    |001|   |00000005|BDAT1|4444444444|  4444444444               
    |001|   |00000006|BDAT1|5555555555|  5555555555               

    |002|   |00000001|BDAT2|          |B                DAT2       
    |002|   |00000002|BDAT2|1111111111|  1111111111               
    |002|   |00000003|BDAT2|6666666666|  6666666666               


now we have all the keys at pos 24 and we sort on it and with equals option we retain the order of duplicate keys.

Once sorted we have all the like keys together and since we concatenated the look up key first it will always be first in the list and this how the data looks like
====|===|===|========|================|========================== 
RDW |   |   |        |     |          |ACTUAL DATA                 
1-4 | ID|   |SEQ     |DATE |KEY       |                           
====|===|===|========|=====|==========|========================== 
    |001|   |00000001|BDAT1|          |B                DAT1       
    |002|   |00000001|BDAT2|          |B                DAT2       

    |001|   |00000002|BDAT1|1111111111|  1111111111               
    |002|   |00000002|BDAT2|1111111111|  1111111111               

    |001|   |00000003|BDAT1|2222222222|  2222222222               
    |001|   |00000004|BDAT1|3333333333|  3333333333               
    |001|   |00000005|BDAT1|4444444444|  4444444444               
    |001|   |00000006|BDAT1|5555555555|  5555555555               

    |002|   |00000003|BDAT2|6666666666|  6666666666



Now once again we write a seqnum at pos 11 using the seqnum parm with restart

OUTREC IFTHEN=(WHEN=INIT,OVERLAY=(11:SEQNUM,8,ZD,RESTART=(24,10))),


this will make the look like this

====|===|===|========|================|========================== 
RDW |   |   |        |     |          |ACTUAL DATA                 
1-4 | ID|   |SEQ     |DATE |KEY       |                           
====|===|===|========|=====|==========|========================== 
    |001|   |00000001|BDAT1|          |B                DAT1       
    |002|   |00000002|BDAT2|          |B                DAT2       

    |001|   |00000001|BDAT1|1111111111|  1111111111               
    |002|   |00000002|BDAT2|1111111111|  1111111111               

    |001|   |00000001|BDAT1|2222222222|  2222222222               
    |001|   |00000002|BDAT1|3333333333|  3333333333               
    |001|   |00000003|BDAT1|4444444444|  4444444444               
    |001|   |00000004|BDAT1|5555555555|  5555555555               

    |002|   |00000001|BDAT2|6666666666|  6666666666



IFTHEN=(WHEN=GROUP,BEGIN=(11,8,ZD,EQ,1),PUSH=(8:5,3))   


now using another GROUP function we push the value at pos 5 on to pos 8

====|===|===|========|================|========================== 
RDW |   |   |        |     |          |ACTUAL DATA                 
1-4 | ID|   |SEQ     |DATE |KEY       |                           
====|===|===|========|=====|==========|========================== 
    |001|001|00000001|BDAT1|          |B                DAT1       
    |002|001|00000002|BDAT2|          |B                DAT2       

    |001|001|00000001|BDAT1|1111111111|  1111111111               
    |002|001|00000002|BDAT2|1111111111|  1111111111               

    |001|001|00000001|BDAT1|2222222222|  2222222222               
    |001|001|00000002|BDAT1|3333333333|  3333333333               
    |001|001|00000003|BDAT1|4444444444|  4444444444               
    |001|001|00000004|BDAT1|5555555555|  5555555555               

    |002|002|00000001|BDAT2|6666666666|  6666666666


So matching record will have 001 in pos 8 and and Id greater than 1 will be all the matching records that we interested and now using OUTFIL INCLUDE we only select those records and Write the Date and actual data discarding the temp variables we created

OUTFIL BUILD=(1,4,19,5,34),                               
INCLUDE=(8,3,ZD,EQ,1,AND,5,3,ZD,GT,1,AND,34,1,CH,NE,C'B')



Hope this helps...
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
skolusu
 
Posts: 586
Joined: Wed Apr 02, 2008 10:38 pm
Has thanked: 0 time
Been thanked: 39 times

Re: A complex requirement of SORT, MERGE, and also edit records

Postby tilak.tumma » Tue Oct 27, 2009 9:06 pm

Hi skolusu,

I can not thank you enough for the detailed decription of the solution. Really appreciate your patience.
I submitted the job, by adding the 250 dataset concatenations for input, however it fails with sort capacity.

ICE752I 0 FSZ=379165237728 BC IGN=0 E AVG=2016 0 WSP=496410237 C DYN=0 0
ICE046A 0 SORT CAPACITY EXCEEDED - RECORD COUNT 45384417
ICE253I 0 RECORDS SORTED - PROCESSED: 45384417, EXPECTED: 189582618
ICE098I 0 AVERAGE RECORD LENGTH - PROCESSED: 849, EXPECTED: 2000


I added sortwk areas to see if that would get me around this issue but no use. Do you think I am missing something ?

Regards,
Tilak
tilak.tumma
 
Posts: 5
Joined: Wed Oct 21, 2009 11:34 am
Has thanked: 0 time
Been thanked: 0 time

Re: A complex requirement of SORT, MERGE, and also edit records

Postby Frank Yaeger » Wed Oct 28, 2009 9:14 pm

Based on the information you sent me offline, it appears you need to set the maximum DYNALLOC n value higher for this very large sort. I sent you a response offline about this.
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1079
Joined: Sat Jun 09, 2007 8:44 pm
Has thanked: 0 time
Been thanked: 15 times

Re: A complex requirement of SORT, MERGE, and also edit records

Postby tilak.tumma » Thu Oct 29, 2009 10:22 pm

Hi Yaeger/skolusu,

I did include the
//DFSPARM DD *
OPTION DYNALLOC(SYSDA,100)
/*

but it ABENDED again with resource unavailable message.

ICE201I F RECORD TYPE IS V - DATA STARTS IN POSITION 5
ICE150I 0 VLSHRT NOT USED FOR SORT, MERGE, INCLUDE, OMIT OR SUM STATEMENT FIELDS
ICE858I 0 LA=204452, DA=65894, AA=138558, BA=1988, CP=1, TA=2037
ICE859I 0 LB=8312, DB=6068, AB=2244, BB=0, CP=0, RS=0, TB=20
ICE860I 0 F=YN, P=2, M=N, B=2048
ICE083A D RESOURCES WERE UNAVAILABLE FOR DYNAMIC ALLOCATION OF WORK DATA SETS (970C)
ICE753I 0 FWK=(41,3715215) SWK=(1,72492) TWK=(0,0) RWK=(0,0) TOTAL=(42,3787707) BLK=56664
ICE751I 0 C5-K90014 C6-K90014 C7-BASE C8-K42136 E9-K40168 C9-BASE E5-K38900 E6-BASE C4-K38900 E7-BASE
ICE750I 0 DC 0 TC 379165237728 CS DSVNN KSZ 14 VSZ 14
ICE752I 0 FSZ=379165237728 BC IGN=0 E AVG=2016 0 WSP=496410237 C DYN=3787707 56664
ICE052I 3 END OF DFSORT


Do you think this is due to some configuration settings that I need to speak to Mainframe admin team ?
(I did send this detail offline as well, I am adding it here for the benefit of others)

Thanks in advance,

Regards,
Tilak
tilak.tumma
 
Posts: 5
Joined: Wed Oct 21, 2009 11:34 am
Has thanked: 0 time
Been thanked: 0 time

Re: A complex requirement of SORT, MERGE, and also edit records

Postby Frank Yaeger » Fri Oct 30, 2009 8:55 pm

It appears that your system does not have enough work space available to do the sort in one pass. Your admin team may be able to help with that. Alternatively, given your work space constraints, you may need to do sequential smaller sorts and a merge.
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1079
Joined: Sat Jun 09, 2007 8:44 pm
Has thanked: 0 time
Been thanked: 15 times


Return to DFSORT/ICETOOL/ICEGENER

 


  • Related topics
    Replies
    Views
    Last post