Splitting the records based on conditions



JES, JES2, JCL utilities, IDCAMS, Compile & Run JCLs, PROCs etc...

Splitting the records based on conditions

Postby vinu78 » Tue Sep 22, 2009 3:49 am

Hi,

I have 1 million records in an input file.
Using SORT, I want to split the records to 3 files.
- If the column 5 - 10 bytes contains the value '00000', it should go to File 1
- 50% of the remaining records should go to File 2
- 50% of the remaining records should go to File 3

Could you please help me in writing JCL using SORT utility. (I can't ICEMAN here in my company).

Thanks
Vinu
vinu78
 
Posts: 29
Joined: Sat Sep 27, 2008 4:47 am
Has thanked: 0 time
Been thanked: 0 time

Re: Splitting the records based on conditions

Postby Frank Yaeger » Tue Sep 22, 2009 4:39 am

If the column 5 - 10 bytes contains the value '00000', it should go to File 1


5-10 bytes is a length is a length of 6 whereas '00000' is a length of 5. Which is it - a length of 6 or a length of 5?

- 50% of the remaining records should go to File 2
- 50% of the remaining records should go to File 3


So if the remaining records are 1-10, do you want 1,3,5,7,9 in file2 and 2,4,6,8,10 in file3, or do you want 1-5 in file2 and 6-10 in file3?
How do you want them split if the remaining records are 1-9?

What is the RECFM and LRECL of the input file?

Could you please help me in writing JCL using SORT utility. (I can't ICEMAN here in my company).


This statement makes no sense. PGM=SORT and PGM=ICEMAN both invoke the SORT utility at your site.

Which SORT product are you using- DFSORT or Syncsort?
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1079
Joined: Sat Jun 09, 2007 8:44 pm
Has thanked: 0 time
Been thanked: 15 times

Re: Splitting the records based on conditions

Postby vinu78 » Tue Sep 22, 2009 4:51 am

Sorry for the incomplete data.
The RECL = 150bytes, If 5 bytes from 5th byte (5 - 9) of the input file is '00000', then it should go to File 1.
The remaining records should go to File 2 and File 3. (It can either go as 1,3,5,7 records in File 1 and 2,4,6,8 records in File 2 OR 1- 100 records in file1 and 101-200 records in File 2) like that. Actually I am not concerned about how the records are getting split in File 2 and 3.

The reason behind splitting of the records in 2 files is that since 50 million records can come, 9 prod disk packs each can be allocated to each dataset. I will then SORT/Merge this 2 files to TAPE dataset. I can't reserve 18 prod disk packs in one go. that is why I am splitting the file to 2.

Appreciate your help.

Thanks
Chidam
vinu78
 
Posts: 29
Joined: Sat Sep 27, 2008 4:47 am
Has thanked: 0 time
Been thanked: 0 time

Re: Splitting the records based on conditions

Postby vinu78 » Tue Sep 22, 2009 5:24 am

Hi Frank,

Our company uses DFSORT.
Regarding the splitting of files, I came across a command SPLITBY however was not sure how I can use it because I want to filter out 5- 9 byte = '00000' to another file too.

Please help.

Thanks
Vinu
vinu78
 
Posts: 29
Joined: Sat Sep 27, 2008 4:47 am
Has thanked: 0 time
Been thanked: 0 time

Re: Splitting the records based on conditions

Postby Frank Yaeger » Tue Sep 22, 2009 8:58 pm

Here's a DFSORT job that will do what you asked for:

//S1    EXEC  PGM=SORT
//SYSOUT    DD  SYSOUT=*
//SORTIN DD DSN=...  input file (FB/150)
//OUT1 DD DSN=...  output file1 (FB/150)
//OUT2 DD DSN=...  output file2 (FB/150)
//OUT3 DD DSN=...  output file3 (FB/150)
//SYSIN    DD    *
   OPTION COPY
   OUTFIL FNAMES=OUT1,INCLUDE=(5,5,CH,EQ,C'00000')
   OUTFIL FNAMES=(OUT2,OUT3),SAVE,SPLIT
/*
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1079
Joined: Sat Jun 09, 2007 8:44 pm
Has thanked: 0 time
Been thanked: 15 times

Re: Splitting the records based on conditions

Postby vinu78 » Tue Sep 22, 2009 10:13 pm

Hi Frank,

Thanks a lot for the JCL. It worked fine.
Just out of curiosity can I know whether the particular records can be split like the following:-
The first half of the records which doesn't have '00000' as 5-9 bytes should go to File 2 and the remaining half as bulk should go to File 3. The File 1 will be having the records with '00000' as mentioned earlier.

Eventhough I don't care the way it was split for my requirement, just for my knowledge sake, I thought of having the logic of dividing the records exactly first half to File 2 and second half of remaining records to File 3.

Thanks
Vinu
vinu78
 
Posts: 29
Joined: Sat Sep 27, 2008 4:47 am
Has thanked: 0 time
Been thanked: 0 time

Re: Splitting the records based on conditions

Postby Frank Yaeger » Tue Sep 22, 2009 10:38 pm

You could modify the job I gave you to use the technique discussed in the "Split a file to n output files dynamically" Smart DFSORT Trick at:

http://www.ibm.com/support/docview.wss? ... g3T7000094
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1079
Joined: Sat Jun 09, 2007 8:44 pm
Has thanked: 0 time
Been thanked: 15 times

Re: Splitting the records based on conditions

Postby vinu78 » Wed Sep 23, 2009 2:13 am

Thanks for the document.
vinu78
 
Posts: 29
Joined: Sat Sep 27, 2008 4:47 am
Has thanked: 0 time
Been thanked: 0 time

Re: Splitting the records based on conditions

Postby vnktrrd » Thu Mar 31, 2011 12:44 am

Hi Frank,

I have got the same situation. Pls help...

I have an input file which may contain any number of records and I want to split it to multiple files based on number of records in each output file.
suppose if the input file contains 124000 records, I want to split this file in to 50000 records in each file.that means 1st file should contain first 50000 records(1 - 49000) of input file, 2nd file should contain the next 50000 records(51000 - 100000) and the third file should contain the last 24000 records(100001 - 124000) of input file .so total of three files in this case.

thanks a lot for the help.
vnktrrd
 
Posts: 7
Joined: Thu Mar 31, 2011 12:32 am
Has thanked: 0 time
Been thanked: 0 time

Re: Splitting the records based on conditions

Postby Frank Yaeger » Thu Mar 31, 2011 1:56 am

I want to split this file in to 50000 records in each file.that means 1st file should contain first 50000 records(1 - 49000) of input file, 2nd file should contain the next 50000 records(51000 - 100000) and the third file should contain the last 24000 records(100001 - 124000)


Your math is a bit off. I assume you mean 1-50000 to output file1, 50001-100000 to output file2 and 100001 to 124000 to output file3.

You can use a DFSORT job like the following for that:

//S1 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTIN DD DSN=... input file
//OUT1 DD DSN=...  output file1
//OUT2 DD DSN=...  output file2
//OUT3 DD DSN=...  output file3
//SYSIN DD *
   OPTION COPY
   OUTFIL FNAMES=(OUT1,OUT2,OUT3),SPLIT1R=50000
/*
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1079
Joined: Sat Jun 09, 2007 8:44 pm
Has thanked: 0 time
Been thanked: 15 times

Next

Return to JCL

 


  • Related topics
    Replies
    Views
    Last post