how to add numbers to groops of records...



IBM's flagship sort product DFSORT for sorting, merging, copying, data manipulation and reporting. Includes ICETOOL and ICEGENER

how to add numbers to groops of records...

Postby Dmitriy » Wed May 24, 2017 6:25 pm

hello colleagues!
I need to add sequence number to each group of records, like unique word ids...

input:

630774221
630774221
963495850
963495850
963495850
345695561
678609548
678609548
678609548
918367402
279702180
 


output:

630774221 0001
630774221 0001
963495850 0002
963495850 0002
963495850 0002
345695561 0003
678609548 0004
678609548 0004
678609548 0004
918367402 0005
279702180 0006
 


and file contains billions of records, so how to do this with maximum performance?
can you help me please. Thanks in advance!
Dmitriy
 
Posts: 1
Joined: Wed May 24, 2017 6:14 pm
Has thanked: 0 time
Been thanked: 0 time

Re: how to add numbers to groops of records...

Postby Aki88 » Mon May 29, 2017 11:29 am

Hello,

A few questions before we look at the solution:
a. You do not want the records to be sorted while padding the ID? The output you've shown retains the original order of records.
b. Is there a possibility of a unique group record to appear again somewhere down the line, if so how do you want that handled; for example:

630774221
630774221
963495850
963495850
963495850
345695561
678609548
678609548
678609548
630774221 --> here this appears again
630774221 --> here this appears again  
918367402
279702180
 


c. You've mentioned that there can be billions of records in input, but you've shown unique identifiers of 4 bytes only, which would mean that it can accommodate maximum of '9999' unique identifiers.

Solution to the query is fairly straight forward unless the aforementioned complexities are not added to it; you need to group the records and PUSH an ID to it. DFSORT allows 15 bytes zoned decimal id to be pushed in, which means 999,999,999,999,999 is the maximum value:


//SORTIN   DD *                          
630774221                                
630774221                                
963495850                                
963495850                                
963495850                                
345695561                                
678609548                                
678609548                                
678609548                                
918367402                                
279702180                                
/*                                        
//SORTOUT  DD SYSOUT=*                    
//SYSIN    DD *                          
 SORT FIELDS=COPY                        
 INREC IFTHEN=(WHEN=GROUP,KEYBEGIN=(1,9),
                          PUSH=(11:ID=15))
/*                                        
 


Output:


630774221 000000000000001
630774221 000000000000001
963495850 000000000000002
963495850 000000000000002
963495850 000000000000002
345695561 000000000000003
678609548 000000000000004
678609548 000000000000004
678609548 000000000000004
918367402 000000000000005
279702180 000000000000006
 
Aki88
 
Posts: 381
Joined: Tue Jan 28, 2014 1:52 pm
Has thanked: 33 times
Been thanked: 36 times

Re: how to add numbers to groops of records...

Postby enrico-sorichetti » Mon May 29, 2017 11:40 am

the number of records is NOT related to the number of groups/identifiers
:mrgreen:
cheers
enrico
When I tell somebody to RTFM or STFW I usually have the page open in another tab/window of my browser,
so that I am sure that the information requested can be reached with a very small effort
enrico-sorichetti
Global moderator
 
Posts: 2994
Joined: Fri Apr 18, 2008 11:25 pm
Has thanked: 0 time
Been thanked: 164 times

Re: how to add numbers to groops of records...

Postby Aki88 » Mon May 29, 2017 11:53 am

Hello Mr. Sorichetti,

enrico-sorichetti wrote:the number of records is NOT related to the number of groups/identifiers
:mrgreen:


Yes, I completely agree; but going by the representative data, there are certain records which have only one entry (instead of paired/grouped entries).
Hence the SORT card written gives the solution for maximum possible groups; TS is expected to tweak it to fit his needs.
I'd be very-very surprised if ONLY 9999 groups were possible in the actual 'billions of records'. :)

Best regards.
Aki88
 
Posts: 381
Joined: Tue Jan 28, 2014 1:52 pm
Has thanked: 33 times
Been thanked: 36 times

Re: how to add numbers to groops of records...

Postby prino » Mon May 29, 2017 12:56 pm

Dmitriy wrote:... and file contains billions of records ...

And if if my uncle was a woman he'd be my aunt...

Which PHB has come up with this ludicrous time-wasting requirement?
Robert AH Prins
robert.ah.prins @ the.17+Gb.Google thingy
User avatar
prino
 
Posts: 635
Joined: Wed Mar 11, 2009 12:22 am
Location: Vilnius, Lithuania
Has thanked: 3 times
Been thanked: 28 times


Return to DFSORT/ICETOOL/ICEGENER

 


  • Related topics
    Replies
    Views
    Last post