Removing duplicates from a input file based on record type.



IBM's flagship sort product DFSORT for sorting, merging, copying, data manipulation and reporting. Includes ICETOOL and ICEGENER

Removing duplicates from a input file based on record type.

Postby Hardik » Mon Aug 22, 2011 6:38 pm

Hi,
I have a file of FB format and 80 LRECL with following input records:

00 XXX Header Data Record
10 001 Detail Data Record
10 002 Detail Data Record
10 003 Detail Data Record
99 003 Trailer Data Record
00 XXX Header Data Record
10 001 Detail Data Record
99 001 Trailer Data Record
00 XXX Header Data Record
10 001 Detail Data Record
10 002 Detail Data Record
99 002 Trailer Data Record

what I want in my output file is:
1. Remove Duplicate Header Records (if First 2 bytes = '00'), and keep only one header record in the output file.
2. copy all the data records as it is (if First 2 bytes = '10')
3. remove duplicates for trailer records (if First 2 bytes = '99') and also sum up the field positioned from col 3 to 6 in to the final trailer record.

henceforth my output file for the above input should be like this:
00 XXX Header Data Record
10 001 Detail Data Record
10 002 Detail Data Record
10 003 Detail Data Record
10 001 Detail Data Record
10 001 Detail Data Record
10 002 Detail Data Record
99 006 Trailer Data Record

Please suggest a sort card for the same. Also it would be great if anybody can tell me if it would be more efficient than writing a COBOL program for the same.
I tried to find a solution for this but I didn't find any example where SUMFIELDS can be used along with INREC IFTHEN.

Thanks,
Hardik
Hardik
 
Posts: 5
Joined: Wed Apr 28, 2010 6:46 pm
Has thanked: 0 time
Been thanked: 0 time

Re: Removing duplicates from a input file based on record ty

Postby skolusu » Mon Aug 22, 2011 9:23 pm

use the following DFSORT JCL which will give you the desired results

//STEP0100 EXEC PGM=SORT                                   
//SYSOUT   DD SYSOUT=*                                     
//SORTIN   DD *                                             
----+----1----+----2----+----3----+----4----+----5----+----6
00 XXX HEADER DATA RECORD                                   
10 001 DETAIL DATA RECORD                                   
10 002 DETAIL DATA RECORD                                   
10 003 DETAIL DATA RECORD                                   
99 003 TRAILER DATA RECORD                                 
00 XXX HEADER DATA RECORD                                   
10 001 DETAIL DATA RECORD                                   
99 001 TRAILER DATA RECORD                                 
00 XXX HEADER DATA RECORD                                   
10 001 DETAIL DATA RECORD                                   
10 002 DETAIL DATA RECORD                                   
99 002 TRAILER DATA RECORD                                 
//SORTOUT  DD SYSOUT=*                                     
//SYSIN    DD *                                             
  INREC IFTHEN=(WHEN=INIT,OVERLAY=(81:SEQNUM,8,ZD)),       
  IFTHEN=(WHEN=(1,2,CH,EQ,C'00'),OVERLAY=(81:12C'0')),     
  IFTHEN=(WHEN=(1,2,CH,EQ,C'99'),                           
  OVERLAY=(81:8C'9',3,4,UFF,ZD,LENGTH=4))                   
                                                           
  SORT FIELDS=(81,8,CH,A),EQUALS                           
  SUM FIELDS=(89,4,ZD)                                     
                                                           
  OUTREC IFOUTLEN=80,                                       
  IFTHEN=(WHEN=(1,2,CH,EQ,C'99'),OVERLAY=(3:89,4))         
//*
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
skolusu
 
Posts: 586
Joined: Wed Apr 02, 2008 10:38 pm
Has thanked: 0 time
Been thanked: 39 times

Re: Removing duplicates from a input file based on record ty

Postby Hardik » Wed Aug 24, 2011 9:14 pm

Thanks Kolusu!! Its working :-) Although I had to do a little bit change in this sort card as I needed a sequence number also for the data record but it worked. Thanks for your help.
Can I use a VB file as input to this sort. I mean would I need to change any thing else in the sort card except for that 4 bytes shifting?
Also from efficielcy point of view, Is it better use a SORT compared to a COBOL program for such kind of restructuring/formatting,especially when there are large number of records in the input file?
Hardik
 
Posts: 5
Joined: Wed Apr 28, 2010 6:46 pm
Has thanked: 0 time
Been thanked: 0 time

Re: Removing duplicates from a input file based on record ty

Postby skolusu » Wed Aug 24, 2011 9:53 pm

Hardik wrote:Can I use a VB file as input to this sort. I mean would I need to change any thing else in the sort card except for that 4 bytes shifting?

Hardik,

Not that easy , if you add the seqnum at the end you will ruin the concept of Variable block by making them all fixed length. You need to add the seqnum after RDW and manipulate it. I just don't get this concept as to why people seek help tend waste time without providing details. Good luck.

Hardik wrote:Also from efficielcy point of view, Is it better use a SORT compared to a COBOL program for such kind of restructuring/formatting,especially when there are large number of records in the input file?


You need to run a test and see your self. Depends on how good you are at writing an efficient cobol program.
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
skolusu
 
Posts: 586
Joined: Wed Apr 02, 2008 10:38 pm
Has thanked: 0 time
Been thanked: 39 times


Return to DFSORT/ICETOOL/ICEGENER

 


  • Related topics
    Replies
    Views
    Last post