Compare 2 files and eliminate duplicates



IBM's flagship sort product DFSORT for sorting, merging, copying, data manipulation and reporting. Includes ICETOOL and ICEGENER

Compare 2 files and eliminate duplicates

Postby alanmdc » Sun Dec 07, 2008 5:44 am

Hello,

Can you please help me on my problem :)
Basically I have 2 files; first one is the master list and the second file is the exclude list.
What I want to do is sort out all the records in the master list exluding all the records found in the exclude list.
Ex.
File1 (unsorted)
012345AAAA11112222
013456BBBB22223333
012345AAAA11114444
013456BBBB11116666
012345AAAA33334444
014567CCCC11115555
013456BBBB11113333
012345AAAA22227777

File2 (sorted)
11113333
11115555
22223333
22227777

I expect to produce an output below
OutFile
compare file1 (col 11-18) against file2 (col 1-8)
(sorted from column 1-6)
012345AAAA11112222
012345AAAA11114444
012345AAAA33334444
013456BBBB11116666

Thanks in advance for the help :)

Regards,
AlanMDC
http://www.alanmdc.hobby-site.com
alanmdc
 
Posts: 2
Joined: Sun Dec 07, 2008 5:27 am
Has thanked: 1 time
Been thanked: 0 time

Re: Compare 2 files and eliminate duplicates

Postby Frank Yaeger » Sun Dec 07, 2008 6:26 am

Here's a DFSORT/ICETOOL job that will do what you asked for. I assumed input file1 has RECFM=FB and LRECL=18 and input file2 has RECFM=FB and LRECL=8, but the job can be changed appropriately if the files have different attributes. I also assumed that input file2 could have keys not found in input file1 (even though your example doesn't show that). If that assumption is incorrect, then the job could be simplified a bit.

//S1   EXEC  PGM=ICETOOL
//TOOLMSG   DD  SYSOUT=*
//DFSMSG    DD  SYSOUT=*
//IN1 DD DSN=...  input file1 (FB/18)
//IN2 DD DSN=...  input file1 (FB/8)
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(MOD,PASS)
//T2 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)
//OUT DD DSN=...  output file (FB/18)
//TOOLIN DD *
COPY FROM(IN1) TO(T1) USING(CTL1)
COPY FROM(IN2) TO(T1) USING(CTL2)
SELECT FROM(T1) TO(T2) ON(11,8,CH) NODUPS USING(CTL3)
SORT FROM(T2) TO(OUT) USING(CTL4)
/*
//CTL1CNTL DD *
  INREC OVERLAY=(19:C'1')
/*
//CTL2CNTL DD *
  INREC BUILD=(11:1,8,19:C'2')
/*
//CTL3CNTL DD *
  OUTFIL FNAMES=T2,INCLUDE=(19,1,CH,EQ,C'1')
/*
//CTL4CNTL DD *
  SORT FIELDS=(1,6,CH,A)
  OUTREC BUILD=(1,18)
/*
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort

These users thanked the author Frank Yaeger for the post:
alanmdc (Thu May 17, 2012 4:26 pm)
User avatar
Frank Yaeger
Global moderator
 
Posts: 1079
Joined: Sat Jun 09, 2007 8:44 pm
Has thanked: 0 time
Been thanked: 15 times


Return to DFSORT/ICETOOL/ICEGENER

 


  • Related topics
    Replies
    Views
    Last post