Duplicates between two files



IBM's flagship sort product DFSORT for sorting, merging, copying, data manipulation and reporting. Includes ICETOOL and ICEGENER

Duplicates between two files

Postby Brifka » Mon Feb 14, 2011 12:47 am

Hello,
First of all I would like to thank this forum for saving my life couple times ;)
It contains SO MUCH useful information and examples and I learned a lot from it.

But now I have some task, that I don't know how to solve

I have two files VB/606

example of file 1 : aaa- prefix of each record, (4.4,zd) - key
aaa1234lalalala
aaa1234mmmm
aaa1235kkkkk
aaa3456ktktkt
aaa3456yyyyy
aaa3567kkkkk
aaa3567kkky

example of file2: bbb-prefix of each record ,(4.4,zd) - key
bbb1234nata
bbb2345mmmm
bbb2345jtjtj
bbb3456nnnn
bbb3456hhhh

Output file (VB/606) has to contain only those record, that has the same key in both files. Both files have duplicates
output file :
aaa1234lalala
aaa1234mmmm
bbb1234nata
aaa3456ktktkt
aaa3456yyyyy
bbb3456nnnn
bbb3456hhhh

I tried :
SELECT FROM(IN) TO(OUT) ON(10,1,CH) ON(4,4,zD) -
ALLDUPS

but it didn't dive me desired result. :(
Brifka
 
Posts: 7
Joined: Mon Feb 14, 2011 12:31 am
Has thanked: 0 time
Been thanked: 0 time

Re: Duplicates between two files

 

Re: Duplicates between two files

Postby dick scherrer » Mon Feb 14, 2011 3:22 am

Hello and welcome with your first post,

Which release of which sort product is being used?

If you are not sure, run the following and post the informational output including all message ids:
//SORTSTEP EXEC PGM=SORT
//SYSOUT   DD SYSOUT=*
//SORTIN   DD *
ABC
//SORTOUT  DD SYSOUT=*
//SYSIN    DD *
  SORT     FIELDS=COPY


Hint - it is also best to use the Code Tag when posting data as this will preserve alignment :)
Hope this helps,
d.sch.
User avatar
dick scherrer
Global moderator
 
Posts: 6304
Joined: Sat Jun 09, 2007 8:58 am
Has thanked: 3 times
Been thanked: 91 times

Re: Duplicates between two files

Postby Brifka » Mon Feb 14, 2011 6:25 pm

Hello,
Thank you for your replay and for hint :)
Unfortunately i cannot edit my open post, but i will take it into account next time
I run the job and got, that I use
Z/OS DFSORT V1R10
Brifka
 
Posts: 7
Joined: Mon Feb 14, 2011 12:31 am
Has thanked: 0 time
Been thanked: 0 time

Re: Duplicates between two files

Postby skolusu » Mon Feb 14, 2011 10:54 pm

Brifka wrote:I run the job and got, that I use
Z/OS DFSORT V1R10


Brifka,

We need more than just 1 line about version of DFSORT you are running. We need to see the message ICE201I which tells us about the level of DFSORT you are running. Check this link for a better explanation

dfsort-icetool-icegener/topic2894.html
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
skolusu
 
Posts: 586
Joined: Wed Apr 02, 2008 10:38 pm
Has thanked: 0 time
Been thanked: 39 times

Re: Duplicates between two files

Postby Brifka » Mon Feb 14, 2011 11:13 pm

Hi,
sorry
Please find the message
ICE201I H RECORD TYPE IS F - DATA STARTS IN POSITION 1
Brifka
 
Posts: 7
Joined: Mon Feb 14, 2011 12:31 am
Has thanked: 0 time
Been thanked: 0 time

Re: Duplicates between two files

Postby Brifka » Mon Feb 14, 2011 11:19 pm

I tried

//SYSIN    DD    *                               
  JOINKEYS F1=IN1,FIELDS=(4,4,A)                 
  JOINKEYS F2=IN2,FIELDS=(4,4,A)                 
  JOIN UNPAIRED,F1,F2                           
  REFORMAT FIELDS=(?,F1:1,20,F2:1,20)           
  OPTION COPY                                   
  OUTFIL FNAMES=OUT1,INCLUDE=(1,1,CH,EQ,C'B'),   
      BUILD=(2,40)                               
/*                                               


and got the cartesian join, and no desired result.
I can split the files, remove duplicates and then merge and sort , but I'm looking for smarter solution :roll:
Thank you
Brifka
 
Posts: 7
Joined: Mon Feb 14, 2011 12:31 am
Has thanked: 0 time
Been thanked: 0 time

Re: Duplicates between two files

Postby Brifka » Tue Feb 15, 2011 12:51 am

and one more question.
in my real file key is combination of CH and PD .
will it work with join?
Brifka
 
Posts: 7
Joined: Mon Feb 14, 2011 12:31 am
Has thanked: 0 time
Been thanked: 0 time

Re: Duplicates between two files

Postby skolusu » Tue Feb 15, 2011 2:24 am

Brifka,

The following DFSORT JCL will give you the desired results.

//STEP0100 EXEC PGM=SORT                                        
//SYSOUT   DD SYSOUT=*                                          
//SORTIN   DD *                                                
//SORTOUT  DD DSN=&&HDR,DISP=(,PASS),SPACE=(TRK,(1,0),RLSE)    
//SYSIN    DD *                                                
  SORT FIELDS=COPY                                              
  OUTFIL REMOVECC,FTOV,BUILD=(80X),HEADER1=('$$$',10X)          
//*
//STEP0200 EXEC PGM=ICETOOL                                  
//TOOLMSG  DD SYSOUT=*                                        
//DFSMSG   DD SYSOUT=*                                        
//INP      DD DSN=&&HDR,DISP=SHR,VOL=REF=*.STEP0100.SORTOUT  
//         DD DSN=your input VB 606 file A,DISP=SHR
//         DD DSN=&&HDR,DISP=SHR,VOL=REF=*.STEP0100.SORTOUT  
//         DD DSN=your input VB 606 file B,DISP=SHR
//DUP      DD DSN=&&DUP,DISP=(,PASS),SPACE=(CYL,(1,1),RLSE)  
//INA      DD DSN=your input VB 606 file A,DISP=SHR
//         DD DSN=your input VB 606 file B,DISP=SHR
//OUT      DD SYSOUT=*
//TOOLIN   DD *                                                  
  SORT FROM(INP) USING(CTL1)                                    
  SORT JKFROM USING(CTL2)                                        
//CTL1CNTL DD *                                                  
  INREC IFTHEN=(WHEN=INIT,BUILD=(1,4,5,3,X,8,4)),                
  IFTHEN=(WHEN=GROUP,BEGIN=(5,3,CH,EQ,C'$$$'),PUSH=(8:ID=1))    
  SORT FIELDS=(8,5,CH,A),EQUALS                                  
  SUM FIELDS=NONE                                                
  OUTFIL FNAMES=DUP,VTOF,OMIT=(5,3,CH,EQ,C'$$$'),BUILD=(8,5)    
//CTL2CNTL DD *                                                  
  OPTION COPY                                                    
  JOINKEYS F1=INA,FIELDS=(8,4,A)                                
  JOINKEYS F2=DUP,FIELDS=(2,4,A)                                
  JOIN UNPAIRED                                                  
  REFORMAT FIELDS=(F1:1,4,?,F2:1,1,F1:5)                        
  OUTFIL FNAMES=OUT,INCLUDE=(5,2,CH,EQ,C'B3'),BUILD=(1,4,7)      
//JNF2CNTL DD *                                                  
  SUM FIELDS=(1,1,ZD)                                            
//*


brifka wrote:and one more question.
in my real file key is combination of CH and PD .


The join is performed treating the keys as binary. If all your PD fields have positive values then the match will work fine. However if you have both positive and negative values then you need to normalize the keys.
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
skolusu
 
Posts: 586
Joined: Wed Apr 02, 2008 10:38 pm
Has thanked: 0 time
Been thanked: 39 times

Re: Duplicates between two files

Postby Frank Yaeger » Tue Feb 15, 2011 2:27 am

Brifka,

Something doesn't add up here. You say that your input files are VB and your key is 4,4,ZD. But that doesn't match the example you show - a VB file has an RDW in positions 1-4, so your key would be 8,4,ZD.

If you actually used 4,4,ZD, you wouldn't get any matches so how did you get a "cartesian join"?

Please verify that your input files are actually VB and your key is 8,4,ZD.

in my real file key is combination of CH and PD .
will it work with join?


JOINKEYS works with binary keys. CH is binary. PD can be treated as binary in many cases.
You'd have to show an example what your PD values actually look like in hex (positive values only? positive and negative values?) before I could answer your question for sure.
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1080
Joined: Sat Jun 09, 2007 8:44 pm
Has thanked: 0 time
Been thanked: 14 times

Re: Duplicates between two files

Postby Brifka » Tue Feb 15, 2011 2:28 am

Thank you so much..
Brifka
 
Posts: 7
Joined: Mon Feb 14, 2011 12:31 am
Has thanked: 0 time
Been thanked: 0 time

Next

Return to DFSORT/ICETOOL/ICEGENER

 


  • Related topics
    Replies
    Views
    Last post