Page 1 of 1

SyncSort: Sorting and Removing Duplicates

PostPosted: Mon Dec 19, 2011 7:35 pm
by saurabh pune
Hi,
Needed assistance for Sorting a file based on 3 fields and then removing duplicates on the same file with 2 of the three sorted fields in a single sort step , Is that possible ?

I will further explain it as :

Input as :
NAAA01AA
NAAA02AA
NAAA03AA
NAAB01AA
NAAB01AB
NAAB03AA


1st Step Sort curently using :
SORT FIELDS=(1,4,CH,A,5,2,BI,A,7,2,CH,A)                 


2nd Step Sort currently using :
SORT FIELDS=(1,4,CH,A,5,2,BI,A),                 
 EQUALS                                           
SUM FIELDS=NONE


Output is :

NAAA01AA
NAAA02AA
NAAA03AA
NAAB01AA
NAAB03AA


Record eliminated is

NAAB01AB


as the first 6 bytes were same hence considered a duplicate and removed.

I wish to implement this into a single step, Is that possible.

Thanks a Ton,
Saurabh

Re: SyncSort: Sorting and Removing Duplicates

PostPosted: Tue Dec 20, 2011 5:15 am
by BillyBoyo
I don't think you can do it in one step with SORT, but you might get what you want with SyncTool's SELECT.

//SYMNAMES DD *
FIRST-SIX-OF-KEY,1,6,CH
FULL-KEY,1,8,CH
//TOOLIN   DD *                                         
  SELECT FROM(IN) TO(OUT) ON(FIRST-SIX-OF-KEY) FIRST USING(CTL1)
//CTL1CNTL DD *                                         
  SORT FIELDS=(FULL-KEY,A) 


I have borrowed the above syntax from a DFSORT solution, I dont' have any documentation for SyncSort/SyncTool but if not exact you should be able to find something similar.

I have put your key together, rather than specifying the fields seperately. I think if you look at any recommendations in the SyncSort manul that it would say if they keys are contiguous it is better to specify as one key.

I like the SYMNAMES for flex and self-documentation.

I have no way to try this out. Let us know if it is any use.

Re: SyncSort: Sorting and Removing Duplicates

PostPosted: Tue Dec 20, 2011 2:45 pm
by xknight
Hello,

As bill have pointed out, you can make use of Synctool version if your site has installed some latest PTF level.

Try the below snippet,

//STEP01 EXEC PGM=ICETOOL                                               
//IN1    DD *                                                           
NAAA01AA                                                               
NAAA02AA                                                               
NAAA03AA                                                               
NAAB01AA                                                               
NAAB01AB                                                               
NAAB03AA                                                               
//OUT1   DD DSN=&TMP1,                                                 
//             RECFM=FB,LRECL=80,BLKSIZE=0,                             
//          DISP=(MOD,PASS,DELETE),                                     
//          SPACE=(CYL,(10,10),RLSE),UNIT=SYSDA                         
//OUT2   DD SYSOUT=*                                                   
//TOOLMSG  DD SYSOUT=*                                                 
//DFSMSG  DD SYSOUT=*                                                   
//TOOLIN DD *                                                           
  SELECT FROM(IN1) TO(OUT1) ON(1,4,CH) ON(5,2,BI) FIRST                 
  COPY FROM(OUT1) TO(OUT2) USING(CTL1)                                 
//CTL1CNTL DD *                                                         
  SORT FIELDS=(1,4,CH,A,5,2,BI,A,7,2,CH,A)                             
/*


Thanks,
Xavier

Re: SyncSort: Sorting and Removing Duplicates

PostPosted: Tue Dec 20, 2011 3:05 pm
by saurabh pune
Thanks Billyboyo and Xavier for your quick help.
Will positively reply to this by tomorrow (On leave today).

Thanks again.

Re: SyncSort: Sorting and Removing Duplicates

PostPosted: Wed Dec 21, 2011 12:41 am
by Alissa Margulies
Hello Saurabh.

Here is a 1-step Syncsort job that should produce the desired output:

//SORT1 EXEC PGM=SORT                                             
//SORTIN  DD *                                                   
NAAA01AA                                                         
NAAA02AA                                                         
NAAA03AA                                                         
NAAB01AA                                                         
NAAB01AB                                                         
NAAB03AA                                                         
//SORTOUT DD SYSOUT=*                                             
//SYSOUT  DD SYSOUT=*                                             
//SYSIN   DD *                                                   
  SORT FIELDS=(1,8,CH,A),EQUALS                                   
  OUTREC IFTHEN=(WHEN=INIT,OVERLAY=(81:SEQNUM,1,ZD,RESTART=(1,6)))
  OUTFIL INCLUDE=(81,1,ZD,EQ,1),BUILD=(1,80)                     
/*                                                               

Please let us know if you have any further questions or if this does not produce the desired output.

Thank you.

Re: SyncSort: Sorting and Removing Duplicates

PostPosted: Wed Dec 21, 2011 5:20 pm
by saurabh pune
Thanks a Lot Alissa,Xavier and BillyBoyo :)
It just worked perfectly fine.
Thanks Again for your Help.