Page 1 of 1

Dupes in one file, rest in another

PostPosted: Thu Feb 03, 2011 3:30 pm
by sharma_deepu
I have 2 datasets that contain duplicate and non-duplicate elements so I need two output datasets: all duplicates in one and all non-duplicates in the other.

Re: Dupes in one file, rest in another

PostPosted: Thu Feb 03, 2011 10:42 pm
by skolusu
sharma_deepu,

Use the following DFSORT JCL which will give you the desired results

//STEP0100 EXEC PGM=ICETOOL                             
//TOOLMSG  DD SYSOUT=*                                   
//DFSMSG   DD SYSOUT=*                                   
//IN       DD *                                         
A                                                       
A                                                       
B                                                       
C                                                       
C                                                       
C                                                       
//DUPS     DD SYSOUT=*                                   
//UNQ      DD SYSOUT=*                                   
//TOOLIN   DD *                                         
  SELECT FROM(IN) TO(UNQ) DISCARD(DUPS) NODUPS ON(1,1,CH)
//*

Re: Dupes in one file, rest in another

PostPosted: Sat Feb 05, 2011 10:21 am
by hailashwin
Hi Kolusu,
It makes me wonder why you had gone in for the ICETOOL step instead of the traditional SORTXSUM (atleast for me :| ) for this particular scenario. I am asking this because, from the time I have started to read abt Sort and appreciate it...I have been hearing from ppl that a ICETOOL step is performance intensive than a normal sort. But I havent had the chance to do the compare myself.
Please correct my understanding..

Thanks,
Ashwin.

Re: Dupes in one file, rest in another

PostPosted: Sat Feb 05, 2011 10:48 am
by dick scherrer
Hello,

I have been hearing from ppl that a ICETOOL step is performance intensive than a normal sort
One must use care in choosing who to listen to. . . Many people have opinions that have no basis in reality - but they sometimes appear quite knowledgable. . .

The biggest causes of "intensive performance" is the number of records to be processed and how many passes over the data are used to arrive at the solution. Some of the worst performing "solutions" to be seen are when the developer was determined to use the smallest possible set of control statements rather than put in a bit of effort to get to a more efficient process.

Re: Dupes in one file, rest in another

PostPosted: Mon Feb 07, 2011 11:51 pm
by Frank Yaeger
I have been hearing from ppl that a ICETOOL step is performance intensive than a normal sort.


People say a lot of things. That doesn't necessarily make them true. A blanket statement like that is not true.

DFSORT does NOT support XSUM. The job Kolusu showed is DFSORT's equivalent of the XSUM function. For more information, see the "Keep dropped duplicate records (XSUM)" Smart DFSORT Trick at:

http://www.ibm.com/support/docview.wss? ... g3T7000094