Page 1 of 1

JCL for removing duplicates in a dataset

PostPosted: Tue Mar 25, 2008 12:41 pm
by pavan.kanugo
hai,
i am trying to sort a dataset on one field and remove the duplicates on another field in a single step. Can anybody suggest me an idea to do so.
for example the dataset look like the following

Emp-Id Emp-Name Salary project Employer


i want to sort it based on salary and remove the names which duplicated

thanks and regards.............
pavan

Re: JCL for removing duplicates in a dataset

PostPosted: Tue Mar 25, 2008 3:09 pm
by arunprasad.k
Here you go!!

//S1     EXEC PGM=ICETOOL                                               
//TOOLMSG  DD SYSOUT=*                                                 
//DFSMSG   DD SYSOUT=*                                                 
//IN1      DD *                                                         
----+----1----+----2----+----3----+----4----+----5----+----6----+----7--
123456 KUMAR 300                                                       
123457 SAM   200                                                       
123458 PETER 100                                                       
123457 SAM   200                                                       
/*                                                                     
//OUT      DD SYSOUT=*                                                 
//TOOLIN   DD *                                                         
  SELECT FROM(IN1) TO(OUT) ON(8,5,CH) FIRST  USING(CTL1)               
/*                                                                     
//CTL1CNTL DD *                                                         
  SORT FIELDS=(14,3,ZD,A)                                               
/*                                                                     


Output
123458 PETER 100                                                               
123457 SAM   200                                                               
123456 KUMAR 300                                                               


Refer the following link to know more about DFSORT select statement.

http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/ICE1CA20/6.11?DT=20060615185603

Post if you have any questions.

Arun.

Re: JCL for removing duplicates in a dataset

PostPosted: Tue Mar 25, 2008 5:32 pm
by pavan.kanugo
Thank u arun

i have tried the code that u have suggested, but my output is in contrast with your output. It is as following

123457 SAM 200
123458 PETER 100
123456 KUMAR 300

instead of

123458 PETER 100
123457 SAM 200
123456 KUMAR 300

Can u plz recheck it.

thank and regards....
pavan

Re: JCL for removing duplicates in a dataset

PostPosted: Tue Mar 25, 2008 6:29 pm
by arunprasad.k
It is woring good for me. See your sysout and check whether CTL1CNTL statements are executed or not.

Best way is to open the link which I have given and read the manuals. It has detailed explaination with examples.

Arun.

Re: JCL for removing duplicates in a dataset

PostPosted: Tue Mar 25, 2008 7:35 pm
by MrSpock
Worked for me too with DFSORT. Maybe the O/P is using SYNCSORT?

Re: JCL for removing duplicates in a dataset

PostPosted: Tue Mar 25, 2008 7:50 pm
by pavan.kanugo
hai arun
i have got the right thing after going through the link
thanks alot
bye....bye....

regards...
pavan

Re: JCL for removing duplicates in a dataset

PostPosted: Tue Mar 25, 2008 8:45 pm
by Frank Yaeger
Arun's "solution" happens to work for the example he gave because the two records with SAM have the same amount. It will NOT work if two records with the same name have different amounts. For example, if the input is:

123456 KUMAR 300     
123457 SAM   200     
123458 PETER 100     
123457 SAM   200     
123457 SAM   300     


The output would be:

123458 PETER 100       
123457 SAM   200       
123456 KUMAR 300       
123457 SAM   300       


Notice that SAM appears twice in the output.

This DFSORT/ICETOOL job would work for the general case:

//S1     EXEC PGM=ICETOOL                                     
//TOOLMSG  DD SYSOUT=*                                         
//DFSMSG   DD SYSOUT=*                                         
//IN1      DD *                                               
123456 KUMAR 300                                               
123457 SAM   200                                               
123458 PETER 100                                               
123457 SAM   200                                               
123457 SAM   300                                               
/*                                                             
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)     
//OUT      DD SYSOUT=*                                         
//TOOLIN   DD *                                               
SELECT FROM(IN1) TO(T1) ON(8,5,CH) FIRST                       
SORT FROM(T1) TO(OUT) USING(CTL1)                             
/*                                                             
//CTL1CNTL DD *                                               
  SORT FIELDS=(14,3,ZD,A)                                     
/*                                                             


The output would be:

123458 PETER 100 
123457 SAM   200 
123456 KUMAR 300 


Of course, it's not clear which salary you want in the output when a name is associated with two different salaries.

Re: JCL for removing duplicates in a dataset

PostPosted: Tue Mar 25, 2008 9:48 pm
by arunprasad.k
Thanks Frank for correcting me.

Its a fact that you know DFSORT better than what I do. :)

Arun.

Re: JCL for removing duplicates in a dataset

PostPosted: Wed Mar 26, 2008 1:03 am
by Frank Yaeger
Its a fact that you know DFSORT better than what I do.


Not surprising given that I develop the DFSORT code and write all of the documentation. ;)

Re: JCL for removing duplicates in a dataset

PostPosted: Wed Mar 26, 2008 6:19 pm
by pavan.kanugo
thank u frank

it is a good observation which i have not noticed
thank alot for giving the solution

pavan