Remove duplicates for only specific records



IBM's flagship sort product DFSORT for sorting, merging, copying, data manipulation and reporting. Includes ICETOOL and ICEGENER

Remove duplicates for only specific records

Postby santhoshkumar_sm » Fri Oct 15, 2010 4:50 pm

Hi All,

I have a requirement wherein I need to remove duplicates for only specific records.

Input (dynamic with the below format):

AAAA ***fdsf***sdf**
BBBBB **gfhfghfghfg*
CCCC 1234
CCCC 5895
CCCC 1234
CCCC 7545
CCCC 8877
AAAA ***fdsf***sdf**
BBBBB **gfhfghfghfg*
CCCC 8585

Requirement:

I need to remove duplicates for only the records starting with CCCC and also the rest should remain as such. So, my output should be like

OUTPUT:

AAAA ***fdsf***sdf**
BBBBB **gfhfghfghfg*
CCCC 1234
CCCC 5895
CCCC 7545
CCCC 8877
AAAA ***fdsf***sdf**
BBBBB **gfhfghfghfg*
CCCC 8585

Many thanks.
santhoshkumar_sm
 
Posts: 5
Joined: Fri Oct 15, 2010 4:38 pm
Has thanked: 0 time
Been thanked: 0 time

Re: Remove duplicates for only specific records

 

Re: Remove duplicates for only specific records

Postby Frank Yaeger » Fri Oct 15, 2010 10:35 pm

You can use a DFSORT job like the following to do what you asked for. I assumed your input file has RECFM=FB and LRECL=80, but the job can be changed appropriately for other attributes.

//S1    EXEC  PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG  DD SYSOUT=*
//IN DD *
AAAA ***fdsf***sdf**
BBBBB **gfhfghfghfg*
CCCC 1234
CCCC 5895
CCCC 1234
CCCC 7545
CCCC 8877
AAAA ***fdsf***sdf**
BBBBB **gfhfghfghfg*
CCCC 8585
/*
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)
//OUT DD SYSOUT=*
//TOOLIN DD *
SELECT FROM(IN) TO(T1) ON(81,8,ZD) ON(5,76,CH) FIRST USING(CTL1)
SORT FROM(T1) TO(OUT) USING(CTL2)
/*
//CTL1CNTL DD *
  INREC IFTHEN=(WHEN=INIT,OVERLAY=(81:SEQNUM,8,ZD,89:81,8)),
    IFTHEN=(WHEN=(1,4,CH,EQ,C'CCCC'),OVERLAY=(81:8C'0'))
/*
//CTL2CNTL DD *
  SORT FIELDS=(89,8,ZD,A)
  OUTREC BUILD=(1,80)
/*
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1080
Joined: Sat Jun 09, 2007 8:44 pm
Has thanked: 0 time
Been thanked: 14 times

Re: Remove duplicates for only specific records

Postby santhoshkumar_sm » Fri Oct 15, 2010 10:53 pm

Hi Frank,

Thanks for your help... I can try this out only on monday....
If you dont mind can you please explain me the sort card you have used.... I am new to mainframe and also a beginner...

SELECT FROM(IN) TO(T1) ON(81,8,ZD) ON(5,76,CH) FIRST USING(CTL1)

INREC IFTHEN=(WHEN=INIT,OVERLAY=(81:SEQNUM,8,ZD,89:81,8)),
IFTHEN=(WHEN=(1,4,CH,EQ,C'CCCC'),OVERLAY=(81:8C'0'))

Lots of thanks...
santhoshkumar_sm
 
Posts: 5
Joined: Fri Oct 15, 2010 4:38 pm
Has thanked: 0 time
Been thanked: 0 time

Re: Remove duplicates for only specific records

Postby santhoshkumar_sm » Fri Oct 15, 2010 11:38 pm

Hi Frank,

Also wanted to know where duplicates are being removed..
Thanks in advance
santhoshkumar_sm
 
Posts: 5
Joined: Fri Oct 15, 2010 4:38 pm
Has thanked: 0 time
Been thanked: 0 time

Re: Remove duplicates for only specific records

Postby Frank Yaeger » Sat Oct 16, 2010 12:00 am

If you're not familiar with DFSORT and DFSORT's ICETOOL, I'd suggest reading through "z/OS DFSORT: Getting Started". It's an excellent tutorial, with lots of examples, that will show you how to use DFSORT, DFSORT's ICETOOL and DFSORT Symbols. You can access it online, along with all of the other DFSORT books, from:

http://www.ibm.com/support/docview.wss? ... g3T7000080

SELECT is an ICETOOL operator - see:

http://publibz.boulder.ibm.com/cgi-bin/ ... 0630155256

FIRST keeps the first record of each set of duplicates.

If you comment out the SORT operator (* SORT FROM...), the job will run with just the SELECT operator. You can then look at the output and you'll see what I'm doing with the sequence numbers. Basically I'm giving all of the CCCC records a sequence number of 0 so they will be treated as duplicates, and all of the other records a unique sequence number so they won't be treated as duplicates. The second sequence number is used to get the records back in their original order.
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1080
Joined: Sat Jun 09, 2007 8:44 pm
Has thanked: 0 time
Been thanked: 14 times

Re: Remove duplicates for only specific records

Postby santhoshkumar_sm » Sat Oct 16, 2010 11:46 pm

Hi Frank,

Can you please explain me the below statement you have used...

ON(81,8,ZD) ON(5,76,CH) FIRST USING(CTL1)

I went through the document but could not get only this...

Will be thankful if u help me...
santhoshkumar_sm
 
Posts: 5
Joined: Fri Oct 15, 2010 4:38 pm
Has thanked: 0 time
Been thanked: 0 time

Re: Remove duplicates for only specific records

Postby dick scherrer » Sun Oct 17, 2010 4:14 am

Hello,

The bit of code you have pasted is part of the SELECT. As Frank explained, the FIRST controls the elimination of the duplicates.

Did you read this suggestion Frank posted:
If you comment out the SORT operator (* SORT FROM...), the job will run with just the SELECT operator. You can then look at the output and you'll see what I'm doing with the sequence numbers.
Suggest you try this when you are logged on again on Monday. It will make the process more clear to you.

Do you understand this explanation?
Basically I'm giving all of the CCCC records a sequence number of 0 so they will be treated as duplicates, and all of the other records a unique sequence number so they won't be treated as duplicates. The second sequence number is used to get the records back in their original order.

As i mentioned, this will be more clear when you can see the output from the SELECT.
Hope this helps,
d.sch.
User avatar
dick scherrer
Global moderator
 
Posts: 6304
Joined: Sat Jun 09, 2007 8:58 am
Has thanked: 3 times
Been thanked: 91 times

Re: Remove duplicates for only specific records

Postby santhoshkumar_sm » Mon Oct 18, 2010 10:28 pm

Hi Frank,

Thank you so much... It worked great.... also i clearly understood wat u have done.... brilliant... :)
santhoshkumar_sm
 
Posts: 5
Joined: Fri Oct 15, 2010 4:38 pm
Has thanked: 0 time
Been thanked: 0 time


Return to DFSORT/ICETOOL/ICEGENER

 


  • Related topics
    Replies
    Views
    Last post