Page 1 of 2

about sort

PostPosted: Fri Nov 06, 2009 4:08 pm
by pahi
hi,

I have a file layout has mentioned below:
00PLN FROZEN YHN
0509215447
1009215486 0000001.00 20091103000000 20091103235900
1509215486 PALLET 3315401 092 002
2000263281851000100
1009215487 0000001.00 20091103000000 20091103235900
1509215487 PALLET 3458900 092 007
2000425928159000315
1009215488 0000001.00 20091103000000 20091103235900
1509215488 PALLET 6241000 092 016
2001020126183000099
2001283319273000297
2001354662366000108
2001436381561000140
2001446381578000140
2001466386901000098


I want to remove the duplicates I used below JCL but i am not able to delete the duplicate record, so could somebody help me on this:
//SORT0Q   EXEC PGM=SORT
//SYSOUT   DD  SYSOUT=*                                   
//SORTIN   DD  DSN=userid.test.cntl,DISP=OLD 
//SORTOUT  DD  DSN=&&SORT0Q,                             
//             DISP=(,PASS,DELETE),                       
//             SPACE=(CYL,(4,1)),UNIT=TEMP
//SYSIN    DD  *                                         
 SORT FIELDS=COPY                                         
 SUM FIELDS=NONE                                         
/*

Re: about sort

PostPosted: Fri Nov 06, 2009 5:03 pm
by MrSpock
In order to remove duplicates from the data, you must sort the data in ascending order for your key. You can then eliminate the duplicate keys with the SUM FIELDS=NONE command.

Re: about sort

PostPosted: Fri Nov 06, 2009 5:11 pm
by pahi
hi,

But after sorting the file based on ascending order I must still retain the same file formatt. I have to eliminate the duplicate records without sorting it so i have used copy in my JCL


regards,
Pahi

Re: about sort

PostPosted: Fri Nov 06, 2009 10:07 pm
by Frank Yaeger
Pahi,

You cannot use SUM with COPY, only with SORT or MERGE. In order to eliminate duplicates, you must specify the positions you want to use to identify duplicates (the key). You haven't given us that information or shown what you expect for output, so we don't really know what it is you're trying to do exactly. Please do a better job of describing what you want to do and show your expected output. Also, give the RECFM and LRECL of the input file, and the starting position, length and format of all relevant fields including the key.

Re: about sort

PostPosted: Mon Nov 09, 2009 12:22 pm
by pahi
[color=#008000]1009215487 0000001.00 20091103000000 20091103235900[/color][color=#008000]1509215487 PALLET 3458900 092 007 [/color]
2000425928159000315
[color=#008000]1009215488 0000001.00 20091103000000 20091103235900
1509215488 PALLET 6241000 092 016 [/color]2001020126183000099
2001283319273000297
2001354662366000108
[color=#FF0000]2001436381578000140
2001446381578000140 [/color]
2001466386901000098


The record beinging with 1 first must retain the same, if under that whatever records begins with 2 and if its duplicate then only that duplicate record must be eliminated.

o/p:
1009215487 0000001.00 20091103000000 20091103235900
1509215487 PALLET 3458900 092 007
2000425928159000315
1009215488 0000001.00 20091103000000 20091103235900
1509215488 PALLET 6241000 092 016
2001020126183000099
2001283319273000297
2001354662366000108
2001446381578000140
2001466386901000098

Re: about sort

PostPosted: Tue Nov 10, 2009 12:05 am
by Frank Yaeger
Well, you color tags didn't work, so it's very difficult to tell what you were trying to show.

You also didn't give the starting position, length and format of the key you want to use to check for duplicates, or the RECFM and LRECL of the input file.

I don't know how you expect anyone to help you when you can't seem to provide the information requested, or explain clearly what you want to do.

Were these supposed to be the duplicate records?

2001436381578000140
2001446381578000140

If so, they aren't duplicates on the entire record since one has 436 and the other has 446. So you need to say which positions you want to check for duplicates on.

Re: about sort

PostPosted: Tue Nov 10, 2009 9:34 am
by pahi
sorry it was typo error,yes those are the duplicate records which i want to remove from the file which is having the RECFM has FB and LRECL 60.
Key lenght starts from position 1 to 19 but i want maintain the same file layout except the duplicate record which beigns with 2 must be removed.

The record beinging with 1 must retain the same, if under that whatever records begins with 2 and if it has duplicate then only that duplicate record must be eliminated.

o/p required is:
1009215487 0000001.00 20091103000000 20091103235900
1509215487 PALLET 3458900 092 007
2000425928159000315
1009215488 0000001.00 20091103000000 20091103235900
1509215488 PALLET 6241000 092 016
2001020126183000099
2001283319273000297
2001354662366000108
2001446381578000140
2001446381578000140(this duplicate record has to removed from the original file)
2001466386901000098

Re: about sort

PostPosted: Wed Nov 11, 2009 2:18 am
by Frank Yaeger
If I understand correctly what you want to do, then this DFSORT/ICETOOL job should do it:

//S1    EXEC  PGM=SORT                                     
//SYSOUT    DD  SYSOUT=*                                   
//SORTIN DD DSN=...  input file (FB/60)                       
//SORTOUT DD DSN=...  output file (FB/60)                 
//SYSIN    DD    *                                         
  INREC IFTHEN=(WHEN=GROUP,BEGIN=(1,2,CH,EQ,C'10'),         
    PUSH=(61:ID=8))                                         
  OPTION EQUALS                                             
  SORT FIELDS=(61,8,ZD,A,1,19,CH,A)                         
  SUM FIELDS=NONE                                           
  OUTREC BUILD=(1,60)                             
/*         

Re: about sort

PostPosted: Wed Nov 11, 2009 10:37 am
by pahi
Many thanks Frank its working fine.

can u pls explain me what this code is doing
INREC IFTHEN=(WHEN=GROUP,BEGIN=(1,2,CH,EQ,C'10'),
PUSH=(61:ID=8))

Regards,
Pahi

Re: about sort

PostPosted: Wed Nov 11, 2009 11:01 pm
by Frank Yaeger
It sets up groups of records, each starting with '10' in positions 1-2 and pushes an ID into positions 61-68 of all of the records in the group. The id in the first group of records will be 00000001, the id in the second group of records will be 00000002, etc.