Matching records within a file



IBM's flagship sort product DFSORT for sorting, merging, copying, data manipulation and reporting. Includes ICETOOL and ICEGENER

Matching records within a file

Postby Aadil » Thu Mar 09, 2017 5:49 pm

Hi all,

I have a file (RECFM=FB, LRECL=20) with records that look like the following:

AAAAAAAA 10012345678
BBBBBBBB 23131839048
CCCCCCCC 10012345678
DDDDDDDD 51111111111
EEEEEEEE 31111111111
FFFFFFFF 10012345678
GGGGGGGG 5111111111
HHHHHHHH 10012345678


The first 8 bytes is unique, however the data that follows may or may not be unique.

I would like to identify the duplicate values that occur past the first 8 bytes i.e. in bytes 9 through to 20 and report on them.

So, in the above instance, the report should look like:

AAAAAAAA 10012345678  
CCCCCCCC
FFFFFFFF
HHHHHHHH

BBBBBBBB 23131839048

DDDDDDDD 51111111111
GGGGGGGG

EEEEEEEE 31111111111
 

I'm sure that must be a way to do this using DFSORT and I would appreciate your assistance with this.

Many thanks.
Aadil
 
Posts: 6
Joined: Fri Aug 29, 2008 4:02 am
Has thanked: 1 time
Been thanked: 0 time

Re: Matching records within a file

Postby Aki88 » Fri Mar 10, 2017 12:08 pm

Hello,

Before anything else, the input that you've shown will-not yield the output you're looking for, for example the values for 'D' and 'G' are not same hence cannot be grouped (unless you group them on 10 bytes instead of 12 - which is from column 9 to 20)


=COLS> ----+----1----+----2
000010 AAAAAAAA 10012345678
000011 BBBBBBBB 23131839048
000012 CCCCCCCC 10012345678
000013 DDDDDDDD 51111111111
000014 EEEEEEEE 31111111111
000015 FFFFFFFF 10012345678
000016 GGGGGGGG 5111111111
000017 HHHHHHHH 10012345678
 


Also, please explain the logic behind the below sequence, i.e. how did the 'D', 'G' chunk come before 'E', even when the key sequence value of 9- 20 for 'E' is smaller (which I am assuming is the driver for this process); please also explain if you need spaces after each chunk of grouped items:


AAAAAAAA 10012345678
CCCCCCCC
FFFFFFFF
HHHHHHHH

BBBBBBBB 23131839048

DDDDDDDD 51111111111
GGGGGGGG

EEEEEEEE 31111111111
 


The below code yields a similar output, you can tweak it further to add the new sequence or the extra lines with spaces as you'd shown:


//SYSIN    DD *                                              
 SORT FIELDS=(10,8,CH,A)                                    
 OUTFIL IFTHEN=(WHEN=GROUP,KEYBEGIN=(10,10),PUSH=(60:SEQ=2)),
        IFTHEN=(WHEN=(60,02,CH,EQ,C'01'),                    
                BUILD=(1,20)),                              
        IFTHEN=(WHEN=NONE,                                  
                BUILD=(1,09)),IFOUTLEN=20                    
/*                                                          
 


Output:

=COLS> ----+----1----+----2
****** ********************
000001 AAAAAAAA 10012345678
000002 CCCCCCCC            
000003 FFFFFFFF            
000004 HHHHHHHH            
000005 BBBBBBBB 23131839048
000006 EEEEEEEE 31111111111
000007 DDDDDDDD 51111111111
000008 GGGGGGGG            
 

These users thanked the author Aki88 for the post:
Aadil (Fri Mar 10, 2017 3:30 pm)
Aki88
 
Posts: 381
Joined: Tue Jan 28, 2014 1:52 pm
Has thanked: 33 times
Been thanked: 36 times

Re: Matching records within a file

Postby Aadil » Fri Mar 10, 2017 3:29 pm

Hi Aki88,

This is exactly what I was looking for. I would like to see the spacing between each set of data though that is a nice to have and not a necessity.

Many thanks for your assistance, much appreciated.
Aadil
 
Posts: 6
Joined: Fri Aug 29, 2008 4:02 am
Has thanked: 1 time
Been thanked: 0 time


Return to DFSORT/ICETOOL/ICEGENER

 


  • Related topics
    Replies
    Views
    Last post