remove duplicate records by considering date and timestamp



IBM's flagship sort product DFSORT for sorting, merging, copying, data manipulation and reporting. Includes ICETOOL and ICEGENER

remove duplicate records by considering date and timestamp

Postby Namrata_Singri » Mon Jun 20, 2016 6:49 pm

Hi

I have 1 file which consist of duplicate records like

Feild1   feild2    date(yyyymmdd)         timestamp(hhmmss)
123       ABC      20160620                         083938
123       ABC      20160620                         053623
145       LMN      20160619                        123456
145       LMN      20160620                        116712
178       PQR      20160620                        051122
178       PQR      20160620                        051122


i need to remove duplicate records and write into another file but based on feild1+date+timestamp combination. As shown above, it should consider the latest record.
i.e., the output file must be

Feild1   feild2    date(yyyymmdd)         timestamp(hhmmss)
123       ABC      20160620                         083938
145       LMN      20160620                        116712
178       PQR      20160620                        051122


it should write only one record in output file if the date and timestamp is same in input file like record3

(178       PQR      20160620    051122).


Coded
Namrata_Singri
 
Posts: 13
Joined: Mon Jun 20, 2016 6:18 pm
Has thanked: 2 times
Been thanked: 0 time

Re: remove duplicate records by considering date and timest

Postby Namrata_Singri » Mon Jun 20, 2016 8:09 pm

can you please give me the solution using sortcard
Namrata_Singri
 
Posts: 13
Joined: Mon Jun 20, 2016 6:18 pm
Has thanked: 2 times
Been thanked: 0 time

Re: remove duplicate records by considering date and timest

Postby Akatsukami » Mon Jun 20, 2016 9:07 pm

Please remember to use the Code tags when posting anything where alignment may be critical.

Additionally, you should post your questions in a forum relevant to the desired answer. JCL is, of course, unable to do what you have asked. I have moved your query to the DFSORT forum; if your shop uses a different sort product, let me know and I will move the query to the relevant forum.
"You have sat too long for any good you have been doing lately ... Depart, I say; and let us have done with you. In the name of God, go!" -- what I say to a junior programmer at least once a day
User avatar
Akatsukami
Global moderator
 
Posts: 1058
Joined: Sat Oct 16, 2010 2:31 am
Location: Bloomington, IL
Has thanked: 6 times
Been thanked: 51 times

Re: remove duplicate records by considering date and timest

Postby ottoelflaco » Mon Jun 20, 2016 10:07 pm

I m not sure if I understand you. You need two output?:
- Output1: latest record without duplicated
- Output2: duplicated with the same date+time


If it is like I have understood, maybe this work:

//SORT10B  EXEC PGM=SORT          
//SORTIN   DD *                                          
123       ABC      20160620                        083938
123       ABC      20160620                        053623
145       LMN      20160619                        123456
145       LMN      20160620                        116712
178       PQR      20160620                        051122
178       PQR      20160620                        051122
//SORTOUT  DD DSN=dataset1          
//SORTXSUM DD DSN=dataset2                                  
  SORT FIELDS=(1,3,CH,A,11,3,CH,A,20,8,CH,D,52,6,CH,D)  
  SUM FIELDS=NONE,XSUM                                  
//SORT10B  EXEC PGM=SORT
//SORTIN   DD DSN=dataset1
//SORTOUT  DD DSN=dataset3                    
  SORT FIELDS=(1,3,CH,A,11,3,CH,A),EQUALS              
  SUM FIELDS=NONE                                      
 


dataset2:

178       PQR      20160620                        051122
 


dataset3:

123       ABC      20160620                        083938
145       LMN      20160620                        116712
178       PQR      20160620                        051122
 
ottoelflaco
 
Posts: 36
Joined: Fri Sep 21, 2012 2:33 pm
Has thanked: 1 time
Been thanked: 0 time

Re: remove duplicate records by considering date and timest

Postby Namrata_Singri » Tue Jun 21, 2016 12:03 pm

i dont need 2 output, only 1 output file
Namrata_Singri
 
Posts: 13
Joined: Mon Jun 20, 2016 6:18 pm
Has thanked: 2 times
Been thanked: 0 time

Re: remove duplicate records by considering date and timest

Postby Aki88 » Tue Jun 21, 2016 1:04 pm

Hello,

Couple of queries:
a. Which SORT product you have: DFSORT or SYNCSORT; the solution provided by Mr. ottoelflaco is a SYNCSORT solution going by the 'SUM FIELDS=NONE,XSUM' statement and will not work with DFSORT.
b. The time-stamp data start and end-position are not consistent for all records; can you please specify the column numbers where they start; or are your records of variable length?
c. The last line of your post is tad-bit confusing; when you say 'write only one record', do you mean you want only this one record written in output or the other two records with different value in field-1 as well:

Namrata_Singri wrote:it should write only one record in output file if the date and timestamp is same in input file like record3

(178       PQR      20160620    051122).


Coded


d. Do you want the header line preserved in output?
Namrata_Singri wrote:

Feild1   feild2    date(yyyymmdd)         timestamp(hhmmss)
123       ABC      20160620                         083938
123       ABC      20160620                         053623
145       LMN      20160619                        123456
145       LMN      20160620                        116712
178       PQR      20160620                        051122
178       PQR      20160620                        051122


e. Do you want the final output to be sorted or do you want the original order preserved?
f. Lastly, your SORT keys vs your output do not match, for example - if your SORT key is Field-1, date and time-stamp, then both records for field-1 = 123, will be retained, as 'timestamp' is not duplicate there.

If the above queries pertaining to the SORT key are cleared, then this will require a simple SUM FIELDS=NONE; in case this does get a bit complicated, then an ICETOOL with DISCARD operator should give you what is needed.
Aki88
 
Posts: 381
Joined: Tue Jan 28, 2014 1:52 pm
Has thanked: 33 times
Been thanked: 36 times


Return to DFSORT/ICETOOL/ICEGENER

 


  • Related topics
    Replies
    Views
    Last post