Page 1 of 2

Delete duplicate records

PostPosted: Wed Jul 30, 2008 5:07 pm
by shivendu
How can I delete duplicate records from a file using REXX? Actually I am writing a file using REXX code which also has duplicate entries which I need to delete.

Input file:
M1428888 C37ONL OLD
M1425555 C37ONL NEW
M1428888 C37ONL MOD
M1428888 C37ONL OLD

Output file should look like:
M1428888 C37ONL OLD
M1425555 C37ONL NEW
M1428888 C37ONL MOD


thanks,
Shivendu

Re: Delete duplicate records

PostPosted: Wed Jul 30, 2008 5:34 pm
by jayind
Hi shivendu,

One solution is - use SORT as your first step and REXX in the second step in JCL and create new file eliminating duplicates in the SORT before giving to REXX..

Hope this will work...

Regards,
Jayind

Re: Delete duplicate records

PostPosted: Wed Jul 30, 2008 5:37 pm
by shivendu
Hi Jayind,

i am not using any JCL and want to do everything in REXX if its possible.

Re: Delete duplicate records

PostPosted: Wed Jul 30, 2008 5:42 pm
by jayind
To my knowledge I dont think it is possible becuase your input file is not sorted and hence you cant even check the previous record stored for comparison.. Why can't you use a JCL? Is that the requirement or you dont want to use JCL? If the file is small and can store in an array then you can probably try otherwise you need to have JCL or the file sorted.

Let us hear from others if they have any solution..

Regards,
jayind

Re: Delete duplicate records

PostPosted: Wed Jul 30, 2008 6:11 pm
by shivendu
There must be some way of sorting a file and then deleting the duplicate records in Rexx. :geek:

Re: Delete duplicate records

PostPosted: Thu Jul 31, 2008 12:40 am
by dick scherrer
Hello,

There must be some way of sorting a file and then deleting the duplicate records in Rexx.
Even if you find a way to do this in REXX it is a bad choice. . . .

Keep in mind that almost anything can be written, but that does not mean that it should be written. . .

What is the business requirement that this be done completely in REXX? Pretty much nonsense to write code to do something that is an already existing and much better performing feature. It is also not a good idea for a learning exercise as one should to learn to do appropriate things rather than inappropriate things.

Re: Delete duplicate records

PostPosted: Thu Jul 31, 2008 8:59 am
by shivendu
Actually what I have put above is a simpler form of what i need to achieve and am fully aware it can be done in SYNCSORT. My intention was to get pointers so that i can attack on my requirement which infact is like this:

Input file has records like this:

M1428888 C37ONL OLD
M1425555 C37ONL NEW
M1428888 C37ONL MOD
M1428888 C37ONL OLD

Requirement is that whenever the first 8 bytes are same in more than one record then 17-19 bytes should be checked and only that record should be kept which has "OLD" in those positions(that too only one occurence) and rest all similar records be deleted.
So output file would look like :

M1428888 C37ONL OLD
M1425555 C37ONL NEW

Hope I am clearer this time. Any suuggestions?

Re: Delete duplicate records

PostPosted: Thu Jul 31, 2008 10:04 am
by dick scherrer
Hello,

Hope I am clearer this time.
I believe you were reasonably clear the first post.

Any suuggestions?
Yes. It is still not a job for REXX. As i said before:
It is also not a good idea for a learning exercise as one should to learn to do appropriate things rather than inappropriate things.

Actually I am writing a file using REXX code which also has duplicate entries which I need to delete.
Might be a good idea to not write the duplicates in the first place.

If you explain the total requirement, we may be better able to offer suggestions.

Something else you might want to consider is that most organizations do not allow REXX to process "business rules" which is what your process does. Even if it is running successfully in a development environment, there is no mechanism to promote it to production.

Re: Delete duplicate records

PostPosted: Thu Jul 31, 2008 5:29 pm
by shivendu
This is an internal tool and not a business requirement and thus will always run in development environment.
I can not explain the entire tool end to end here as it involves multiple entities and people. Also there are some other constraints due to which duplicates can't be avoided while writing the above input file.
Anyways thanks for all the inputs and will get back once the thing is done. :)

Re: Delete duplicate records

PostPosted: Thu Jul 31, 2008 10:06 pm
by Pedro
You can call DFSORT from your rexx program. Use ALLOC commands to allocate the same DD statements are required in batch, the use Address TSO "CALL *(DFSORT)" to invoke it.