Page 1 of 2

A Dropping Dupes Challenge

PostPosted: Thu May 16, 2013 8:44 am
by RockinRiles
I have a large file I'm trying to eliminate dupes in and I'm not having much luck. Obviously, I'd like to get this done via JCL instead of a program.

My file can have multiple records per SSN. Because there are actually people with the same SSN, we use unique internal IDs to separate customers. I need to make sure there are no SSNs that have more than one INT-SSN, just in case.

Record layout:
SSN.......... 1-9
filler......... 10
INT-SSN.... 11-19
filler........ 20-80

Sample records. File is sorted on SSN, INT-SSN
These 2 records should remain.
111111111 234567890
111111111 234567890

These 3 records should be deleted. There are > 1 INT-SSN per SSN.
222222222 987654321
222222222 987654321
222222222 999999999

Anyone think this can be done? I was thinking if it could be done, SYNCSORT could do it.

Thank you for your time,

Bob

Re: A Dropping Dupes Challenge

PostPosted: Thu May 16, 2013 11:26 am
by BillyBoyo
With DFSORT it would be a simple ICETOOL SELECT. Do you want to try that with SyncTool?

Lots of other ways.

What version of SyncSort do you have?

Re: A Dropping Dupes Challenge

PostPosted: Thu May 16, 2013 5:14 pm
by RockinRiles
Hello BillyBoro,

Yes, I want to use SyncSort tho, I don't have the version right now. I believe it's the latest but, will check.

Thanx a bunch,

Bob

Re: A Dropping Dupes Challenge

PostPosted: Thu May 16, 2013 5:27 pm
by BillyBoyo
Well, you can do a single-file MERGE, which would allow you to use XSUM.

You could use the reporting functions on OUTFIL, with REMOVECC, NODETAIL and SECTIONS for the key.

Re: A Dropping Dupes Challenge

PostPosted: Thu May 16, 2013 5:43 pm
by NicC
Even doing it by JCL you will still be doing it by a program - the sort program. Another program written in e.g cobol would also require JCL so whichever way you do it you will be using a program AND JCL

Re: A Dropping Dupes Challenge

PostPosted: Thu May 16, 2013 5:44 pm
by RockinRiles
BillyBoro,

Would you mind giving me an example; whichever way you think is the most efficient?

I'm no expert at JCL.... yet.

Thanx again!

Re: A Dropping Dupes Challenge

PostPosted: Thu May 16, 2013 7:01 pm
by dick scherrer
Hello,

You need to get to the understanding that Sort Control statements are NOT JCL.

There are various examples in the forum about removing duplicates. Suggest you look at DFSORT solutions as well as Syncsort solutions as many of the DFSORT commands work with Syncsort.

Re: A Dropping Dupes Challenge

PostPosted: Thu May 16, 2013 7:52 pm
by RockinRiles
I appreciate the lecture but, SORT commands/code are put in a JCL job. Whether it's "technically" JCL or not is missing the forest for the trees, IMHO. When I said I wanted to avoid doing this in a "program", I meant a COBOL type program.

What I'm trying to do isn't a standard "drop dupes" scenario. Any direction or links would be appreciated. I'm going over BillyBoyo's (sorry for the misspelling!) suggestions now.

TIA,

Bob

Re: A Dropping Dupes Challenge

PostPosted: Thu May 16, 2013 8:27 pm
by Akatsukami
RockinRiles wrote:I appreciate the lecture but, SORT commands/code are put in a JCL job. Whether it's "technically" JCL or not is missing the forest for the trees, IMHO. When I said I wanted to avoid doing this in a "program", I meant a COBOL type program.

Bob, you really need a better understanding of what you're trying to do (i.e., use a mainframe).

  1. There are ways to run sort programs (and every other type of program) that do not involve JCL (silly and non-performant ways, IMPO, but working ones).
  2. JCL can be and is used to run programs that have nothing to do with sort utilities.
  3. There are a number of different sort utilities. DFSORT and Syncsort control cards have very similar, but not, identical, syntax. I've never worked with CA-SORT, so I can't say how similar or different its syntax may be. Regardless, you must be a little more specific than a naïve assertion that "I wanna do a JCL sort"; that's likely to get you the wrong answer (or a curt "I don't advise on my competitor's products").
  4. Most importantly, we are intelligent and experienced enough to understand what you actually mean, and can provide assistance even whilst correcting you. To the left, if you try to look up *Sort control syntax in the z/OS MVS JCL Reference you will fail; you will be perpetually restricted to begging for information from your peers (who are likely even less knowledgeable) and your senpai (who will continue to correct you).

Re: A Dropping Dupes Challenge

PostPosted: Thu May 16, 2013 8:27 pm
by dick scherrer
Hello,

I don't have the version right now.
Every time you execute Syncsort, the release/version is shown at the top of the sysout info.

SORT commands/code are put in a JCL job. Whether it's "technically" JCL or not is missing the forest for the trees, IMHO
Suggest you change your opinion. Calling SYSIN data JCL is just showing ignorance or resistance to using proper terminology in favor of some colloquialism. Many here actually believe these are JCL statements and we try to get them on the right track. And it is surely NOT missing the forest. . .

One method that helps helpers help you is to show some sample input and the output you want when that input is processsed. I believe your description is rather clear, but i also believe that "seeing" some "real" input and output will help the helpers. (well, me at least<g>).