A Dropping Dupes Challenge



Support for NetApp SyncSort for z/OS, Visual SyncSort, SYNCINIT, SYNCLIST and SYNCTOOL

A Dropping Dupes Challenge

Postby RockinRiles » Thu May 16, 2013 8:44 am

I have a large file I'm trying to eliminate dupes in and I'm not having much luck. Obviously, I'd like to get this done via JCL instead of a program.

My file can have multiple records per SSN. Because there are actually people with the same SSN, we use unique internal IDs to separate customers. I need to make sure there are no SSNs that have more than one INT-SSN, just in case.

Record layout:
SSN.......... 1-9
filler......... 10
INT-SSN.... 11-19
filler........ 20-80

Sample records. File is sorted on SSN, INT-SSN
These 2 records should remain.
111111111 234567890
111111111 234567890

These 3 records should be deleted. There are > 1 INT-SSN per SSN.
222222222 987654321
222222222 987654321
222222222 999999999

Anyone think this can be done? I was thinking if it could be done, SYNCSORT could do it.

Thank you for your time,

Bob
User avatar
RockinRiles
 
Posts: 12
Joined: Thu Aug 16, 2012 12:35 am
Has thanked: 8 times
Been thanked: 0 time

Re: A Dropping Dupes Challenge

Postby BillyBoyo » Thu May 16, 2013 11:26 am

With DFSORT it would be a simple ICETOOL SELECT. Do you want to try that with SyncTool?

Lots of other ways.

What version of SyncSort do you have?

These users thanked the author BillyBoyo for the post:
RockinRiles (Thu May 16, 2013 5:15 pm)
BillyBoyo
Global moderator
 
Posts: 3804
Joined: Tue Jan 25, 2011 12:02 am
Has thanked: 22 times
Been thanked: 265 times

Re: A Dropping Dupes Challenge

Postby RockinRiles » Thu May 16, 2013 5:14 pm

Hello BillyBoro,

Yes, I want to use SyncSort tho, I don't have the version right now. I believe it's the latest but, will check.

Thanx a bunch,

Bob
User avatar
RockinRiles
 
Posts: 12
Joined: Thu Aug 16, 2012 12:35 am
Has thanked: 8 times
Been thanked: 0 time

Re: A Dropping Dupes Challenge

Postby BillyBoyo » Thu May 16, 2013 5:27 pm

Well, you can do a single-file MERGE, which would allow you to use XSUM.

You could use the reporting functions on OUTFIL, with REMOVECC, NODETAIL and SECTIONS for the key.
BillyBoyo
Global moderator
 
Posts: 3804
Joined: Tue Jan 25, 2011 12:02 am
Has thanked: 22 times
Been thanked: 265 times

Re: A Dropping Dupes Challenge

Postby NicC » Thu May 16, 2013 5:43 pm

Even doing it by JCL you will still be doing it by a program - the sort program. Another program written in e.g cobol would also require JCL so whichever way you do it you will be using a program AND JCL
The problem I have is that people can explain things quickly but I can only comprehend slowly.
Regards
Nic
NicC
Global moderator
 
Posts: 3025
Joined: Sun Jul 04, 2010 12:13 am
Location: Pushing up the daisies (almost)
Has thanked: 4 times
Been thanked: 136 times

Re: A Dropping Dupes Challenge

Postby RockinRiles » Thu May 16, 2013 5:44 pm

BillyBoro,

Would you mind giving me an example; whichever way you think is the most efficient?

I'm no expert at JCL.... yet.

Thanx again!
User avatar
RockinRiles
 
Posts: 12
Joined: Thu Aug 16, 2012 12:35 am
Has thanked: 8 times
Been thanked: 0 time

Re: A Dropping Dupes Challenge

Postby dick scherrer » Thu May 16, 2013 7:01 pm

Hello,

You need to get to the understanding that Sort Control statements are NOT JCL.

There are various examples in the forum about removing duplicates. Suggest you look at DFSORT solutions as well as Syncsort solutions as many of the DFSORT commands work with Syncsort.
Hope this helps,
d.sch.

These users thanked the author dick scherrer for the post:
RockinRiles (Thu May 16, 2013 8:16 pm)
User avatar
dick scherrer
Global moderator
 
Posts: 6268
Joined: Sat Jun 09, 2007 8:58 am
Has thanked: 3 times
Been thanked: 93 times

Re: A Dropping Dupes Challenge

Postby RockinRiles » Thu May 16, 2013 7:52 pm

I appreciate the lecture but, SORT commands/code are put in a JCL job. Whether it's "technically" JCL or not is missing the forest for the trees, IMHO. When I said I wanted to avoid doing this in a "program", I meant a COBOL type program.

What I'm trying to do isn't a standard "drop dupes" scenario. Any direction or links would be appreciated. I'm going over BillyBoyo's (sorry for the misspelling!) suggestions now.

TIA,

Bob
User avatar
RockinRiles
 
Posts: 12
Joined: Thu Aug 16, 2012 12:35 am
Has thanked: 8 times
Been thanked: 0 time

Re: A Dropping Dupes Challenge

Postby Akatsukami » Thu May 16, 2013 8:27 pm

RockinRiles wrote:I appreciate the lecture but, SORT commands/code are put in a JCL job. Whether it's "technically" JCL or not is missing the forest for the trees, IMHO. When I said I wanted to avoid doing this in a "program", I meant a COBOL type program.

Bob, you really need a better understanding of what you're trying to do (i.e., use a mainframe).

  1. There are ways to run sort programs (and every other type of program) that do not involve JCL (silly and non-performant ways, IMPO, but working ones).
  2. JCL can be and is used to run programs that have nothing to do with sort utilities.
  3. There are a number of different sort utilities. DFSORT and Syncsort control cards have very similar, but not, identical, syntax. I've never worked with CA-SORT, so I can't say how similar or different its syntax may be. Regardless, you must be a little more specific than a naïve assertion that "I wanna do a JCL sort"; that's likely to get you the wrong answer (or a curt "I don't advise on my competitor's products").
  4. Most importantly, we are intelligent and experienced enough to understand what you actually mean, and can provide assistance even whilst correcting you. To the left, if you try to look up *Sort control syntax in the z/OS MVS JCL Reference you will fail; you will be perpetually restricted to begging for information from your peers (who are likely even less knowledgeable) and your senpai (who will continue to correct you).
"You have sat too long for any good you have been doing lately ... Depart, I say; and let us have done with you. In the name of God, go!" -- what I say to a junior programmer at least once a day
User avatar
Akatsukami
Global moderator
 
Posts: 1058
Joined: Sat Oct 16, 2010 2:31 am
Location: Bloomington, IL
Has thanked: 6 times
Been thanked: 51 times

Re: A Dropping Dupes Challenge

Postby dick scherrer » Thu May 16, 2013 8:27 pm

Hello,

I don't have the version right now.
Every time you execute Syncsort, the release/version is shown at the top of the sysout info.

SORT commands/code are put in a JCL job. Whether it's "technically" JCL or not is missing the forest for the trees, IMHO
Suggest you change your opinion. Calling SYSIN data JCL is just showing ignorance or resistance to using proper terminology in favor of some colloquialism. Many here actually believe these are JCL statements and we try to get them on the right track. And it is surely NOT missing the forest. . .

One method that helps helpers help you is to show some sample input and the output you want when that input is processsed. I believe your description is rather clear, but i also believe that "seeing" some "real" input and output will help the helpers. (well, me at least<g>).
Hope this helps,
d.sch.
User avatar
dick scherrer
Global moderator
 
Posts: 6268
Joined: Sat Jun 09, 2007 8:58 am
Has thanked: 3 times
Been thanked: 93 times

Next

Return to Syncsort/Synctool