Page 1 of 4

Have to compare two datasets with Syncsort

PostPosted: Sat Mar 31, 2012 12:57 pm
by deva_048
Have to compare two datasets was having huge records
Dataset 1:Orignial records+Duplicate records
Dataset 2:Original records

I want to compare both datasets and result should be stored into new dataset ie)Duplicate records .I tried superc utility and sort xsum not getting a perfect match.

Note: To compare whole dataset not any specified column

Re: Have to compare two datasets

PostPosted: Sat Mar 31, 2012 5:32 pm
by NicC
I suggest you look for JOINKEYS in the section of the forum that deals with your sort product (DFSORT or SYNSCORT - not that there are different sections for each). There are lots of examples there about caomparing files and doing something or other with matched/unmatched records. You should be able to take one and adapt it to your own needs.

Re: Have to compare two datasets

PostPosted: Sat Mar 31, 2012 6:04 pm
by BillyBoyo
deva_048 wrote:Have to compare two datasets was having huge records
Dataset 1:Orignial records+Duplicate records
Dataset 2:Original records

I want to compare both datasets and result should be stored into new dataset ie)Duplicate records .I tried superc utility and sort xsum not getting a perfect match.

Note: To compare whole dataset not any specified column


Do you mean that the LRECL is big? Or there are a "huge" number of records?

What do mean by "duplicate" records? Have records been duplicated in their entirety? If this is not "normal", then the "duplicates" could be identified from the one file alone.

However, if you are saying you want a file of all the records which are on Dataset 1 which are not on Dataset 2, then JOINKEYS should be a good starting point.

You need to tell us the LRECL and RECFM if you want further assistance with the actual code.

Note that if you go ahead with this yourself, if would be a very good idea to list out records from Dataset 2 which don't match Dataset 1, as this would indicate a hole in your theory, so it would be nice to confirm that this does not happen.

Re: Have to compare two datasets

PostPosted: Sun Apr 01, 2012 8:25 pm
by deva_048
Duplicate means Dataset 1 can have dataset2 records along with more records.
i want to separate alone unmatched records ie) more records into new dataset

Re: Have to compare two datasets

PostPosted: Sun Apr 01, 2012 10:35 pm
by BillyBoyo
OK, what about all the other information...?

Re: Have to compare two datasets

PostPosted: Mon Apr 02, 2012 12:00 am
by dick scherrer
Hello,

When asking this type of question, it is best to show some sample input (only relevant fields) for both input files and the output expected when this sample data is processed. Use the "Code" tag to preserve alignment and improve readability.

Also, you need to answer the other questons asked. . .

Re: Have to compare two datasets

PostPosted: Mon Apr 02, 2012 12:42 pm
by NicC
Have you looked yet at JOINKEYS for your sort product? What you are doinbg sounds very simple and numerous examples exist in the forum and they should not be difficult to adapt to your requirements.

Re: Have to compare two datasets

PostPosted: Mon Apr 02, 2012 7:33 pm
by deva_048
i/p file contains:
Dataset 1(it contains both dataset2 with more records)ie)original+duplicate
1aaaaaaaaa,12,12
2bbbbbbbbb,12,12
4cccccccccc,12,12
3cccccccccc,12,12
Dataset 2 (original records)
1aaaaaaaaa,12,12
3cccccccccc,12,12
2bbbbbbbbb,12,12

I want to compare both the datasets and need to grab only unmatched records into new dataset

0/p should be present in new dataset 4cccccccccc,12,12

Re: Have to compare two datasets

PostPosted: Mon Apr 02, 2012 7:45 pm
by BillyBoyo
  JOINKEYS FILE=F1,FIELDS=(1,x,A)
  JOINKEYS FILE=F2,FIELDS=(1,x,A)
  JOIN UNPAIRED,F2,ONLY
  SORT FIELDS=COPY


This could give you something with a bit of tweaking. Since you're not answering questions, I assume you'll pick it up and run with it yourself. Don't even know which sort product you are using. Consult your documentation, then.

Re: Have to compare two datasets

PostPosted: Mon Apr 02, 2012 8:39 pm
by deva_048
Could you please explain (1,x,A)
1- starting position
x-?
A- ascending

Using pgm=sort
Please am new to this .....