Page 1 of 1

OMIT Even SEQNUMs

PostPosted: Tue Jun 02, 2015 12:57 pm
by Leixner
Hello guys,

I added a conditional SEQNUM to my dataset. (When there is "text" in the first 4 bytes of the file, add SEQNUM)

text1 00001
data I need
text1 00002

text2 00004
data I need
text2 00005

I now want to OMIT the duplicate records, meaning I only want one line of "text1" & "text2" in my file.
How do I formulate the condition of the OMIT to do the desired action?

Any help is appreciated :)

Thanks in advance,
Patrick

Re: OMIT Even SEQNUMs

PostPosted: Tue Jun 02, 2015 2:16 pm
by BillyBoyo
Is the sequence number added just for this purpose?

What happened to number 00003?

Can there be more than one duplicate?

Are you aware of ICETOOL? ICETOOL's SELECT operator can (most probably) do the task for you without you having to code much at all. Consult the DFSORT Getting Started and DFSORT Application Programming Guide for details, and ask if you have problems.

Re: OMIT Even SEQNUMs

PostPosted: Tue Jun 02, 2015 2:30 pm
by Leixner
Yes, the sequential number is added for this purpose only. It is always on byte 10 of the dataset.
Damn, I missed number 00003, of course the numbers are sequential. --> 00001 00002 00003 00004

Yes, there can be more duplicates, text 2 can occur 4 times, 6 times, etc., but always comes in pairs of two.
The occurance rate varies from day to day, hence my try with the even/uneven SEQNUMs. OMITTING the even numbers would do the trick!

I´ll look into the ICETOOL´s SELECT operator as soon as I can, thanks in advance for your help!

Re: OMIT Even SEQNUMs

PostPosted: Tue Jun 02, 2015 2:50 pm
by Leixner
Leixner wrote:Yes, the sequential number is added for this purpose only. It is always on byte 10 of the dataset.
Damn, I missed number 00003, of course the numbers are sequential. --> 00001 00002 00003 00004

Yes, there can be more duplicates, text 2 can occur 4 times, 6 times, etc., but always comes in pairs of two.
The occurance rate varies from day to day, hence my try with the even/uneven SEQNUMs. OMITTING the even numbers would do the trick!

I´ll look into the ICETOOL´s SELECT operator as soon as I can, thanks in advance for your help!


I should also note that I want to always keep one of the pairs, even if after omitting, there are duplicates in the dataset.
Sample Input:
text1 00001
data I need
text1 00002

text2 00003
data I need
text2 00004

text2 00005
data I need
text2 00006


Sample Output:
text1 00001
data I need


text2 00003
data I need


text2 00005
data I need

Re: OMIT Even SEQNUMs

PostPosted: Tue Jun 02, 2015 3:01 pm
by BillyBoyo
Can there be five records with the same text? And you'd want three records on the output, or two? And the records with the same text would always be contiguous (so no need to actually SORT them to get matches together?).

Re: OMIT Even SEQNUMs

PostPosted: Tue Jun 02, 2015 3:38 pm
by Leixner
BillyBoyo wrote:Can there be five records with the same text? And you'd want three records on the output, or two? And the records with the same text would always be contiguous (so no need to actually SORT them to get matches together?).


text1/text2/etc. are up to 20 different values, always 8 characters long. They can occur several times in the dataset, maximum is about 100 times.
I can´t sort the file because I´ll loose vital information if I do so. "data I need" are 5-10 lines of text which can´t be sorted and are associated to the "text1"/"text2"/etc..

The data I want to manipulate is a Beta92 output, consisting of the daily processed data of our Jobs (about 25 different Jobs). I want to output all of those into one dataset.
The problem is, some of our Jobs can run multiple times (up to 100 times), and thus there are 100 equal Job outputs with different numbers, which is the reason I can´t sort.

I think the SEQNUM solution would be most suited for my problem, but I do not know how to OMIT lines based on wheter chars 10-15 are odd or even.

I hope my English is good enough for you too understand my problem.

Thanks again for you help!

Re: OMIT Even SEQNUMs

PostPosted: Tue Jun 02, 2015 4:07 pm
by BillyBoyo
Your English is fine.

I'll assume contiguous keys and where there is an odd one in a pair, you want it.

A SEQNUM has an option to have a RESTART. The RESTART specifies a key.

If you have SEQNUM with RESTART you will get sequence numbers within the key.

If you then have a WHEN=GROUP with BEGIN= being a SEQNUM of three, and PUSH a SEQ for that group, then you'll have, within a group, patterns like 1,2,1,2,1,2,1,2 or 1,2,1,2,1.

Note that this does not require your existing sequence number (what you asked to do can be done, to INCLUDE COND/OMIT COND to end up with only odd numbers) because any time there are an odd number of jobs it would invalidate your method.

You can then use the second bite at record-selection, which is INCLUDE=/OMIT= on OUTFIL. Just INCLUDE= all the 1s, and you'll have what you want. All in one step, with no need to pre-sequence-number your data (so the data only needs to be read once).

Re: OMIT Even SEQNUMs

PostPosted: Tue Jun 02, 2015 5:36 pm
by Leixner
Got it :)

Thank you very much for your help!

Re: OMIT Even SEQNUMs

PostPosted: Tue Jun 02, 2015 6:10 pm
by BillyBoyo
Well done. If you can post your code, it can perhaps benefit others, and we can suggest any little improvements that may be necesaary.