Counting chars occurences based on a conditions



IBM's flagship sort product DFSORT for sorting, merging, copying, data manipulation and reporting. Includes ICETOOL and ICEGENER

Counting chars occurences based on a conditions

Postby wuz » Mon Nov 14, 2011 3:55 pm

Hello everyone,

I am looking for a script which is able to count the occurences of "fields" (startpos, length) with a specific values in a dataset. So far I developed the following solution.

I am using the COPY-Command to extract all matching fields plus a counting var into a temporary dataset. In a second step i sum the counter var.

COPY FROM(INLOW) TO(TMPLOW) -
            USING(AA10)
COPY FROM(INLOW) TO(TMPLOW) -
            USING(AA11)
//AA10CNTL DD *
  OUTREC BUILD=(1:C'000000000001',
           13:C':FIELD_TEST1:',
           26:1,1,
           37:C' ')
  INCLUDE COND=(1,1,CH,EQ,C' ')
/*
//AA11CNTL DD *
  OUTREC BUILD=(1:C'000000000001',
           13:C':FIELDTEST:',
           24:2,3,
           37:C' ')
  INCLUDE COND=(2,3,CH,EQ,C' ')
/*


This works but with extremely poor performance, because for every condition a copy task is required. I would like to perform all the conditions in one CNTL. For instance (psydocode):

//AA10CNTL DD *
  IF INCLUDE COND=(1,1,CH,EQ,C' ') THEN
      OUTREC BUILD=(1:C'000000000001',
           13:C':FIELD_TEST1:',
           26:1,1,
           37:C' ')
  END IF
  IF INCLUDE COND=(2,3,CH,EQ,C' ') THEN
      OUTREC BUILD=(1:C'000000000001',
           13:C':FIELDTEST:',
           24:2,3,
           37:C' ') 
 END IF
/*


Unfortunately i don't know the correct sytax for this psydo code :(.

Thanks a lot for your help in advance and best regards!
wuz
 
Posts: 4
Joined: Mon Jul 18, 2011 3:54 am
Has thanked: 0 time
Been thanked: 0 time

Re: Counting chars occurences based on a conditions

Postby enrico-sorichetti » Mon Nov 14, 2011 4:53 pm

see here for the solution to a similar requirement
http://ibmmainframes.com/about56724.html

working for dfsort, should work also for syncsort
cheers
enrico
When I tell somebody to RTFM or STFW I usually have the page open in another tab/window of my browser,
so that I am sure that the information requested can be reached with a very small effort
enrico-sorichetti
Global moderator
 
Posts: 3002
Joined: Fri Apr 18, 2008 11:25 pm
Has thanked: 0 time
Been thanked: 164 times

Re: Counting chars occurences based on a conditions

Postby dick scherrer » Mon Nov 14, 2011 10:45 pm

Hello and welcome to the forum,

Which sort product is used on your system? Post the informational messages generated by any execution of the sort.

There are separate parts of the forum for DFSORT and Syncsort questions and your topic will be moved to the appropriate part of the forum.
Hope this helps,
d.sch.
User avatar
dick scherrer
Global moderator
 
Posts: 6268
Joined: Sat Jun 09, 2007 8:58 am
Has thanked: 3 times
Been thanked: 93 times

Re: Counting chars occurences based on a conditions

Postby Frank Yaeger » Tue Nov 15, 2011 12:35 am

wuz,

Have you looked at the OCCUR operator of DFSORT's ICETOOL. It may do what you want. You can access all of the DFSORT books from:

http://www.ibm.com/support/docview.wss? ... g3T7000080

If not, you need to do a better job of describing what it is you want to do rather than showing how you think you can do it. Show an example of the records in your input file (relevant fields only) and what you expect for output. Explain the "rules" for getting from input to output. Give the starting position, length and format of each relevant field. Give the RECFM and LRECL of the input files.
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1079
Joined: Sat Jun 09, 2007 8:44 pm
Has thanked: 0 time
Been thanked: 15 times

Re: Counting chars occurences based on a conditions

Postby wuz » Wed Nov 16, 2011 5:40 pm

Hi,

Thanks for your replies. I am using DFSORT ICETOOL UTILITY. The OCCURS operator seems pretty usefull, but is a little bit an overkill because i don't need the count for each unique value, only for a certain condition.

I'll tried to merge the several CNTTL into one with the help if the IFTHEN ... WHEN operator. But first of all I want to ensure that all the conditions are evaluated (not only if the predecessing one is false), and second i receive the following errror messagen.

//AA10CNTL DD *                                             
  OUTREC IFTHEN=(WHEN=((1,1,ZD,EQ,1),AND,(324,1,CH,EQ,C' ')),
               BUILD=(1:C'000000000001',                     
                      13:C':AA_AAA_AAA:',                   
                      25:324,1,                             
                      40:C' '),                             
         IFTHEN=(WHEN=((1,1,ZD,EQ,1),AND,(450,1,CH,EQ,C' ')),
               BUILD=(1:C'000000000001',                     
                      13:C':BBBBBBBB_BBBB:',                 
                      28:450,1, -                           
                      40:C' ')                               
/*                                                           


  OUTREC IFTHEN=(WHEN=((1,1,ZD,EQ,1),AND,(324,1,CH,EQ,C' ')),
               BUILD=(1:C'000000000001',                     
                      13:C':AA_AAA_AAA:',                   
                      25:324,1,                             
                      40:C' '),                             
         IFTHEN=(WHEN=((1,1,ZD,EQ,1),AND,(450,1,CH,EQ,C' ')),
         $                                                   
OPERAND DEFINER ERROR                                       
               BUILD=(1:C'000000000001',                     
               $                                             
BLANK NEEDED IN COLUMN 1 OR OPERATION NOT DEFINED CORRECTLY 
                      13:C':BBBBBBBB_BBBB:',                 
                      $                                     
SYNTAX ERROR                                                 
                      28:450,1, -                           
                      $                                     
SYNTAX ERROR                                                 
                      40:C' ')                               
wuz
 
Posts: 4
Joined: Mon Jul 18, 2011 3:54 am
Has thanked: 0 time
Been thanked: 0 time

Re: Counting chars occurences based on a conditions

Postby Frank Yaeger » Thu Nov 17, 2011 12:10 am

Since you didn't supply any actual information about what you're trying to do as I previously requested, all I can do is show you the valid syntax - I can't verify that what you're doing is "correct". Here is the valid DFSORT syntax (right parens added where needed):

  OUTREC IFTHEN=(WHEN=((1,1,ZD,EQ,1),AND,(324,1,CH,EQ,C' ')),     
               BUILD=(1:C'000000000001',                         
                      13:C':AA_AAA_AAA:',                         
                      25:324,1,                                   
                      40:C' ')),                                 
         IFTHEN=(WHEN=((1,1,ZD,EQ,1),AND,(450,1,CH,EQ,C' ')),     
               BUILD=(1:C'000000000001',                         
                      13:C':BBBBBBBB_BBBB:',                     
                      28:450,1, -                                 
                      40:C' '))                                   
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1079
Joined: Sat Jun 09, 2007 8:44 pm
Has thanked: 0 time
Been thanked: 15 times

Re: Counting chars occurences based on a conditions

Postby wuz » Fri Nov 18, 2011 7:53 pm

Thanks a lot for your help, they job now finished with return code 0 but it seems that when both conditions are false the whole row is copied. Moreover the second condition only seems to be evaluated when the first one is false.

I want to count all the occurences of a specific fields (specified by startpos, length) based on a specific condition. Guess i have to figure out how the occurs operator works with the specified conditions.

Best regards
wuz
 
Posts: 4
Joined: Mon Jul 18, 2011 3:54 am
Has thanked: 0 time
Been thanked: 0 time

Re: Counting chars occurences based on a conditions

Postby Frank Yaeger » Sat Nov 19, 2011 12:00 am

Again and again: Since you didn't supply any actual information about what you're trying to do as I previously requested, all I can do is show you the valid syntax - I can't verify that what you're doing is "correct". I can comment on what you said, however:

but it seems that when both conditions are false the whole row is copied.


Are you saying you want to OMIT any records that don't meet the conditions? You would have to use OMIT for that. IFTHEN does NOT omit records - it only changes them.

Moreover the second condition only seems to be evaluated when the first one is false.


Are you saying that you want the second IFTHEN clause to be evaluated even if the first is true. In that case, you'd need to use HIT=NEXT on the first IFTHEN clause.

Of course, I have no idea if what you think you need is even close to what you actually need because I don't know exactly what you're trying to do! Is there some reason you can't show a complete picture of what you want to do? In my FIRST post, I asked for the following:

You need to do a better job of describing what it is you want to do rather than showing how you think you can do it. Show an example of the records in your input file (relevant fields only) and what you expect for output. Explain the "rules" for getting from input to output. Give the starting position, length and format of each relevant field. Give the RECFM and LRECL of the input files.

If you had given me that when I asked, you would probably have your solution by now. If you give it to me now, I can help you instead of trying to guess what you're doing.
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
User avatar
Frank Yaeger
Global moderator
 
Posts: 1079
Joined: Sat Jun 09, 2007 8:44 pm
Has thanked: 0 time
Been thanked: 15 times

Re: Counting chars occurences based on a conditions

Postby wuz » Tue Nov 22, 2011 3:35 pm

Sorry, I will try to be more precise.

  • I have an input dataset without delimiters, so single fields can only be specified by startpos and length.
  • Now I want to check if a specific field has the intended value.
  • If the field does not have the intended value, I want to generate the following output:
    number_of_occurences field_name not_intended_value
  • For instance I assume that field1 has to have the value "A". A sample output could look like the following:
    3 field1 B
    7 field1 C
    I don't want to have the, let's say more than 100 records where field1 has the correct value of A, but the output tells me that field1has 3 times the wrong value B and 7 times the wrong value c.

I am really sorry if I wasn't that precise. The code i posted in my first post does exactly what I need but it is not performant. In a first step I use an own COPY CNT for every field to generate a counter var, the fieldname and the value. In a second step i sum the counter var and group by fieldname, value.

Thanks a lot for your help
wuz
 
Posts: 4
Joined: Mon Jul 18, 2011 3:54 am
Has thanked: 0 time
Been thanked: 0 time

Re: Counting chars occurences based on a conditions

Postby skolusu » Tue Nov 22, 2011 10:56 pm

wuz wrote:
  • For instance I assume that field1 has to have the value "A". A sample output could look like the following:
    3 field1 B
    7 field1 C
    I don't want to have the, let's say more than 100 records where field1 has the correct value of A, but the output tells me that field1has 3 times the wrong value B and 7 times the wrong value c.

I am really sorry if I wasn't that precise. The code i posted in my first post does exactly what I need but it is not performant. In a first step I use an own COPY CNT for every field to generate a counter var, the fieldname and the value. In a second step i sum the counter var and group by fieldname, value.

Thanks a lot for your help


wuz,

You don't need 2 steps to get the counts of values. Here is a DFSORT JCL which will give you the counts of each character which don't have the intended value. ex: 'A"

//STEP0100 EXEC PGM=SORT                               
//SYSOUT   DD SYSOUT=*                                 
//SORTIN   DD *                                         
A                                                       
A                                                       
B                                                       
Z                                                       
B                                                       
C                                                       
C                                                       
B                                                       
//SORTOUT  DD SYSOUT=*                                 
//SYSIN    DD *                                         
  OMIT COND=(1,1,CH,EQ,C'A')                           
  INREC OVERLAY=(10:C'1')                               
  SORT FIELDS=(1,1,CH,A)                               
  OUTFIL REMOVECC,NODETAIL,BUILD=(80X),                 
  SECTIONS=(1,1,                                       
  TRAILER3=('COUNT OF CHARACTER ',1,1,' IN POS 1 : ',   
            TOT=(10,1,ZD,M10,LENGTH=8)))               
//*                                                     


The output from this is
COUNT OF CHARACTER B IN POS 1 :        3
COUNT OF CHARACTER C IN POS 1 :        2
COUNT OF CHARACTER Z IN POS 1 :        1


If you're not familiar with DFSORT and DFSORT's ICETOOL, I'd suggest reading through "z/OS DFSORT: Getting Started". It's an excellent tutorial, with lots of examples, that will show you how to use DFSORT, DFSORT's ICETOOL and DFSORT Symbols. You can access it online, along with all of the other DFSORT books, from:

http://www.ibm.com/support/docview.wss? ... g3T7000080
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
skolusu
 
Posts: 586
Joined: Wed Apr 02, 2008 10:38 pm
Has thanked: 0 time
Been thanked: 39 times


Return to DFSORT/ICETOOL/ICEGENER

 


  • Related topics
    Replies
    Views
    Last post