Page 1 of 1

Counting chars occurences based on a conditions

PostPosted: Mon Nov 14, 2011 3:55 pm
by wuz
Hello everyone,

I am looking for a script which is able to count the occurences of "fields" (startpos, length) with a specific values in a dataset. So far I developed the following solution.

I am using the COPY-Command to extract all matching fields plus a counting var into a temporary dataset. In a second step i sum the counter var.

COPY FROM(INLOW) TO(TMPLOW) -
            USING(AA10)
COPY FROM(INLOW) TO(TMPLOW) -
            USING(AA11)
//AA10CNTL DD *
  OUTREC BUILD=(1:C'000000000001',
           13:C':FIELD_TEST1:',
           26:1,1,
           37:C' ')
  INCLUDE COND=(1,1,CH,EQ,C' ')
/*
//AA11CNTL DD *
  OUTREC BUILD=(1:C'000000000001',
           13:C':FIELDTEST:',
           24:2,3,
           37:C' ')
  INCLUDE COND=(2,3,CH,EQ,C' ')
/*


This works but with extremely poor performance, because for every condition a copy task is required. I would like to perform all the conditions in one CNTL. For instance (psydocode):

//AA10CNTL DD *
  IF INCLUDE COND=(1,1,CH,EQ,C' ') THEN
      OUTREC BUILD=(1:C'000000000001',
           13:C':FIELD_TEST1:',
           26:1,1,
           37:C' ')
  END IF
  IF INCLUDE COND=(2,3,CH,EQ,C' ') THEN
      OUTREC BUILD=(1:C'000000000001',
           13:C':FIELDTEST:',
           24:2,3,
           37:C' ') 
 END IF
/*


Unfortunately i don't know the correct sytax for this psydo code :(.

Thanks a lot for your help in advance and best regards!

Re: Counting chars occurences based on a conditions

PostPosted: Mon Nov 14, 2011 4:53 pm
by enrico-sorichetti
see here for the solution to a similar requirement
http://ibmmainframes.com/about56724.html

working for dfsort, should work also for syncsort

Re: Counting chars occurences based on a conditions

PostPosted: Mon Nov 14, 2011 10:45 pm
by dick scherrer
Hello and welcome to the forum,

Which sort product is used on your system? Post the informational messages generated by any execution of the sort.

There are separate parts of the forum for DFSORT and Syncsort questions and your topic will be moved to the appropriate part of the forum.

Re: Counting chars occurences based on a conditions

PostPosted: Tue Nov 15, 2011 12:35 am
by Frank Yaeger
wuz,

Have you looked at the OCCUR operator of DFSORT's ICETOOL. It may do what you want. You can access all of the DFSORT books from:

http://www.ibm.com/support/docview.wss? ... g3T7000080

If not, you need to do a better job of describing what it is you want to do rather than showing how you think you can do it. Show an example of the records in your input file (relevant fields only) and what you expect for output. Explain the "rules" for getting from input to output. Give the starting position, length and format of each relevant field. Give the RECFM and LRECL of the input files.

Re: Counting chars occurences based on a conditions

PostPosted: Wed Nov 16, 2011 5:40 pm
by wuz
Hi,

Thanks for your replies. I am using DFSORT ICETOOL UTILITY. The OCCURS operator seems pretty usefull, but is a little bit an overkill because i don't need the count for each unique value, only for a certain condition.

I'll tried to merge the several CNTTL into one with the help if the IFTHEN ... WHEN operator. But first of all I want to ensure that all the conditions are evaluated (not only if the predecessing one is false), and second i receive the following errror messagen.

//AA10CNTL DD *                                             
  OUTREC IFTHEN=(WHEN=((1,1,ZD,EQ,1),AND,(324,1,CH,EQ,C' ')),
               BUILD=(1:C'000000000001',                     
                      13:C':AA_AAA_AAA:',                   
                      25:324,1,                             
                      40:C' '),                             
         IFTHEN=(WHEN=((1,1,ZD,EQ,1),AND,(450,1,CH,EQ,C' ')),
               BUILD=(1:C'000000000001',                     
                      13:C':BBBBBBBB_BBBB:',                 
                      28:450,1, -                           
                      40:C' ')                               
/*                                                           


  OUTREC IFTHEN=(WHEN=((1,1,ZD,EQ,1),AND,(324,1,CH,EQ,C' ')),
               BUILD=(1:C'000000000001',                     
                      13:C':AA_AAA_AAA:',                   
                      25:324,1,                             
                      40:C' '),                             
         IFTHEN=(WHEN=((1,1,ZD,EQ,1),AND,(450,1,CH,EQ,C' ')),
         $                                                   
OPERAND DEFINER ERROR                                       
               BUILD=(1:C'000000000001',                     
               $                                             
BLANK NEEDED IN COLUMN 1 OR OPERATION NOT DEFINED CORRECTLY 
                      13:C':BBBBBBBB_BBBB:',                 
                      $                                     
SYNTAX ERROR                                                 
                      28:450,1, -                           
                      $                                     
SYNTAX ERROR                                                 
                      40:C' ')                               

Re: Counting chars occurences based on a conditions

PostPosted: Thu Nov 17, 2011 12:10 am
by Frank Yaeger
Since you didn't supply any actual information about what you're trying to do as I previously requested, all I can do is show you the valid syntax - I can't verify that what you're doing is "correct". Here is the valid DFSORT syntax (right parens added where needed):

  OUTREC IFTHEN=(WHEN=((1,1,ZD,EQ,1),AND,(324,1,CH,EQ,C' ')),     
               BUILD=(1:C'000000000001',                         
                      13:C':AA_AAA_AAA:',                         
                      25:324,1,                                   
                      40:C' ')),                                 
         IFTHEN=(WHEN=((1,1,ZD,EQ,1),AND,(450,1,CH,EQ,C' ')),     
               BUILD=(1:C'000000000001',                         
                      13:C':BBBBBBBB_BBBB:',                     
                      28:450,1, -                                 
                      40:C' '))                                   

Re: Counting chars occurences based on a conditions

PostPosted: Fri Nov 18, 2011 7:53 pm
by wuz
Thanks a lot for your help, they job now finished with return code 0 but it seems that when both conditions are false the whole row is copied. Moreover the second condition only seems to be evaluated when the first one is false.

I want to count all the occurences of a specific fields (specified by startpos, length) based on a specific condition. Guess i have to figure out how the occurs operator works with the specified conditions.

Best regards

Re: Counting chars occurences based on a conditions

PostPosted: Sat Nov 19, 2011 12:00 am
by Frank Yaeger
Again and again: Since you didn't supply any actual information about what you're trying to do as I previously requested, all I can do is show you the valid syntax - I can't verify that what you're doing is "correct". I can comment on what you said, however:

but it seems that when both conditions are false the whole row is copied.


Are you saying you want to OMIT any records that don't meet the conditions? You would have to use OMIT for that. IFTHEN does NOT omit records - it only changes them.

Moreover the second condition only seems to be evaluated when the first one is false.


Are you saying that you want the second IFTHEN clause to be evaluated even if the first is true. In that case, you'd need to use HIT=NEXT on the first IFTHEN clause.

Of course, I have no idea if what you think you need is even close to what you actually need because I don't know exactly what you're trying to do! Is there some reason you can't show a complete picture of what you want to do? In my FIRST post, I asked for the following:

You need to do a better job of describing what it is you want to do rather than showing how you think you can do it. Show an example of the records in your input file (relevant fields only) and what you expect for output. Explain the "rules" for getting from input to output. Give the starting position, length and format of each relevant field. Give the RECFM and LRECL of the input files.

If you had given me that when I asked, you would probably have your solution by now. If you give it to me now, I can help you instead of trying to guess what you're doing.

Re: Counting chars occurences based on a conditions

PostPosted: Tue Nov 22, 2011 3:35 pm
by wuz
Sorry, I will try to be more precise.

  • I have an input dataset without delimiters, so single fields can only be specified by startpos and length.
  • Now I want to check if a specific field has the intended value.
  • If the field does not have the intended value, I want to generate the following output:
    number_of_occurences field_name not_intended_value
  • For instance I assume that field1 has to have the value "A". A sample output could look like the following:
    3 field1 B
    7 field1 C
    I don't want to have the, let's say more than 100 records where field1 has the correct value of A, but the output tells me that field1has 3 times the wrong value B and 7 times the wrong value c.

I am really sorry if I wasn't that precise. The code i posted in my first post does exactly what I need but it is not performant. In a first step I use an own COPY CNT for every field to generate a counter var, the fieldname and the value. In a second step i sum the counter var and group by fieldname, value.

Thanks a lot for your help

Re: Counting chars occurences based on a conditions

PostPosted: Tue Nov 22, 2011 10:56 pm
by skolusu
wuz wrote:
  • For instance I assume that field1 has to have the value "A". A sample output could look like the following:
    3 field1 B
    7 field1 C
    I don't want to have the, let's say more than 100 records where field1 has the correct value of A, but the output tells me that field1has 3 times the wrong value B and 7 times the wrong value c.

I am really sorry if I wasn't that precise. The code i posted in my first post does exactly what I need but it is not performant. In a first step I use an own COPY CNT for every field to generate a counter var, the fieldname and the value. In a second step i sum the counter var and group by fieldname, value.

Thanks a lot for your help


wuz,

You don't need 2 steps to get the counts of values. Here is a DFSORT JCL which will give you the counts of each character which don't have the intended value. ex: 'A"

//STEP0100 EXEC PGM=SORT                               
//SYSOUT   DD SYSOUT=*                                 
//SORTIN   DD *                                         
A                                                       
A                                                       
B                                                       
Z                                                       
B                                                       
C                                                       
C                                                       
B                                                       
//SORTOUT  DD SYSOUT=*                                 
//SYSIN    DD *                                         
  OMIT COND=(1,1,CH,EQ,C'A')                           
  INREC OVERLAY=(10:C'1')                               
  SORT FIELDS=(1,1,CH,A)                               
  OUTFIL REMOVECC,NODETAIL,BUILD=(80X),                 
  SECTIONS=(1,1,                                       
  TRAILER3=('COUNT OF CHARACTER ',1,1,' IN POS 1 : ',   
            TOT=(10,1,ZD,M10,LENGTH=8)))               
//*                                                     


The output from this is
COUNT OF CHARACTER B IN POS 1 :        3
COUNT OF CHARACTER C IN POS 1 :        2
COUNT OF CHARACTER Z IN POS 1 :        1


If you're not familiar with DFSORT and DFSORT's ICETOOL, I'd suggest reading through "z/OS DFSORT: Getting Started". It's an excellent tutorial, with lots of examples, that will show you how to use DFSORT, DFSORT's ICETOOL and DFSORT Symbols. You can access it online, along with all of the other DFSORT books, from:

http://www.ibm.com/support/docview.wss? ... g3T7000080