Page 3 of 3

Re: testing empty dataset

PostPosted: Sat Jun 06, 2020 5:56 pm
by enrico-sorichetti
why all those stems, why not just


that' exactly what I was suggesting ;)

and for the TS

to see by Yourself why "EXECIO * DISKW ..." is bad

use


/* rexx */
trace "o"
signal on novalue name novalue
do i = 1 to 100
   stem.i = "line"i
end
stem.10 = ""
Address TSO
"ALLOC FI(F1) DS('<any PS dataset>') SHR REUS"
"EXECIO * DISKW F1 (STEM STEM. FINIS"
exit
novalue:
say  "*********************************"
say  "**                             **"
say  "** novalue trapped at line" || right(sigl,4) || " **"
say  "**                             **"
say  "*********************************"
exit
 

and run three times
1) without statement 7
2) with statement 7
3) with statement 7 using 100 instead of the "*"

Re: testing empty dataset

PostPosted: Fri Jun 19, 2020 2:43 pm
by samb01




/*  process only selected rows of the stem */
<output_stemvar>.0 = 0
do  i = 1 to <input_stemvar>.0

    if  <some condition> then do
        <output_stemvar>.0 = <output_stemvar>.0 + 1
        j = <output_stemvar>.0
        <output_stemvar>.j = ... ... ...
    end
   
end
/* if the condition is never satisfied
    <output_stemvar>.0 will be 0 and just the EOF will be written */  
"EXECIO" <output_stemvar>.0 "DISKW <output_ddname> ( FINIS STEM <output_stemvar>."

return
 
[/quote]

don't you think there's a mistake in


  j = <output_stemvar>.0
        <output_stemvar>.j = ... ... ...
 


may be it's j instead of i ?

Re: testing empty dataset

PostPosted: Fri Jun 19, 2020 2:52 pm
by NicC
No. i is iterating over the input. j is keeping track of the output. Please read and understand the code given to you. I also suggest reading the Rexx language reference and the EXECIO documentation and understanding it.

Re: testing empty dataset

PostPosted: Fri Jun 19, 2020 6:26 pm
by samb01
OK, thanks to your advices, now my rexx is like as follow :


/*REXX*/                                                                                                
DO FOREVER
  "EXECIO 1000 DISKR IN (FINIS STEM DTIN."  
  IF DTIN.0=0 THEN LEAVE                    
    DO J = 1 TO DTIN.0                                          
        IF SUBSTR(DTIN.J,22,9) = "PTL A" & SUBSTR(DT.J,34,2) > 10 THEN
      DO
          DTOUT.0 = DTOUT.0 + 1
          J = DTOUT.0
          DTOUT.J = SUBSTR(DTIN.J,1,80)
      END                                    
    END
END
"EXECIO" DTOUT.0 "DISKW OUT ( FINIS STEM DTOUT."

RETURN                                                
EXIT;            
 

Re: testing empty dataset

PostPosted: Fri Jun 19, 2020 8:14 pm
by sergeyken
samb01 wrote:OK, thanks to your advices, now my rexx is like as follow :


/*REXX*/                                                                                                
DO FOREVER
  "EXECIO 1000 DISKR IN (FINIS STEM DTIN."  
  IF DTIN.0=0 THEN LEAVE                    
    DO J = 1 TO DTIN.0                                          
        IF SUBSTR(DTIN.J,22,9) = "PTL A" & SUBSTR(DT.J,34,2) > 10 THEN
      DO
          DTOUT.0 = DTOUT.0 + 1
          J = DTOUT.0
          DTOUT.J = SUBSTR(DTIN.J,1,80)
      END                                    
    END
END
"EXECIO" DTOUT.0 "DISKW OUT ( FINIS STEM DTOUT."

RETURN                                                
EXIT;            
 

This code must fall into endless loop.

Since each DISKR ends with FINIS option (e.g. the file is closed), then the next attempt should re-open it beginning from its first record again, and the whole bunch of records just read shall be re-processed again (and again, and again, and again...)

BTW, this “double buffering” by 1000 records at one single read doesn’t give any benefit except disadvantages. (First level of buffering is provided by zOS I/O system itself.) The code would be much more simple either (1) by reading 1 record at a time into the program stack, or (2) by reading all records (*) at once, into a stem, or into the program stack. The “performance improvement” after reading records by bunch of 1000 of them is an old myth, which has very little to do with the reality.

Re: testing empty dataset

PostPosted: Fri Jun 19, 2020 8:24 pm
by samb01
hello sergeyken,
i used willy jensen's method by reading by 1000.
Because, he said never used "*" in a EXECIO in Production...

So what is the right way to do ?

It seems nobody is agree by the right way...

Re: testing empty dataset

PostPosted: Fri Jun 19, 2020 9:06 pm
by sergeyken
samb01 wrote:hello sergeyken,
i used willy jensen's method by reading by 1000.
Because, he said never used "*" in a EXECIO in Production...

So what is the right way to do ?

It seems nobody is agree by the right way...


Using 1000-group is possible, but your code logic is wrong. It needs to be changed

/*REXX*/                                                                                                
"EXECIO 0 DISKR IN (OPEN"  /* optional; can be done by default by the first READ */  
DTOUT.0 = 0      /* fool-proof setting, to avoid REXX errors on undefined variable after I/O error etc. */
DO FOREVER
  "EXECIO 1000 DISKR IN (STEM DTIN."  /* no FINIS, to continue next READ operations */
   IF RC <> 0 ,                       /* recommended to check for I/O errors of any kind */
    | DTIN.0 = 0 THEN LEAVE                    
    DO J = 1 TO DTIN.0                                          
        IF SUBSTR(DTIN.J,22,9) = "PTL A" & SUBSTR(DT.J,34,2) > 10 THEN
      DO
            . . . . . . . . . . .
      END                                    
    END J
END
"EXECIO 0 DISKR IN (FINIS"     /* explicit CLOSE for correct logic */
"EXECIO" DTOUT.0 "DISKW OUT ( FINIS STEM DTOUT."
. . . . . . .
 


Or, reading records one-by-one

/*REXX*/                                                                                                
"EXECIO 0 DISKR IN (OPEN"  /* optional; can be done by default by the first READ */  
Do I = 1 By 1
  "EXECIO 1 DISKR IN"  /* no FINIS, get one record into program stack */
   IF RC <> 0 THEN LEAVE                       /* recommended to check for I/O errors of any kind */
   Parse Pull NewRecord                                          
   IF SUBSTR( NewRecord, 22, 9) = "PTL A" ,
    & SUBSTR( NewRecord, 34, 2 ) > 10 THEN
      DO
         . . . . . . .
      END                                    
END I
"EXECIO 0 DISKR IN (FINIS"     /* explicit CLOSE for correct logic */
"EXECIO" DTOUT.0 "DISKW OUT (FINIS STEM DTOUT."
. . . . . . .
 


Or, reading all records at once (in your example it may be either possible or not, depending on expected size of your input)

/*REXX*/                                                                                                
"EXECIO * DISKR IN (FINIS"  /* read all records at once into the program stack, and CLOSE file */  
IF RC <> 0 THEN SIGNAL IOERROR       /* recommended to handle I/O somehow */
StackSize = Queued()
Do I = 1 To StackSize
   Parse Pull NewRecord                                          
   IF SUBSTR( NewRecord, 22, 9) = "PTL A" ,
    & SUBSTR( NewRecord, 34, 2 ) > 10 THEN
      DO
          Queue NewRecord            /* keep good record at the end of your stack */
      END                                    
END I
/* At this point all input records have been extracted from the stack,
   and "good" records added at the end of it */
"EXECIO * DISKW OUT (FINIS" /* Write the rest of the stack as output file */
. . . . . . .
 

Re: testing empty dataset

PostPosted: Sat Jun 20, 2020 3:15 pm
by willy jensen
reading records by bunch of 1000 of them is an old myth, which has very little to do with the reality.

I beg to differ, I just did a quick test in a z/OS 2.4 system:
Program 1:
count=0                                                          
t1=time('e')                                                    
do forever                                                      
  "Execio 1000 diskr in (stem in.)"                              
  if in.0=0 then leave                                          
  do ini=1 to in.0                                              
    count=count+1                                                
  end                                                            
end                                                              
"Execio 0 diskr in (finis)"                                      
say 'count:' count', time:' time('e')-t1', cpu:' sysvar('syscpu')

Program 2:
count=0                                                          
t1=time('e')                                                      
do forever                                                        
  "Execio 1 diskr in (stem in.)"                                  
  if in.0=0 then leave                                            
  count=count+1                                                  
end                                                              
"Execio 0 diskr in (finis)"                                      
say 'count:' count', time:' time('e')-t1', cpu:' sysvar('syscpu')
 

The results were:
count: 84000, time: 0.129330, cpu: 0.10
and
count: 84000, time: 0.376596, cpu: 0.34
So reading one record at a time is more expensive. And I found that the difference is much more in an older smaller system. Whether the added cost is significant of course depends.
But I give you that the I/O count were the same.

Re: testing empty dataset

PostPosted: Sat Jun 20, 2020 6:29 pm
by sergeyken
willy jensen wrote:
reading records by bunch of 1000 of them is an old myth, which has very little to do with the reality.

I beg to differ, I just did a quick test
The results were:
count: 84000, time: 0.129330, cpu: 0.10
and
count: 84000, time: 0.376596, cpu: 0.34
So reading one record at a time is more expensive. And I found that the difference is much more in an older smaller system. Whether the added cost is significant of course depends.
But I give you that the I/O count were the same.

1) I doubt that anybody would care about the difference in 0.2 sec for datasets below 100,000-500,000 records

2) I doubt that anyone dealing with datasets >100,000,000,000 would ask such question at the beginners forum

3) I doubt that the system dealing with 100,000,000,000 records would be based on REXX code

4) Any real system dealing with 100,000,000,000 records usually does some kind of record processing besides of stupid READ/WRITE; it usually takes much longer. If not (if the processing is simple), then instead of REXX any hi-performance standard utility can be used

Etc...

After all, in real life it makes no sense to complicate the logic in order to “save” 0.2 sec... :D

Re: testing empty dataset

PostPosted: Sun Jun 21, 2020 2:23 am
by willy jensen
in real life it makes no sense to complicate the logic in order to “save” 0.2 sec

No argument there, but as I said for smaller systems it does make a difference and I like to program for the smallest common dominator. Ah well, each to her/his own. ;)