by steve-myers » Thu Jan 06, 2011 10:03 pm
Those statistics are not coming from JES2, either. They are coming from SMF data as it is being examined by the IEFACTRT SMF exit routine. One of z/OS's dirty little secrets is that CPU time, especially for quickly running jobs, is not repeatable. There are several reasons for this. The two biggest issues is that I/O, generally for other jobs, is stealing use of CPU time to access memory. Similarly, and this is especially true for modern hardware where you have cache memory and CPU instruction "pipelines," every time the CPU processes an interrupt it is a disaster for your program. First, the instruction "pipeline," which caused storage accesses to fill, is abandoned. Second, by the time the computer completes processing the interrupt, and often running some other process, the storage cache has been filled with data from someone else, and until your data has replaced the cache your program is running more slowly. When your program restarts, it has to refill the instruction "pipeline" and cache memory before it is running at 100%. All of this is charged to you. SRB time is even more flaky than the CPU time you see n your JESMSGLG and JESYSMSG datasets, especially since most of it has nothing to do with your job.
As Mr. Sample says, elapsed time is even less repeatable than CPU time, as it is very dependent on other workloads in the system.
Us mainframe types have always been obsessed with CPU time, even though it has never been very repeatable. As the real costs of CPU have declined over time, I increasingly wonder if this obsession is realistic these days, though when it is used as a rough measure of program efficiency you are making a valid use of the statistic.