Page 2 of 2

Re: Extended Format Datasets

PostPosted: Mon Aug 12, 2013 6:09 pm
by steve-myers
I think j2422tw's discussion of "large" and "extended format" data sets is absolutely correct. I am much less comfortable with j2422tw's analysis of the CPU consumption when compression is used.

All I/O requires some CPU time, almost certainly more than j2422tw thinks. The only 100% certain way to reduce I/O's CPU requirement is to reduce the number of real I/O requests by using larger data blocks, or as j2422tw thinks, compressing the data to reduce the number of real I/O requests.

Data compression also requires CPU time. The amount of time depends on many factors; this is not an appropriate forum for this discussion.

Re: Extended Format Datasets

PostPosted: Mon Aug 12, 2013 7:44 pm
by mfrookie
All,

Thanks for all those comments.

Just to clear some of the points -

1) The current SORT jobs are very well tuned (again I would not say that all of them are tuned, but at least those which get reported due to their high CPU/Elapsed time). It includes, filtering unnecessary data, extracting only those fields which are required, using OUTFILs wherever applicable etc., using SORTED option on JOINKEYS wherever applicable, appropriate use of SORTWK files etc. Some of the long SORT jobs were split into multiple small sorts and them we followed MERGE technique to merge the files. And we have seen almost 50% savings in some cases (either CPU or Elapsed time or in some cases, both reduced considerable).

But still looking at the way volume has been growing, we are very much concerned about performance as the batch cycle is expected to finish in certain SLA window.

2) I did not confuse LARGE and Extended format datasets. I was just trying to say that because of volume growth, we have already started using LARGE datasets. But to gain further in terms of performance, we were just exploring the idea of using Extended format datasets.

Hi Jerry,

1) Did you face any issues when your shop started using Extended format datasets?
2) Any points that you think we consider or any issues that might crop up afterwards. The manual says that in case of Stripping the performance is best with only one program at a time, but we have multiple programs accessing the datasets.
3) In case we do not compress the data, would there still be any gains in terms of performance.
4) Any products that you or others know that do not support these datasets IBM File Manager etc. I know that DFSORT supports them. Though I did not find any specific reference to Extended format datasets in SYNCSORT manual, I am almost certain that it supports it.

All these things will become evident when we start doing the testing using sample jobs. But was just asking for information before hand to make it smooth.

Thanks.

Re: Extended Format Datasets

PostPosted: Tue Aug 13, 2013 7:34 am
by j2422tw
Steve-myers is right, compressing CPU usage and I/O eliminate is more complex than I can say.
Each compress dataset the compress ratio is different, depent on the data organization, but compress/decompress increase and I/O reduce CPU comsume, so the total CPU time difference is not much.

mfrookie:

1) Did you face any issues when your shop started using Extended format datasets?
--> If you using AP program like COBOL or C to process an empty extended format dataset, notice not use random or dynamic processing to load a file, or you will have VSAM open status 30 error.

2) Any points that you think we consider or any issues that might crop up afterwards. The manual says that in case of Stripping the performance is best with only one program at a time, but we have multiple programs accessing the datasets.
--> Sorry I don't know this criteria, but striping dataset process by multi parts of a dataset, it may be downgracde when multi program process same stripe data in same volume. But I think PAV volume can eliminate this situation.

3) In case we do not compress the data, would there still be any gains in terms of performance.
--> For striping dataset, yes, for non stiping extended format dataset, no.

4) Any products that you or others know that do not support these datasets IBM File Manager etc. I know that DFSORT supports them. Though I did not find any specific reference to Extended format datasets in SYNCSORT manual, I am almost certain that it supports it.
--> In my image, no. By the way, we using DFSORT, DFSORT have self performance tuning, but not sure SYNCSORT.
I think you can using extended format like normal dataset except first point previuos I talk about.

Hope this can help you.

Jerry

Re: Extended Format Datasets

PostPosted: Tue Aug 13, 2013 7:00 pm
by mfrookie
Hi Jerry,

Thanks a lot for answering those questions.

We do not have any VSAM files. All of our data is stored in Sequential files.

We still haven't started the exercise and are in the process of collecting information. We will be doing a small excercise using Extended datasets (using stripping and no stripping) and see the gain in performance.

You say that in case stripping is not done, there may not be any gains. I could be wrong but my understanding is that Extended datasets get allocated on multiple volumes and hence all those volumes can now be accessed paralelly which otherwise was not possible with BASIC / LARGE type datasets as if a task is accessing a volume, others tasks trying to access the same volume must wait until the first task is over. So I feel that there should be some gain even if we do not strip the data.

Re: Extended Format Datasets

PostPosted: Wed Aug 14, 2013 5:15 pm
by j2422tw
Hi mfrookie,

Sorry, I don't know you only using SAM for Extended format dataset, I check our site, we only using VSAM for striping dataset, and after check yesterday batch job, I sure a striping dataset can be used by 4 jobs in same time.

Cause we have no SAM for striping dataset, I make a test, the 2 batch jobs and TSO user can concurrent read the SAM striping dataset normally.

Striping dataset can give more performance benifit then basic SAM, only if you no CPU bound issue, because more data throughput need more CPU to processing.
If you have sufficent CPU resource, then you get the benifit, or the job still be hense by CPU usage.

best regards,
Jerry

Re: Extended Format Datasets

PostPosted: Fri Aug 16, 2013 7:00 pm
by mfrookie
Thanks Jerry.

I will keep it in mind.

I will also try to post the result of our exercise, but it might take a while as we are still in planning stage.

Thanks.