Page 1 of 2

How to do this, Sort, DB2 or COBOL?

PostPosted: Tue Apr 22, 2008 4:36 pm
by bond_ajay
I am trying to achieve the below requirement. I need your help to know if this can be acieved?

Highlevel Requirement:
To fetch data from a table grouped by 15 minute intervals on timestamp column.

1. i need to create an output file which will have fields: key-1, data-1, metric-1, metric-2, metric-3...metric-96.
2. i need to sum the data-1 values for every 15 minute interval in the table to arrive at metric-1, metric-2 value and so on.

Scenario:
i have product-id (key), dept, qty and timestamp on a table. I need to sum the qty field for all unique produt-id/dept combination for the first 15 minutes (00:00 hrs to 00:15 hrs) to get the sum(qty) as metric-1. Again sum(qty) for next 15 mins interval (00:16 hrs to 00:30 hrs) to get metric-2 and so on for all the 15 minute interval (96 intervals in 24 hours). Now the output needs to be written with product-id, dept, and the 96 metric values.

Question: is it possible to get the data from the table by just one query by calling a paragraph 96 times to get the 96 interval data and create one output record? Is there any better way to achieve this?

Please let me know if the requirement is not clear or if more inputs are required.

Sample Input data: (this is what i have from the table)

Key dept Qty timestamp
1 X 5 2008-04-01-00.00.00.000000
1 X 5 2008-04-01-00.10.00.000000
1 X 5 2008-04-01-00.20.00.000000
1 X 5 2008-04-01-00.25.00.000000
1 X 5 2008-04-01-01.01.00.000000
1 X 5 2008-04-01-01.05.00.000000
1 X 5 2008-04-01-01.10.00.000000
1 X 5 2008-04-01-01.15.00.000000
1 X 5 2008-04-01-01.20.00.000000
1 X 5 2008-04-01-01.25.00.000000
1 X 5 2008-04-01-01.28.00.000000
1 X 5 2008-04-01-01.30.00.000000


Required output data - output from query: (this is what i need in the output file. Note: Metric is sum(qty) for that time interval - every 15 mins)
Key dept Metric-1 metric-2 metric-3 metric-4 metric-5 ....... metric-96
1 X 10 10 0 0 20 .......

Thanks,
Ajay

Re: How to do this, Sort, DB2 or COBOL?

PostPosted: Tue Apr 22, 2008 6:20 pm
by arunprasad.k
Here is a possible solution.

#1 Schedule a job to run for every 15 min.
#2 Have a COBOL program to execute the following query using fetch.

SELECT PRODUCT-ID, DEPT, SUM(QTY) FROM TABLE_1 GROUP BY PRODUCT-ID, DEPT

#3 Have a count file to note the number of run. For the first run this file will have 1 and for second run it will have 2 and so on.
#4 Have your required O/P file as a VSAM file with Key as PRODUCT-ID and DEPT. Also have the field METRIC in a array with occurance of 96.Daily this file will be del/def.
#5 Update the VSAM file with the same COBOL program each time and for the METRIC field use the count from count file as a subscript.

Post if you do not understand any of my above point.

Arun.

Re: How to do this, Sort, DB2 or COBOL?

PostPosted: Wed Apr 23, 2008 12:30 am
by dick scherrer
Hello,

If this were my requirement, i'd probably do very similar to what Arun has suggested.

The difference is that i would use DB2 to store the collected data rather than a vsam file. I would also not destroy the collected data every day so that i would be able to produce a wider range of analysis reports.

Re: How to do this, Sort, DB2 or COBOL?

PostPosted: Wed Apr 23, 2008 10:03 am
by bond_ajay
Is it not possible to get the output using one job that will run only once? Our requirement is to run a job/program to produce the output file once in a day. i.e, the job has a time window of 3:00AM to 5:00AM. so i cant schedule the job to run every 15 mins. It will be a daily job that needs to run in the afore mentioned time slot.

Is there anyways possible to achieve this using DB@/Cobol program or by using unload/DFSORT utility? Please provide your inputs.

Re: How to do this, Sort, DB2 or COBOL?

PostPosted: Wed Apr 23, 2008 10:10 am
by bond_ajay
Add to it, the transaction volumes are going to be in millions and we need to run them within 2 hours. Can you suggest if we can create temp tables with data that we want and then use those tables to create the output file?

Re: How to do this, Sort, DB2 or COBOL?

PostPosted: Wed Apr 23, 2008 10:20 am
by dick scherrer
Hello,

Yes, you could unload the data into a qsam file, process that as input, and create the new records you need for that day.

I would expect this would run in considerably less than 2 hours. . .

Re: How to do this, Sort, DB2 or COBOL?

PostPosted: Wed Apr 23, 2008 10:55 am
by bond_ajay
dick scherrer,
The point is, will we have to run 96 unloads or 96 queries in the program to get the data. Per our requirement, we need data grouped by 15 minute time interval from a table. so what should the query be like to load into the temp tables?

Re: How to do this, Sort, DB2 or COBOL?

PostPosted: Wed Apr 23, 2008 5:17 pm
by arunprasad.k
Dick,

I suggested VSAM just for easy random read and update. DB2 would also be fine.

I wanted the VSAM to del/def because i dont want the last day data to be mixed up with the current day's one. We can very well have a backup before del/def.

Arun.

Re: How to do this, Sort, DB2 or COBOL?

PostPosted: Thu Apr 24, 2008 1:39 am
by dick scherrer
Hello,

The point is, will we have to run 96 unloads or 96 queries in the program to get the data
Why?

Unless i completely mis-understand the requirement, you could unload the needed data into one qsam file and then process that data creating the summaries you need. These results could be put into a vsam file or a db2 table (as has been mentioned above).

Keep in mind that just because the source of the data is a database, not all processes are best served being restricted to database real-time (queries) processing. . . Often, performance is much better when large processes are run from extracted data rather than via sql queries against the database.

If there's something i've not understood correctly, please advise and we'll have another round :)

Re: How to do this, Sort, DB2 or COBOL?

PostPosted: Fri Apr 25, 2008 3:23 pm
by bond_ajay
Dick/Arun,
i cannot run the job every 15 min. I need to run the job only once a day.

""Yes, you could unload the data into a qsam file, process that as input, and create the new records you need for that day.
I would expect this would run in considerably less than 2 hours. . .""

So as per your suggestion, i can unload the data into a qsam file and use it for creating the new records. I can group by the keys (store#) but My question is how can i group by the 15 minute interval from this file, to get the desired output.

If you can give some samples or examples, it would be really great.

Also let me know if processing 9 million records daily in the above logic will not hamper any performance.

Thanks,
Ajay