Page 1 of 1

Incoming file, delimited, no line feed, spanning records?

PostPosted: Wed May 28, 2014 7:59 pm
by Peter J
Hello, I have been searching but could not find that this is possible to handle:

Data looks like

2014-01-01 12:12:12,data,data,,,,data,data,,,data,,2014-01-01 12:12:22,data,data,,,,data,data,data,data,data,,2014-01-01 12:12:32,data,data,,,,data,data,,,data,,<continues to EOF>da
ta,data,,2014-01-01 12:12:12,data,data,,,,data,data,,,data,,

until max file length (VB) reached and then the data wraps. So in essence it's one giant record. A 50GB record.
Each logical record begins with a timestamp, and then data values or nulls based on the comma delimiter.

Can I read this effectively and create separate records?

2014-01-01 12:12:12,data,data,,data,,data,data,,,data,,<CRLF>
2014-01-01 12:12:22,data,data,,,,data,data,data,data,data,,<CRLF>
2014-01-01 12:12:32,data,data,,,,data,data,,,data,,<CRLF>
...

This is some sort of internet data dump coming in and we're arguing for line feeds

Re: Incoming file, delimited, no line feed, spanning records

PostPosted: Wed May 28, 2014 8:56 pm
by steve-myers
The data you described probably start life as a Windoze text file, and was sent to MVS as a binary transfer or by the 3270 file transfer service using ASCII without CRLF. This assumes <CRLF> is not a figment of your imagination.

OS/360 data sets do not use record delimiter characters of any sort. IBM made that mistake with the 14xx series of computers and made a deliberate decision not to repeat it with System/360. RECM V and VB data sets are delimited by physical record boundaries, Block Descriptor Words (BDWs), and Record Descriptor Words (RDWs). RECFM F and FB data sets are delimited by physical record boundaries, and the data set's LRECL. RECFM U data sets are delimited by physical record boundaries. The Selecting Record Formats for Non-VSAM Data Sets chapter in DFSMS Using Data Sets for your z/OS release discusses this matter in more detail.

The good news is it should be fairly easy to transform this data to a standard OS/360 data set. Your sort product may be able to do it, though that's out of my field, or it should be possible to write a program for the purpose.

Re: Incoming file, delimited, no line feed, spanning records

PostPosted: Wed May 28, 2014 10:19 pm
by BillyBoyo
And there's me just about to say it'll be tricky in SORT. Perhaps.

Are there a fixed number of fields? How many? Maximum fields on a record? LRECL? What SORT product? What version of same (from sysout of any SORT step).

Re: Incoming file, delimited, no line feed, spanning records

PostPosted: Wed May 28, 2014 10:44 pm
by Peter J
Z/OS DFSORT V1R12

Fixed number of fields - so they say, 28. But when you come to the end of a column delimiter, the 28th field may or may not be the timestamp that separates data.
Current leaning is to max VB length - 32767.

Which means, I suspect, I would have to examine each data element, determine if it starts with 2014- or 2015- and so on to say - ah, ok this is new.
If there were 1 or two records per row, then no problem, but what I can't determine is if I come to End Of Line - and the record continues on the next one, how would I represent that?