Page 1 of 1

XML parsing using SORT

PostPosted: Sat Jan 24, 2015 12:12 am
by k singh
Hi Frank

I am beginner in mainframe, trying to learn. I need little or guidance
I generated a xml file using XMLGEN utility. Now I need to convert this xml to flat file(comma separated hopefully)
my xml file(containing lot of records) is in following format
<?xml version="1.0"?>
<FILE DSN="data">
<sample RECORD="00001">
<NUMBER>1111</NUMBER>
<DATE>2015</DATE>
<TIME>0000</TIME>
<NO>1</NO>
<N01>1</N01>
<N02>2002</N02>
<N03>0</N03>
<N04>0</N04>
</sample>
<sample RECORD="00002">
<NUMBER>2222</NUMBER>
<DATE>2014</DATE>
<TIME>1111</TIME>
<NO>1</NO>
<N01>1000</N01>
<N02>2002</N02>
<N03>10</N03>
<N04>10</N04>
</sample>

No I am using this JCL to generate flat output file
//STEP01   EXEC PGM=SORT
//*
//SORTIN   DD  DSN=File1,DISP=OLD
//SORTOUT  DD  DSN=File2,DISP=OLD
//*
//SYSIN    DD * 
  OPTION COPY
  OUTREC IFTHEN=(WHEN=INIT,PARSE=(%1=(STARTAFT=C'NUMBER>',
  ENDBEFR=C'</>',FIXLEN=4))),
  IFTHEN=(WHEN=INIT,PARSE=(%2=(STARTAFT=C'DATE>',
  ENDBEFR=C'</>',FIXLEN=4))),
  IFTHEN=(WHEN=INIT,PARSE=(%3=(STARTAFT=C'TIME>',
  ENDBEFR=C'</>',FIXLEN=4))),
  IFTHEN=(WHEN=INIT,PARSE=(%4=(STARTAFT=C'NO>',
  ENDBEFR=C'</>',FIXLEN=1))),
  IFTHEN=(WHEN=INIT,PARSE=(%5=(STARTAFT=C'N01>',
  ENDBEFR=C'</>',FIXLEN=4))),
  IFTHEN=(WHEN=INIT,PARSE=(%6=(STARTAFT=C'N02>',
  ENDBEFR=C'</>',FIXLEN=4))),
  IFTHEN=(WHEN=INIT,PARSE=(%7=(STARTAFT=C'N03>',
  ENDBEFR=C'</>',FIXLEN=4))),
  IFTHEN=(WHEN=INIT,PARSE=(%8=(STARTAFT=C'N04>',
  ENDBEFR=C'</>',FIXLEN=4))),
  IFTHEN=(WHEN=NONE,BUILD=(1,4,%1,15:%2,%3,%4,%5,%6,%7,%8))

But I am getting output as
1111
2015
0000
1
1</N0
2002
0</N0
0</N0

2222
2014
1111
1
1000
2002
10</N0
10</N0
Now I have two issues. I want my output(1record in 1 line values separated by ,) as
1111,2015,0000,1,1,2002,0,0
2222,2014,1111,1,1000,2002,10,10
and values of last 4 columns can vary from no value to 4 digits. but when I give as fixlen=4, it is picking up next characters
if value is less than 4 digits.

Please help me on this one, any guidance or advise on what changes I should make to my jcl to get required o/p will be greatly appreciated

thanks a lot

Re: XML parsing using SORT

PostPosted: Sat Jan 24, 2015 3:04 am
by k singh
I am able to solve fixlen problem.
any help with getting the values in one line would be greatly appreciated

Re: XML parsing using SORT

PostPosted: Sat Jan 24, 2015 6:48 am
by BillyBoyo
Will there always be the same number of input records from <sample.. to </sample>?

Re: XML parsing using SORT

PostPosted: Mon Jan 26, 2015 2:34 am
by k singh
Hi

for this particular file, no of input records will always be same and in same order.
But for other files it can change. Any guidance with that respect will be appreciated
but for now same no of records is main concern.

Re: XML parsing using SORT

PostPosted: Fri Jan 30, 2015 4:36 am
by BillyBoyo
Please don't PM to ask.

Use OMIT COND= to get rid of the data you don't want.

You then have groups of records of equal number. You can use ICETOOL's RESIZE operator to get these onto a line, then use SQZ to get rid of embedded blanks.

You can do the PARSEing part before or after that, it doesn't matter much.

Where you have variable numbers of elements, and not in a known order, you would have to know at least the maximum number of elements. You'd have to PARSE for each field seperately.

Get the first one going first, and then we'll see.

Re: XML parsing using SORT

PostPosted: Thu Feb 05, 2015 9:30 pm
by k singh
I was not able to work on resize operator, I tried using when=group logic
//STEP01   EXEC PGM=SORT
//*
//SORTIN   DD  DSN=file1,DISP=OLD
//SORTOUT  DD  DSN=file2,DISP=OLD
//*
//SYSIN    DD *
OPTION COPY
INREC IFTHEN(WHEN=GROUP,
BEGIN=(7,8,CH,EQ,C'<NUMBER>'),
END=(7,5,CH,EQ,C'<No4>'),
PUSH=(253:ID=1))
OUTFIL PARSE=(%00=(STARTAFT=C'>',ENDBEFR=C'</',FIXLEN=9),
%01=(STARTAFT=C'>',ENDBEFR=C'</',FIXLEN=8),
%02=(STARTAFT=C'>',ENDBEFR=C'</',FIXLEN=8),
%03=(STARTAFT=C'>',ENDBEFR=C'</',FIXLEN=8),
%04=(STARTAFT=C'>',ENDBEFR=C'</',FIXLEN=4),
%05=(STARTAFT=C'>',ENDBEFR=C'</',FIXLEN=4),
%06=(STARTAFT=C'>',ENDBEFR=C'</',FIXLEN=1),
%07=(STARTAFT=C'>',ENDBEFR=C'</',FIXLEN=1)),
BUILD=(1,4,%00,%01,%02,%03,%04,%05,%06,%07)


but still I got outout as
1111
2015
0000
1
1
2002
0
0

desired output
1111,2015,0000,1,1,2002,0,0

I tried using overlay instead of build but it kept on giving error

Re: XML parsing using SORT

PostPosted: Mon Feb 09, 2015 6:24 am
by BillyBoyo
See if this one is any use to you: syncsort-synctool/topic10428.html