Removing spaces from within quoted fields



IBM's flagship sort product DFSORT for sorting, merging, copying, data manipulation and reporting. Includes ICETOOL and ICEGENER

Removing spaces from within quoted fields

Postby LundyGruffalo » Fri May 06, 2016 2:47 pm

A source file is written in CSV format, as a VB file.

The problem is that they didn't "trim" the strings before putting them in quotes, which means the average record length is over 3,000, when it should be about 200.

e.g.
123,"ABC ",45.67,"DEF ",45,"GHI ",999
123,"ABC ",7689.99,"DEF ",234,"GHI ",999

Therefore there is a lot of wasted space in the file. The numeric values can be different lengths, which means not all fields are lined up.

What's the best way of stripping out the spaces from within the quotes?
LundyGruffalo
 
Posts: 2
Joined: Fri May 06, 2016 2:42 pm
Has thanked: 0 time
Been thanked: 0 time

Re: Removing spaces from within quoted fields

Postby NicC » Fri May 06, 2016 3:31 pm

As you can see, there are no extra spaces in your post. (You should have spotted this and corrected when you posted.)

However, if you had used the code tags then this is what you would have:
123,"ABC      ",45.67,"DEF                   ",45,"GHI      ",999
123,"ABC      ",7689.99,"DEF                   ",234,"GHI      ",999
The problem I have is that people can explain things quickly but I can only comprehend slowly.
Regards
Nic
NicC
Global moderator
 
Posts: 3025
Joined: Sun Jul 04, 2010 12:13 am
Location: Pushing up the daisies (almost)
Has thanked: 4 times
Been thanked: 136 times

Re: Removing spaces from within quoted fields

Postby BillyBoyo » Fri May 06, 2016 5:38 pm

Can there be valid embedded spaces? If no, then you can arrange to SQZ the record, and all spaces will disappear (with the record-length automatically adjusted).

If you can have valid embedded blanks the problem is different, so let us know.
BillyBoyo
Global moderator
 
Posts: 3804
Joined: Tue Jan 25, 2011 12:02 am
Has thanked: 22 times
Been thanked: 265 times

Re: Removing spaces from within quoted fields

Postby LundyGruffalo » Mon May 09, 2016 11:04 am

There can be valid embedded spaces. The fields typically contain strings of four or five words, so the spaces between those words must stay. It is just the spaces between the last "valid" character and the closing quotes that need to be removed.
LundyGruffalo
 
Posts: 2
Joined: Fri May 06, 2016 2:42 pm
Has thanked: 0 time
Been thanked: 0 time

Re: Removing spaces from within quoted fields

Postby BillyBoyo » Mon May 09, 2016 7:49 pm

You'll need SQZ with MID=C' '. Note this will also squeeze multiple contiguous embedded blanks if you have those.

You'll need to extend your records to fixed-length first, with an X in the column after the last of the possible data (LRECL+1). Then you can specify the SQZ on all the data, and that will take care of the lengths. I assume you have variable-length records? You will need a FINDREP if you want to get rid of the single trailing blank, IN=C' "',OUT=C'"'.
BillyBoyo
Global moderator
 
Posts: 3804
Joined: Tue Jan 25, 2011 12:02 am
Has thanked: 22 times
Been thanked: 265 times


Return to DFSORT/ICETOOL/ICEGENER

 


  • Related topics
    Replies
    Views
    Last post