CA Endevor

View Only

Sample Rexx Procedure for Parsing CSV data

Recommend

May 13, 2016 06:45 PM

Eoin OCleirigh

Accurately parsing Endevor CSV data can be challenging when some of the fields can contain free text like descriptions, comments etc.

Using the CSV LIST OPTION DELIMITER '|' can help but to be more accurate you need to look for the delimiters AND the bounding quotes.
The following two procedures work like the builtin WORD and WORDS procedures but instead of looking for space delimited words look for CSV Delimited words with the broken bar.

DWords for "delimited words" is a slightly more complex version of CWords

(comma delimited words) it check to see if the words are also wrapped in

quotes in addition to a delimiter of "¦" and if so, it also stripps them.

This allows it to work more reliably with Endevor's CSV outpu, but it's

still not guaranteed against folks adding double.quote¦double.quote into

a text field. Long term Endevor's CSV should probably more closely match

EXCEL's standard and or allow escape! when a delimiter is found in the data

DWords:procedure

parse arg inword

if length(inword) = 0 then return 0

select /* check for Quote or Apost delimiters */

when left(inword,1) == '"' then TgtStr = '"¦"'

when left(inword,1) == "'" then TgtStr = "'¦'"

othersise then TgtStr = "¦"

end

lenTgt = length(TgtStr) /* if extra delim's used extend comp len */

count = 1

do i = 1 by 1 to length(inword)

if substr(inword,i,lenTgt) == TgtStr then

count = count + 1

end

return count

DWord:procedure

parse arg inword, incount

sa= "inword:'" || inword || "' and count:" incount

select /* check for Quote or Apost delimiters */

when left(inword,1) == '"' then TgtStr = '"¦"'

when left(inword,1) == "'" then TgtStr = "'¦'"

othersise then TgtStr = "¦"

end

lenTgt = length(TgtStr) /* if extra delim's used extend comp len */

if lenTgt > 1 then /* Calculate an offset depending on lenth*/

if incount = 1 then offset = 1 /* the First word will only have one*/

else offset = 2 /* otherwise we have to skip both */

else offset = 0 /* but if no delim's then ignore */

outword = ''

if incount = 0,

| incount > dwords(inword) then return outword

count = 1 /* start by skiping to end of prev word */

do i = 1 by 1 while count < incount

if substr(inword,i,lenTgt) == TgtStr then

count = count + 1

end

do j = I+offset by 1 until j >= length(inword) /* start with offset */

thisChr = substr(inword,j,lenTgt)

if thisChr == TgtStr then leave /* get out if next fnd */

if j = length(inword) then /* or we find trailing */

if substr(inword,j,1) == left(TgtStr,1) then leave /* delimiter */

outword = Outword||left(thischr,1)

end

return strip(outword)

Is this helpful?

Please share some of the things you are doing with CSV data.

Statistics

0 Favorited

5 Views

0 Files

0 Shares

0 Downloads

CA Endevor

Sample Rexx Procedure for Parsing CSV data

Related Entries and Links