Subject: | Re: text parsing - A Dog's Breakfast
| Date: | Thu, 9 May 2019 21:38:48 -0800
| From: | Peter <peter@removespamwhiteknight.email>
| Newsgroups: | pnews.paradox-programming
|
The csv file is produced by another program, I have to work with the output.
Cannot use fixed length approach, the first and 2nd field change as well
as other fields
You can see the embedded " preceded by the \ in sample record 2
Sample record 1
31:1,9:2,"00001"," ",0,4800,"SAMMY ",2400,"LAB ",5,1,0,0,"Test
Work Order",0
Sample record 2
31:282927,9:61874,"04UBV"," ",0,12000,"JERRY ",0,"LA
",0,1,1,0,"CONCERN: OVERHEATING (AT/PAST \"H\" ON GAUGE), WITH \"ENGINE
TEMPERATURE HIGH",-112
I almost have a solution. There are 15 fields, so when the array is more
than 15, eg, 17 elements, I cocatenate elements 14,15,16 to get the
string I need to put into a single field in the table.
The trick is to accoount for all the possible placements of special
characters and spacing.
On 05/09/2019 12:28 PM, Michael Kennedy wrote:
> Peter,
>
> The correct approach to handing these "CSV" files is simple, and handles
> everything that can be dumped into the file...
>
> Decide on:
> - A field-separator character - in your case, it's a comma.
> - A field-delimiter character - in your case, it seems to be a
> double-quote. (Must be non-blank, different from the separator, etc,
> obviously!).
> - How to handle leading and trailing spaces/zeros per field, etc...
>
> when creating the file, use the field-separator between all fields.
> That's it! Then check the contents of each data-field:
> - If the data-field is empty, or only numeric, or maybe a "numeric"
> date, these fields probably do not need the surrounding field-delimiters.
> - All other fields need the field-delimiter around them.
> - If any data-field contains any embedded field-delimiter characters,
> double them (" becomes "") when the CSV file is being created, and
> "halve" them when the fields are being extracted from the CSV file.
>
> Simple! Handles everything!
>
> If you don't have control over the creation of your text, let us know,
> and maybe give us some worse-case examples... there might be NO way to
> distinguish between quotes/commas that belong to the data, and
> quotes/commas that do not!
>
> - Mike
>
>
> On 09/05/2019 06:29, Peter wrote:
>> I am running into a problem parsing a text file.
>>
>> I am reading the entire file into an array, all good.
>> The text file uses a comma (,) to separate fields
>> I use breakapart to split the fields into an array
>> str.breakApart(ary2,",")
>>
>> The problem is that many times there is a literal comma in the text
>> line that is not a field separator.
>> eg: "The quick brown fox, jumped over something I forget"
>> In the sample line I get two fields (two array elements) when it
>> should be one. Furhter, I have to save the comma because there is
>> another text line to add to it.
>>
>> How can I handle this?
>> Can I wrap the comma with something to make it a character?
>>
>> Thanks for any ideas.
>>
>>
>> Peter
|