Paradox Community

Items in pnews.paradox-dos

Subject:Re: date grief
Date:Wed, 20 Dec 2017 17:27:52 +1000
From:Bernie van't Hof <berniev@bje.com.au>
Newsgroups:pnews.paradox-dos
I'm officially OVER pdox dos qbe wildcards, so this is absolutely my 
last post in this topic.

Default format MM/DD/YY
3/3/99 => ok
03/03/99 => ok
3/03/99 => ok
03/3/99 => ok

3/03/.. => ok
3/03/9.. => ok
3/03/99.. => ok

3/3/99.. => empty
3/3/9.. => empty

03/3/99.. => empty
03/3/9.. => empty

03/03/99.. => empty
03/03/9.. => empty

3.. => ok
03.. => empty

../3/99 => empty
../03/99 => ok

..3/99 => ok

Wildcard parsing is a pile of crap.

Over and out.

- Bernie

On 20/12/17 10:54 am, Leslie wrote:
> Hi Bernie,
> 
> I am following this more as a mental exercise than any goal because I 
> used to write compilers back in the day for TRS80/Z80 machines, so I 
> find it interesting.
> 
> ============================================
> 
> Anyway it appears that the #6 result with the modified test data shows 
> two things:
> 
> 1. "PP" + "O".. *is not* being treated as ("PP + "O")..
> but likely (unproven) as "PP" + ("O"..)
> 
> and
> 
> 2.  It does not resolve to any of the following combinations
> "PP" + ""
> "PP + "O",
> or "PP" + "OO"
> 
> What this means is that when a pattern match placeholder is present, 
> Paradox does *not* break the expression down into sub-expressions and do 
> a recursive term/factor evaluation.
> 
> In fact, the only conclusion I can make is that (at least for a string 
> field) as soon as there is a pattern match placeholder present, the 
> presence of any arithmetic operators produces an empty result set.
> 
> Which closely follows the results when using patterns in date fields.
> So perhaps it is much simpler than we thought.
> 
> ============================================
> 
> In terms of the rules:
> 
> I am getting a picture in my mind that there are multiple expression 
> evaluators going on inside Paradox and *two sets of evaluation rules 
> exist per field type*.
> 
> One is the "expected" field type specific term/factor processing.
> 
> The other being that as soon as the .. pattern matching placeholder is 
> present, a different evaluation process kicks in.
> 
> My guess is that .. was added later and rather than modify the existing 
> sets of rules and potentially breaking things, they added a new set 
> which gives them better control over the pattern matching *per field type*.
> 
> I also believe that an expression with both .. and arithmetic operators 
> present work only either by accident or specific limited cases were 
> coded to work - with .. always taking priority.
> 
> IMO this locked them into the parsing nightmare you are seeing, with 
> somethings working as expected but mostly producing empty result sets.
> 
> Unless we disassemble Paradox itself, I cannot think of a way to prove 
> this but it makes sense given the results across your various posts.
> 
> 
> ============================================
> 
> The last nagging issue for me is test #5. This should have resulted with 
> "PPOO".
> 
> Instead some sort of syntax check kicked in to produce the error message.
> 
> So in this case, I assume that if one part of the expression is quoted 
> then all parts must be quoted.
> 
> It is interesting (and annoying) that Calc is different.
> 
> ============================================
> 
> In conclusion and without further information to hand.....
> 
> I think that if .. and arithmetic operators are present then returning 
> an empty result set is a valid response. In other words, the table does 
> not even get read.
> 
> And finally as I suggested in a previous post,
> "PP" + "O".. is not the same as 'PP' + 'O%' in SQL.
> Leslie.
> 
> 
> 
> On 20/12/2017 7:29 AM, Bernie van't Hof wrote:
>> Thanks for staying with me on this.
>>
>> Firstly, following your suggestion I added records:
>> O
>> P
>> OO
>> PP
>>
>> Kept:
>> PPOO
>>
>>    Query         Prev     Now
>>    ==========    ======== ========
>> 1. PPOO       =>      PPOO
>> 2. "PP"+"OO"  =>      PPOO
>> 3. "PP"..     => PPOO     PP, PPOO
>> 4. "PP".."O"  =>      PPOO
>> 5. PP+"OO"    =>   "Wrong type" (amended) But CALC PP+"OO" => PPOO (!)
>> 6. "PP"+"O".. =>      Empty
>>
>> So the only difference is #3 which behaves as expected.
>>
>> My Stream/Tokeniser/Parser combination actually works quite well. It is
>> simple left-to-right with effectively one function per token (actually
>> uses switch statements) handles precedence and is recursive. I actually
>> did study a number of docs like you suggest when writing it.
>>
>> I recognise that issues in qbe may require some extra smarts in the
>> parser, but so far the it has handled everything (except some qbe
>> expressions) that has been thrown at it.
>>
>> Summary of difficulties:
>>     a) It is proving very difficult to understand how pal qbe is MEANT
>>        to be interpreted. As seen in these discussions there appears
>>        no simple answer, and we have the best pdox dos minds on the
>>        subject.
>>     b) Operators in dates have context-sensitive meaning.
>>        E.g. '/', '-' can be arith or date sep, '.' decimal or date sep.
>>     c) 3 (or 4) date formats (Except where a wildcard is used) using 3
>>        different separators.
>>     d) qbe doesn't rely on white-space separation
>>        E.g. 23-Sep-2000-1-2 is legal and means 23-Sep-2000 -1 -2
>>     e) Wildcards plus arith operators appears illogical. We can fit an
>>        argument to suit the question, but that's not much help to the
>>        'normal' programmer who reasonably expects consistency with the
>>        way pal works. With respect, that "PP".."O" is interesting, after
>>        all the discussions here, is telling, but that example DOES work
>>        exactly as one might expect.
>>     f) Separators in dates with wildcards must be per CCP default.
>>        Others are not supported.
>>     g) If a wildcard is used with an arith op the answer is empty. Even,
>>        as we have seen most recently, with string literal concatenation.
>>     h) In a date calc the left side must be a date
>>     i) Different behaviours in CALC
>>
>> At the risk of repeating myself, the current problem is not how to parse
>> the qbe, it is precisely defining its rules/grammar.
>>
>> IMHO Results so far indicate we have learnt to make our queries fit what
>> works, not following some simple, consistent, well-documented rules.
>>
>> You can't write the grammar if you don't know what you expect it to do
>> or if what it is to do doesn't make reasonable sense, is ambiguous or
>> inconsistent.
>>
>> The parsing certainly has difficulties as outlined, but that can be
>> handled one way or another later, currently experimenting with a sort of
>> state machine.
>>
>>
>> SUMMARY:
>> My current target rule set looks like:
>>    /**
>>      * =============== Rules=============
>>      *
>>      * Supported date formats:
>>      *     MM/DD/YY
>>      *     DD-Mon-YY
>>      *     DD.MM.YY
>>      *     YY.MM.D
>>      *
>>      * YY can be Y, YY, YYY or YYYY
>>      * 2-Digit years: >=70 => 19xx, <70 => 20xx
>>      *
>>      * Date calc left side must be date (might remove this one)
>>      * Arith only if left side is date or num
>>      * String concat "PP"+OO and PP+"OO" ok
>>      * Recognise only default separator (per CCP)
>>      * No arith allowed in expressions containing wildcards (treat as
>> potential separators)
>>      * Non-Integer numeric components rounded UP in date calcs (makes
>> sense as anything after a date is a later date)
>>      * Space terminates date (but not required)
>>      * ... is DotDot Dot (lexer)
>>      */
>>
>> Note I have decided (for now) to only allow date separator per CCP
>> default, and will allow string concat as shown.
>>
>> - Bernie
>> On 19/12/17 9:07 pm, Leslie wrote:
>>> Bernie,
>>>
>>> You seem to be getting totally confused because it makes sense to me,
>>> and this is why you need to build a proper expression evaluator into
>>> your parser.
>>>
>>> So "PP" + "O"..
>>>
>>> effectively becomes "PP" + ("O"..) and not ("PP" + "O")..
>>> which I think is what you are seeing. Of course you could specify the
>>> brackets in the QBE yourself if you meant the latter.
>>>
>>> Now it depends upon Paradox as to how the expression gets evaluated,
>>> but I would guess that ("O"..) gets evaluated first and fails to
>>> match, and so we now get:
>>>
>>> "PP" + failed match result, which could either immediately fail to
>>> match or gets evaluated as "PP" + "" which also fails to match if you
>>> only have "PPOO" in the table.
>>>
>>>
>>> Even if Paradox evaluates "PP" first, then you get: fails to match +
>>> "OO".. which again could either immediately fail to match, or be seen
>>> as "" + "OO".., which again fails to match.
>>>
>>> So I do not expect a silent fail but rather an empty answer table
>>> based upon the test data.
>>>
>>> If you adjust your test data to have a record with just "PP" and also
>>> just "OO" what gets returned? Give variations like this in your test
>>> data a try to see if you can prove what happens.
>>>
>>> Note that "PP".. + "O" should also return no results (depending upon
>>> your test data, because ("PP"..) + "O" would result in "PPOO" + "O"
>>> which becomes "PPOOO" and does not match.
>>>
>>> Again, create a test record, to try and prove this.
>>>
>>>
>>>
>>>
>>> The one I do find interesting is: "PP".."O"
>>>
>>> Because a correct result gets returned, it means that it truly is
>>> seeing .. as a pattern matching placeholder and not an operator and
>>> thus matches on the single string PP..OO, - there is actually very
>>> clever coding going on inside Paradox to handle this.
>>>
>>>
>>> Finally....
>>>
>>> Bernie I know the Paradox documentation refers to .. as a wildcard,
>>> but we have now proven many times that in fact it is not.
>>>
>>> .. is actually a pattern match placeholder, which matches differently
>>> depending upon the actual field type.
>>>
>>> In other words although @ could be considered to equate to the ?
>>> standard wildcard
>>>
>>> .. cannot be considered to equate to the * standard wildcard or even
>>> the % SQL operator (but it is close in some cases).
>>>
>>> You should compare it to pattern characters in regex expressions to
>>> get a better idea of what it it trying to achieve (albeit more
>>> primitive).
>>>
>>>
>>> Also if you rewrite the expressions on paper to include brackets as I
>>> have above it also may help to clarify things when the results appear
>>> confusing.
>>> Leslie.
>>>
>>>
>>>
>>> On 19/12/2017 1:12 PM, Bernie van't Hof wrote:
>>>> It doesn't end..
>>>>
>>>> Data in column: PPOO
>>>> In corresponding text field query:
>>>> PPOO       => OK
>>>> "PP"+"OO"  => OK
>>>> "PP"..     => OK
>>>> "PP".."O"  => OK
>>>> PP+"OO"    => OK
>>>> "PP"+"O".. => Silent fail
>>>>
>>>> Looks like string concatenation operator is is same bucket as arith 
>>>> when
>>>> wildcard present.
>>>>
>>>> I guess after all we've been through I shouldn't be surprised.
>>>>
>>>> - Bernie
>>>> PS: Merry Christmas!
>>>>
>>>> On 12/12/17 12:07 am, Leslie wrote:
>>>>> I have been thinking about the leading Zero problem as per Dan 
>>>>> Ehrmann.
>>>>>
>>>>> Now this may not need to be said, but I will say it anyway as it may
>>>>> help explain some of the results.
>>>>>
>>>>> The fact that ../9/.. does not match records when it should, but
>>>>> ../09/.. does is actually quite important.
>>>>>
>>>>> What this tells me, and I am ready to stand corrected, is that Paradox
>>>>> is not performing any true evaluations with .. and probably @@ as 
>>>>> well.
>>>>>
>>>>> What I think Paradox is doing is retrieving the value from the record
>>>>> and constructing it into its display format and holding as a string.
>>>>>
>>>>> Then it is doing a string comparison, whilst also catering for
>>>>> wildcards. So ../9/.. will never pattern match 01/09/2016 which is an
>>>>> example of what gets constructed - hence an empty result set.
>>>>>
>>>>> This is completely different from how I would do it, which would be to
>>>>> retrieve the date, extract the Day, Month and Year components and
>>>>> compare their values with those specified in the QBE (which I would
>>>>> also convert to shorts in the process). In other words no pattern
>>>>> matching occurs which removes the need for display formatting
>>>>> information. However, they did not do this as it might have been too
>>>>> expensive in terms of the code required (we only have 640K remember).
>>>>>
>>>>> So, this means that Paradox really is doing pattern matching when a ..
>>>>> operator is present. This may also be true for other types of fields
>>>>> as well, in fact I would put money on it.
>>>>>
>>>>> But do not forget that Dan also states that 5/9/89 does return the
>>>>> correct result set without needing the leading zero. In this scenario,
>>>>> I believe that Paradox is doing binary matching and so the display
>>>>> format never comes into play.
>>>>>
>>>>>
>>>>> Therefore, and I think this is what Bernie shows with his results, QBE
>>>>> expressions get evaluated using an entirely different algorithm when
>>>>> .. is present and that is likely always as a string and possibly for
>>>>> any field types.
>>>>>
>>>>> I also think I can explain the 1+ and +1 before and after the pattern
>>>>> as well.
>>>>>
>>>>> 1 + Date pattern should be able to be evaluated like Date Pattern + 1
>>>>> does. Instead we get the Wrong type error.
>>>>>
>>>>> Upon reflection I think this is due simply how they parse the Date
>>>>> expression to extract out the date pattern itself. So they are parsing
>>>>> left to right in a single pass and then when they hit a non-date
>>>>> operator *after* the *date* expression/pattern they then syntax check
>>>>> it probably to make sure they got a valid pattern.
>>>>>
>>>>> At this point we get the Wrong Type error (but Syntax Error might have
>>>>> been a better message).
>>>>>
>>>>> Operators after the date do get evaluated correctly as they have
>>>>> already processed/evaluated the pattern match and so they can then
>>>>> apply date arithmetic as they continue on their merry way parsing left
>>>>> to right.
>>>>>
>>>>> A bit of a mess and I would need to test this theory out to be sure,
>>>>> but what it comes down to is that some things could be done for cheap
>>>>> or even free in terms of the coding, but others could require
>>>>> something more complex such as multiple passes, which would require
>>>>> more memory.
>>>>>
>>>>> So, fancy stuff got dropped and this was all probably documented. We
>>>>> just don't have that documentation.
>>>>>
>>>>> So to conclude, I think that once the wildcard operators are present
>>>>> Paradox then does string matching and this can cause strange results
>>>>> to be produced as Bernie reports.
>>>>>
>>>>> It makes sense (to me at least).
>>>>> Leslie.
>>>>>
>>>>> ---
>>>>> This email has been checked for viruses by AVG.
>>>>> http://www.avg.com
>>>>>


Copyright © 2004 thedbcommunity.com