|Date:||Thu, 28 Dec 2017 08:38:41 +1000
|From:||Bernie van't Hof <email@example.com>
It certainly appears likely that pdox uses a form of soundex. First
character critical then focusing on next four or so characters. Use sql
A bit of research:
Soundex was originally patented in 1918 (!!). So-called "American
Soundex" was first used in 1930. The accepted modern version is as Knuth
described in 1973. But he also described a modified version in the same
mysql states it uses the original version defined by Knuth, but Wiki
suggests most db's (except PostgreSQL) use the modified version so
things get a bit unclear.
PHP provides soundex per Knuth (but doesn't say which one) as well as
levenshtein and metaphone.
I don't think it really matters which one we use as it seems nobody
knows what pdox was doing anyway.
An interesting trick to solve the first character limitation is to
prepend a common character to each term, but as we're emulating pdox
behavior that is is irrelevant here.
Looks like sql default soundex is an appropriate translation for pdox
On 22/12/17 10:05 pm, Michael Kennedy wrote:
> On 22/12/2017 00:38, Steven Green wrote:
>> don't recall all of the testing that was done, but it was generally
>> concluded that it was skewed towards the first 5 characters of the
>> string, and was less predictable than most dot-dot options.. don't
>> know how you'd want to reproduce it, if you can't predict it
> AFAIR, I NEVER used "like", because I never knew how it worked.
> But, with that 5-char skew that Steve remembered, perhaps it was an
> implementation of (standard?) "Soundex"? If so, it would apply only to
> alphabetic (maybe alphanumeric?) strings:
> - always do an exact match on the very first char,
> - then match on the (english?) "sound" of the next 3/4/5/6/.. chars.
> If QBE has: Like paradox, this would translate to a Soundex code of P632
> Then, DB records with Paradox, Poridex, Piridicks, etc (which also
> translate to P632) would be matches.
> (I've not run the above example).
> - Mike
>> OK Steve, I'll take the bait!
>> What is known of the traps/bugs/features/poo in 'LIKE'?
>> Anyone wanna do some tests?
>> I'll be quiet for a while trying to sensibly parse what we have so far,
>> but will be lurking ...
>> - Bernie
>> This email has been checked for viruses by Avast antivirus software.