Page 2 of 3

Re: Implementing full regex in LiveCode

Posted: Thu Jul 11, 2019 11:29 pm
by mwieder
@Kaveh-

Here you go:

Code: Select all

function convertBackwards pInputString
   local tNumber, tText
   repeat while matchtext(pInputString, " (\d+)([A-Z]+) ", tNumber, tText)
      replace (space & tNumber & tText & space) with (space & tText && tNumber & space) in pInputString
   end repeat
   return pInputString
end convertBackwards

Re: Implementing full regex in LiveCode

Posted: Fri Jul 12, 2019 12:48 am
by kaveh1000
Thank you so much Mark. This is a great test. So I have produced a test stack, attached.

There are 5000 line to change in the top field (4998 unique). This takes some 10 seconds on LiveCode. In BBEdit, using

Code: Select all

(\d+)([A-Z]+) > \2 \1
it is instantaneous (less than 1/4 sec I estimate). Regex is fast!!

So the challenge is how to make this faster. Can it be done without native implementation of Regex?

Love to hear expert views.

Re: Implementing full regex in LiveCode

Posted: Fri Jul 12, 2019 2:23 am
by mwieder
OK - this gets it down to about 250 milliseconds on my computer (moved tReplaced to a script variable and lock screen before calling the function, unlock afterwards):

Code: Select all

function convertBackwards2 pInputString
   local tNumber, tText
   local tReturnString
   
   put 0 into sReplaced
   set the itemdelimiter to space
   repeat for each item tItem in pInputString
      if matchtext(tItem, "(\d+)([A-Z]+)", tNumber, tText) then
         put (tText && tNumber) & space after tReturnString
         add 1 to sReplaced
      else
         put tItem & space after tReturnString
      end if
   end repeat
   delete char -1 of tReturnString
   return tReturnString
end convertBackwards2

Re: Implementing full regex in LiveCode

Posted: Fri Jul 12, 2019 8:21 am
by kaveh1000
This is astonishing, Mark. Thank you again. A lot of food for thought. My first reaction:
  • Making local variable to script variable does not have much or any effect
  • Locking screen does not have much or any effect (there is only one screen update) but good practice
  • This is a huge improvement, but it is always limited to the search in question. e.g. what if there was no space before and after the string? I have a feeling there will be an answer, but just thinking aloud
Very exciting and will mull over it as soon as I have a chance.

Kaveh

Re: Implementing full regex in LiveCode

Posted: Fri Jul 12, 2019 11:27 am
by bogs
I'm curious, as well as locking the screen, wouldn't locking the messages while it is running, then unlocking after it is done reduce the time? Or do I mis-understand message locking (among other things) :D

Re: Implementing full regex in LiveCode

Posted: Fri Jul 12, 2019 4:25 pm
by mwieder
Locking messages wouldn't have any effect here because there aren't any side effects that would invoke any other messages. My guess about the speed increase:

The matchtext function is notoriously slow, and is described that way in the dictionary notes. But it does allow you to use regex in the matching string. I'm guessing that pointing it at a single word, even though it's called more often, is faster than pointing it at the entire text. Even so, I'm surprised to end up with a 40x increase in speed.

Re: Implementing full regex in LiveCode

Posted: Fri Jul 12, 2019 4:29 pm
by mwieder
@Kaveh-
what if there was no space before and after the string?
All I have to work with is the text in the sample you posted. I have no idea what your real-world context might be like. The regex might have to be tweaked if you have other restrictions.

Re: Implementing full regex in LiveCode

Posted: Fri Jul 12, 2019 4:43 pm
by jacque
Could you repeat for each word instead of items? That would ignore extraneous spaces at both ends as well as internally.

Re: Implementing full regex in LiveCode

Posted: Fri Jul 12, 2019 5:12 pm
by mwieder
That opens up other problems: by-word instead of by-item removes the crlfs so you'd have to add extra code to take care of that.

There's still the problem of the target text being at the end of a sentence because periods would be removed from the converted text:

Lorem ipsum dolor sit 1235ABCD feugait nulla facilisi 1235ABCD.
Lorem ipsum dolor sit 1236ABCD feugait nulla facilisi.

==>

Lorem ipsum dolor sit ABCD 1235 feugait nulla facilisi ABCD 1235 ipsum dolor sit ABCD 1236 feugait nulla facilisi.

Re: Implementing full regex in LiveCode

Posted: Fri Jul 12, 2019 5:16 pm
by kaveh1000
regex is completely general, which is why it is so powerful. And you can do anything with it. But I have a good feeling we can have a good LC solution.

I have a rough idea of putting a unique delimiter around the strings that need to be changed first, then just looking at those. so first convert

Code: Select all

Lorem ipsum dolor sit amet, 1234ABC consectetuer
adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore
zzril delenit 678XYZ augue duis dolore te feugait nulla
facilisi.
to

Code: Select all

Lorem ipsum dolor sit amet, •1234ABC• consectetuer
adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore
zzril delenit •678XYZ• augue duis dolore te feugait nulla
facilisi.
assuming • will not appear in text, then use • as the itemdelimiter, do the replace, then remove •

this way we have fewer items.

Just a rough thought. I have a project next few days but will dive in next week. Wishing all you great folks a happy weekend. :-)

Re: Implementing full regex in LiveCode

Posted: Fri Jul 12, 2019 5:19 pm
by mwieder
Well, as you so aptly said,
nulla facilisi.

Re: Implementing full regex in LiveCode

Posted: Fri Jul 12, 2019 6:11 pm
by jacque
Maybe "trueWord" then. It ignores spaces, punctuation, etc. I'm just thinking out loud.

Re: Implementing full regex in LiveCode

Posted: Fri Jul 12, 2019 9:30 pm
by mwieder
Yep - same problem. Using "trueword" ignores punctuation, so the punctuation (period, comma, crlf, etc) never makes the comparison list and gets lost.

Re: Implementing full regex in LiveCode

Posted: Sat Jul 13, 2019 4:57 pm
by jacque
assuming • will not appear in text, then use • as the itemdelimiter, do the replace, then remove •
My favorite delimiters are ascii 3 and 8. They can't be typed, so neither will ever be in the text a user creates.

Re: Implementing full regex in LiveCode

Posted: Sat Jul 13, 2019 6:24 pm
by mwieder
Yeah, mine too. But the problem with this approach for this situation is that if you've already located the text objects you want to delimit with non-printable characters then you've already solved the problem of you're trying to solve by delimiting them and you might as well convert them in situ.