Implementing full regex in LiveCode

Anything beyond the basics in using the LiveCode language. Share your handlers, functions and magic here.

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

kaveh1000
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 508
Joined: Sun Dec 18, 2011 7:23 pm
Location: London
Contact:

Re: Implementing full regex in LiveCode

Post by kaveh1000 » Sat Jul 13, 2019 6:40 pm

Thanks for the hints on the chars. But sometimes it is good to see the characters for debugging purposes, so I have used very arcane characters that I know will not occur.

Mark, on the delimiter, I admit I was thinking aloud. Here is my "loud" thinking.

Am I right that the speed increase in your code was because you were searching each time just in one item (one word) and in my original code I was looking in the whole code every time?

If so, it is much faster for LiveCode to look in a small item than the whole text, even if that means getting the next item every time. But still that can be a lot of items.

So I thought if we can quickly put a custom delimiter around the items of interest (and that does not take significant time) then we have minimized the number of items to look at.

in fact even better if we can look at every *other* item, as the in between text we know is of no interest, and also probably much longer than the text we have pinpointed.

Does that make sense?
Kaveh

mwieder
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3581
Joined: Mon Jan 22, 2007 7:36 am
Location: Berkeley, CA, US
Contact:

Re: Implementing full regex in LiveCode

Post by mwieder » Sun Jul 14, 2019 3:33 am

Yeah, you have to run matchtext() more times in the faster solution; so my guess is that matchtext on a single word is going to be faster than running matchtext on the entire text to see if there's a hit.

What I still don't get is this:

if you need to go through the entire text looking for hits and then delimiting them, and then go through the entire text again converting the delimited items, why not just convert them when you find them the first time?
Am I missing something here?

kaveh1000
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 508
Joined: Sun Dec 18, 2011 7:23 pm
Location: London
Contact:

Re: Implementing full regex in LiveCode

Post by kaveh1000 » Sun Jul 14, 2019 11:13 am

Very good question Mark. Simple answer is that right now there is an assumption that the pattern is a word. But regex is totally general. The script fails if you have:

Code: Select all

Lorem ipsum dolor sit 1235ABCD1288ABCD feugait nulla facilisi.
it is converted to:

Code: Select all

Lorem ipsum dolor sit ABCD 1235 feugait nulla facilisi.
instead of

Code: Select all

Lorem ipsum dolor sit ABCD 1235ABCD 1288 feugait nulla facilisi.
So I am trying to mark each pattern first. But then we are going around in circles as we are slowing the script down again.

Still thinking...
Kaveh

mwieder
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3581
Joined: Mon Jan 22, 2007 7:36 am
Location: Berkeley, CA, US
Contact:

Re: Implementing full regex in LiveCode

Post by mwieder » Sun Jul 14, 2019 5:10 pm

While you're thinking...

maybe you could post what your real-world problem is. It's hard to craft a generic solution without knowing the parameters. Do you really have something like 1235ABCD1288ABCD in the corpus of the text you're working with or is this just blue-skying?

kaveh1000
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 508
Joined: Sun Dec 18, 2011 7:23 pm
Location: London
Contact:

Re: Implementing full regex in LiveCode

Post by kaveh1000 » Sun Jul 14, 2019 6:54 pm

Mark, every real world problem is different. so we cannot predict what comes next. this was just a quick example to show that we cannot guarantee that each item is a word. I am going to have another think and come back.

On this particular point, it was blue sky. But it shows that whatever delimiters you choose, you might have more than one match in each item. So if you take return as delimiter then you would need to do a loop in each para, which would increase the time again

Thanks again for engaging and giving energy to this thread. ;-)
Kaveh

Thierry
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 875
Joined: Wed Nov 22, 2006 3:42 pm

Re: Implementing full regex in LiveCode

Post by Thierry » Mon Aug 19, 2019 7:39 pm

kaveh1000 wrote: So I have produced a test stack, attached.

There are 5000 line to change in the top field (4998 unique).
This takes some 10 seconds on LiveCode. In BBEdit, using

Code: Select all

(\d+)([A-Z]+) > \2 \1
it is instantaneous (less than 1/4 sec I estimate). Regex is fast!!

So the challenge is how to make this faster.

Love to hear expert views.
Hi Kaveh,

Somehow I missed your post.

Using the sunnYrex library,
here are my results you might find interesting:

With the script in your convert button:


Screenshot 2019-08-19 at 20.24.19.png
Screenshot 2019-08-19 at 20.24.19.png (10.21 KiB) Viewed 3853 times


With the use of sunnYreplace(...):


Screenshot 2019-08-19 at 20.30.51.png
Screenshot 2019-08-19 at 20.30.51.png (10.49 KiB) Viewed 3853 times

Figures say all.

And of course, it's only 1 line of code!

But never expect regex replacements in LC to be faster than BBEdit; not a chance.
And in fact, this has nothing to do with PCRE engine!!!

Regards,


Thierry
Last edited by Thierry on Thu Nov 17, 2022 12:18 pm, edited 2 times in total.
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!

kaveh1000
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 508
Joined: Sun Dec 18, 2011 7:23 pm
Location: London
Contact:

Re: Implementing full regex in LiveCode

Post by kaveh1000 » Mon Aug 19, 2019 7:50 pm

Great to hear this, Thierry and thanks for trying it. Good to know...
Kaveh

Post Reply

Return to “Talking LiveCode”