Update of RegEx library (libpcre) done

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, LCMark

Locked
Thierry
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 875
Joined: Wed Nov 22, 2006 3:42 pm

Update of RegEx library (libpcre) done

Post by Thierry » Mon Sep 30, 2013 11:59 am

Hi,

I'm happy to announce the successful update of the libpcre library.

Please note that I started Git, Github and LC sources from scratch.
The main goal of this exercise was to see if I could work on LC sources
for a limited time per day (an hour or so).

Thanks to Mark Wieder who was always willing to answer
my beginner's questions on Git precisely and with a lot of patience.

The old libpcre library was 6.7 from July 2006
The new one is 8.33 from May 2013

You can read about the changes:
- http://www.pcre.org/news.txt
- http://www.pcre.org/changelog.txt

For all the details:
- https://github.com/thierrydouez/livecode-thirdparty
- new branch: libpcre_8_33

I've been playing some time with it on MacOSX without any problems.

Mark Wieder managed to compile the library on linux, but with no tests yet.

And now?

Regards,

Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!

LCMark
Livecode Staff Member
Livecode Staff Member
Posts: 1209
Joined: Thu Apr 11, 2013 11:27 am

Re: Update of RegEx library (libpcre) done

Post by LCMark » Tue Oct 01, 2013 12:13 pm

@Thierry: Welcome to the world of the engine :)

If you send a pull-request, then I'll take a look. I've had a quick look at the changes needed for the Linux Makefile and it should be easy enough for us to iterate the changes into the other platforms. All being well, we can pull this in for 6.5.

Thierry
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 875
Joined: Wed Nov 22, 2006 3:42 pm

Re: Update of RegEx library (libpcre) done

Post by Thierry » Tue Oct 01, 2013 1:06 pm

@runrevmark: Ok, done.


I have done some more successful tests with the new library.

named subpatterns and references to them, ie:

"(?<pal>(?<char>\w)(?:\w?|(?&pal))\k{char})"

Escape sequence \K to reset the match start, ie:

"\w+\K([^\d]+)\d+"

And last but not the least, Unicode, UTF8 regex, ie:

"(*UTF8)(*UCP)\w+\s+\d+"


Unfortunately, I've never kept those regex I couldn't run with the old pcre library over the years.
If any of you have some valid but not working regex,
please send them to me so I can increase the number of tests.


Regards,
Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!

Thierry
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 875
Joined: Wed Nov 22, 2006 3:42 pm

Re: Update of RegEx library (libpcre) done

Post by Thierry » Sat Oct 05, 2013 8:16 pm

Thierry wrote:@runrevmark: Ok, done.

</snip>

I have done some more successful tests with the new library.

And last but not the least, Unicode, UTF8 regex, ie:

"(*UTF8)(*UCP)\w+\s+\d+"

Regards,
Thierry
Just changed some settings in the config.h for Unicode.

How do I proceed as I have already a pull request in the waiting list?
Is this Ok to commit my changes in the same branch,
push it to my github repo and send another pull request?

Regards,

Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!

monte
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 1564
Joined: Fri Jan 13, 2012 1:47 am
Contact:

Re: Update of RegEx library (libpcre) done

Post by monte » Sat Oct 05, 2013 9:44 pm

If you push to the same branch then it automatically gets added to the pull request until the pull request is closed by someone at RunRev.
LiveCode User Group on Facebook : http://FaceBook.com/groups/LiveCodeUsers/

Thierry
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 875
Joined: Wed Nov 22, 2006 3:42 pm

Re: Update of RegEx library (libpcre) done

Post by Thierry » Wed Oct 09, 2013 10:23 am

@monte: Great! Worked as you said :)

Next, would like to add some features to replaceText().
new thread coming..

Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!

LCMark
Livecode Staff Member
Livecode Staff Member
Posts: 1209
Joined: Thu Apr 11, 2013 11:27 am

Re: Update of RegEx library (libpcre) done

Post by LCMark » Wed Oct 09, 2013 12:05 pm

@Thierry: I pulled in your branch as livecode-thirdparty/libpcre_8_33 and tweaked the Android, iOS and Windows makefiles to work with the new source files. The only minor change I had to make was to ensure PCRE_STATIC was defined - otherwise Visual C complained about linkage issues (it was trying to link pcre as a dynamic library). The resulting branch has been merged into 'develop' so will appear in the next build of 6.5 :)

Troy
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 35
Joined: Sun May 21, 2006 2:21 am

Re: Update of RegEx library (libpcre) done

Post by Troy » Fri Jan 24, 2014 7:14 pm

Hi, sorry to step in to this thread, but our search for back referencing capture groups in LiveCode landed me here.

It sounds like this updated library was being rolled into the distribution? Does that change anything about the way we can use RegEx in LiveCode? Because we're hitting some real stumpers of the sort that require much more complex use of Regex than LiveCode seems to support.

Can anyone involved clarify the (latest) status of Regex at all? If we can't do capture groups internally to LiveCode's implementation, is there some external which improves upon it?

Thanks for any information or pointers.

mwieder
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3581
Joined: Mon Jan 22, 2007 7:36 am
Location: Berkeley, CA, US
Contact:

Re: Update of RegEx library (libpcre) done

Post by mwieder » Fri Jan 24, 2014 8:31 pm

@Troy: what sort of "stumpers" are you running into? I believe the 6.5 LC releases have access to the entire capabilities of the latest pcre library. Is this just a matter of LC text parsing?

Troy
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 35
Joined: Sun May 21, 2006 2:21 am

Re: Update of RegEx library (libpcre) done

Post by Troy » Sat Jan 25, 2014 1:40 am

mwieder wrote:@Troy: what sort of "stumpers" are you running into? I believe the 6.5 LC releases have access to the entire capabilities of the latest pcre library. Is this just a matter of LC text parsing?
Well.. I guess. Yeah. ;-)

We just can't seem to find any "magic" combination of matchChunk and filter and offset and replacetext... etc. which will do the same things for us as having several named capture groups. Basically, we need ALL of RegEx, not the partial implementation, AFAICT. I'm not saying it can't be done, I'm saying we haven't been able to do it here.

Like parsing the following markdown properly...

* For **one** *thing*, ***I'm*** **not** the *right* **one** to ***ask***

With full regex, this isn't actually a problem... but it stumps us in LiveCode.

Thierry
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 875
Joined: Wed Nov 22, 2006 3:42 pm

Re: Update of RegEx library (libpcre) done

Post by Thierry » Sat Jan 25, 2014 7:06 am

Troy wrote:Basically, we need ALL of RegEx, not the partial implementation, AFAICT. I'm not saying it can't be done, I'm saying we haven't been able to do it here.
Ok, maybe first would be interesting to reduce this sentence to:
"ALL of Regex" or "I don't know how to do it"

Like parsing the following markdown properly...
* For **one** *thing*, ***I'm*** **not** the *right* **one** to ***ask***
With full regex, this isn't actually a problem... but it stumps us in LiveCode.
Umm, don't understand *full* regex. Could you elaborate?

Otherwise, if you send me some of your text with a working regex and the wanted results
in whatever language you made it, I will check and see if this can be done or not...

Please, zip your datas to avoid any net transformation.

Regards,

Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!

Locked

Return to “Engine Contributors”