Regular Expressions and matchText Help

Bringing the internet highway into your project? Building FTP, HTTP, email, chat or other client solutions?

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

ARAS
Posts: 55
Joined: Sat Nov 02, 2013 5:35 pm

Regular Expressions and matchText Help

Post by ARAS » Fri Nov 15, 2013 8:00 pm

Hi All,

I have made the code below to find the value of a person called "Name Surname". The value is 0,5 as it is seen in the field content.
I want to get the result in the message box.
Seems like the code works because it doesn't return "not found". However, I only get an empty message box. What is the mistake in the code?

Could you help me to find it?

Thanks,
ARAS

Code: Select all

on mouseUp
   answer funwithRegex( "Name Surname" )
end mouseUp

function funwithRegex theName
   put quote into q
      
      repeat for 44
        put space after s
    end repeat

          if matchText( field 1, "(?m)<td class="&q&"kfnt"&q&" bgcolor="&q&"#ffffff"&q&" height="&q&"20"&q&">&nbsp;"&theName&"</td>\n"&s&"<td class="&q&"kfnt"&q&" bgcolor="&q&"#ffffff"&q&" height="&q&"20"&q&" width="&q&"107"&q&">\n"&s&"<p align="&q&"center"&q&">([\d,]+)</p>\n"&s&"</td>", theValue) then
    else
         return "not found"
   end if
end funwithRegex
Field 1 Content

Code: Select all

<td class="kfnt" bgcolor="#ffffff" height="20">&nbsp;Name Surname</td>
                                            <td class="kfnt" bgcolor="#ffffff" height="20" width="107">
                                            <p align="center">0,5</p>
                                            </td>

Mark
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 5150
Joined: Thu Feb 23, 2006 9:24 pm
Contact:

Re: Regular Expressions and matchText Help

Post by Mark » Sat Nov 16, 2013 12:22 am

Hi Aras,

You forgot the line "return theValue". Also, you can generate spaces much more quickly. I would put spaces around ampersands. That makes your script much(!) more readable.

Code: Select all

function funwithRegex theName
   put quote into q
   set the itemDel to space
   put space into item 44 of s
   if matchText( field 1, "(?m)<td class="&q&"kfnt"&q&" bgcolor="&q&"#ffffff"&q&" height="&q&"20"&q&">&nbsp;"&theName&"</td>\n"&s&"<td class="&q&"kfnt"&q&" bgcolor="&q&"#ffffff"&q&" height="&q&"20"&q&" width="&q&"107"&q&">\n"&s&"<p align="&q&"center"&q&">([\d,]+)</p>\n"&s&"</td>", theValue) then
      return theValue
   else
      return "not found"
   end if
end funwithRegex
Kind regards,

Mark
The biggest LiveCode group on Facebook: https://www.facebook.com/groups/livecode.developers
The book "Programming LiveCode for the Real Beginner"! Get it here! http://tinyurl.com/book-livecode

ARAS
Posts: 55
Joined: Sat Nov 02, 2013 5:35 pm

Re: Regular Expressions and matchText Help

Post by ARAS » Sat Nov 16, 2013 7:05 pm

It works! Thank you Mark,

Yes, you are right. I'd better put spaces around ampersands.

Thanks for the tips :)

Best wishes,
ARAS

ARAS
Posts: 55
Joined: Sat Nov 02, 2013 5:35 pm

Re: Regular Expressions and matchText Help

Post by ARAS » Tue Nov 19, 2013 5:34 pm

Hi,

I am having another problem.

This code doesn't work for the text below. It always returns "not found". I don't know why. However, there is something I have noticed. There are 6 spaces, but spaces are not normal.

I would appreciate if someoe could help me.

Code: Select all

on mouseUp
  answer funwithRegex( "NAME SURNAME" )
end mouseUp

function funwithRegex theName
put quote into q
set the itemDel to space
put space into item 6 of s
if matchText( field 1, "(?m)<td bgcolor=" &q& "#FFFFFF" &q& " class=" &q& "kfnt" &q& " height=" &q& "20" &q& ">&nbsp;" & theName & "</td>\n" &s& "<td bgcolor=" &q& "#FFFFFF" &q& " class=" &q& "kfnt" &q& " width=" &q& "107" &q& " height=" &q& "20" &q& "><p align=" &q& "center" &q& ">([\d,]+)</td>", theValue) then
return theValue
else
return "not found"
end if
end funwithRegex

Code: Select all

						<td bgcolor="#FFFFFF" class="kfnt" height="20">&nbsp;NAME SURNAME</td>
						<td bgcolor="#FFFFFF" class="kfnt" width="107" height="20"><p align="center">0,5</td>

ARAS
Posts: 55
Joined: Sat Nov 02, 2013 5:35 pm

Re: Regular Expressions and matchText Help

Post by ARAS » Tue Nov 19, 2013 7:47 pm

In the field text in my last reply, it seems there are many spaces, but indeed there are only 6. When I click edit. I can see it is 6. However, they are bigger than normal space character -maybe as big as TAB key.

ARAS
Posts: 55
Joined: Sat Nov 02, 2013 5:35 pm

Re: Regular Expressions and matchText Help

Post by ARAS » Wed Nov 20, 2013 5:21 pm

I copied the space and check for the ascii code. It is a Tab indeed. I got ASCII 9. I have changed the code. Instead of space, I have used this lines, but no difference.

This one seems creating commas also.

Code: Select all

put numToChar(9) into item 6 of s
However, this one didn't work too.

Code: Select all

put numToChar(9) & numToChar(9) & numToChar(9) & numToChar(9) & numToChar(9) & numToChar(9) into s
I had also tried with just copying the characters from the source code, but it is not working.

Code: Select all

put "						" into s

Simon
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3901
Joined: Sat Mar 24, 2007 2:54 am

Re: Regular Expressions and matchText Help

Post by Simon » Wed Nov 20, 2013 10:05 pm

Hi ARAS,
Have you tried:

Code: Select all

put tab & tab & tab & tab & tab & tab into s
tab is a constant.

Simon
I used to be a newbie but then I learned how to spell teh correctly and now I'm a noob!

ARAS
Posts: 55
Joined: Sat Nov 02, 2013 5:35 pm

Re: Regular Expressions and matchText Help

Post by ARAS » Thu Nov 21, 2013 9:23 am

Simon wrote:Hi ARAS,
Have you tried:

Code: Select all

put tab & tab & tab & tab & tab & tab into s
tab is a constant.

Simon
Hi Simon,

I tried this one but didn't work. Maybe it is because of something else. Do you see any other mistakes? I can't see one :(

Code: Select all

put tab into item 6 of s
I will try with your code later when I am able to.

Thanks
ARAS

ARAS
Posts: 55
Joined: Sat Nov 02, 2013 5:35 pm

Re: Regular Expressions and matchText Help

Post by ARAS » Thu Nov 21, 2013 5:01 pm

Simon I have tried it but gives not found.

Seems like the problem is not about tab.
ARAS

Thierry
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 875
Joined: Wed Nov 22, 2006 3:42 pm

Re: Regular Expressions and matchText Help

Post by Thierry » Thu Nov 21, 2013 8:22 pm

Hello Aras,

I'm coming with a quick solution...

Here I took 2 sets of your previous text samples and changed the values for testing:

Code: Select all

<td class="kfnt" bgcolor="#ffffff" height="20">&nbsp;Aras fromTurkey</td>
                                            <td class="kfnt" bgcolor="#ffffff" height="20" width="107">
                                            <p align="center">0,5</p>
                                            </td>


                  <td bgcolor="#FFFFFF" class="kfnt" height="20">&nbsp;Thierry Douez</td>
                  <td bgcolor="#FFFFFF" class="kfnt" width="107" height="20"><p align="center">0,42</td>

Notice that class="kfnt" are not at a fixed position (swapped with bgcolor) !

and I apply this piece of code:

Code: Select all

function seriouswithRegex theName
   put "(?msi)<td\s.*?class=.kfnt..*?;" & theName & "</td>[^<]+" into r1
   put "<td\s.*?class=.kfnt..*?<p align=.center.>([^<]+?)</" into r2
   put r1 & r2 into regex
   if matchText( field 1, regex, theValue) then
      return theValue
   else
      return "not found"
   end if
end seriouswithRegex
and test it with:

Code: Select all

on mouseUp
   answer seriouswithRegex( "Aras fromTurkey" )
   answer seriouswithRegex( "Thierry Douez" )
end mouseUp
I'm sorry I'm too busy to explain all in details, but you can probably
take something out of this. I'll be back in a couple of days...

HTH,

Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!

ARAS
Posts: 55
Joined: Sat Nov 02, 2013 5:35 pm

Re: Regular Expressions and matchText Help

Post by ARAS » Fri Nov 22, 2013 5:15 pm

Hi Thierry,

Thank you!!! It works!

Yes, I know they are swapped because they are different. Even though they are different, your code works. It is exactly what I want.

I like the idea of r1+r2. You are clever.

It will take some time for me to understand the code in detail. I am at work right now. When I am home, I will study your code.

Thank you again!!

ARAS

Thierry
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 875
Joined: Wed Nov 22, 2006 3:42 pm

Re: Regular Expressions and matchText Help

Post by Thierry » Fri Nov 22, 2013 6:12 pm

ARAS wrote: Hi Thierry,

Thank you!!! It works!
You're welcome :)

Yes, I know they are swapped because they are different.
Even though they are different, your code works. It is exactly what I want.
Yes, that's what I meant. Taking into account different patterns.
I like the idea of r1+r2. You are clever.
Well, this one is quite simple, isn't it?

For my personal coding, I'm more used to this one:

Code: Select all

get "blah blah blah"
get IT & "xxxxxxxxxxxx"
get IT & "yyyyyyy"

callwhatever IT
and when regex are too looooooong and complex, I put them in a custom property.


Regards,

Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!

Thierry
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 875
Joined: Wed Nov 22, 2006 3:42 pm

Re: Regular Expressions and matchText Help

Post by Thierry » Sat Nov 23, 2013 12:02 pm

Hi,

Here I write the same regex as 2 posts above, but splitting the regex by part
and trying to comment every step.
Umm, let me know if it's easier to get into it then...
One trick writting regex *in* Livecode (but be careful with it!):

LC compiler being unhappy with: class=\"kfnt\" bgcolor=\"
and to avoid those kind of hellish lines: class=" &quote& "kfnt" &quote& " bgcolor=" &quote& "

we can use a '.' instead of \", then: class=.kfnt. bgcolor=.

Obviously, this can catch something like: class=XkfntX bgcolor=Z which is not the case in our context

Code: Select all

function seriouswithRegex theName
   local theValue
   -- As we need 2 times next pattern, we can put it in a LC var
   local KFNTclass
   
   -- <td class="kfnt" bgcolor="#ffffff" height="20">&nbsp
   -- <td bgcolor="#FFFFFF" class="kfnt" width="107" height="20">
   put "<td\s.*?class=.kfnt..*?" into KFNTclass
   
   -- multiline, dot catches \n and not case sensitive
   get "(?msi)"
   
   -- <td bgcolor="#FFFFFF" class="kfnt" height="20">&nbsp;Thierry Douez</td>
   -- <td class="kfnt" bgcolor="#ffffff" height="20">&nbsp;Aras fromTurkey</td>
   -- theName is enclosed by ';' then '</td>'
   -- and we don't care about other chars before ';'
   get IT & KFNTclass & ";" & theName & "</td>"
   
   -- consume spaces tabs and returns before next '<'
   get IT & "[^<]+"
   
   -- <td class="kfnt" bgcolor="#ffffff" height="20" width="107">
   --               <p align="center">
   -- <td bgcolor="#FFFFFF" class="kfnt" width="107" height="20"><p align="center">
   get IT & KFNTclass & "<p align=.center.>"
   
   -- capture number before </td> or </p>
   get IT & "([^<]+?)</"
   
   if matchText( field 1, IT, theValue) then return theValue
   return "not found"
end seriouswithRegex

Code: Select all

on mouseUp
   get seriouswithRegex( "Aras fromTurkey" )
   answer IT &cr& seriouswithRegex( "Thierry Douez" )
end mouseUp

Regards,

Thierry
!
SUNNY-TDZ.COM doesn't belong to me since 2021.
To contact me, use the Private messages. Merci.
!

ARAS
Posts: 55
Joined: Sat Nov 02, 2013 5:35 pm

Re: Regular Expressions and matchText Help

Post by ARAS » Sun Nov 24, 2013 7:33 pm

Thanks Thierry.

ARAS

SparkOut
Posts: 2943
Joined: Sun Sep 23, 2007 4:58 pm

Re: Regular Expressions and matchText Help

Post by SparkOut » Sat Nov 30, 2013 12:18 pm

Thierry swings into action again!
Image

Post Reply