I have a RSS feed (xml) file that I need to extract links

Got a LiveCode personal license? Are you a beginner, hobbyist or educator that's new to LiveCode? This forum is the place to go for help getting started. Welcome!

Moderators: Klaus, FourthWorld, heatherlaine, kevinmiller

Post Reply
shawnblc
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 338
Joined: Fri Jun 01, 2012 11:11 pm
Location: USA

I have a RSS feed (xml) file that I need to extract links

Post by shawnblc » Fri Jan 16, 2015 6:39 am

I have a RSS feed (xml) file that I need to extract links from. This is the part where I need to extract the link (there's several link in this file):

Code: Select all

<weblink>
<![CDATA[
https://www.domainName.com/p.php?l=0&p=0056&id=171
]]>
</weblink>
Having a difficult time figuring it out, if anyone can give me a hand or point me in the right direction. Thanks.

Simon
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3901
Joined: Sat Mar 24, 2007 2:54 am
Location: Palo Alto

Re: I have a RSS feed (xml) file that I need to extract link

Post by Simon » Fri Jan 16, 2015 8:34 am

Hi Shawn,
Isn't this

Code: Select all

put lineOffset("<![CDATA[",myXML) into myVar
add 1 to myVar
Then you get to use "lines to skip"


Simon
I used to be a newbie but then I learned how to spell teh correctly and now I'm a noob!

shawnblc
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 338
Joined: Fri Jun 01, 2012 11:11 pm
Location: USA

Re: I have a RSS feed (xml) file that I need to extract link

Post by shawnblc » Fri Jan 16, 2015 4:39 pm

Hmmm. Not having any luck. I'll continue trying and post some code.

shawnblc
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 338
Joined: Fri Jun 01, 2012 11:11 pm
Location: USA

Re: I have a RSS feed (xml) file that I need to extract link

Post by shawnblc » Fri Jan 16, 2015 5:12 pm

Ok. I went with the php file instead of the XML file and almost have what I need. Getting close. Here's some code, any help is greatly appreciated.
-- Things I need to do
A) loop through fld "fld1" and find a random link
B) see the second block of code, I need to find the string + 4 char
* the second code block is what I'm trying to achieve, but obviously doesn't work with my way of thinking.

Code: Select all

on mouseUp
   put URL "http://mydomain.com/rss.php" into tURL
   put tURL into fld "fld1"
   find string "https://www.myotherdomain.com/show.php?l=0&u=17156&id=" in fld "fld1"
   put the foundText into tFound
   put tFound into fld "fld2"
end mouseUp
This is what I'd like

Code: Select all

on mouseUp
   put URL "http://mydomain.com/rss.php" into tURL
   put tURL into fld "fld1"
   find random string "https://www.myotherdomain.com/show.php?l=0&u=17156&id=" & + 4 char in fld "fld1"
   put the foundText into tFound
   put tFound into fld "fld2"
end mouseUp

mattmaier
Posts: 109
Joined: Fri Apr 19, 2013 2:49 am

Re: I have a RSS feed (xml) file that I need to extract link

Post by mattmaier » Fri Jan 16, 2015 6:28 pm

So you know how you want the target link to start? Maybe the "begins with" function will help http://livecode.com/developers/api/6.0. ... ns%20with/

shawnblc
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 338
Joined: Fri Jun 01, 2012 11:11 pm
Location: USA

Re: I have a RSS feed (xml) file that I need to extract link

Post by shawnblc » Fri Jan 16, 2015 7:02 pm

I can find instances of the URL using this (although, not all of them in one swoop), but I need the next few characters too, which will always change, but always be 5 digits.

Code: Select all

on mouseUp
   find characters "https://www.mydomain.com/rss.php?z=0&p=156&id="
end mouseUp

Simon
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3901
Joined: Sat Mar 24, 2007 2:54 am
Location: Palo Alto

Re: I have a RSS feed (xml) file that I need to extract link

Post by Simon » Sat Jan 17, 2015 3:27 am

Hi shawn,
Can you post some of your XML/PHP whatever is returned?

And never loop through a field (way slow) always stick it into a variable and loop that.

Simon
I used to be a newbie but then I learned how to spell teh correctly and now I'm a noob!

MaxV
Posts: 1576
Joined: Tue May 28, 2013 2:20 pm
Location: Italy
Contact:

Re: I have a RSS feed (xml) file that I need to extract link

Post by MaxV » Thu Jan 22, 2015 4:29 pm

This page could help you, it explains you how to create an RSS feed reader using Livecode XML functions: http://livecodeitalia.blogspot.it/2014/ ... e-rss.html
Use the google translate butoon on the right to translate in your language. :D
Livecode Wiki: http://livecode.wikia.com
My blog: https://livecode-blogger.blogspot.com
To post code use this: http://tinyurl.com/ogp6d5w

Martin Koob
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 239
Joined: Sun May 27, 2007 8:19 pm

Re: I have a RSS feed (xml) file that I need to extract link

Post by Martin Koob » Wed Jan 28, 2015 11:16 pm

I have been working on learning regex and I thought this question would be a good one to try to see if you could extract the URLs with regex.

If the data you are looking for is always in the form <![CDATA[ the url]]> then the regex <!\[CDATA\[(.*)\]\] will capture the URL.

I put a couple of lines with urls in this format in the following code to test.

Code: Select all

on mouseUp
   put "<![CDATA[https://www.domainName.com/p.php?l=0&p=0056&id=181]]>"  & CR & "<![CDATA[https://www.domainName.com/p.php?l=0&p=0064&id=151]]>" into tURLtoExtract
   local tStart,tEnd
   put matchtext(tURLtoExtract, "<!\[CDATA\[(.*)\]\]",tURL) into tSuccess
   put matchchunk(tURLtoExtract, "<!\[CDATA\[(.*)\]\]",tStart,tEnd) into tSuccess
   put tSuccess into line 1 of msg
   put tStart into line 2 of msg
   put tEnd into line 3 of msg
   put tURL into line 4 of msg
end mouseUp
MatchText will find and extract the first match and put it in tURL. This won't return subsequent matches so you would have to iterate through your text to find subsequent matches. If there is only one URL per line in your feed you could iterate for each line.

If that did not work you could also use matchChunk which returns the start and end position of the match. You could have a repeat loop that uses the end position to delete characters to that point in the text and then use matchText and matchChunk again to get the next URL.

Not sure if this will do what you want but would be interested to see if it did.

Martin

Post Reply

Return to “Getting Started with LiveCode - Complete Beginners”