Web crawler

Bringing the internet highway into your project? Building FTP, HTTP, email, chat or other client solutions?

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

Post Reply
buchacho
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 50
Joined: Fri Jun 14, 2013 10:22 pm

Web crawler

Post by buchacho » Thu Jan 23, 2014 10:34 pm

Is anyone aware of a LiveCode web crawler? I have a website I am interested in backing up and want to have it do things with the data it crawls, such as categorizing, tagging, storing in a database structure, etc based on the content. I don't have a really concrete idea of what it will do, I am wondering if there are any projects people are working on her or have seen, and it sounds like something interesting to try to develop with LiveCode. Any suggestions on how to crawl and parse the pages?

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10043
Joined: Sat Apr 08, 2006 7:05 am
Contact:

Re: Web crawler

Post by FourthWorld » Thu Jan 23, 2014 11:18 pm

It's a lot of work, esp. handling robots.txt correctly and making sure you're kind to other people's server resources, but doable. I found this book helpful - nothing LC specific, but full of good advice that's easily adaptable to LC:
http://shop.oreilly.com/product/9781593273972.do
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

Post Reply