Building a search engine

Bringing the internet highway into your project? Building FTP, HTTP, email, chat or other client solutions?

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

Post Reply
dalkin
Posts: 183
Joined: Wed Jul 04, 2007 2:32 am
Contact:

Building a search engine

Post by dalkin » Mon Aug 24, 2009 8:49 am

I'm not even sure if this is the place to post but within the auspices of the OnRev hosting setup, is there any way of building a search engine? If the answer is "Yes", is there a published project head-start?

Regards to all.
The underlying purpose of Al is to allow wealth to access skill
while removing from the skilled the ability to access wealth.

BvG
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 1239
Joined: Sat Apr 08, 2006 1:10 pm
Contact:

Post by BvG » Mon Aug 24, 2009 11:12 am

It's possible. However, to me the question is more in the direction of "why???". Making a search engine is not only hard work, sooner or later you need a dedicated server (respectively a dozen of server farms). Does google suck so much in your eyes?
Various teststacks and stuff:
http://bjoernke.com

Chat with other RunRev developers:
chat.freenode.net:6666 #livecode

dalkin
Posts: 183
Joined: Wed Jul 04, 2007 2:32 am
Contact:

Post by dalkin » Tue Aug 25, 2009 1:18 am

Hi. Not so much that Google sucks, it's more a case of they charge $100 per site. It doesn't take many sites to incur a big chunk of profit.
The underlying purpose of Al is to allow wealth to access skill
while removing from the skilled the ability to access wealth.

mcgrath3
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 149
Joined: Thu Feb 23, 2006 8:49 pm
Contact:

Re: Building a search engine

Post by mcgrath3 » Fri Nov 27, 2009 6:20 pm

Google business search costs $100.

But site:yoursitehere.com yoursearchhere -dontsearchthis works for free.

site:lazyriversoftware.com software -quartz

So, a simple search function that puts "site:" & "the site name" with the search term and don't search terms should be easy to do. And still make use of googles hard working algorithms etc.


HTHs

Tom
Tom McGrath III
Lazy River Software
3mcgrath@comcast.net

sturgis
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 1685
Joined: Sat Feb 28, 2009 11:49 pm

Re: Building a search engine

Post by sturgis » Mon Dec 14, 2009 7:35 am

Doubt its what you'd want but this is interesting. http://www.google.com/enterprise/search/gsa.html and of course theres the mini also.

Also, while i'm sure it's possible to roll your own rev crawler, there are already open source solutions available that might do what you need. I've never used the scripts themselves, but there seem to be quite a few php based crawlers out there. If nothing else, if you decide to roll your own, the php scripts might give some insight.

calieigh
Posts: 2
Joined: Wed Jan 13, 2010 5:44 am

Re: Building a search engine

Post by calieigh » Wed Jan 20, 2010 8:20 am

Hi.
Can I have list of some good php based crawler which are good according to your experience ?

sturgis
Livecode Opensource Backer
Livecode Opensource Backer
Posts: 1685
Joined: Sat Feb 28, 2009 11:49 pm

Re: Building a search engine

Post by sturgis » Wed Jan 20, 2010 4:30 pm

I've never actually used any of them, just googled to see wat was out there. You might look on devshed, also hotscripts as well as googling for php search engine and php crawler. Then experiment with what you find to see if there is anything that matches your needs. THe last time I used a home grown search engine it was for a relatively small local site and used ingres as a back end. I don't even remember the name of the engine itself or what language it was in. Its been 12 years or so since then. I'm sure things have improved greatly in the meantime.

Edit: My mistake, it used GDBM, heres a link to the search engine. http://harvest.sourceforge.net/harvest/doc/index.html From what I recall from way back then, setup was a real bear, but once it was working, it was really good. As I said above tho, its been over 12 years so my memory is almost nil at this point. Hard to remember breakfast much less that far back.

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10043
Joined: Sat Apr 08, 2006 7:05 am
Contact:

Re: Building a search engine

Post by FourthWorld » Wed Jan 20, 2010 5:39 pm

FWIW I use Atomz.com at my site. They have a free option which is quite useful for most sites with a reasonable number of pages.

I've built a couple search engines for desktop apps, and like BvG says it's a lot of work. In my case it was necessary because we have unusual data which needs to be handled in unusual ways, but I wouldn't recommend writing one from scratch unless you absolutely need to; the time required is often better spent on other features.

That said, if you have unique needs that can only be addressed by a custom solution, it may be helpful to keep this old programmer's adage in mind: "Show me your data structures and I'll show you your algorithm".

Decide up front what you need to accomplish with your SE, then design the data structures you'll need to make that happen. Once you have that figured out you'll be in a good position to work through the tedious details of indexing and retrieving that data store.

This article provides some background on the Google engine, which may provide some good ideas for your own:
http://infolab.stanford.edu/~backrub/google.html
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

Post Reply