Building a search engine
Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller
Building a search engine
I'm not even sure if this is the place to post but within the auspices of the OnRev hosting setup, is there any way of building a search engine? If the answer is "Yes", is there a published project head-start?
Regards to all.
Regards to all.
The underlying purpose of Al is to allow wealth to access skill
while removing from the skilled the ability to access wealth.
while removing from the skilled the ability to access wealth.
It's possible. However, to me the question is more in the direction of "why???". Making a search engine is not only hard work, sooner or later you need a dedicated server (respectively a dozen of server farms). Does google suck so much in your eyes?
Various teststacks and stuff:
http://bjoernke.com
Chat with other RunRev developers:
chat.freenode.net:6666 #livecode
http://bjoernke.com
Chat with other RunRev developers:
chat.freenode.net:6666 #livecode
Re: Building a search engine
Google business search costs $100.
But site:yoursitehere.com yoursearchhere -dontsearchthis works for free.
site:lazyriversoftware.com software -quartz
So, a simple search function that puts "site:" & "the site name" with the search term and don't search terms should be easy to do. And still make use of googles hard working algorithms etc.
HTHs
Tom
But site:yoursitehere.com yoursearchhere -dontsearchthis works for free.
site:lazyriversoftware.com software -quartz
So, a simple search function that puts "site:" & "the site name" with the search term and don't search terms should be easy to do. And still make use of googles hard working algorithms etc.
HTHs
Tom
Tom McGrath III
Lazy River Software
3mcgrath@comcast.net
Lazy River Software
3mcgrath@comcast.net
Re: Building a search engine
Doubt its what you'd want but this is interesting. http://www.google.com/enterprise/search/gsa.html and of course theres the mini also.
Also, while i'm sure it's possible to roll your own rev crawler, there are already open source solutions available that might do what you need. I've never used the scripts themselves, but there seem to be quite a few php based crawlers out there. If nothing else, if you decide to roll your own, the php scripts might give some insight.
Also, while i'm sure it's possible to roll your own rev crawler, there are already open source solutions available that might do what you need. I've never used the scripts themselves, but there seem to be quite a few php based crawlers out there. If nothing else, if you decide to roll your own, the php scripts might give some insight.
Re: Building a search engine
Hi.
Can I have list of some good php based crawler which are good according to your experience ?
Can I have list of some good php based crawler which are good according to your experience ?
Re: Building a search engine
I've never actually used any of them, just googled to see wat was out there. You might look on devshed, also hotscripts as well as googling for php search engine and php crawler. Then experiment with what you find to see if there is anything that matches your needs. THe last time I used a home grown search engine it was for a relatively small local site and used ingres as a back end. I don't even remember the name of the engine itself or what language it was in. Its been 12 years or so since then. I'm sure things have improved greatly in the meantime.
Edit: My mistake, it used GDBM, heres a link to the search engine. http://harvest.sourceforge.net/harvest/doc/index.html From what I recall from way back then, setup was a real bear, but once it was working, it was really good. As I said above tho, its been over 12 years so my memory is almost nil at this point. Hard to remember breakfast much less that far back.
Edit: My mistake, it used GDBM, heres a link to the search engine. http://harvest.sourceforge.net/harvest/doc/index.html From what I recall from way back then, setup was a real bear, but once it was working, it was really good. As I said above tho, its been over 12 years so my memory is almost nil at this point. Hard to remember breakfast much less that far back.
-
- VIP Livecode Opensource Backer
- Posts: 10043
- Joined: Sat Apr 08, 2006 7:05 am
- Contact:
Re: Building a search engine
FWIW I use Atomz.com at my site. They have a free option which is quite useful for most sites with a reasonable number of pages.
I've built a couple search engines for desktop apps, and like BvG says it's a lot of work. In my case it was necessary because we have unusual data which needs to be handled in unusual ways, but I wouldn't recommend writing one from scratch unless you absolutely need to; the time required is often better spent on other features.
That said, if you have unique needs that can only be addressed by a custom solution, it may be helpful to keep this old programmer's adage in mind: "Show me your data structures and I'll show you your algorithm".
Decide up front what you need to accomplish with your SE, then design the data structures you'll need to make that happen. Once you have that figured out you'll be in a good position to work through the tedious details of indexing and retrieving that data store.
This article provides some background on the Google engine, which may provide some good ideas for your own:
http://infolab.stanford.edu/~backrub/google.html
I've built a couple search engines for desktop apps, and like BvG says it's a lot of work. In my case it was necessary because we have unusual data which needs to be handled in unusual ways, but I wouldn't recommend writing one from scratch unless you absolutely need to; the time required is often better spent on other features.
That said, if you have unique needs that can only be addressed by a custom solution, it may be helpful to keep this old programmer's adage in mind: "Show me your data structures and I'll show you your algorithm".
Decide up front what you need to accomplish with your SE, then design the data structures you'll need to make that happen. Once you have that figured out you'll be in a good position to work through the tedious details of indexing and retrieving that data store.
This article provides some background on the Google engine, which may provide some good ideas for your own:
http://infolab.stanford.edu/~backrub/google.html
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn