mySQL and full text indexing

Mike Garfias mike at garfias.org
Mon Oct 23 21:31:11 MST 2006


I work for a search company, so I tend to know a bit about these things.

Yes, Hadoop is a FS.  The benefit is that it is distributed and  
redundant.  So instead of a huge ass box to hold your index, you get  
10 little boxes that you can build at frys and increase the bw to  
your index.

On Oct 23, 2006, at 4:13 PM, Joshua Zeidner wrote:

> Mike,
>
>   There are a number of minor projects of this nature ie. http:// 
> www.mnogosearch.org .
>
>   Lucene is the dominant OSS indexer.  The only thing I could  
> imagine would be a problem would be Java, which PHP people seem to  
> avoid like the plague and needlessly FUD.  In addition, Java  
> usually introduces a minor( if not negligible ) hosting cost.
>
>   re: Hadoop, I'm not sure if you want to use that on its own, its  
> a file system that is optimized for large volumes and I believe it  
> has distributed capability.
>
> thanks, jmz
>
>
> On 10/23/06, Mike Garfias <mike at garfias.org> wrote: Left out Swish- 
> E as well:
>
> http://swish-e.org/
>
> Also consider using the filesystem that grew out of nutch:
> http://lucene.apache.org/hadoop/
>
> On Oct 23, 2006, at 2:36 PM, Joshua Zeidner wrote:
>
> > Josh,
> >
> > I left out Sphinx, which is a lesser known option:
> >
> > http://sphinxsearch.com/
> >
> > -jmz
> >
> > On 10/23/06, Josh Coffman <josh_coffman at yahoo.com> wrote: Hi,
> >
> >   Anyone have an experience or opinions on full text indexing with
> > MySQL? We currently use MS SQL with full text indexing, and its a
> > pain.
> > We are preparing for our db to add tens of millions of rows soon;
> > currently those tables are in the 600,000 - 800,000 range. So its a
> > big jump.
> > This data is fed to us and reloaded nightly. This data is used by
> > websites, and traffic increases with time.
> >
> > I'm concerned about performance in general, especially in text
> > searches. In case the topic starts to come up, I'd like to have any
> > idea how MySQL
> > well would handle something like this.  Or PostGre for that matter.
> > Any difference between running those DB's on linux versus Windows?
> >
> > Thanks
> > -j
> >
> >
> >
> >
> >
> >
> >
> >
> > ---------------------------------------------------
> > PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
> > To subscribe, unsubscribe, or to change  you mail settings:
> > http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
> >
> >
> >
> > --
> > .0000. communication.
> > .0001. development.
> > .0010. strategy.
> > .0100. appeal.
> >
> > JOSHUA M. ZEIDNER
> > IT Consultant
> >
> > ++power; ++perspective; ++possibilities;
> > ( 602 ) 490 8006
> > jjzeidner at gmail.com
> > ---------------------------------------------------
> > PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
> > To subscribe, unsubscribe, or to change  you mail settings:
> > http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
>
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
> To subscribe, unsubscribe, or to change  you mail settings:
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
>
>
>
> -- 
> .0000. communication.
> .0001. development.
> .0010. strategy.
> .0100. appeal.
>
> JOSHUA M. ZEIDNER
> IT Consultant
>
> ++power; ++perspective; ++possibilities;
> ( 602 ) 490 8006
> jjzeidner at gmail.com
> ---------------------------------------------------
> PLUG-discuss mailing list - PLUG-discuss at lists.plug.phoenix.az.us
> To subscribe, unsubscribe, or to change  you mail settings:
> http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss



More information about the PLUG-discuss mailing list