java-user@lucene.apache.org
[Top] [All Lists]

Re: how to index a large database

Subject: Re: how to index a large database
From: "Chris Lu"
Date: Sat, 31 Mar 2007 11:22:04 -0700
BTW: DBSight uses the same approach that Erick describes.
Like Phil said, this will incur multiple joins for each document.
But this works well and efficient for incremental indexing.
And this approach does not need to change any database schema.

--
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes

On 3/31/07, Phil Myers <jlpmyers@xxxxxxxxx> wrote:
I've used the approach that Erick describes, and it
works well. Another approach is to create a single new
table in your database that holds all of the data you
want to index. This allows you to copy the various
fields from other tables using separate SQL statements
before you index, rather than relying on one massive
join during the indexing process. Having everything in
a single table helps avoid the overhead associated
with making multiple calls from your indexing process
for each row.

This approach may not be practical, of course, if you
don't have enough disk space to keep multiple copies
of your data.

Phil
--- Erick Erickson <erickerickson@xxxxxxxxx> wrote:
> So, try something like creating a SQL statement that
> selects
> the ID for table 1 and write it to a file. At the
> end of this, you'll
> have a list of all the IDs from table 1. Then, read
> those in one
> at a time and execute your query for use in indexing
> on one
> row at a time. In other words, iterate over each row
> rather
> than try to execute it in one massive statement.
>
> Erick
>
> On 3/31/07, Mohammad Norouzi <mnrz57@xxxxxxxxx>
> wrote:
> >
> > Hi all
> > I am going to index our database. one approach is
> to join them and then
> > index the fields. but the information are very
> large say more than 3
> > millions. so the Sql Server fails to select them.
> >
> > I want to know if anyone has such this experience
> to indexing huge
> > information of database using lucene.
> >
> > can anyone give me some advice?
> >
> > --
> > Regards,
> > Mohammad
> >
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@xxxxxxxxxxxxxxxxx
For additional commands, e-mail: java-user-help@xxxxxxxxxxxxxxxxx



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@xxxxxxxxxxxxxxxxx
For additional commands, e-mail: java-user-help@xxxxxxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>