java-user@lucene.apache.org
[Top] [All Lists]

RE: Lucene fields not analyzed

Subject: RE: Lucene fields not analyzed
From: "Uwe Schindler"
Date: Tue, 9 Feb 2010 09:27:45 +0100
If you don't get it working that way, then you have to ask you the question: 
Why do you want it indexed that way? Is it because you don't want to find all 
people in that field when you add ony "Mr." to a search query? It looks like 
you use StandardAnalyzer, and in this case, I would add "mr", not "mr!", to the 
stop word list and index the name field as any other field. Before doing this, 
it would be good to explain, what you are intending to do/prevent by indexing 
with NOT_ANALYZED, which is the source of your problem.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@xxxxxxxxxxx


> -----Original Message-----
> From: Rohit Banga [mailto:iamrohitbanga@xxxxxxxxx]
> Sent: Tuesday, February 09, 2010 9:03 AM
> To: java-user@xxxxxxxxxxxxxxxxx
> Subject: Re: Lucene fields not analyzed
> 
> let us assume this is the only field that is relevant (others are
> stored and
> not indexed).
> i tried termquery and it does not work.
> i also tried keyword analyzer and still could not make it work.
> 
> @Mark
> i cannot escape the spaces in my query as i am using Lucene to identify
> occurences of names among other things in the unstructured sentence.
> so while adding names to the index, i used keyword analyzer and changed
> the
> name to be added to the index to "Mr.\\ Kumar"
> but still couldn't get it to work.
> 
> 
> 
> 
> 
> 
> Rohit Banga
> 
> 
> On Tue, Feb 9, 2010 at 1:06 PM, Mark Harwood
> <markharw00d@xxxxxxxxxxx>wrote:
> 
> > I suspect it is because QueryParser uses space characters to separate
> > different clauses in a query string while you want the space to
> represent
> > some content in your "name" field. Try escaping the space character.
> >
> > Cheers
> > Mark
> >
> >
> >
> > On 9 Feb 2010, at 07:26, Rohit Banga wrote:
> >
> > > Hello
> > >
> > > i have a field that stores names of people. i have used the
> NOT_ANALYZED
> > > parameter to index the names.
> > >
> > > this is what happens during indexing
> > >
> > >    doc.add(new Field("name", "\"" + name + "\"", Field.Store.YES,
> > > Field.Index.NOT_ANALYZED));
> > >
> > >
> > >
> > > when i search it, i create a query parser using standardanalyzer
> and
> > append
> > > ~0.5 to the search query.
> > >
> > > the problem is that if the indexed name is "Mr. Kumar", my search
> does
> > not
> > > work for "Mr. Kumar" while it does work for "Mr.Kumar" (without the
> > space).
> > >
> > > // searching code
> > >        File index_directory = new File(INDEX_DIR_PATH);
> > >        IndexReader reader =
> > > IndexReader.open(FSDirectory.open(index_directory), true);
> > >        Searcher searcher = new IndexSearcher(reader);
> > >
> > >        Analyzer analyzer = new
> StandardAnalyzer(Version.LUCENE_CURRENT);
> > >
> > >        QueryParser parser = new QueryParser(Version.LUCENE_CURRENT,
> > "name",
> > > analyzer);
> > >
> > >        Query query;
> > >        query = parser.parse(text + "~0.5");
> > >
> > > how to make it work?
> > >
> > > Rohit Banga
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@xxxxxxxxxxxxxxxxx
> > For additional commands, e-mail: java-user-help@xxxxxxxxxxxxxxxxx
> >
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@xxxxxxxxxxxxxxxxx
For additional commands, e-mail: java-user-help@xxxxxxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>