java-user@lucene.apache.org
[Top] [All Lists]

Re: No hits while searching!

Subject: Re: No hits while searching!
From: Matthew Hall
Date: Mon, 01 Jun 2009 13:51:00 -0400
Just build your own.

Here's exactly what you are looking for:

(Mind you I just whipped this out, and didn't compile it... so there could be minor syntax errors here.)

You will also obviously have to make your own package declaration, and your own imports.

So anyhow, the really neat thing about lucene, is being able to do exactly what we just did here, you can chain these tokenizers and filters together in almost any way you want, and create custom analyzers outta them.

Its a good thing to become familiar with, because I will nearly promise you that this analyzer here will ALSO probably be insufficient for your needs.

Anyhow, hope this helps.

Matt

/**
* Custom Lowercase Analyzer
*
* @author mhall
*
* This analyzer tokenizes on whitespace, and then lowercases the token.
*
*/

public class LowerCaseAnalyzer extends Analyzer {

   public LowerCaseAnalyzer() {
      super();
   }

   /**
    * Worker for this Analyzer.
    *
    * Specifically this analyzer chains together WhitespaceTokenizer ->
    * LowerCaseFilter together to form customized Tokens
    */

   public TokenStream tokenStream(String fieldName, Reader reader) {
       return new LowerCaseFilter(new WhitespaceTokenizer(reader));
   }

}

vanshi wrote:
Thanks Matt & sithu. Yes, It was due to stop word analyzer...now i'm using a
simple analyzer temporarily, as I know even simple analyzer cannot handle
quotes in names. However, can somebody plz direct me towards how to handle
quotes with the name in query using lowercase analyzer?

thanks,
Vanshi

Matthew Hall-7 wrote:
Yeah, he's gotta be.

You might be better of using something like a lowercase analyzer here, since punctuation in a name is likely important.

Matt

Sudarsan, Sithu D. wrote:
Do you use stopword filtering?

Sincerely,
Sithu D Sudarsan

-----Original Message-----
From: vanshi [mailto:nilu.thakur@xxxxxxxxx] Sent: Monday, June 01, 2009 11:39 AM
To: java-user@xxxxxxxxxxxxxxxxx
Subject: Re: No hits while searching!


Thanks Erick, I was able to get this work...as you said ..Luke is a
great
tool to look in to what gets stored as indexes though in my case I was
searching before the indexes were created so i was getting zero hits.

On side note, I'm running a strange output with prefix query...it only
works
when i have 3 or more than 3 letters in the first name/last name. Any
idea
what is going on here? Please see the output from log here.

02:05:20,996 INFO  [PhysicianQueryBuilder] Entered addTypeSpecificTerms
in
PhysicianQuerybuilder with exactName=true
02:05:20,996 INFO  [PhysicianQueryBuilder] Before running Prefix query,
First name: ang
02:05:20,996 INFO  [PhysicianQueryBuilder] Before running  Prefix query,
Last name: john
02:05:21,012 INFO  [LuceneIndexService] the query is:
+(FIRST_NAME_EXACT:ang*) +(LAST_NAME_EXACT:john*)
02:05:21,012 INFO  [LuceneIndexService] Result Size: 1

02:06:03,578 INFO  [PhysicianQueryBuilder] Entered addTypeSpecificTerms
in
PhysicianQuerybuilder with exactName=true
02:06:03,578 INFO  [PhysicianQueryBuilder] Before running term query,
First
name: a
02:06:03,578 INFO  [PhysicianQueryBuilder] Before running term query,
Last
name: johns
02:06:03,578 INFO  [LuceneIndexService] the query is: +()
+(LAST_NAME_EXACT:johns*)
02:06:03,578 INFO  [LuceneIndexService] Result Size: 0

02:08:01,548 INFO  [PhysicianQueryBuilder] Entered addTypeSpecificTerms
in
PhysicianQuerybuilder with exactName=true
02:08:01,548 INFO  [PhysicianQueryBuilder] Before running term query,
First
name: an
02:08:01,548 INFO  [PhysicianQueryBuilder] Before running term query,
Last
name: johns
02:08:01,548 INFO  [LuceneIndexService] the query is: +()
+(LAST_NAME_EXACT:johns*)
02:08:01,580 INFO  [LuceneIndexService] Result Size: 0

As one can see the query works with first name=ang but not with first
name=a
or an.

Appreciate all your inputs.

Vanshi

Erick Erickson wrote:
The most common issue with this kind of thing is that
UN_TOKENIZEDimplies
no
case folding. So if your case differs you won't get a match.

That aside, the very first thing I'd do is get a copy of Luke (google
Lucene
Luke)
and examine the index to see if what's in your index is what you
*think*
is
in there.


The second thing I'd do is look at query.toString() to see what the
actual
query is. You can even paste the output of toString() into Luke and
see
what happens.
I'm not sure what buildMultiTermPrefixQuery is all about, but I assume
you have a good reason for using that. But the other strategy I use
for
this kind of "what happened?" question is to peel back to simpler
cases
until I get what I expect, then build back up until it breaks.....

But really get a copy of Luke, it's a wonderful tool that'll give you
lots
of
insight about what's *really* going on...

Best
Erick

On Wed, May 27, 2009 at 12:43 AM, vanshi <nilu.thakur@xxxxxxxxx>
wrote:
In my web application, I need search functionality on first name and
last
name in 2 different ways, one search must be based on 'Metaphone
Analyzer'
giving all similar sounding names as result and another search should
be
exact match on either first name or last name. The name sounds like
search
has already been coded previously and I need to add another exact
match
search to the application. For this, I have a Lucene Index based out
on
fields from database tables which already had the names field indexed
with
metaphone analyzer. I added 2 more fields in the existing document,
which
indexes first name/last name as UN_TOKENIZED. While searching for
exact
match, I create a term query to look in to newly created UN_TOKENIZED
fields
as shown in the code snippets......however this is not getting any
hits.
I
would like to know is there anything wrong conceptually?

//creating fields for the document
FIRST_NAME(Field.Store.NO, Field.Index.TOKENIZED),
               FIRST_NAME_EXACT(Field.Store.NO,
Field.Index.UN_TOKENIZED),
               LAST_NAME(Field.Store.NO, Field.Index.TOKENIZED),
               LAST_NAME_EXACT(Field.Store.NO,
Field.Index.UN_TOKENIZED),
//name sounds like analyzer class....used while Indexing and
searching
public class NameSoundsLikeAnalyzer extends Analyzer {
       PerFieldAnalyzerWrapper wrapper;

       /**
        *
        */
       public NameSoundsLikeAnalyzer() {
               wrapper = new PerFieldAnalyzerWrapper(new
StopAnalyzer());
               wrapper.addAnalyzer(

 PhysicianDocumentBuilder.PhysicianFieldInfo.FIRST_NAME
                                               .toString(), new
MetaphoneReplacementAnalyzer());

               wrapper.addAnalyzer(

 PhysicianDocumentBuilder.PhysicianFieldInfo.LAST_NAME
                                               .toString(), new
MetaphoneReplacementAnalyzer());

       }

       /**
        * @see PerFieldAnalyzerWrapper#tokenStream(String, Reader)
        */
       @Override
       public TokenStream tokenStream(String fieldName, Reader
reader) {
               return wrapper.tokenStream(fieldName, reader);
       }

}

//lastly the query builder
if(physicianQuery.getExactNameSearch()){

 if(StringUtils.isNotEmpty(physicianQuery.getFirstNameStartsWith())){
                               TermQuery term = new TermQuery(new
Term(FIRST_NAME_EXACT.toString(),
physicianQuery.getFirstNameStartsWith()));
                               query.add(term,MUST);

                       }

 if(StringUtils.isNotEmpty(physicianQuery.getLastNameStartsWith())){
                               TermQuery term = new TermQuery(new
Term(LAST_NAME_EXACT.toString(),
physicianQuery.getLastNameStartsWith()));
                               query.add(term,MUST);

                       }
else{
//we want metaphone search
if (StringUtils.isNotEmpty(physicianQuery.getFirstNameStartsWith()))
{
 query.add(buildMultiTermPrefixQuery(FIRST_NAME.toString(),

 physicianQuery.getFirstNameStartsWith()), MUST);
                       }

                       if
(StringUtils.isNotEmpty(physicianQuery.getLastNameStartsWith())) {

 query.add(buildMultiTermPrefixQuery(LAST_NAME.toString(),

 physicianQuery.getLastNameStartsWith()), MUST);
                       }
}


--
View this message in context:

http://www.nabble.com/No-hits-while-searching%21-tp23735920p23735920.htm
l
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@xxxxxxxxxxxxxxxxx
For additional commands, e-mail: java-user-help@xxxxxxxxxxxxxxxxx


--
Matthew Hall
Software Engineer
Mouse Genome Informatics
mhall@xxxxxxxxxxxxxxxxxxx
(207) 288-6012



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@xxxxxxxxxxxxxxxxx
For additional commands, e-mail: java-user-help@xxxxxxxxxxxxxxxxx






--
Matthew Hall
Software Engineer
Mouse Genome Informatics
mhall@xxxxxxxxxxxxxxxxxxx
(207) 288-6012



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@xxxxxxxxxxxxxxxxx
For additional commands, e-mail: java-user-help@xxxxxxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>