java-user@lucene.apache.org
[Top] [All Lists]

Re: document diversity

Subject: Re: document diversity
From: Paul Libbrecht
Date: Tue, 6 Oct 2009 22:59:56 +0200
Just as you can add a query that will boost better things with a higher quality, you can add a query for a higher revenue.

Basically, the default operator "should" in boolean-clauses can be used exactly for that: do not force this query to be matched but raise boost if there's something that matches.

The translation of the user-query, itself, is marked as "must" (and inside this one is orred in all the various flavours, e.g. per language, phonetic...).

paul


Le 06-oct.-09 à 22:33, Michael Masters a écrit :

My initial description may have been a little abstract. Maybe I should
explain exactly what I'm trying to do. My company has various revenue
channels, one of which is per click. If a user does a search, we would
like to show results with the greatest revenue, although we don't want
people to be able to buy all the top results. Hence, we would like to
have some way of mixing results. The mixing of results could be based
of potential revenue, relevancy, which revenue stream the result is
associated with, etc.

The previously mentioned ideas are great btw.

-Mike


On Sat, Oct 3, 2009 at 4:25 PM, Grant Ingersoll <gsingers@xxxxxxxxxx> wrote:
I'm curious, can you elaborate more on the deeper use case for this?

Perhaps just implementing faceting on doc type would be sufficient? That way users can drill in on doc type. Alternatively, I suppose you could implement a hit collector that accesses a field cache on the doc type field and promotes lesser seen doc types until they are evenly represented. Could also likely write a Function query that does a similar thing. I'd imagine
you need to be careful to control your memory.

-Grant

On Oct 1, 2009, at 12:56 PM, Michael Masters wrote:

I was wondering if there is any way to control what kind of documents
are returned from a search. For example, lets say we have an index
built from different types of documents (pdf, txt, html, etc.). Is
there a way to have the first x results have a specified distribution of document types? It would be nice to have an even number of results
that are from pdfs, txt files, and html files.


Any help would greatly be appreciated.


-Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@xxxxxxxxxxxxxxxxx
For additional commands, e-mail: java-user-help@xxxxxxxxxxxxxxxxx


--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
Solr/Lucene:
http://www.lucidimagination.com/search


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@xxxxxxxxxxxxxxxxx
For additional commands, e-mail: java-user-help@xxxxxxxxxxxxxxxxx



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@xxxxxxxxxxxxxxxxx
For additional commands, e-mail: java-user-help@xxxxxxxxxxxxxxxxx


<Prev in Thread] Current Thread [Next in Thread>