|
|
[
http://issues.apache.org/jira/browse/LUCENE-469?page=comments#action_12358233 ]
Luc Vanlerberghe commented on LUCENE-469:
-----------------------------------------
I discovered the problem in my production system a while ago, assumed it would
have been fixed in 1.9 and noticed it wasn't.
I created test cases to reproduce it and used them to find and eliminate the
problem. I didn't pay much attention to compatibility issues yet.
I backported the patches to 1.4.3 now, tested it and put it in production.
I'll post those 1.4.3 version of the patches later
- TopDocs and TopFieldDocs are indeed public. I could add the old constructor
again with a @deprecated tag that sets maxScore to 1.0
- Other Sorts: You are right, I made a mistake by concentrating too much on the
Sort.RELEVANCE case. A similar problem exists for ParallelMultiSearcher.
- I am also in favour of making maxScore private with public accessors. I only
made it public because the other members where public...
I'll post corrected patches later today...
> (Parallel-)MultiSearcher: using Sort object changes the scores
> --------------------------------------------------------------
>
> Key: LUCENE-469
> URL: http://issues.apache.org/jira/browse/LUCENE-469
> Project: Lucene - Java
> Type: Bug
> Components: Search
> Versions: CVS Nightly - Specify date in submission
> Environment: 21 november 2005, revision 345901
> Reporter: Luc Vanlerberghe
> Attachments: MultiSearcherSort.patch, TestMultiSearcher.patch
>
> Example:
> Hits hits=multiSearcher.search(query);
> returns different scores for some documents than
> Hits hits=multiSearcher.search(query, Sort.RELEVANCE);
> (both for MultiSearcher and ParallelMultiSearcher)
> The documents returned will be the same and in the same order, but the scores
> in the second case will seem out of order.
> Inspecting the Explanation objects shows that the scores themselves are ok,
> but there's a bug in the normalization of the scores.
> The document with the highest score should have score 1.0, so all document
> scores are divided by the highest score. (Assuming the highest score was>1.0)
> However, for MultiSearcher and ParallelMultiSearcher, this normalization
> factor is applied *per index*, before merging the results together (the merge
> itself is ok though).
> An example: if you use
> Hits hits=multiSearcher.search(query, Sort.RELEVANCE);
> for a MultiSearcher with two subsearchers, the first document will have score
> 1.0.
> The next documents from the same subsearcher will have decreasing scores.
> The first document from the other subsearcher will however have score 1.0
> again !
> The same applies for other Sort objects, but it is less visible.
> I will post a TestCase demonstrating the problem and suggested patches to
> solve it in a moment...
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@xxxxxxxxxxxxxxxxx
For additional commands, e-mail: java-dev-help@xxxxxxxxxxxxxxxxx
|
|