|
|
Thanks for your suggestion Michael and thanks to Uwe for clarifying.
Payload is currently used to store only the start positions.
What I gathered from your suggestion is that we could possibly
store the end position, or span, or some other complex
encoding in order to store the extra information.
Am I right?
--Sumukh
Michael McCandless-2 wrote:
>
>
> Since Lucene doesn't represent/store end position for a token, I don't
> think the index can properly represent SYN spanning two positions?
>
> I suppose you could encode this into payloads, and create a custom
> query that would look at the payload to enforce the constraint.
>
> Or, if you switch to doing SYN expansion only at runtime (not adding
> it to the index), that might work.
>
> Mike
>
> Uwe Schindler wrote:
>
>> I think his problem is, that "SYN" is a synonym for the phrase "WORD1
>> WORD2". Using these positions, a phrase like "SYN WORD2" would also
>> match
>> (or other problems in queries that depend on order of words).
>>
>> Uwe
>>
>> -----
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: uwe@xxxxxxxxxxx
>>
>>> -----Original Message-----
>>> From: Michael McCandless [mailto:lucene@xxxxxxxxxxxxxxxxxx]
>>> Sent: Monday, March 02, 2009 4:07 PM
>>> To: java-user@xxxxxxxxxxxxxxxxx
>>> Subject: Re: Indexing synonyms for multiple words
>>>
>>>
>>> Shouldn't WORD2's position be 1 more than your SYN?
>>>
>>> Ie, don't you want these positions?:
>>>
>>> WORD1 2
>>> WORD2 3
>>> SYN 2
>>>
>>> The position is the starting position of the token; Lucene doesn't
>>> store an ending position
>>>
>>> Mike
>>>
>>> Sumukh wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm fairly new to Lucene. I'd like to know how we can index synonyms
>>>> for
>>>> multiple words.
>>>>
>>>> This is the scenario:
>>>>
>>>> Consider a sentence: AAA BBB WORD1 WORD2 EEE FFF GGG.
>>>>
>>>> Now assume the two words combined WORD1 WORD2 can be replaced by
>>>> another
>>>> word SYN.
>>>>
>>>> If I place SYN after WORD1 with positionIncrement set to 0, WORD2
>>>> will
>>>> follow SYN,
>>>> which is incorrect; and the other way round if I place it after
>>>> WORD2.
>>>>
>>>> If any of you have solved a similar problem, I'd be thankful if you
>>>> could
>>>> share some light on
>>>> the solution.
>>>>
>>>> Regards,
>>>> Sumukh
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@xxxxxxxxxxxxxxxxx
>>> For additional commands, e-mail: java-user-help@xxxxxxxxxxxxxxxxx
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@xxxxxxxxxxxxxxxxx
>> For additional commands, e-mail: java-user-help@xxxxxxxxxxxxxxxxx
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@xxxxxxxxxxxxxxxxx
> For additional commands, e-mail: java-user-help@xxxxxxxxxxxxxxxxx
>
>
>
--
View this message in context:
http://www.nabble.com/Indexing-synonyms-for-multiple-words-tp22289069p22300656.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@xxxxxxxxxxxxxxxxx
For additional commands, e-mail: java-user-help@xxxxxxxxxxxxxxxxx
|
|