| Subject: | Keep URLs intact and not tokenized by the StandardTokenizer |
|---|---|
| From: | Sudha Verma |
| Date: | Wed, 18 Nov 2009 22:58:11 -0700 |
Hi, I am using lucene 2-9-1. I am reading in free text documents which I index using lucene and the StandardAnalyzer at the moment. The StandardAnalyzer keeps email addresses intact and does not tokenize them. Is there something similar for URLs? This seems like a common need. So, I thought I'd check if there is anything out there that does it already. I'd appreciate any help. Thanks, sudha |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | Re: lucene not returning correct results eventhough search query is present, Otis Gospodnetic |
|---|---|
| Next by Date: | Re: Finding the highest term in a field, Daniel Noll |
| Previous by Thread: | Re: lucene not returning correct results eventhough search query is present, Otis Gospodnetic |
| Next by Thread: | RE: Keep URLs intact and not tokenized by the StandardTokenizer, Steven A Rowe |
| Indexes: | [Date] [Thread] [Top] [All Lists] |