We are using lucene in our project to search through information objects which
works fine. For indexing we use the StandardAnalyzer.
Now, we have to support the Chinese language. I found out that the Chinese
words and letters are correctly saved in the index but the query to search for
them does not work. Example: in English language the query is “text” which we
parse to “*text*”. If we search for Chinese words / phrases like “佛山东方书城”the
query is “*佛山东方书城*“ but there are no search results. If the query places blanks
between the single letters / symbols like this “*佛 山 东 方 书 城*“ we are getting
results. Does the StandardAnalyzer interpret each Chinese letter as one word?
What are best practices for this case? Shall we use another analyzer (Chinese
analyzer)? Or is it better to replace the query parser in this case?