English 中文(简体)
Apache solr search part of the word
原标题:

I m using apache solr search engine for indexing my website database..

I m using django+http://haystacksearch.org/

So let s say I have document that have word "Chicken"

When I search for "chicken" - solr can find this document

But When I search "chick" - it does not find anything..

Is there a way to fix this ?

最佳回答

Note: The following solution is Solr 1.4 (and above) specific!

For more flexibility, I would recommend indexing your data with the NGramTokenizerFactory to do complete front and back wildcard searches. If you just want to search for substrings at the beginning or end of the string, consider using the EdgeNGramTokenizerFactory.

Here s a drop in replacement of the text field type which would accomodate your need:

<fieldType name="text" class="solr.TextField" >
<analyzer type="index">
    <tokenizer class="solr.NGramTokenizerFactory" minGramSize="3" maxGramSize="15" />
    <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
    <tokenizer class="solr.WhitespaceTokenizerFactory" />
    <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
问题回答

If you want to find all words that start with chick, search for chick*.

When I ve used

<tokenizer class="solr.NGramTokenizerFactory" minGramSize="3" maxGramSize="15" />

for making wildcard search from Brian s answer, Solr indexing time dramaticly increased. In more than 20 times! The other decision of wildcard searching problem I found here:

http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/

You need just add filter

<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="25" />

(default tokenizer - solr.WhitespaceTokenizerFactory in index block of FieldType). For me result was the same with less system costs.

A different approach, if you are having trouble with a small set of words, would be to use the solr.SynonymFilterFactory

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory

You just have to maintain a simple text file that contains synonyms:

chick peep chicken
dawg hound dog
moggie puss kitten cat

Plurals should take care of themselves with other filters.

I haven t changed any configuration. I am just using star in front and in the back of my searchString: *chicke * (without white space at the end -> it s because of SO formatting word as italic if you use * at the beginning and at the end)





相关问题
How to get two random records with Django

How do I get two distinct random records using Django? I ve seen questions about how to get one but I need to get two random records and they must differ.

Moving (very old) Zope/Plone Site to Django

I am ask to move data from a (now offline) site driven by Plone to a new Django site. These are the version informations I have: Zope Version (unreleased version, python 2.1.3 ) Python Version 2.1....

Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

Flexible pagination in Django

I d like to implement pagination such that I can allow the user to choose the number of records per page such as 10, 25, 50 etc. How should I go about this? Is there an app I can add onto my project ...

is it convenient to urlencode all next parameters? - django

While writing code, it is pretty common to request a page with an appended "next" query string argument. For instance, in the following template code next points back to the page the user is on: &...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

热门标签