English 中文(简体)
Sphinx delta indexing -- still necessary to rebuild the main index?
原标题:

I ve been reading up on the Sphinx search engine and the Thinking Sphinx gem. In the TS docs it says...

Sphinx has one major limitation when compared to a lot of other search services: you cannot update the fields [of] a single document in an index, but have to re-process all the data for that index.

If I understand correctly, that means when a user adds or edits something, the change is not reflected in the index. So if they add a record it won t come up in searches until the entire index is rebuilt. Or if they delete a record, it will come up in searches, and then cause some kind of error or frustrating behavior.

Moreover, while rebuilding the index Sphinx is shut down. So, your app s search functionality goes off line regularly (once an hour, once every few hours), and anyone who tries to do a search then will get an error or a "try later" message.

OK, clearly none of that is acceptable in real-world app. So you pretty much have to use delta indexing.

But apparently you still need to regularly shut down your search engine and do a full indexing...

Turning on delta indexing does not remove the need for regularly running a full re-index, as otherwise the delta index itself will grow to become just as large as the core indexes, and this removes the advantage of keeping it separate. It also slows down your requests to your server that make changes to the model records.

I don t really understand what the docs are saying here. Maybe someone can help me out. I thought the whole point of delta indexing was that you don t need to regularly rebuild the index. It s updated instantly whenever the data changes.

Because rebuilding the index every hour or every anything would be totally messed up, right?

最佳回答

If I understand correctly, that means when a user adds or edits something, the change is not reflected in the index. So if they add a record it won t come up in searches until the entire index is rebuilt. Or if they delete a record, it will come up in searches, and then cause some kind of error or frustrating behavior. Moreover, while rebuilding the index Sphinx is shut down. ...

You don t need to rebuild your indexes - just reindex them. Which means - there s no need to stop the daemon. Rebuilding is only needed after changing the structure of the index - and that is not the case here.

And for the second part - again, you don t rebuild the index, ergo stopping the deamon isn t necessary. When using delta indexing there are actually two indexes that are used for searching - the main index (which should be reindexed once a while) and the delta index (which is refreshed after each relevant operation on the record). If I understand it correctly, when reindexing the main index (eg. via cron task), the delta index is simply merged into the main index, so it won t take that much place and stay fast.

问题回答

暂无回答




相关问题
Acronyms with Sphinx search engine

how can i index acronyms like m.i.a. ? when i search for mia , i get results for mia and not m.i.a. . when i search for m.i.a. , i get nothing at all. edit: solution looks roughly like: ...

Querying multiple index in django-sphinx

The django-sphinx documentation shows that django-sphinx layer also supports some basic querying over multiple indexes. http://github.com/dcramer/django-sphinx/blob/master/README.rst from ...

Adding Search to Ruby on Rails - Easy Question

I am trying to figure out how to add search to my rails application. I am brand new so go slow. I have created a blog and done quite a bit of customizing including adding some AJAX, pretty proud of ...

Searching and ranking short phrases (e.g. movie titles)

I m trying to improve our search capabilities for short phrases (in our case movie titles) and am currently looking at SQL Server 2008 Full Text Search, which provides some of the functionality we ...

Will Full text search consider indexes?

Ok I have a full text search index created on my JobsToDo table, but what I m concerned about is if this is rendering my other indexes on the table useless. I have a normal nonclustered index on the ...

Lucene.NET on shared hosting

I m trying to get Lucene.NET to work on a shared hosting environment. Mascix over on codeproject outlines here how he got this to work on godaddy. I m attempting this on isqsolutions. Both ...

Hibernate Search or Compass

I can t seem to find any recent talk on the choice. Back in 06 there was criticism on Hibernate Search as being incomplete and not being ready to compete with Compass, is it now? Has anyone used both ...

热门标签