English 中文(简体)
用于描述结果的搜索引擎想法
原标题:search engine ideas for description of results

我正在为全文搜索制作搜索引擎, 在用描述显示结果时出现性能问题。 我为当前查询做了结果, 但缺乏性能是因为当我试图获取文本并突出关键字所在部分时。 我使用 pdf, txt, doc, docs, html 等。 因此, 我的搜索引擎像 :

  • I have a db table where i store the document text
  • I have a db table where i index the text with it s frequency

这个方案是否很好。 我必须搜索索引并获取文档, 分析文本, 获取句子, 用关键字过滤句子。 无需描述搜索的性能是 :

**Крушевското Востание 1903** 0,00518989562988
**Даме Груев** 0,00394678115845
**Даме Груев и Гоце Делчев**  0,0916090011597
**Државен празник Илинден** 0,0072648525238
**Даме** 0,00195503234863
**Александар Македонски** 0,0423209667206
**Бранко Црвенковски и Никола Груевски** 0,0233609676361
**СДСМ и ВМРО-ДПМНЕ** 0,0295231342316
**Македонија** 0,0435738563538
**Никола Груевски и Македонија** 0,0451180934906

搜索关键词是用我母语编写的,文件的收集为3679。 带有句子的描述标记, 我对结果的显示慢了10x20x倍( 类似 2-3 秒) 。 搜索用 Python 进行 。

有什么建议吗?

问题回答

我真的建议你看看Elastic搜索和Solr(都基于Lucene)等项目, 它们都支持您想要做的事(全文搜索、结果突出...),





相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...

热门标签