English 中文(简体)
能够使用Lucene.net,对多种来源的成果进行指数化和合并。
原标题:Can I use Lucene.net to index and join results from multiple sources
  • 时间:2011-11-14 22:48:02
  •  标签:
  • lucene.net

我想利用Lucene.net从各种来源(例如地方档案系统和数据库)对数据进行索引。 然而,我要把两个来源的数据(基于一个共同领域,如一个国际发展领域)联系起来,并向用户展示综合信息。 就我所知,我有三种选择。 在对每个来源进行索引编制之后:

  1. Use Lucene.net to combine the indexes in a search query into a single result set
  2. Create some custom code to correlate results retrospectively; or
  3. Store separate result sets in a database (in my case, it won t be the same database as the source). Then create a new index based on a query that joins the data

备选案文1是我喜欢做的事,但我不敢肯定,由于以下几个原因,这一点与卢塞恩有多么可行:

  • Lucene isn t a relational database, is this attempting something that Lucene is not really designed to do?
  • Can combining indexes result in a noticeable performance hit?

选择2的唯一理由是,如果我相信我能够创建比选择1更为有效的算法。 按照这一逻辑,我不得不问,我是否应该完全使用卢塞恩来校正数据。

导致我选择 3. 我高兴的是,它将发挥作用,但似乎是一种妥协:

  • Data will be stored in a database as well as Lucene (as well as the original source)
  • By introducing an extra step, it ll take longer to complete the process. I m not sure how this will affect the user experience

任何建议?

最佳回答

Yes, you can, but you need to stop thinking relationally and start thinking in terms of documents rather than rows. Or, option 3 is the right approach. What you want to do is to create a single document holding:

a) whatever I wanted to search on -- analyized fields in lucene terms
b) pointers to the full, extant records -- basically the ID number or file location
c) if possible, enough stuff to show search results without having to reach out to the file system or the database -- stored fields in lucene parlance.

在业绩方面,有太多的间接费用或超负荷。 添加物品以指数化,并不是说大片业绩受到打击,而列ene本身也非常快。 如果需要,我将以合理、集中的方式加以充实,然后变为业绩。

问题回答

暂无回答




相关问题
How can I index HTML documents?

I am using Lucene .NEt to do full-text searching. Till now I have been indexing PDF docs, but now I have a few webpages that I need to index. What s the best/easiest way to index HTML documents to ...

Lucene.net: Separate building Index from Searching the Index

I created a website but i have a problem. i want to build once an index und use it. at the moment i have two functions "create a document an store it into the directory" and "searching" when the ...

Lucene .NET 2.3.2 Security Exception - Medium trust Issues

I m only partially able to get Lucene .NET to work on GoDaddy. It throws a security exception on this line: Hits hits = searcher.Search(query, filter); Here are the details of this exception: ...

Lucene.NET in medium trust

How do I make Lucene.NET 2.3.2 run in a medium trust environment? GoDaddy doesn t like it the way it is.

Storing relational data in a Lucene.NET index

I m currently trying to implement a Lucene.NET based search on a large database and I ve hit a snag trying to do a search on what is essentially relational data. At a high level the data I m trying ...

Trouble searching for acronyms in Lucene.NET

I m currently working on a Lucene.NET full-text search implementation. For the most part it s going quite well but I m having a few issues revolving around acronyms in the data... As an example of ...

Lucene.NET on shared hosting

I m trying to get Lucene.NET to work on a shared hosting environment. Mascix over on codeproject outlines here how he got this to work on godaddy. I m attempting this on isqsolutions. Both ...

热门标签