我想利用Lucene.net从各种来源(例如地方档案系统和数据库)对数据进行索引。 然而,我要把两个来源的数据(基于一个共同领域,如一个国际发展领域)联系起来,并向用户展示综合信息。 就我所知,我有三种选择。 在对每个来源进行索引编制之后:
- Use Lucene.net to combine the indexes in a search query into a single result set
- Create some custom code to correlate results retrospectively; or
- Store separate result sets in a database (in my case, it won t be the same database as the source). Then create a new index based on a query that joins the data
备选案文1是我喜欢做的事,但我不敢肯定,由于以下几个原因,这一点与卢塞恩有多么可行:
- Lucene isn t a relational database, is this attempting something that Lucene is not really designed to do?
- Can combining indexes result in a noticeable performance hit?
选择2的唯一理由是,如果我相信我能够创建比选择1更为有效的算法。 按照这一逻辑,我不得不问,我是否应该完全使用卢塞恩来校正数据。
导致我选择 3. 我高兴的是,它将发挥作用,但似乎是一种妥协:
- Data will be stored in a database as well as Lucene (as well as the original source)
- By introducing an extra step, it ll take longer to complete the process. I m not sure how this will affect the user experience
任何建议?