How can I use Lucene to search for documents that do not contain a term?
  2011-11-04
  • java
  • lucene

我知道,Lucene 文件

Note: The NOT operator cannot be used with just one term. For example, the following search will return no results:

NOT “jakarta apache”

然而,我要提出一个问题,即归还所有载有任期的文件。 我研究了一起铺设href=

If I index the following two documents

Doc0: content:The quick brown fox jumps over the lazy dog.
Doc1: (empty string)

查询<代码>*:* -content:fox在我只想一份文件时将这两份文件退回。

The RegexQuery content:^((?!fox).)*$ suggested by this StackOverflow answer returns one document but it does not seem to be working correctly because content:^((?!foo).)*$ returns one document as well when I expect it to return two documents.

我知道我想要做的工作对业绩的影响。 问询只能用几份文件处理,因此我对业绩不感到担忧。




IndexSearcher searcher = new IndexSearcher("path_to_index");
MatchAllDocsQuery everyDocClause = new MatchAllDocsQuery();
TermQuery termClause = new TermQuery(new Term("text", "exclude_term"));
BooleanQuery query = new BooleanQuery();
query.add(everyDocClause, BooleanClause.Occur.MUST);
query.add(termClause, BooleanClause.Occur.MUST_NOT);
Hits hits = searcher.search(query);  

Else, have a dummy field which some fixed value and use query

+dummy_field:dummy_value -exclude_term


