English 中文(简体)
全文检索文件和相关数据
原标题:Full-text Search on documents and related data mssql

目前,在建立知识库过程中,对文件信息的最佳储存和索引方式有点不确定。

The user uploads the document and when doing so selects a number of options from dropdown lists (such as category,topic,area..., note these are not all mandatory) they also enter some keywords and a description of the document. At the moment the category (and others) selected is stored as foreign key in the documents table using the id from the categories table. What we want to be able to do is do a FREETEXTTABLE or CONTAINSTABLE on not only the information within the varchar(max) column where the document is located but also on the category name, topic name and area name etc.

我审视了形成指数化观点的备选办法,但由于LEFT JOIN对类别一栏的改动,这一假设是可能的。 因此,我不敢肯定,如何做到这一点,将受到高度赞赏。

问题回答

我假定,你希望进行两次搜查。 例如,发现所有包含“foo”和“Automot”类内容的文件。

也许你不需要把补充数据全文成正文,而且只能使用吗? 如果额外数据小,可能不必使全文复杂化。

然而,如果你想使用关于这两者的全文,则使用一种储存的程序,把结果汇集在一起,供你使用。 这里的陷阱是取得结果,而不是试图使结果直截了当。

这是粗略的起点。

-- a staging table variable for the document results
declare @documentResults table (
    Id int,       
    Rank int
)

insert into @documentResults
select d.Id, results.[rank]
from containstable (documents, (text),  "foo*" ) results
inner join documents d on results.[key] = d.Id

-- now you have all of the primary keys that match the search criteria
-- whittle this list down to only include keys that are in the correct categories

-- a staging table variable for each the metadata results
declare @categories table (
    Id int        
)

insert into @categories
select results.[KEY]
from containstable (Categories, (Category),  "Automotive Repair*" ) results

declare @topics table (
    Id int        
)

insert into @topics
select results.[KEY]
from containstable (Topics, (Topic),  "Automotive Repair*" ) results

declare @areas table (
    Id int        
)

insert into @areas
select results.[KEY]
from containstable (Areas, (Area),  "Automotive Repair*" ) results


select d.text, c.category, t.topic, a.area
from @results r
inner join documents d on d.Id = r.Id
inner join @categories c on c.Id = d.CategoryId
inner join @topics t on t.Id = d.TopicId
inner join @areas a on a.Id = d.AreaId

你可以为您的全文索引设立一个新的栏目,该栏将包含原始文件加上作为元数据所附的类别。 然后对该栏进行检索,可以同时检索文件类别和类别。 你们需要发明一种束缚性制度,使其在你的文件中独一无二,但标签可能不会被自己用作搜索短语。 也许像:

This is my regular document text. <FTCategory: Automotive Repair> <FTCategory: Transmissions>




相关问题
Acronyms with Sphinx search engine

how can i index acronyms like m.i.a. ? when i search for mia , i get results for mia and not m.i.a. . when i search for m.i.a. , i get nothing at all. edit: solution looks roughly like: ...

Querying multiple index in django-sphinx

The django-sphinx documentation shows that django-sphinx layer also supports some basic querying over multiple indexes. http://github.com/dcramer/django-sphinx/blob/master/README.rst from ...

Adding Search to Ruby on Rails - Easy Question

I am trying to figure out how to add search to my rails application. I am brand new so go slow. I have created a blog and done quite a bit of customizing including adding some AJAX, pretty proud of ...

Searching and ranking short phrases (e.g. movie titles)

I m trying to improve our search capabilities for short phrases (in our case movie titles) and am currently looking at SQL Server 2008 Full Text Search, which provides some of the functionality we ...

Will Full text search consider indexes?

Ok I have a full text search index created on my JobsToDo table, but what I m concerned about is if this is rendering my other indexes on the table useless. I have a normal nonclustered index on the ...

Lucene.NET on shared hosting

I m trying to get Lucene.NET to work on a shared hosting environment. Mascix over on codeproject outlines here how he got this to work on godaddy. I m attempting this on isqsolutions. Both ...

Hibernate Search or Compass

I can t seem to find any recent talk on the choice. Back in 06 there was criticism on Hibernate Search as being incomplete and not being ready to compete with Compass, is it now? Has anyone used both ...

热门标签