English 中文(简体)
如何从Sphinx指数中收回最经常发生的关键术语?
原标题:How does one retrieve the most frequently occurring keyterms from a Sphinx index?

I have a Sphinx index of text files and I d like to retrieve a list of the keyterms Sphinx found when indexing the text files, ordered, highest to lowest, by how frequently they occurred in the dataset. How do I do this?

如果有可能,我就希望检索真实术语和干.。

Im利用PHP机向指数打电话。

Below are my Sphinx.conf settings for this index:

source srcDatasheets
{
    type                = mysql
    sql_host            = localhost
    sql_user            = user
    sql_pass            = pass
    sql_db              = db
    sql_port            = 3306

    sql_query           = 
         SELECT id, company_id, title, brief, content_file_path 
         FROM datasheets

    sql_attr_uint       = company_id
    sql_file_field      = content_file_path
    sql_query_info      = SELECT * FROM datasheets WHERE id=$id
}


index datasheets
{
    source              = srcDatasheets
    path                = /usr/local/sphinx/var/data/datasheetsStemmed
    docinfo             = extern
    charset_type        = sbcs
    morphology          = stem_en
    min_stemming_len    = 1
}
最佳回答

One cannot retrieve keyword density directly from a live index with Sphinx. The data is not stored in a way that allows this. Here is a response from the Sphinx forums.

What you can do, however, is run the indexer with --buildstops, and --buildfreqs (see the docs). The indexer will output a txt file of the most frequently occurring terms and frequencies based on the settings you have in the .conf file for that index.

这套数据用于编制清单和文本档案,实际上没有编制新的可检索索引。

我对文本档案索引(变造的pdf)进行了测试,其字数长度为5个特性。 大约20秒中处理了70 000份档案(5分钟,限定为1份)。

问题回答

暂无回答




相关问题
Search field with Thickbox issue

i have a search form which is shown with Thickbox inside an iframe. the problem is.. that after i click "search" the result page is shown inside the same iframe! and i want it to be shown in the main ...

Will an incomplete google sitemap hurt my search ranking?

If I submit a sitemap.xml which does not contain all of the pages of my site, will this affect my search ranking? For example: If my sitemap only contained pages that had been created in the last ...

speeding up windows file search with C#

i made a program that search logical drives to find a specific file .if user type file name an click search button , searching begins , but i don t know how to stop searching in the middle of process....

JQuery/MVC Search Issue

I have inherited a piece of work where the entry screen shows a summary of 20 calculated variables. E.g. Var A (250), Var B (79). Clicking on any of these links takes the user to a view with a ...

Handling no results for docmd.applyfilter

I have an Access app where I use search functionality. I have a TextBox and a Search Button on the form, and it does a wildcard search of whatever the user enters in the TextBox, and displays the ...

Search by using the keyboard in a list/grid - algorithm

I need to implement a custom search in a grid and I would like to find some user interface guidelines that explain the standard way to implement it. I mean this kind of search that is initiated by ...

Embed Google/ Yahoo search into a web site or build your own

I am looking for an opinion on the whether to use Google custom search, Yahoo search builder or build my own for web projects (no more than 100 pages of content). If I should build my own - do you ...

热门标签