我正在建造一个完整的文本搜索设施,供我的网站使用,其编号为p.net mvc,有我的sql数据库。 这个网站是非英语。 我已开始使用Lucense作为搜索案文的引擎,但我看不出它是否支持统一编码。
是否有任何人了解卢塞内是否支持统法协会编码? 我不想感到惊讶。
另外,还链接到关于实施电子产品的最初条款。 净额将升值。
我正在建造一个完整的文本搜索设施,供我的网站使用,其编号为p.net mvc,有我的sql数据库。 这个网站是非英语。 我已开始使用Lucense作为搜索案文的引擎,但我看不出它是否支持统一编码。
是否有任何人了解卢塞内是否支持统法协会编码? 我不想感到惊讶。
另外,还链接到关于实施电子产品的最初条款。 净额将升值。
Yes. It fully support unicode.
But for analyzing you should explicitly assign appropriate stemmers and correct stopwords.
As for sample. Here is copy from our last project
directory = new RAMDirectory();
analyzer = new StandardAnalyzer(version, new Hashtable());
var indexWriter = new IndexWriter(directory, analyzer, true, IndexWriter.MaxFieldLength.UNLIMITED);
using (var session = sessionFactory.OpenStatelessSession())
{
organizations = session.CreateCriteria(typeof(Organization)).List<Organization>();
foreach (var organization in organizations)
{
var document = new Document();
document.Add(new Field("Id", organization.ID.ToString(), Field.Store.YES, Field.Index.NOT_ANALYZED_NO_NORMS));
document.Add(new Field("FullName", organization.FullName, Field.Store.NO, Field.Index.ANALYZED_NO_NORMS));
document.Add(new Field("ObjectTypeInvariantName", typeof(Organization).FullName, Field.Store.YES, Field.Index.NOT_ANALYZED_NO_NORMS));
indexWriter.AddDocument(document);
}
var persistentType = typeof(Order);
var classMetadata = DbContext.SessionFactory.GetClassMetadata(persistentType);
var properties = new List<PropertyInfo>();
for (int i = 0; i < classMetadata.PropertyTypes.Length; i++)
{
var propertyType = classMetadata.PropertyTypes[i];
if (propertyType.IsCollectionType || propertyType.IsEntityType) continue;
properties.Add(typeof(Order).GetProperty(classMetadata.PropertyNames[i]));
}
orders = session.CreateCriteria(typeof(Order)).List<Order>();
var idProperty = typeof(Order).GetProperty(classMetadata.IdentifierPropertyName);
foreach (var order in orders)
{
var document = new Document();
document.Add(new Field("Id", idProperty.GetValue(order, null).ToString(), Field.Store.YES, Field.Index.NOT_ANALYZED_NO_NORMS));
document.Add(new Field("ObjectTypeInvariantName", typeof(Order).FullName, Field.Store.YES, Field.Index.NOT_ANALYZED_NO_NORMS));
foreach (var property in properties)
{
var value = property.GetValue(order, null);
if (value != null)
{
document.Add(new Field(property.Name, value.ToString(), Field.Store.NO, Field.Index.ANALYZED_NO_NORMS));
}
}
indexWriter.AddDocument(document);
}
indexWriter.Optimize(true);
indexWriter.Commit();
return indexWriter.GetReader();
}
I m querying Organization Object from NHibernate and put them into Lucene. NET
此处简单搜索
var searchValue = textEdit1.Text;
var parser = new QueryParser(version, "FullName", analyzer);
parser.SetLocale(new CultureInfo("ru-RU"));
Query query = parser.Parse(searchValue);
var indexSearcher = new IndexSearcher(directory, true);
var docs = indexSearcher.Search(query, 10);
lblSearchTotal.Text = string.Format(totalPattern, docs.totalHits, organizations.Count() + orders.Count);
resultPanel.Controls.Clear();
foreach (var found in docs.scoreDocs)
{
var document = indexSearcher.Doc(found.doc);
var objectId = document.Get("Id");
var objectType = document.Get("ObjectTypeInvariantName");
if (resultPanel.Controls.Count > 0)
{
var labelSeparator = CreateSeparatorLabelControl();
resultPanel.Controls.Add(labelSeparator);
}
var labelCard = CreateFoundLabelControl();
resultPanel.Controls.Add(labelCard);
var organization = organizations.Where(o => o.ID.ToString() == objectId).FirstOrDefault();
if (organization != null)
{
labelCard.Text = string.Format("<b>{0}</b></br>{1}", organization.AccountNumber, organization.FullName);
labelCard.Tag = organization;
//labels[count].Text = string.Format("<b>{0}</b></br>{1}", organization.AccountNumber, organization.FullName);
//labels[count].Visible = true;
}
else
{
labelCard.Text = string.Format("Найден объект типа {0} с идентификатором {1} ", objectType, objectId);
labelCard.Tag = mainForm.GetObject(objectType, objectId);
}
labelCard.Visible = true;
//count++;
}
是的,Lucene支持单体编码,因为它以UTF-8格式储存地体。
<><><>><>>>>
Lucene在《UTF-8》中用tes语书写了单条编码。
String
软体字体是UTF-8,由tes编码。 首先,用斜体字书写,然后是tes。
String -> VInt, Chars
卢塞尼确实支持统一编码,但存在限制。 例如,一些文件阅读者不支持单编码。 此外,电子语言也像复数或零化字。 当你使用外语时,某些语言就消失了。
In my webpages I have references to js and images as such: "../../Content/Images/"Filename" In my code if I reference a file as above, it doesnt work so i have to write: "c:/miscfiles/"filename" 1-...
I m the only developer in my company, and am getting along well as an autodidact, but I know I m missing out on the education one gets from working with and having code reviewed by more senior devs. ...
Heres the problem, In Masterpage, the google analytics code were pasted before the end of body tag. In ASPX page, I need to generate a script (google addItem tracker) using codebehind ClientScript ...
I m looking for best practices here. Sorry. I know it s subjective, but there are a lot of smart people here, so there ought to be some "very good" ways of doing this. I have a custom object called ...
I am implementing Transaction using TransactionScope with the help this MSDN article http://msdn.microsoft.com/en-us/library/system.transactions.transactionscope.aspx I just want to confirm that is ...
i have the following base controller... public class BaseController : Controller { protected override void Initialize(System.Web.Routing.RequestContext requestContext) { if (...
For what it is necessary Microsoft.Contracts namespace in asp.net? I mean, in what cases I could write using Microsoft.Contracts;?
I d like to add a simple separator line in an aspx web form. Does anyone know how? It sounds easy enough, but still I can t manage to find how to do it.. 10x!