English 中文(简体)
判决之间的内在相似性
原标题:semantic similarity between sentences
  • 时间:2010-01-10 17:29:44
  •  标签:
  • java
  • nlp

I m从事一个项目。 我需要任何开放源工具或技术,以找到两句的属人性相似之处,我将两句作为投入,并作为产出获得分数(即异人相似)。 任何帮助?

问题回答

Salma,恐怕这不是你提问的适当论坛,因为它与方案规划没有直接关系。 我建议你再次在corpora list上提问。 你也不妨首先查找档案。

除此之外,你的问题不够准确,我解释了我所说的话。 我假定,你的项目是计算判决与判决之间的内在相似性,而不是在很多人中,相互相似之处。 如果是这样,那么可以考虑: 首先,无论是从计算语言还是理论语言的角度来看,都不清楚语义上的相似性的确切含义。 其观点和定义多种多样,都取决于所要解决的问题类型、手头的工具和技术以及完成这项任务的背景等。 考虑这些例子:

  1. Pete and Rob have found a dog near the station.
  2. Pete and Rob have never found a dog near the station.
  3. Pete and Rob both like programming a lot.
  4. Patricia found a dog near the station.
  5. It was a dog who found Pete and Rob under the snow.

第2-4号判决中哪些判决与第1条相似? 2 是1对面的,但还是Pete和Rob(而不是)发现狗。 3 涉及佩特和罗布,但情况完全不同。 4 即将在站附近找到一个狗,但发现者是其他人。 5 涉及佩特、罗布、狗和寻找活动,但方式不同于1。 就我而言,即使不必写计算机程序,我也无法根据这些例子的相似性来排列。

为了理解同感,你必须首先决定你想要什么与世俗相近,什么不是。 为了计算刑期上的属人性相似性,理想的做法是比较判决的某种含义。 平均代表制通常属于逻辑公式,极为复杂。 然而,有一些试图这样做的工具,例如Boxer

作为一种简单但往往是实际的做法,你将把属人性的相似性定义为一句话中与另一句话之间相似之处的总和。 这使问题变得非常容易,尽管仍有一些困难的问题需要解决,因为语义上的相似性与判决的相似性一样,也很难界定。 如果想给人以这种印象,请看D.A. Cruse(1986年)所编的Lexical Semantics。 然而,有许多工具和技术用来计算二字之间的对应性。 其中一些国家将其基本定义为在以下一类的税制中两字的负面距离:Word Net或Wikipedia taxonomy(见 rel=“nofollow noretinger”>。 另一些则采用一些统计措施,对大体文字群进行计算,从而对同异性进行补偿。 它们基于这样的见解,即类似言论在类似情况下发生。 第三种计算“之间同声”的方法涉及您从信息检索中可能知道的病媒空间模型。 为了解后一种技术,请在《》一书中研究第8.5章。 曼宁和舒尔茨的统计自然语言处理基金会

现在,希望你们能够站出来。

I have developed a simple open-source tool that does the semantic comparison according to categories: https://sourceforge.net/projects/semantics/files/

It works with sentences of any length, is simple, stable, fast, small in size... Here is a sample output:
Similarity between the sentences
-Pete and Rob have found a dog near the station.
-Pete and Rob have never found a dog near the station.
is: 1.0000000000

Similarity between the sentences
-Patricia found a dog near the station.
-It was a dog who found Pete and Rob under the snow.
is: 0.7363210405107239

Similarity between the sentences
-Patricia found a dog near the station.
-I am fine, thanks!
is: 0.0

Similarity between the sentences
-Hello there, how are you?
-I am fine, thanks!
is: 0.29160592175990213

USAGE:

import semantics.Compare;
public class USAGE {

public static void main(String[] args) {

    String a = "This is a first sentence.";
    String b = "This is a second one.";

    Compare c = new Compare(a,b);
    System.out.println("Similarity between the sentences
-"+a+"
-"+b+"
 is: " + c.getResult());

    }

}

You can try using the UMBC Semantic Similarity Service which is based on WordNet KB. There are UMBC STS (Semantic Textual Similarity) Service. Here is the link http://swoogle.umbc.edu/StsService/sts.html

关于





相关问题
Spring Properties File

Hi have this j2ee web application developed using spring framework. I have a problem with rendering mnessages in nihongo characters from the properties file. I tried converting the file to ascii using ...

Logging a global ID in multiple components

I have a system which contains multiple applications connected together using JMS and Spring Integration. Messages get sent along a chain of applications. [App A] -> [App B] -> [App C] We set a ...

Java Library Size

If I m given two Java Libraries in Jar format, 1 having no bells and whistles, and the other having lots of them that will mostly go unused.... my question is: How will the larger, mostly unused ...

How to get the Array Class for a given Class in Java?

I have a Class variable that holds a certain type and I need to get a variable that holds the corresponding array class. The best I could come up with is this: Class arrayOfFooClass = java.lang....

SQLite , Derby vs file system

I m working on a Java desktop application that reads and writes from/to different files. I think a better solution would be to replace the file system by a SQLite database. How hard is it to migrate ...

热门标签