English 中文(简体)
需要有基于3个要点和重要因素的适当数据结构或指数
原标题:Need a proper data structure or index for fast user lookup based on 3d points and importance factors

我有3个多点重要因素。

每个用户有六点。 例如,理查人有6点:(22 44,55人)是他的第一个重要因素,有3点(10,0)是他的第二大病媒,其重要因素为2.8倍,其第六点(100,300,200人),重要因素为0.4。

我想做的是,找到与查理最相似的人,而不要通过其他每个人。 基本将这一功能降至每个用户(即,将用户的正确六点与查理相匹配):

pythagoras(point, point2) * max(importance_factor, importance_factor2) * (abs(importance_factor - importance_factor2) + 1)

之后,通过选择成本最低的用户,发现用户最类似于查理。 我撰写了该法典的 way路(做了很多的路程),但我想找到一种办法,妥善处理有多个要点和重要因素的事实。

我开始研究生育间隔指数,但我不认为他们会工作,因为我有多个要点,但也许我可以把要点推向一个更高的层面? 因此,在3个层面,我可以有1个层面? 然而,这个因素仍然无法处理,但比任何因素都好。

不幸的是,我可以在这里把我们的矢量和宇宙引向来,因为(1,1,1)和(400,400,400)是“每个钟”与事情相反的。

任何想法?

最佳回答

Since you haven t gotten any answers yet, I thought I would at least contribute some thoughts. I have used a python k-d tree module for quickly searching nearest neighbor points:
http://code.google.com/p/python-kdtree/downloads/detail?name=kdtree.py
It takes arbitrary point lengths as long as they are the same sizes.

我不敢肯定你将如何运用“进口”的权重,但这里只是关于如何利用树木模块至少使最接近的“人民”达到某个人的每一点的集思广益:

import numpy
from kdtree import KDTree
from itertools import chain

class PersonPoint(object):

    def __init__(self, person, point, factor):
        self.person = person 
        self.point = point 
        self.factor = factor 

    def __repr__(self):
        return  <%s: %s, %0.2f>  % (self.person, 
            [ %0.2f  % p for p in self.point], self.factor) 

    def __iter__(self):
        return self.point

    def __len__(self):
        return len(self.point)

    def __getitem__(self, i):
        return self.point[i]


people = {}
for name in ( bill ,  john ,  mary ,  jenny ,  phil ,  george ):
    factors = numpy.random.rand(6)
    points = numpy.random.rand(6, 3).tolist()
    people[name] = [PersonPoint(name, p, f) for p,f in zip(points, factors)]

bill_points = people[ bill ]
others = list(chain(*[people[name] for name in people if name !=  bill ]))

tree = KDTree.construct_from_data(others)

for point in bill_points:
    # t=1 means only return the 1 closest.
    # You could set it higher to return more.
    print point, "=>", tree.query(point, t=1)[0]

成果:

<bill: [ 0.22 ,  0.64 ,  0.14 ], 0.07> => 
    <phil: [ 0.23 ,  0.54 ,  0.11 ], 0.90>

<bill: [ 0.31 ,  0.87 ,  0.16 ], 0.88> => 
    <phil: [ 0.36 ,  0.80 ,  0.14 ], 0.40>

<bill: [ 0.34 ,  0.64 ,  0.25 ], 0.65> => 
    <jenny: [ 0.29 ,  0.77 ,  0.28 ], 0.40>

<bill: [ 0.24 ,  0.90 ,  0.23 ], 0.53> => 
    <jenny: [ 0.29 ,  0.77 ,  0.28 ], 0.40>

<bill: [ 0.50 ,  0.69 ,  0.06 ], 0.68> => 
    <phil: [ 0.36 ,  0.80 ,  0.14 ], 0.40>

<bill: [ 0.13 ,  0.67 ,  0.93 ], 0.54> => 
    <jenny: [ 0.05 ,  0.62 ,  0.94 ], 0.84>

I figured with the result, you could look at the most frequent matched "person" or then consider the weights. Or maybe you can total up the important factors in the results and then take the highest rated one. That way, if mary only matched once but had like a 10 factor, and phil had 3 matched but only totaled to 5, mary might be more relevant?

我知道,你在建立指数方面有着更强有力的功能,但需要贯穿你收集的每一点。

问题回答

暂无回答




相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...

热门标签