Question

我想找到这种阵列的独特价值。当然,由于浮动点精确性,这一点存在问题。 ......因此,我想在拟定哪些要素是独一无二时,能够确定三角价值,用于比较。

是否有办法这样做? 现在我只是做的是:

unique(array)

给我这样的东西:

array([       -Inf,  0.62962963,  0.62962963,  0.62962963,  0.62962963,
    0.62962963])

如果看上去的数值(显示的血清点数)明显不同。

Answer 1

代码<>下限/代码>和<代码>下限在某些情况下,双方都没有遵守《任择议定书》的要求?

np.floor([5.99999999, 6.0]) # array([ 5.,  6.])
np.round([6.50000001, 6.5], 0) #array([ 7.,  6.])

The way I would do it is (and this may not be optimal (and is surely slower than other answers)) something like this:

import numpy as np
TOL = 1.0e-3
a = np.random.random((10,10))
i = np.argsort(a.flat)
d = np.append(True, np.diff(a.flat[i]))
result = a.flat[i[d>TOL]]

当然,这种方法将排除具有任何其他价值的容忍力的所有价值观的最大成员,也就是说,如果所有价值观都大为接近,即使最大程度大于宽容,你可能不会发现任何独特的价值观。

这里基本上是相同的算法,但更容易理解,而且应当更快,因为它避免了指数化步骤:

a = np.random.random((10,))
b = a.copy()
b.sort()
d = np.append(True, np.diff(b))
result = b[d>TOL]

附属履行机构不妨审议<代码>cipy.cluster。 (关于这种方法的精度版本)或<代码>numpy. 数字化 (关于其他两种方法的原始版本)

Answer 2

Another possibility is to just round to the nearest desirable tolerance:

np.unique(a.round(decimals=4))

www.un.org/spanish/ga/president

<><>Edit>: 仅指出,根据我的时间安排,我的解决办法和“unutbu s”几乎完全是一样的速效(地雷可能更快5%),因此要么是一个好的解决办法。

www.un.org/Depts/DGACM/index_spanish.htm Edit #2: 这是为了解决保罗的关切。这无疑是缓慢的,而且可能有一些优化,但我把它放在表面上,以显示停滞不前:

def eclose(a,b,rtol=1.0000000000000001e-05, atol=1e-08):
    return np.abs(a - b) <= (atol + rtol * np.abs(b))

x = np.array([6.4,6.500000001, 6.5,6.51])
y = x.flat.copy()
y.sort()
ci = 0

U = np.empty((0,),dtype=y.dtype)

while ci < y.size:
    ii = eclose(y[ci],y)
    mi = np.max(ii.nonzero())
    U = np.concatenate((U,[y[mi]])) 
    ci = mi + 1

print U

如果在精度范围内存在许多重复的价值观,如果许多价值观是独特的,那么,这就会变得缓慢。此外,最好制定<代码>。 U作为清单,随行附上,但正在进一步优化。

Answer 3

我刚刚注意到,已接受的答复没有工作。例如:

a = 1-np.random.random(20)*0.05
<20 uniformly chosen values between 0.95 and 1.0>
np.sort(a)
>>>> array([ 0.9514548 ,  0.95172218,  0.95454535,  0.95482343,  0.95599525,
             0.95997008,  0.96385762,  0.96679186,  0.96873524,  0.97016127,
             0.97377579,  0.98407259,  0.98490461,  0.98964753,  0.9896733 ,
             0.99199411,  0.99261766,  0.99317258,  0.99420183,  0.99730928])
TOL = 0.01

成果:

a.flat[i[d>TOL]]
>>>> array([], dtype=float64)

简而言之,由于分类的投入阵列的数值没有足够空间,至少是“TOL”,而正确的结果应当是:

>>>> array([ 0.9514548,  0.96385762,  0.97016127,  0.98407259,
             0.99199411])

(although it depends how you decide which value to take within the “TOL“)

你们应当利用这样的事实,即不产生这种机器精确效应:

np.unique(np.floor(a/TOL).astype(int))*TOL
>>>> array([ 0.95,  0.96,  0.97,  0.98,  0.99])

这比拟议解决办法快5倍(按百分比计算)。

请注意,“类型(int)”是任择性的,尽管取消这一类别使业绩恶化了1.5倍,因为从一系列的暗中提取独一无二之处的速度要快得多。

您不妨在独一无二的成果中增加一半的“TOL”,以弥补下限效应:

(np.unique(np.floor(a/TOL).astype(int))+0.5)*TOL
>>>> array([ 0.955,  0.965,  0.975,  0.985,  0.995])

Answer 4

在目前版本的NumPy(1.23)中,numpy.unique有任择参数return_index,以回归每个独特价值首次出现的指数。因此,你可以简单地使用<代码>numpy.unique和return_index=True,在圆形阵列上,并将原阵列索引,以获取原始的非环绕价值。与此类似:

decimals = 3
X_unique_with_tolerance = X[np.unique(X.round(decimals), return_index=True)[1]].shape

Answer 5

How about something like

np.unique1d(np.floor(1e7*x)/1e7)

<代码>x为原文。

Answer 6

我也为此表示支持https://github.com/nschloe/npx”rel=“nofollow noreferer”>npx。 (一) 小型绝食延伸包。

import npx

a = [0.1, 0.15, 0.7]
a_unique = npx.unique(a, tol=2.0e-1)

assert all(a_unique == [0.1, 0.7])

友情链接