English 中文(简体)
如何在熊猫身上找到两个不同数据框中的两点之间的距离?
原标题:How to find the distance between 2 points in 2 different dataframes in pandas?

我有两个数据框架,每个框架都有一套坐标。 Dataframe 1 是一份生物量地点清单,坐标为 lat和 lng 列。 Dataframe 2 是一份邮政编码坐标清单,与销售价格挂钩,坐标为pc_lat 和 pc_lng 列。

我用"https://stackoverflow.com/ questions/36843122/ how-to-find-the-losest-match-mach- based-on-2-keys-from-one-dataframe-to-an-an-another"这个堆叠流问题 来找出每个地产最近的生物量站点。 这是我使用的代码 :

def dist(lat1, long1, lat2, long2):
return np.abs((lat1-lat2)+(long1-long2))

def find_site(lat, long):
    distances = biomass.apply(
        lambda row: dist(lat, long, row[ lat ], row[ lng ]), 
        axis=1)
    return biomass.loc[distances.idxmin(), Site Name ]

hp1995[ BiomassSite ] = hp1995.apply(
    lambda row: find_site(row[ pc_lat ], row[ pc_long ]), 
    axis=1)

print(hp1995.head())

这效果很好,因为我得到了最接近生物质生成地点的名字, 但我想知道这两个地点之间的距离。

  1. 我如何计算距离?

  2. 输出距离的量度是多少? 我试图在离生物量地点2公里内找到特性。

问题回答

要计算两个全球坐标之间的距离,您应该使用Hawrsine Frease ,基于页,我采用了以下方法:

import math
def distanceBetweenCm(lat1, lon1, lat2, lon2):
    dLat = math.radians(lat2-lat1)
    dLon = math.radians(lon2-lon1)

    lat1 = math.radians(lat1)
    lat2 = math.radians(lat2)

    a = math.sin(dLat/2) * math.sin(dLat/2) + math.sin(dLon/2) * math.sin(dLon/2) * math.cos(lat1) * math.cos(lat2)
    c = 2 * math.atan2(math.sqrt(a), math.sqrt(1-a))
    return c * 6371 * 100000 #multiply by 100k to get distance in cm

您也可以修改它来返回不同的单位, 乘以不同的功率 10 。 举例来说, 100k 乘以100k 的乘以单位的厘米。 不乘以方法返回的距离以公里计。 必要时, 您可以从那里执行更多的单位转换 。

编辑:如评论中建议的那样,最优化的办法是使用电源运营商,而不是常规的乘法,例如:

a = math.sin(dLat/2)**2 + math.sin(dLon/2)**2 * math.cos(lat1) * math.cos(lat2)

看看这个 问题,以便阅读更多关于Python计算能力的不同速度复杂性。


“强度”编辑 : “/强度” 几年后,我需要再次计算拉特, lon 点之间的哈弗辛距离。 这个答案对我仍然有用, 因为它计算了正确的距离 *, 不需要外部图书馆 。

然而,如果我们去掉这些小细节,我们可以看到,我提供的硬码地球半径为6371的算法,它并不认为地球半径不统一(锅炉警报:离极越近越小,离赤道越近越近)。

在大多数情况下,我们很可能能够忍受这种情况,因为它引入了一个小近似误差,“farther”“real”半径是用于你输入的点(用更精确的方式/来源进行某些测量,最坏的情况误差低于2米,在大多数情况下是次计量 )。

另一种办法是使用一个图书馆来考虑大地测量模型来计算实际半径。 我发现的一个这样的图书馆是 geopy , 特别是 geopy.tery. geodosic () 方法。 但是,这将引入外部依赖作为权衡。





相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...

热门标签