English 中文(简体)
如果满足条件,我如何在安达数据框架内将数值设定到至少一个群体。
原标题:How do I set values in a pandas dataframe to the minimum of a group if a condition is met

例如:

lst = [[ PF2 ,  E1 , -500, -127, 199971, 200164, True, True], 
       [ PR2 ,  E1 , -500, -167, 199655, 200124, True, True],
       [ PF2 ,  E1 , -500, -167, 199645, 200124, False, True],
       [ PF2 ,  E1 , -400, -127, 199971, 200564, True, True], 
       [ PR2 ,  E1 , -400, -167, 199155, 200324, True, True]]

df = pd.DataFrame(lst, columns=["Name", "Part", "Rel_s", 
                                        "Rel_e", "Abs_s", "Abs_e",
                                        "Quality_Start", "Quality_End"])

I want to modify this dataframe to change the values of Abs_s to the smallest for each combination of Part and Rel_s (and the same thing for Abs_e with Part and Rel_e with the largest value). This part works well with this code:

df[ Abs_s ] = df.groupby(["Part", "Rel_s"])[ Abs_s ].transform( min )

df[ Abs_e ] = df.groupby(["Part", "Rel_e"])[ Abs_e ].transform( max )

如同这一解决办法一样,因为这种办法似乎简单易懂;然而,我也想考虑质量价值,以便我在<条码>、Abs_s值><条码_Start:和最大<条码>>Abs_e><条码>>上填写<条码>。 因此,此处为E1,-500, 更正Abs_s。 页: 1 如果没有良好的质量(<>Tru)值以取代,则应保持Bad Quality(False)。

我能否补充这些条件? 我如何做这一转变?

问题回答

您可以将价值替换为<代码>。 正确的s in Quality_Start and Quality_End to Disappearance first, here in Series. where :

g = (df.assign(Abs_s=df[ Abs_s ].where(df[ Quality_Start ]),
               Abs_e=df[ Abs_e ].where(df[ Quality_End ]))
       .groupby(["Part", "Rel_s"]))
    
df[ Abs_s ] = g[ Abs_s ].transform( min )
df[ Abs_e ] = g[ Abs_e ].transform( max )

print (df)    
  Name Part  Rel_s  Rel_e     Abs_s   Abs_e  Quality_Start  Quality_End
0  PF2   E1   -500   -127  199655.0  200564           True         True
1  PR2   E1   -500   -167  199655.0  200324           True         True
2  PF2   E1   -500   -167  199655.0  200324          False         True
3  PF2   E1   -400   -127  199155.0  200564           True         True
4  PR2   E1   -400   -167  199155.0  200324           True         True

www.un.org/spanish/ecosoc 如何工作:

print (df.assign(Abs_s=df[ Abs_s ].where(df[ Quality_Start ]),
                 Abs_e=df[ Abs_e ].where(df[ Quality_End ])))  

  Name Part  Rel_s  Rel_e     Abs_s   Abs_e  Quality_Start  Quality_End
0  PF2   E1   -500   -127  199971.0  200164           True         True
1  PR2   E1   -500   -167  199655.0  200124           True         True
2  PF2   E1   -500   -167       NaN  200124          False         True
3  PF2   E1   -400   -127  199971.0  200564           True         True
4  PR2   E1   -400   -167  199155.0  200324           True         True

如果需要电离层的分类,可按<代码>将具有缺失值的一栏转换成。 Int64 :

g = (df.assign(Abs_s=df[ Abs_s ].where(df[ Quality_Start ]).astype( Int64 ),
               Abs_e=df[ Abs_e ].where(df[ Quality_End ]).astype( Int64 ))
       .groupby(["Part", "Rel_s"]))

df[ Abs_s ] = g[ Abs_s ].transform( min )
df[ Abs_e ] = g[ Abs_e ].transform( max )

print (df)    
  Name Part  Rel_s  Rel_e   Abs_s   Abs_e  Quality_Start  Quality_End
0  PF2   E1   -500   -127  199655  200164           True         True
1  PR2   E1   -500   -167  199655  200164           True         True
2  PF2   E1   -500   -167  199655  200164          False         True
3  PF2   E1   -400   -127  199155  200564           True         True
4  PR2   E1   -400   -167  199155  200564           True         True

www.un.org/spanish/ecosoc 如何工作:

print (df.assign(Abs_s=df[ Abs_s ].where(df[ Quality_Start ]).astype( Int64 ),
                 Abs_e=df[ Abs_e ].where(df[ Quality_End ]).astype( Int64 )))  

  Name Part  Rel_s  Rel_e   Abs_s   Abs_e  Quality_Start  Quality_End
0  PF2   E1   -500   -127  199655  200164           True         True
1  PR2   E1   -500   -167  199655  200164           True         True
2  PF2   E1   -500   -167    <NA>  200164          False         True
3  PF2   E1   -400   -127  199155  200564           True         True
4  PR2   E1   -400   -167  199155  200564           True         True

a. 测试,不匹配:

lst = [[ PF2 ,  E1 , -500, -127, 199971, 200164, False, True], 
       [ PR2 ,  E1 , -500, -167, 199655, 200124, False, True],
       [ PF2 ,  E1 , -500, -167, 199645, 200124, False, True],
       [ PF2 ,  E1 , -400, -127, 199971, 200564, True, True], 
       [ PR2 ,  E1 , -400, -167, 199155, 200324, True, True]]

df = pd.DataFrame(lst, columns=["Name", "Part", "Rel_s", 
                                        "Rel_e", "Abs_s", "Abs_e",
                                        "Quality_Start", "Quality_End"])


g = (df.assign(Abs_s=df[ Abs_s ].where(df[ Quality_Start ]).astype( Int64 ),
               Abs_e=df[ Abs_e ].where(df[ Quality_End ]).astype( Int64 ))
       .groupby(["Part", "Rel_s"]))

df[ Abs_s ] = g[ Abs_s ].transform( min )
df[ Abs_e ] = g[ Abs_e ].transform( max )

print (df)    
  Name Part  Rel_s  Rel_e   Abs_s   Abs_e  Quality_Start  Quality_End
0  PF2   E1   -500   -127    <NA>  200164          False         True
1  PR2   E1   -500   -167    <NA>  200164          False         True
2  PF2   E1   -500   -167    <NA>  200164          False         True
3  PF2   E1   -400   -127  199155  200564           True         True
4  PR2   E1   -400   -167  199155  200564           True         True

print (df.assign(Abs_s=df[ Abs_s ].where(df[ Quality_Start ]).astype( Int64 ),
                 Abs_e=df[ Abs_e ].where(df[ Quality_End ]).astype( Int64 )))  
  Name Part  Rel_s  Rel_e   Abs_s   Abs_e  Quality_Start  Quality_End
0  PF2   E1   -500   -127    <NA>  200164          False         True
1  PR2   E1   -500   -167    <NA>  200164          False         True
2  PF2   E1   -500   -167    <NA>  200164          False         True
3  PF2   E1   -400   -127  199155  200564           True         True
4  PR2   E1   -400   -167  199155  200564           True         True

EDIT:

#one groupby with created 2 temporary columns
g = (df.assign(Abs_s_t=df[ Abs_s ].where(df[ Quality_Start ]).astype( Int64 ),
               Abs_e_t=df[ Abs_e ].where(df[ Quality_End ]).astype( Int64 ))
       .groupby(["Part", "Rel_s"])) 
 
#if need replace by original values
df[ Abs_s ] = g[ Abs_s_t ].transform( min ).fillna(df[ Abs_s ]) 
df[ Abs_e ] = g[ Abs_e_t ].transform( max ).fillna(df[ Abs_e ])

#if need replace by min/max values per groups
df[ Abs_s1 ] = g[ Abs_s_t ].transform( min ).fillna(g[ Abs_s ].transform( min )) 
df[ Abs_e1 ] = g[ Abs_e_t ].transform( max ).fillna(g[ Abs_e ].transform( max )) 

print (df)    
  Name Part  Rel_s  Rel_e   Abs_s   Abs_e  Quality_Start  Quality_End  Abs_s1  Abs_e1
0  PF2   E1   -500   -127  199971  200164          False         True  199645  200164
1  PR2   E1   -500   -167  199655  200164          False         True  199645  200164
2  PF2   E1   -500   -167  199645  200164          False         True  199645  200164
3  PF2   E1   -400   -127  199155  200564           True         True  199155  200564
4  PR2   E1   -400   -167  199155  200564           True         True  199155  200564

某些组别的过滤器将作trick,只有情况确实变化的数值。 质量法的所有其他价值保持不变。

start = df[ Quality_Start ]
end = df[ Quality_End ]
df.loc[start,  Abs_s ] = df[start].groupby(["Part", "Rel_s"])[ Abs_s ].transform( min )
df.loc[end,  Abs_e ] = df[end].groupby(["Part", "Rel_s"])[ Abs_e ].transform( max )

End result:

Name Part Rel_s Rel_e Abs_s Abs_e Quality_Start Quality_End
0 PF2 E1 -500 -127 199655 200164 True True
1 PR2 E1 -500 -167 199655 200164 True True
2 PF2 E1 -500 -167 199645 200164 False True
3 PF2 E1 -400 -127 199155 200564 True True
4 PR2 E1 -400 -167 199155 200564 True True




相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...

热门标签