我有这样的数据框架:
user1,product1,0
user1,product2,2
user1,product3,1
user1,product4,2
user2,product3,0
user2,product2,2
user3,product4,0
user3,product5,3
The data frame has millions of rows. I need to go through each row, and if the value in the last column is 0, then keep that product number, otherwise attach the product number to the previous product number that has value = 0, then write to a new data frame.
例如,由此形成的矩阵应当
user1,product1
user1,product1product2
user1,product1product3
user1,product1product4
user2,product3
user2,product3product2
user3,product4
user3,product4product5
I wrote a for
loop to go through each row, and it works, but is very very slow. How can I speed it up? I tried to vectorize it, but I m not sure how because I need to check the value of previous row.