English 中文(简体)
如何减轻与最弱小的适应问题,或者说,我可能听不到误解?
原标题:How to mitigate the overfitting problem with lstm, or maybe I m misunderstanding lstm training?
  • 时间:2024-02-09 01:21:06
  •  标签:
  • python
  • lstm

My goal is to predict whether a shot is hit or not by the trajectory of the basketball (there are already papers that do the same, I m just reproducing them (https://arxiv.org/abs/1608.03793)). It s a very simple problem, but for some other purpose I want to try lstm s prediction. But the network was not only slow to converge but start overfitting very quickly. A time series has a length of 12 and a feature dimension of 4 (x,y,z coordinates and time)


class PolicyHead(nn.Layer):
    def __init__(self):
        super(PolicyHead, self).__init__()
        self.lstm = LSTM(4,64,2,dropout=0.3)
        self.fc = nn.Linear(64,2)

    def forward(self, past_traj):
        b,n,c = past_traj.shape
         
        return self.fc(self.lstm(past_traj, batch_id))

“entergraph

相比之下,使用了非常简单的乳房结构,预测非常好。

class PolicyHead(nn.Layer):
    def __init__(self):
        super(PolicyHead, self).__init__()
        self.fc = nn.Sequential(
            nn.Linear(12*4, 512),
            nn.ReLU(),
            nn.Linear(512,512),
            nn.ReLU(),
            nn.Linear(512,2)
        )

    def forward(self, past_traj):
        b,n,c = past_traj.shape
        return self.fc(past_traj.reshape([b,-1]))

enter image description here

问题回答

Your graph does suggest overfitting, especially if it continues to increase if more epochs are added. I don t see where you are setting the training optimizer, but you can also use L1/L2 regularization to update the Learning rate; this will help with overfitting, and I think this will be beneficial regardless of what decision you take for your over-fitting issue.

现在,为了直接解决过度适应问题,你可以增加气球的正常化,将辍学率调整为更高的价值(即0.5),和(或)利用经常辍学(你可以共同使用)。

如果数据差异很大,而且你的批量已经很大,则按正常情况分类是非常有益的。

辍学和经常辍学只是培训过程中的行为,因此没有必要担心影响推论阶段。

class PolicyHead(nn.Layer):
    def __init__(self):
        super(PolicyHead, self).__init__()
        self.lstm = LSTM(4,64,2,dropout=0.3, recurrent_dropout=0.3)
        self.fc = nn.Linear(64,2)

    def forward(self, past_traj):
        b,n,c = past_traj.shape
         
        return self.fc(self.lstm(past_traj, batch_id))

你们可以共同利用这些价值观,但trick计是尝试错误,只使用经常辍学,尝试某些价值观,并检验其行为。

Now, if you want to apply BatchNormalization, you will need to update a little more. In your case

class PolicyHead(nn.Layer):
    def __init__(self):
        super(PolicyHead, self).__init__()
        self.lstm = LSTM(4,64,2,dropout=0.3, recurrent_dropout=0.3) #Assuming this is nn.LSTM
        self.batch_norm = nn.BatchNorm1d(64)
        self.fc = nn.Linear(64,2)

    def forward(self, past_traj):
        b,n,c = past_traj.shape
        lstm_out = self.lstm(past_traj, batch_id)[:-1:]
        return self.fc(self.batch_norm(lstm_out)

Hope this helps





相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...

热门标签