-
Notifications
You must be signed in to change notification settings - Fork 14
Description
RecurrentModel inherits from RecurrentNeuralNetwork. In this model, the out_features argument is optional. However, in the __init__ method, the following call is made:
super().__init__(
in_features,
hidden_features=hidden_features[:-1],
out_features=hidden_features[-1],
batch_first=batch_first,
return_cell_state=return_cell_state,
)This implementation causes self.out_features to always be initialized with the last element of hidden_features, even when out_features is not explicitly provided by the user.
The problem
In the forward method, RecurrentModel checks for the existence of self.out_features. If present, the model attempts applies a dense layer (head).
if self.out_features is not None:
outputs = self.head(outputs)
...
return outputsHowever, the presence of self.out_features is enforced, leading to a scenario where the model attempts to apply a dense head that was not intended by the user, as it is conditioned on a user-defined out_features that may not exist.
One simple solution might be to change the argument name to "model_out_features" but that would disrupt naming consistency in deepplay.
The following code snippet reproduces the issue:
rnn = dl.RecurrentModel(
in_features=16,
hidden_features=[2],
rnn_type="LSTM",
)
rnn = rnn.create()
x=torch.rand((32, 20, 16))
o = rnn(x)