A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction
Implements the algorithm described in https://arxiv.org/pdf/1704.02971.pdf in Pytorch. The implementation also includes a Dataset class with additional utility for handling messy data sets. The implementation also includes an implementation of Adam with weight decay and Cosine annealing to increase training speed.