Research project exploring short-horizon stock forecasting by combining time-series signals with textual and macroeconomic context.
The repository compares multiple modeling approaches across notebooks and focuses on whether adding richer contextual inputs, such as sentiment and market indicators, can improve prediction quality over purely price-based baselines.
- Forecast short-term stock movement and price behavior
- Compare classical and deep learning approaches
- Incorporate textual information and market context into the modeling pipeline
- Evaluate how different modeling choices affect forecast quality and stability
The experiments in this repository draw from several types of input:
- historical stock prices
- ticker-level sequences
- VIX and market context
- textual summaries and sentiment-style signals
SARIMA.ipynb: classical statistical baselineLSTM.ipynb: recurrent neural network baselineT5.ipynb: sequence-to-sequence forecasting experimentsTFT.ipynb,TFT_1.ipynb,TFT_Phase1.ipynb,TFT_Phase2.ipynb,TFT_Text.ipynb: Temporal Fusion Transformer experiments, including multimodal and sentiment-aware variants
CSCI_566___Final_Project_Report.pdf: project report65_G.O.A.T.pdf: additional report artifactprocessed_vix.csv: processed macro/market signal inputall_tickers (1).txt: ticker universe reference- notebook-based experimentation and evaluation files
- Start with
CSCI_566___Final_Project_Report.pdffor the research framing. - Review
SARIMA.ipynbandLSTM.ipynbfor baseline approaches. - Move to the
TFT*notebooks to see the richer multimodal forecasting work. - Inspect
T5.ipynbfor alternative sequence-model experimentation.
- Python
- Jupyter Notebook
- PyTorch Forecasting
- Transformers
- statsmodels
- yfinance
- pandas and NumPy
This project demonstrates applied ML research thinking: comparing baselines, layering in richer signals, and using multiple model families to study a difficult forecasting problem rather than relying on a single technique.