Skip to content

The project involved detailed sentiment analysis of financial news articles and sophisticated time series analysis of stock prices. It determined that LSTM models, reinforced with differential privacy (LSTM-DP), surpassed conventional time series models in forecasting accuracy.

Notifications You must be signed in to change notification settings

CeciliaYu0821/LSTM-DP-for-Stock-Prediction

Repository files navigation

Differential Privacy LSTM for Stock Prediction with Financial News

The project involved detailed sentiment analysis of financial news articles and sophisticated time series analysis of stock prices. It determined that LSTM models, reinforced with differential privacy (LSTM-DP), surpassed conventional time series models in forecasting accuracy with a 60% reduction in MSE due to their more robust feature integration.

Project Structure

The project is organized into the following components:

  • 'news_with_sentiment.csv': The dataset contains 3 columns - published_date, source_name (cnbc/fortune/reuters/wsj), and compound (compound score extracted from financial news articles via sentiment analysis).

  • 'Project_all_models.ipynb.': The notebook contains the entire steps taken in the data analysis process.
    Data Preprocessing - Extract S&P 500, Nasdaq 100 Index, the Dow Jones Industrial Average, and the Russell 2000 Index from Yahoo Finance API. Calculate the daily compound sentiment scores of four sources, respectively.
    Model Constructure - Build a suite of 3 models (ARIMA, Random Forest, and LSTM with/without sentiment/DP).
    Model Evaluation - Evaluate the performance of the constructed models based on MSE, R-square, and accuracy.

  • 'Project Report.pdf' The final report of the project.

  • 'Slide.pdf' The slide presented at class.

Conclusions

The LSTM model outperformed ARIMA and Random forest for index time series prediction. All time series models work better on the return data as it is normalized with the same scale. LSTM with DP would achieve the best performance in most scenarios as the features it takes would be more robust.

Contact Information

If you have any questions, please contact [simin.yu@columbia.edu].

About

The project involved detailed sentiment analysis of financial news articles and sophisticated time series analysis of stock prices. It determined that LSTM models, reinforced with differential privacy (LSTM-DP), surpassed conventional time series models in forecasting accuracy.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published