top of page

Your certificate is now private

CertificateBackground.png

Certificate of Completion

ErdosHorizontal.png

THIS ACKNOWLEDGES THAT

HAS COMPLETED THE FALL 2024 DATA SCIENCE BOOT CAMP

Xiaokang Wang

Roman Holowinsky, PhD

December 11, 2024

DIRECTOR

DATE

clear.png

TEAM

Analyzing the Impact of News Topics on Stock Prices

Xiangwei Peng, Xiaokang Wang

clear.png

The stock price is influenced by numerous factors. We focus on using the daily news information to model the abnormal gain of the stock return which is not explained by the market information. To be more precisely, we build an automatic pipeline of :
- Stock price, news, factors ingestion;
- Preprocessing both stock and news data;
- Classifying the news and predicting the future price.

We use the Famma-French 5 factor model to get the abnormal return. We annotate the news using a soft-voting classifier and do the clustering of topics using the Hierarchical Dirichlet Process (HDP). Finally, we regress the abnormal return using the normalized daily topics counts as the features and XGBoost as the model.

Here are the datasets:
- Headlines: https://www.kaggle.com/datasets/rmisra/news-category-dataset 
- News: https://components.one/datasets/all-the-news-articles-dataset/ 
- Factors: https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html
- Stock price: Yahoo Finance

Screen Shot 2022-06-03 at 11.31.35 AM.png
github URL

©2017-2025 by The Erdős Institute.

bottom of page