AI-first Finance: Discovering and Forecasting with Alternative Data
In the financial area, decision-making traditionally relied on quantitative indicators collected from financial statements manually. In the last decade, the explosion in the sheer magnitude of data, such as financial news, earnings conference call voice recordings, SEC 10k reports, etc. has brought a huge opportunity and has been playing an increasing role in asset management, decision-making tasks. Each type of data has its own advantages and disadvantages. High-frequency textual data such as social blogs are relatively short and can reflect real-time events, but always involves a lot of noise. Medium-frequency text data including financial news usually have clean content because they are from the official provider, which makes it easy to process and analyze, but the information carried has a certain lag. The professional financial documents, such as 10k reports, are more reliable and contain large valuable information, but are released quarterly or annually. The aim of the project is to leverage the advantages of different types of data and modern natural language processing (NLP) and artificial intelligence (AI) technologies to make a precise financial market prediction and assist the investor’s decision-making process. Objective1: Develop an approach incorporating multi-source text data, i.e. low-frequency, medium-frequency, and high-frequency financial text sources, into financial prediction. Objective2: By building a Graph Convolutional Network (GCN) for economic entities, extract the relationship of different entities, and prompt the use of indirectly correlated text data. 40