Renault, T., 2017, Journal of Banking and Finance 84, 25-40
We implement a novel approach to derive investor sentiment from messages posted on social media before we explore the relation between online investor sentiment and intraday stock returns. Using an extensive dataset of messages posted on the microblogging platform StockTwits, we construct a lexicon of words used by online investors when they share opinions and ideas about the bullishness or the bearishness of the stock market. We demonstrate that a transparent and replicable approach significantly outperforms standard dictionary-based methods used in the literature while remaining competitive with more complex machine learning algorithms. Aggregating individual message sentiment at half-hour intervals, we provide empirical evidence that online investor sentiment helps forecast intraday stock index returns. After controlling for past market returns, we find that the first half-hour change in investor sentiment predicts the last half-hour S&P 500 index ETF return. Examining users’ self-reported investment approach, holding period and experience level, we find that the intraday sentiment effect is driven by the shift in the sentiment of novice traders. Overall, our results provide direct empirical evidence of sentiment-driven noise trading at the intraday level.
Picault, M., Renault, T., 2017, Journal of International Money and Finance 79, 136-156
We develop a field-specific dictionary to measure the stance of the European Central Bank monetary policy (dovish, neutral, hawkish) and the state of the Eurozone economy (positive, neutral, negative) through the content of ECB press conferences. In contrast with traditional textual analysis, we propose a novel approach using term-weighting and contiguous sequence of words (n-grams) to better capture the subtlety of central bank communication. We find that quantifying ECB communication using our field-specific weighted lexicon do help predicting future ECB monetary decision and European stock market volatility. Our indicators significantly outperform a textual classification based on the Loughran-McDonald or Apel-Blix-Grimaldi dictionaries and a media-based measure of economic policy uncertainty.
GDP statistics in France are published on a quarterly basis, 30 days after the end of the quarter. In this article, we consider media content as an additional data source to traditional economic tools to improve short-term forecast/nowcast of French GDP. We use a database of more than a million articles published in the newspaper Le Monde between 1990 and 2017 to create a new synthetic indicator capturing media sentiment about the state of the economy. We compare an autoregressive model augmented by the media sentiment indicator with a simple autoregressive model. We also consider an autoregressive model augmented with the Insee Business Climate indicator. Adding a media indicator improves French GDP forecasts compared to these two reference models. We also test an automated approach using penalised regression, where we use the frequencies at which words or expressions appear in the articles as regressors, rather than aggregated information. Although this approach is easier to implement than the former, its results are less accurate.
Gillet, R., Renault, T., 2019, Finance, 40-2, 7-49.
We investigate the efficient market hypothesis at the intraday level by analyzing market reactions to negative tweets and reports published on the Internet by an activist short seller. Conducting event studies, we find that fast-moving traders can generate small, albeit significant, abnormal profit by trading on public information published on social media. The market reaction to tweets is stronger when a company is mentioned for the first time on Twitter, showing that investors can disentangle new information from noise in real time. We also find that traders who manage to identify the information on the short seller’s website before the dissemination of the same news on Twitter can generate much greater abnormal returns. As acquiring information on a website is more costly and difficult than acquiring the same information on Twitter, our findings provide empirical evidence supporting the Grossman–Stiglitz paradox at the intraday level. Very short-lived market anomalies do exist in the stock market to compensate investors who spent time and money in setting up bots and algorithms to trade on new information before the crowd.
We use a large dataset of one million messages sent on the microblogging platform StockTwits to evaluate the performance of a wide range of preprocessing methods and machine learning algorithms for sentiment analysis in finance. We find that adding bigrams and emojis significantly improve sentiment classification performance. However, more complex and time-consuming machine learning methods, such as random forests or neural networks, do not improve the accuracy of the classification. We also provide empirical evidence that the preprocessing method and the size of the dataset have a strong impact on the correlation between investor sentiment and stock returns. While investor sentiment and stock returns are highly correlated, we do not find that investor sentiment derived from messages sent on social media helps in predicting large capitalization stocks return at a daily frequency.
Social media can help investors gather and share information about stock markets. However, it also presents opportunities for fraudsters to spread false or misleading statements in the marketplace. Analyzing millions of messages sent on the social media platform Twitter about small capitalization firms, we find that an abnormally high number of messages on social media is associated with a large price increase on the event day and followed by a sharp price reversal over the next trading week. Examining users' characteristics, and controlling for lagged abnormal returns, press releases, tweets sentiment and firms' characteristics, we find that the price reversal pattern is stronger when the events are generated by the tweeting activity of stock promoters or by the tweeting activity of accounts dedicated to tracking pump-and-dump schemes. Overall, our findings are consistent with the patterns of a pump-and-dump scheme, where fraudsters/promoters use social media to temporarily inflate the price of small capitalization stocks.
What makes cryptocurrencies special? Investor sentiment and price predictability in the absence of fundamental value.
Chen, C., Despres, R., Guo, L, Renault, T., 2018 (Word in progress)
Using a novel dataset of millions of messages published on the social media StockTwits and on the discussion website Reddit, we explore the relation between investor sentiment and cryptocurrencies aggregate return. We first construct a "crypto-specific lexicon" to precisely capture the semantic orientations and avoid misspecification due to the specific language used by individual investors when they talk about cryptocurrencies on the Internet. Then, conducting regressions and implementing various trading strategies, we find that opinions quantified through our crypto-specific lexicon drive the movement of cryptocurrency market. Investor sentiment positively predicts returns from 1 day to 1 week without price reversal. The results suggest that, in a market driven by individual investors who possess higher risk preference, and when there is only limited information about the fundamental value of the underlying asset, investor sentiment has a permanent effect on prices.
This dissertation makes methodological and empirical contributions to three issues related to the informational efficiency of financial markets through the use of Big Data analytics. More precisely, it analyzes: (1) how to measure intraday investor sentiment and determine the relation between investor sentiment and aggregate market returns, (2) how to measure investor attention to news in real time, and identify the relation between investor attention and the price dynamics of large capitalization stocks, and (3) how to detect suspicious behaviors that could undermine the informational role of financial markets and determine the relation between the level of posting activity on social media and small-capitalization stock returns. In that regard, the research design of each essay involves the construction of new datasets of messages published on social media sites to create novel indicators in order to: (1) measure investor sentiment, (2) proxy investor attention to news, and (3) detect suspicious stock recommendations that could be related to market manipulation. Using textual analysis, network theories, event studies, or predictive regressions, this dissertation provides empirical evidence that textual content published on social media contains value-relevant information about asset price formation.