Thomas Renault

Publications

Community-based fact-checking reduces the spread of misleading posts on social media

Nature Communications (Forthcoming), With T. Chuai, M. Pilarski, D. Restrepo-Amariles, A. Troussel-Clément, G. Lenzini, N. Pröllochs

Community-based fact-checking is a promising approach to verify social media content and correct misleading posts at scale. Yet, causal evidence regarding its effectiveness in reducing the spread of misinformation on social media is missing. Here, we performed a large-scale empirical study to analyze whether community notes reduce the spread of misleading posts on X. Using a Difference-in-Differences design and repost time series data for N=237,677 (community fact-checked) cascades that had been reposted more than 431 million times, we found that exposing users to community notes reduced the spread of misleading posts by, on average, 62.0%. Furthermore, community notes increased the odds that users delete their misleading posts by 103.4%. However, our findings also suggest that community notes might be too slow to intervene in the early (and most viral) stage of the diffusion. Our work offers important implications to enhance the effectiveness of community-based fact-checking approaches on social media.

https://arxiv.org/abs/2409.08781

Republicans are flagged more often than Democrats for sharing misinformation on X's Community Notes

PNAS (2025), With M. Mosleh and D. Rand

We use crowd-sourced assessments from X's Community Notes program to examine whether there are partisan differences in the sharing of misleading information. Unlike previous studies, misleadingness here is determined by agreement across a diverse community of platform users, rather than by fact-checkers. We find that 131% more posts by Republicans are flagged as misleading compared to posts by Democrats. These results are not base rate artifacts, as we find no meaningful over-representation of Republicans among X users. Our findings provide strong evidence of a partisan asymmetry in misinformation sharing which cannot be attributed to political bias on the part of raters, and indicate that Republicans will be sanctioned more than Democrats even if platforms transition from professional fact-checking to Community Notes.

https://osf.io/preprints/psyarxiv/vk5yj

Social media and suicide: empirical evidence from the quasi-exogenous geographical adoption of Twitter

Annals of the New York Academy of Sciences (2025), With A. Du

Social media usage is often cited as a potential driver behind the rising suicide rates. However, distinguishing the causal effect—whether social media increases the risk of suicide—from reverse causality, where individuals already at higher risk of suicide are more likely to use social media, remains a significant challenge. In this paper, we use an instrumental variable approach to study the quasi-exogenous geographical adoption of Twitter and its causal relationship with suicide rates. Our analysis first demonstrates that Twitter’s geographical adoption was driven by the presence of certain users at the 2007 SXSW festival, which led to long-term disparities in adoption rates across counties in the United States. Then, using a two-stage least squares (2SLS) regression and controlling for a wide range of geographic, socioeconomic and demographic factors, we find no significant relationship between Twitter adoption and suicide rates.

The Usual Suspects: Offender Origin, Media Reporting and Natives’ Attitudes Towards Immigration

The Economic Journal (2024), with S. Keita and J. Valette

This paper analyses whether the systematic disclosure of criminals’ origins in the press affects natives’ attitudes towards immigration. It takes advantage of the unilateral change in reporting policy announced by the German newspaper Sächsische Zeitung in July, 2016. Combining individual-level panel data from the German Socio-Economic Panel from 2014 to 2018 with 402,819 crime-related articles in German newspapers and those newspapers’ market shares, we find that systematically mentioning the origins of criminals increases the relative salience of natives’ criminality and reduces natives’ concerns about immigration, breaking the implicit link between immigration and crime.

https://academic.oup.com/ej/advance-article-abstract/doi/10.1093/ej/uead059/7238467

Media sentiment on monetary policy: Determinants and relevance for inflation expectations

Journal of International Money and Finance (2022), with M. Picault and J. Pinter

We construct a new indicator to capture media sentiment about the European Central Bank monetary policy and its relevant environment by analyzing 25,000 articles from five major international newspapers. Using named entity recognition and part-of-speech tagging, we propose a methodology to dissociate the dissemination of official communications of the central bank from the media comments. The resulting (daily) index correlates with some (monthly) standard measures of economic sentiment but reveals idiosyncratic information on monetary policy. Analyzing the determinants of our index, we find that both press conference and inter-meeting communications of the President significantly affect media sentiment. We then show that, controlling for a large range of factors, daily changes in media sentiment have predictive power for financial market inflation expectations.

https://www.sciencedirect.com/science/article/pii/S0261560622000298

Social distancing beliefs and human mobility: Evidence from Twitter

PLoS One (2021), with S. Porcher

We construct a novel database containing hundreds of thousands geotagged messages related to the COVID-19 pandemic sent on Twitter. We create a daily index of social distancing—at the state level—to capture social distancing beliefs by analyzing the number of tweets containing keywords such as “stay home”, “stay safe”, “wear mask”, “wash hands” and “social distancing”. We find that an increase in the Twitter index of social distancing on day t-1 is associated with a decrease in mobility on day t. We also find that state orders, an increase in the number of COVID-19 cases, precipitation and temperature contribute to reducing human mobility. Republican states are also less likely to enforce social distancing. Beliefs shared on social networks could both reveal the behavior of individuals and influence the behavior of others. Our findings suggest that policy makers can use geotagged Twitter data—in conjunction with mobility data—to better understand individual voluntary social distancing actions.

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0246949

Does investor sentiment on social media provide robust information for Bitcoin returns predictability?

Finance Research Letter (2021), with D. Guégan

We use a dataset of approximately one million messages sent on StockTwits to explore the relationship between investor sentiment on social media and intraday Bitcoin returns. We find a statistically significant relationship between investor sentiment and Bitcoin returns for frequencies of up to 15 minutes. For lower frequencies, the relation disappears. We also find that the impact of sentiment on returns is concentrated on the period around the Bitcoin bubble. However, the magnitude of the effect is rather small making it impossible for a trader to make economic profits by trading on the information published on social media.

https://www.sciencedirect.com/science/article/pii/S1544612319314199

Economic Uncertainty Before and During the COVID-19 Pandemic

Journal of Public Economics (2020), with D. Altig, S. Baker, JM. Barrero, N. Bloom, P. Bunne, S. Chen, S. Davis, J. Leather, B. Meyer, E. Mihaylov, P. Mizen, N. Parker, P. Smietanka and G. Thwaites

We consider several economic uncertainty indicators for the US and UK before and during the COVID-19 pandemic: implied stock market volatility, newspaper-based policy uncertainty, Twitter chatter about economic uncertainty, subjective uncertainty about business growth, forecaster disagreement about future GDP growth, and a model-based measure of macro uncertainty. Four results emerge. First, all indicators show huge uncertainty jumps in reaction to the pandemic and its economic fallout. Indeed, most indicators reach their highest values on record. Second, peak amplitudes differ greatly – from a 35% rise for the model-based measure of US economic uncertainty (relative to January 2020) to a 20-fold rise in forecaster disagreement about UK growth. Third, time paths also differ: Implied volatility rose rapidly from late February, peaked in mid-March, and fell back by late March as stock prices began to recover. In contrast, broader measures of uncertainty peaked later and then plateaued, as job losses mounted, highlighting differences between Wall Street and Main Street uncertainty measures. Fourth, in Cholesky-identified VAR models fit to monthly U.S. data, a COVID-size uncertainty shock foreshadows peak drops in industrial production of 12–19%.

https://www.sciencedirect.com/science/article/pii/S0047272720301389

When Machines Read the Web: Market Efficiency and Costly Information Acquisition at the Intraday Level

Finance (2019), with R. Gillet

We investigate the efficient market hypothesis at the intraday level by analyzing market reactions to negative tweets and reports published on the Internet by an activist short seller. Conducting event studies, we find that fast-moving traders can generate small, albeit significant, abnormal profit by trading on public information published on social media. The market reaction to tweets is stronger when a company is mentioned for the first time on Twitter, showing that investors can disentangle new information from noise in real time. We also find that traders who manage to identify the information on the short seller’s website before the dissemination of the same news on Twitter can generate much greater abnormal returns. As acquiring information on a website is more costly and difficult than acquiring the same information on Twitter, our findings provide empirical evidence supporting the Grossman–Stiglitz paradox at the intraday level. Very short-lived market anomalies do exist in the stock market to compensate investors who spent time and money in setting up bots and algorithms to trade on new information before the crowd.

https://www.cairn.info/revue-finance-2019-2-page-7.htm

Sentiment analysis and machine learning in finance: a comparison of methods and models on one million messages

Digital Finance (2019)

We use a large dataset of one million messages sent on the microblogging platform StockTwits to evaluate the performance of a wide range of preprocessing methods and machine learning algorithms for sentiment analysis in finance. We find that adding bigrams and emojis significantly improve sentiment classification performance. However, more complex and time-consuming machine learning methods, such as random forests or neural networks, do not improve the accuracy of the classification. We also provide empirical evidence that the preprocessing method and the size of the dataset have a strong impact on the correlation between investor sentiment and stock returns. While investor sentiment and stock returns are highly correlated, we do not find that investor sentiment derived from messages sent on social media helps in predicting large capitalization stocks return at a daily frequency.

https://doi.org/10.1007/s42521-019-00014-x

Nowcasting GDP with traditional media content

Economics and Statistics (2018), with C. Bortolli and S. Combes

GDP statistics in France are published on a quarterly basis, 30 days after the end of the quarter. In this article, we consider media content as an additional data source to traditional economic tools to improve short-term forecast/nowcast of French GDP. We use a database of more than a million articles published in the newspaper Le Monde between 1990 and 2017 to create a new synthetic indicator capturing media sentiment about the state of the economy. We compare an autoregressive model augmented by the media sentiment indicator with a simple autoregressive model. We also consider an autoregressive model augmented with the Insee Business Climate indicator. Adding a media indicator improves French GDP forecasts compared to these two reference models. We also test an automated approach using penalised regression, where we use the frequencies at which words or expressions appear in the articles as regressors, rather than aggregated information. Although this approach is easier to implement than the former, its results are less accurate.

https://www.insee.fr/en/statistiques/3705981?sommaire=3706269

Words are not all created equal: a new measure of the ECB communication

Journal of International Money and Finance (2017), with M. Picault

We develop a field-specific dictionary to measure the stance of the European Central Bank monetary policy (dovish, neutral, hawkish) and the state of the Eurozone economy (positive, neutral, negative) through the content of ECB press conferences. In contrast with traditional textual analysis, we propose a novel approach using term-weighting and contiguous sequence of words (n-grams) to better capture the subtlety of central bank communication. We find that quantifying ECB communication using our field-specific weighted lexicon do help predicting future ECB monetary decision and European stock market volatility. Our indicators significantly outperform a textual classification based on the Loughran-McDonald or Apel-Blix-Grimaldi dictionaries and a media-based measure of economic policy uncertainty.

https://doi.org/10.1016/j.jimonfin.2017.09.005

Intraday online investor sentiment and return patterns in the U.S. stock market

Journal of Banking and Finance (2017)

We implement a novel approach to derive investor sentiment from messages posted on social media before we explore the relation between online investor sentiment and intraday stock returns. Using an extensive dataset of messages posted on the microblogging platform StockTwits, we construct a lexicon of words used by online investors when they share opinions and ideas about the bullishness or the bearishness of the stock market. We demonstrate that a transparent and replicable approach significantly outperforms standard dictionary-based methods used in the literature while remaining competitive with more complex machine learning algorithms. Aggregating individual message sentiment at half-hour intervals, we provide empirical evidence that online investor sentiment helps forecast intraday stock index returns. After controlling for past market returns, we find that the first half-hour change in investor sentiment predicts the last half-hour S&P 500 index ETF return. Examining users’ self-reported investment approach, holding period and experience level, we find that the intraday sentiment effect is driven by the shift in the sentiment of novice traders. Overall, our results provide direct empirical evidence of sentiment-driven noise trading at the intraday level.

https://doi.org/10.1016/j.jbankfin.2017.07.002

Three Essays on the Informational Efficiency of Financial Markets through the use of Big Data Analytics

Renault, T., 2017

This dissertation makes methodological and empirical contributions to three issues related to the informational efficiency of financial markets through the use of Big Data analytics. More precisely, it analyzes: (1) how to measure intraday investor sentiment and determine the relation between investor sentiment and aggregate market returns, (2) how to measure investor attention to news in real time, and identify the relation between investor attention and the price dynamics of large capitalization stocks, and (3) how to detect suspicious behaviors that could undermine the informational role of financial markets and determine the relation between the level of posting activity on social media and small-capitalization stock returns. In that regard, the research design of each essay involves the construction of new datasets of messages published on social media sites to create novel indicators in order to: (1) measure investor sentiment, (2) proxy investor attention to news, and (3) detect suspicious stock recommendations that could be related to market manipulation. Using textual analysis, network theories, event studies, or predictive regressions, this dissertation provides empirical evidence that textual content published on social media contains value-relevant information about asset price formation.

University Paris-Saclay

Publications