Working Papers
-
Does machine learning really help to select mutual funds with positive alpha?
with Jürg Fausch, Moreno Frigg, Thomas Johann, Emil Mussbach, and Wolfgang Drobetz, 2025
Show Abstract
We revisit the recent finding of DeMiguel, Gil-Bazo, Nogales and Santos (2023) that machine learning (ML) methods can identify long-only portfolios of actively managed mutual funds with positive net annual alpha of up to 2.69%. Using their replication package, we show that these conclusions are driven by a coding error that introduces a forward-looking bias in the construction of portfolio returns. Removing the bias eliminates the long-only alpha. Based on an independent replication, we find that nonlinear ML methods consistently identify underperforming funds. However, this also holds for linear models. Consequently, long-short portfolios deliver robust and economically meaningful alphas, driven predominantly by the short leg. We further extend the analysis by expanding the sample period to the end of 2024, by considering forecast horizons up to 36 months and conducting univariate sorts across fund characteristics. These extensions do not materially enhance long-only predictability but offer empirical evidence of the relative advantages of ML at longer horizons. Overall, our results suggest that ML adds value primarily through the consistent avoidance of poorly performing funds rather than by identifying long-only outperforming ones. This has important implications for both the academic interpretation of ML performance and its practical application in active mutual fund selection.
Paper
-
Broker colocation and the execution costs of customer and proprietary orders
with Satchit Sagade and Stefan Scharnowski, 2024
Show Abstract
Colocation services offered by stock exchanges enable market participants to achieve execution costs for large orders that are substantially lower and less sensitive to interacting with high-frequency traders. However, these benefits manifest only for orders executed on the colocated brokers' own behalf, whereas customers' order execution costs are substantially higher. Analyses of individual order executions indicate that customer orders originating from colocated brokers are less actively monitored. This suggests that brokers do not make effective use of their technology, possibly due to agency frictions or poor algorithm selection and parameter choice by customers.
Paper
-
A Tale of Two Cities – Inter-Market Latency and Fast-Trader Competition
with Satchit Sagade, Stefan Scharnowski, and Erik Theissen, 2024
Show Abstract
We examine the impact of increasing competition among the fastest traders by analyzing a new low-latency microwave network connecting exchanges trading the same stocks. Using a difference-in-differences approach comparing German stocks with similar French stocks, we find improved market integration, faster incorporation of stock-specific information, and an increased contribution to price discovery by the smaller exchange. Liquidity worsens for large caps due to increased sniping but improves for mid caps due to fast liquidity provision. Trading volume on the smaller exchange declines across all stocks. We thus uncover nuanced effects of fast trader participation that depend on their prior involvement.
Paper
-
The Sources of Researcher Variation in Economics
with Nick Huntington-Klein, Claus C. Pörtner, et. al., 2025
Show Abstract
We use a rigorous three-stage many-analysts design to assess how different researcher decisions—specifically data cleaning, research design, and the interpretation of a policy question—affect the variation in estimated treatment effects. A total of 146 research teams each completed the same causal inference task three times each: first with few constraints, then using a shared research design, and finally with pre-cleaned data in addition to a specified design. We find that even when analyzing the same data, teams reach different conclusions. In the first stage, the interquartile range (IQR) of the reported policy effect was 3.1 percentage points, with substantial outliers. Surprisingly, the second stage, which restricted research design choices, exhibited slightly higher IQR (4.0 percentage points), largely attributable to imperfect adherence to the prescribed protocol. By contrast, the final stage, featuring standardized data cleaning, narrowed variation in estimated effects, achieving an IQR of 2.4 percentage points. Reported sample sizes also displayed significant convergence under more restrictive conditions, with the IQR dropping from 295,187 in the first stage to 29,144 in the second, and effectively zero by the third. Our findings underscore the critical importance of data cleaning in shaping applied microeconomic results and highlight avenues for future replication efforts.
Paper
-
Comparing Human-Only, AI-Assisted, and AI-Led Teams on Assessing Research Reproducibility in Quantitative Social Science
with Abel Brodeur et al., 2025
Show Abstract
This study evaluates the effectiveness of varying levels of human and artificial intelligence (AI) integration in reproducibility assessments. We computationally reproduced quantitative results from published articles in the social sciences with 288 researchers, randomly assigned to 103 teams across three groups - human-only teams, AI-assisted teams and teams whose task was to minimally guide an AI to conduct reproducibility checks (the AI-led approach). Findings reveal that when working independently, human teams matched the reproducibility success rates of teams using AI assistance, while both groups substantially outperformed AI-led approaches (with human teams achieving 57 pp higher success rates than AI-led teams). Human teams found significantly more major errors compared to both AI-assisted teams and AI-led teams. AI-assisted teams demonstrated an advantage over more automated approaches, detecting 0.4 more major errors per team than AI-led teams, though still significantly fewer than human-only teams. Finally, both human and AI-assisted teams significantly outperformed AI-led approaches in both proposing and implementing comprehensive robustness checks.
Paper
-
Mass Reproducibility and Replicability: A New Hope
with Abel Brodeur et al., 2024
Show Abstract
This study pushes our understanding of research reliability by reproducing and replicating claims from 110 papers in leading economic and political science journals. The analysis involves computational reproducibility checks and robustness assessments. It reveals several patterns. First, we uncover a high rate of fully computationally reproducible results (over 85%). Second, excluding minor issues like missing packages or broken pathways, we uncover coding errors for about 25% of studies, with some studies containing multiple errors. Third, we test the robustness of the results to 5,511 re-analyses. We find a robustness reproducibility of about 70%. Robustness reproducibility rates are relatively higher for re-analyses that introduce new data and lower for re-analyses that change the sample or the definition of the dependent variable. Fourth, 52% of re-analysis effect size estimates are smaller than the original published estimates and the average statistical significance of a re-analysis is 77% of the original. Lastly, we rely on six teams of researchers working independently to answer eight additional research questions on the determinants of robustness reproducibility. Most teams find a negative relationship between replicators' experience and reproducibility, while finding no relationship between reproducibility and the provision of intermediate or even raw data combined with the necessary cleaning codes.
Paper
-
Quasi-Dark Trading: The Effects of Banning Dark Pools in a World of Many Alternatives
with Thomas Johann, Talis Putnins, and Satchit Sagade, 2019
Show Abstract
We show that “quasi-dark” trading venues, i.e., markets with somewhat non-transparent trading mechanisms, are important parts of modern equity market structure alongside lit markets and dark pools. Using the European MiFID II regulation as a quasi-natural experiment, we find that dark pool bans lead to (i) volume spill-overs into quasi-dark trading mechanisms including periodic auctions and order internalization systems; (ii) little volume returning to transparent public markets; and consequently, (iii) a negligible impact on market liquidity and short-term price efficiency. These results show that quasi-dark markets serve as close substitutes for dark pools and consequently mitigate the effectiveness of dark pool regulation. Our findings highlight the need for a broader approach to transparency regulation in modern markets that takes into consideration the many alternative forms of quasi-dark trading.
Paper
-
High-Frequency Trading and Price Informativeness
with Jasmin Gider and Simon Schmickler, 2021
Show Abstract
We study how the informativeness of stock prices changes with the start of high-frequency trading (HFT). Our estimate is based on the staggered start of HFT participation in a panel of international exchanges. With HFT presence, market prices are a less reliable predictor of future cash flows and investment, even more so for longer horizons. Further, firm-level idiosyncratic volatility decreases, and the holdings and trades by institutional investors deviate less from the market-capitalization weighted portfolio as a benchmark. Our results document that the informativeness of prices decreases subsequent to the start of HFT. These findings are consistent with theoretical models of HFTs' ability to anticipate informed order flow, resulting in decreased incentives to acquire fundamental information.
Paper
-
Corporate Bond Issuance Fragmentation, Liquidity, and Issuance Costs
with Mohammad Izadi, 2022
Show Abstract
Many companies issue multiple bonds that are outstanding simultaneously. Market participants surmise that this fragmentation of issuance is one reason for the relative lack of liquidity in the corporate bond market. Using data from the US, we confirm that bond issuance fragmentation predicts lower bond liquidity, as measured by estimates of the bid-ask spread and the intra-quartile range of trade prices. These results hold for both retail and institutional-sized trades. The effect was most severe during the Global Financial Crisis, and it has generally reduced over time. Our results are most consistent with predictions from models of search and bargaining costs, though they do not rule out a role of dealer inventory costs. Bond issuers do not benefit from more concentrated issuance because they are associated with higher yields despite their higher liquidity.
-
Mass Reproducibility and Replicability: A New Hope
with Abel Brodeur, Derek Mikola, Nikolai Cook et al., 2024
Show Abstract
This study pushes our understanding of research reliability by reproducing and replicating claims from 110 papers in leading economic and political science journals. The analysis involves computational reproducibility checks and robustness assessments. It reveals several patterns. First, we uncover a high rate of fully computationally reproducible results (over 85%). Second, excluding minor issues like missing packages or broken pathways, we uncover coding errors for about 25% of studies, with some studies containing multiple errors. Third, we test the robustness of the results to 5,511 re-analyses. We find a robustness reproducibility of about 70%. Robustness reproducibility rates are relatively higher for re-analyses that introduce new data and lower for re-analyses that change the sample or the definition of the dependent variable. Fourth, 52% of re-analysis effect size estimates are smaller than the original published estimates and the average statistical significance of a re-analysis is 77% of the original. Lastly, we rely on six teams of researchers working independently to answer eight additional research questions on the determinants of robustness reproducibility. Most teams find a negative relationship between replicators' experience and reproducibility, while finding no relationship between reproducibility and the provision of intermediate or even raw data combined with the necessary cleaning codes.
Paper
-
The Dynamics and Spillovers of Management Interventions: A Comment on Bianchi and Giorcelli (2022)
with Gon¸calo Lima, Jakob Moeller, and Marco Schmandt,
-
The Anatomy of Designated Market Maker Trading in Limit Order Markets
with Erik Theissen, 2018
Show Abstract
Many exchanges allow or require firms to hire designated market makers (DMM) to improve liquidity. Available evidence suggests that DMMs indeed improve liquidity, and that share prices react favorably to their presence. Little is known, however, about what DMMs actually do. We analyze their trading activity in detail. DMM participation decreases in firm size and trading volume, and decreases over the trading day. They not only provide liquidity but also take liquidity. Other traders take liquidity supplied by DMMs in times of high spreads and informational asymmetries. DMMs do not earn trading profits and thus need to be compensated otherwise for the services they provide.
-
The Effects of Post-Trade Transparency in Equity Markets: Evidence from MiFID Large Trade Disclosure Rules
with Stefan Scharnowski, 2016