Data First Venture is Coming
That Was The Week 2023 #27
Data-First Venture is Coming
I spent a lot of this week using ChatGPT with the data we maintain at . We have data for over one million funding events, covering over 40 fields per row. More than 50 million data points altogether.
We learn from the data and build predictive models with the goal of selecting the top 5% of Series B rounds each year.
It is very good. The average performance after 5 years is that a selected Series B returns 6x growth in value. 25-30% become unicorns.
So you would think, as CEO of SignalRank, that I am a big believer in data-driven investing in startups. But the truth is far more subtle. And there is a very big difference between data-driven investing (human decision-making, supported by data) and data-first investing, where humans are simply executing from the recommendations of a model.
But I digress. First, some background.
Venture Capital is not best understood as a single asset class. It is at least three (seed, venture, and growth) and arguably even more.
The Seed and Series A part of venture capital is hard to apply data to. Companies have little in the way of real businesses. Founder background and hiring patterns are all very large producers of false positive signals. Those that claim to have insight rarely do. The best investors at these stages combine experience, often as operators, with intuition and analysis. The best of them pick future winners repeatedly, and they play a huge individual role in that. Seed investors are the true venture investors, and that is unlikely to change.
Look at the top early-stage investors from the point of view of unicorn production.
This measures their Seed and Series A investments. They produced 769 unicorns. No data could have found them at that stage. Even SignalRank's algorithms would need over 7500 Series A investments to get to that number of unicorns. A very good one-in-ten hit rate, but the best seed investors also do that, or better.
So early-stage investing is mostly art, with a little analytics built in. Mike Maples, Hunter Walk, Aileen Lee, Garry Tan, Paul Graham, Ron Conway, Reshma Sohoni, Carlos Espinal, Sia Houchangnia, Suzanne Ashman Blair, Saul Klein, and many more emerging managers are the heroes of this early-stage ecosystem. To peek at the data results - here is the SignalRank top 20 seed investor list from the past 5 years. Scores are correlated to outcomes over 5 years from the date of investment, or less if 5 years have not elapsed. If you want to see all 850 scoring seed investors, you can subscribe to SignalRank AI by emailing us.
!function(){"use strict";window.addEventListener("message",(function(e){if(void 0!==e.data["datawrapper-height"]){var t=document.querySelectorAll("iframe");for(var a in e.data["datawrapper-height"])for(var r=0;r
By the time a Series B term sheet comes along, these early-stage investors are no longer capitalized to invest and suffer dilution. But data is capable of differentiating between their likely future fund returners and those who are not likely to perform at that level.
The data can be used in varying degrees to filter companies based on likely outcomes. And models can be trained to recognize these likely good outcomes based on features discovered in the data.
The later the funding round, the more data can help.
Not to sell my startup's vision here, but at SignalRank, we have shown that a 100% data-driven Series B investment selection process can beat any human Series B investor in outcomes and efficiency. And with no human override.
And the same at C and D.
If we are right (and we are), it seems likely that venture investing will increasingly be human-led at the early stage and machine-led later. And by machine-led, I mean no human oversight. Early-stage investors can expect to have capital available to their best companies from a pool of capital designated for later rounds and earn carry from the profits. This pool of capital would remove the need for opportunity funds or special one-off vehicles.
Early-stage investors will partner with later-stage data-driven allocators to secure the future outcomes of their high-scoring portfolio companies. Venture Capitalists will not be replaced but augmented.
Data-first investing is being born.
Capital allocators have yet to catch up with this trend. But soon, rather than looking at the track record of an individual or firm, later-stage allocators will want to know how a model backtests and will allocate to the best models. If the average performance of a Series B investment after 5 years is 6x growth in value and 25-30% become unicorns, it will be hard not to allocate capital to the model.
This future is built on the human-led seed and Series A investing ecosystem.
Rohit Krishnan has a quote describing the social context in which ideation (and early-stage investing) thrives:
In the 17th and 18th century, there began a coffee drinking scene in London, bringing with it an incredible scene of intellectual debates and spun off innovation. Soon London was only second to Constantinople in number of coffeehouses!
Silicon Valley is basically a large coffee house. And humans are drinking the coffee. There is the subtlety. Human first seed investing and data first later stage, from the Series B onwards. You read it here first.
So, no surprise I read a lot about data-driven venture investing this week. This is random. It just happened to be in my feed. Abhishek Bhatia and Gary Dushnitsky from the London Business School get essay of the week for The Future of Venture Capital? Insights Into Data-Driven VCs. It is worth a deep dive. Especially the section called Algorithms as Startup Investors. And a second paper on whether good investors are persistent - with good data. Koble Moneyball's Why more is less in Investing also focuses in on data processes to filter out losers.
All these approaches focus on how data can lower portfolio risk, possibly to zero. If that is the case, then data first investments in private companies and indexes built by securitizing the assets invested in will become very popular with investors. For me, once a company is post-Series A and has a Series B term sheet, data can help the early-stage investor allocate to their likely future winners with significantly reduced risk.
More in this week's video and podcast.
Essays of the Week