top of page

The Alt Data Revolution: Tackling stakeholder asymmetry | Ali Saif


As the old adage goes: “Knowledge is power”.

We need only look back in time for context. From merchants capitalizing on advance knowledge of regional battles, overthrown kings and war outcomes; ticker-tape, price, and technical trading becoming a prevalent source of alpha during the Livermore/Morgan era; digitization of financial statements in the 70’s; satellite imagery and Twitter sentiment in the last few years. Tapping insights others don't have amounts to profitable “edge”.

Today’s concept of "alt data" has been around for a while, the "alt" in alt data lasting as long as data remained subject to fair use within a closed ecosystem of participants (the data “have’s”).

As humans have evolved, so has the landscape of data they’ve created around them. It has always been - and still is to this day - constantly evolving by virtue of dilution at the hands of an increasing number of participants motivated by profit.

As such, any data that at one point in time was coveted, unique and difficult to source, eventually became commonplace, i.e. alpha always decays, eventually diffusing into the environment as a common risk factor.

The data “haves” always have an advantage over the “have nots”. Survival of the fittest, nothing more, nothing less. The question is: is access to this edge (and the capital that backs it), meritocratic? In this article, we discuss today’s structurally asymmetric alternative data ecosystem, and how Darwinex movement members collaborate to level the playing field.

Extractive Finance

Unfortunate as it may be, we live in an age of increasing disparity between our two main classes of market participants:

Main Street vs. Wall Street

1. Main street generates high volume data

2. Custodians (brokers, banks, etc) store (1)

3. Data vendors cleanse and transform (2)

4. Institutions purchase and mine (3)


The more you know (information from main street)

-> the more you make (alpha vs common risk)

-> the more you have (capital gains)

-> the more you know (purchase more information with capital gains)

The cyclical nature of this data transformation value chain systematically amplifies capital availability for the already most sufficiently capitalized participants..

..which in turn strengthens the case for such participants to become the custodians themselves, ensuring continuity of new data and reduced stakeholdership.

This extractive practice hence reduces the markets over time, to a venue where a “cartel” of the wealthiest enable constant capital and information asymmetry.

Cause & Effect

Over the past decade alone, technological innovation in both storage and computational power has pushed digitization to levels we haven’t seen before. Innovation turned novel and diverse datasets from consumer purchase behaviour to satellite imagery, structured and unstructured alike into... treasure-troves of previously inaccessible alpha. Isn’t it paradoxical then, that this advancement has exacerbated age-old challenges, reinforcing rather than resolving, pre-existing structural problems such as information and capital asymmetry between the have’s, and the have-nots? Let’s review a few examples to expose the paradox.

Access & Availability

An increasing number of data providers

The number of alternative data providers has grown exponentially over the past decade, to currently 445 (source:

image credit:

Increased supply has made more data available, bringing issues of privacy, protection of intellectual property, and regulatory risk associated with legislative uncertainty and grey areas.

Despite exponential growth in availability, accessing these novel datasets remains “exclusive” in that access premia are priced competitively - to harvest consumer surplus from the most highly capitalized stakeholders [1].

Exclusivity: competitors with access to similar datasets

The risk of overcrowding alt data-driven strategies is high, which explains accelerating decay: reports that the percentage of all data in circulation created in the last two years is 90% [2].

The race to access process & structure the data has positive externalities (more data becomes available to more stakeholders), but are the rewards shared meritocratically?

Data efficacy reports the percentage of funds that use or expect to use alternative data 2019 onwards currently sits at 78%. The explosion in data providers challenges the efficacy of the data they are selling. With highly capitalized stakeholders competing for finite edge, alpha decay rate accelerates triggering a winner takes all data war, where winners move on to fresh sources of edge before merit can access?

image credit: also reports that the expected spend on alt datasets by institutions is estimated to be 1.71 billion USD by 2020.

Alpha decay and consequent diffusion to common risk are hence proportional to not only distribution in this case, but also to over-capitalization in a finite space of eligible participants.

Measuring alpha decay

Empirical evidence on capacities of alternative data-driven strategies is difficult to source. A good reference here is Chakravarty and Deb’s paper Capacity Constraints and New Hedge Fund Openings [3] To adequately estimate capacity constraints impacting a closed ecosystem of institutional participants using a finite space of alt data providers, perhaps an alternative scheme of provisioning the data in the first place, needs to be effected.

Protection of privacy and intellectual property

The questions that need to be asked of data providers include:

1. How is a dataset being sourced?

2. Have providers secured explicit provider consent prior to dissemination?

3. What legalities apply to dissemination in the case of consumer-centric data, e.g. purchase behaviours, credit card transactions, etc.

4. How are data generators (especially in the case of consumer-centric data) being compensated?

5. If data generators are being compensated, is compensation a function of the value capital derives from edge?

Information asymmetry

Referring back to the argument presented in the data efficacy section above: A finite space of data providers favouring well-capitalized stakeholders adds to existing information asymmetry prevalent in the financial markets.

Capital asymmetry

As a direct consequence of the above, alt data capacity is largely shared between highly-capitalized participants, adding to existing capital asymmetry prevalent in the financial markets.

Data composition & precision

While structured datasets lend themselves to immediate consumption by models, unstructured datasets may create undesired alpha dampening effects due to additional cost overheads.

For example, the need for specialist infrastructure and personnel dedicated to quantitative decomposition of unstructured data into structured datasets for subsequent alpha research.

Furthermore, in terms of data precision, lagged datasets (older than real-time) will naturally experience faster pricing and diffusion effects than real-time data. While provision of real-time satellite imagery and sensor datasets is not beyond the realm of possibility, provisioning behavioural consumer data such as credit card transactions, purchase patterns, and other sensitive information in real-time is a practice that likely raises serious questions of intellectual property protection, adequate compensation and privacy.

Solution: Inclusive Finance

A meritocratic league of alternative data accessible to all stakeholders under one roof, generators and acquirers alike:

Accessible to all stakeholders

Retail and institutional -> size does not (and should not) matter.

Directly tradeable as an asset class

Stakeholders take alpha directly to the market, swapping cost of acquisition and process for fair compensation to alpha generators instead.

Stakeholders replicate alpha generators’ signal in real-time, generators charge a performance fee for providing the service.

Stakeholders informed of generator capacity in advance, generators have complete control over capacity management in case of over-capitalization risk.

No capital asymmetry

All stakeholders have equal access to the dataset, competing fairly for a share of the available capacity.

No information asymmetry

All stakeholders have equal, non-preferential access to the dataset. No way to disseminate outside the ecosystem without fairly compensating alpha generators.

Protection of intellectual property

Containment within the ecosystem and regulatory cover for generators to legally charge performance fees ensures generators get adequately compensated for signal provision. If the solutions above have piqued your interest, we look forward to seeing you at the Quant Conference in London on November 01, 2019, where Darwinex CEO Juan Colón will be shedding light on why and how Darwinex is democratizing the financial landscape.

About Darwinex

Darwinex is a Broker, Asset Manager and Technology Provider, authorized and regulated by the Financial Conduct Authority (FCA) in the United Kingdom.

Founded in 2012, we employ more than 40 across our London headquarters and our development office in Spain.

In our quest to democratize the financial landscape, we’ve:

1. Engineered the technology necessary to wrap investment strategies as an asset class. The DARWIN asset/wrapper lets investors invest in / trade manager talent, alpha and community sentiment in real-time, while fairly compensating for trader IP.

2. Eliminated the startup costs and regulatory overhead associated with starting a hedge fund, enabling under-capitalized talent to compete at an institutional scale under our regulatory umbrella - the foundation of our Hedge Fund as a Service business model.

3. Successfully assetized both positive and negative alpha, enabling anyone with access to our community data to engineer investment strategies uncorrelated to any other asset class in existence. We’ve hence created an ecosystem for traders and investors to pool collective talent, information, and capital for private and social benefit, one that:

a) Structurally solves the “manage peanuts, make peanuts” problem faced by under-capitalized talent; facilitates equal access to all stakeholders.

b) Facilitates merit-driven access to investor leverage with no downside risk. Investors compete fairly for their share of alpha capacity.

c) Demonstrably improves trader satisfaction better than our competition.

d) Enables investors to swap alt data acquisition and processing costs for fair compensation to alpha generators.


bottom of page