...

In today’s digital advertising world, data is often treated like a magic bullet — the more, the better. Deterministic measurement, real-time tracking, and the illusion of total transparency have created a deep scepticism toward traditional, sample-based methods. But this distrust is misplaced. Representative samples, combined with inferential statistics, have long delivered reliable insights — in media research and beyond.

Using the example of Germany’s recent federal election, this piece shows how surveys of just a few thousand people consistently come remarkably close to the actual election outcome — even though over 50 million people voted. The lesson for media research is clear: when done right, small samples are not a weakness, but a strength. In times of increasing data protection and media fragmentation, clinging to the dream of complete data collection is neither realistic nor necessary. It’s time to rediscover the power of smart sampling — and remind ourselves that more data is not always better data.

Advertising measurement: From panel to tracking – and back again

Deriving findings based on representative samples has a long history in media research. Television audience research in Germany, for example, began in the 1960s, and since 1988, AGF has been providing the industry with data on television reach with the help of a panel comprising around 5.000 households.

The advent of the commercial internet in the 1990s brought about a paradigm shift. Suddenly, there was the prospect of recording every single action in detail and deterministically with the help of methods known at the time as ‘web mining’. ‘Place an advert and find out the user’s reaction to it in real time!’ This click-based feedback logic made the internet attractive in its early years particularly as a channel for direct marketing activities.

In the meantime, however, online advertising has emancipated itself as a genre and is rightly regarded as equivalent to traditional brand and image advertising genres. The performance metrics of earlier times make little sense for classic branding campaigns. In addition, technical restrictions to protect personal data, users’ sensitivity to their privacy, and the fragmentation of the media landscape make such an approach difficult.

Small Samples, Big Misunderstandings

Therefore, it is not surprising that advertisers evaluate the success of their digital branding campaigns using the same criteria as their traditional branding campaigns: net reach, average contact frequency, advertising pressure in the target group, awareness, consideration, and the like.

For digital media campaigns, such criteria are typically collected (analogous to campaigns in traditional media) via measurements within representative samples and subsequent statistical inference to the entirety.

So far, nothing new. Strangely enough, however, a fundamental scepticism towards such probabilistic methods has developed in our industry. Coupled with a fundamental preference for deterministic methods, this leads to results being doubted, often with reference to an apparently small sample size.
The possibility of surveying 100% of the sample to validate the results of a sample-based inference and counteract such scepticism does not exist in media research.

Forecasting an Election — With Just 1.000 People

However, such an opportunity arises in political elections like the most recent federal election on the 23rd of February 2025. In the run-up to the election, many market and opinion research institutes drew representative, randomly selected samples and asked them the so-called ‘Sunday question’ (‘Which party would you vote for if the federal election were next Sunday?’). By the 24th of February at the latest, when the Federal Returning Officer announced the provisional results of the election, it was possible to compare the institutes’ forecasts with the actual election result. So, how close were the predictions based on the survey of a small number of people versus the results of surveying the 50 million voters?

Let’s take a look:

Firstly, the provisional official final result of the Bundestag election:

Union SPD Grüne FDP Linke AfD BSW
Results 28,6% 16,4% 11,6% 4,3% 8,8% 20,8% 4,97%

And now the forecasts of some institutes shortly before the election:

Union SPD Grüne FDP Linke AfD BSW n=
INSA (22/02/25) 29,5% 15% 12,5% 4,5% 7,5% 21% 5% 2.005
FG Wahlen (20/02/25) 28% 16% 14% 4,5% 8% 21% 4,5% 1.349
Yougov (21/02/25) 29% 16% 13% 4% 8% 20% 5% 1.681
WK Prognose (21/02/25) 29,5% 14,5% 12% 4,5% 8% 20,5% 4,5% 1.536
IPSOS (21/02/25) 30% 16% 12% 4,5% 7% 21% 4,5% 1.000

As can be seen, the forecasts are very close to the official final result. On average, the five institutes were only off by around 6% across all their projections, with the most considerable deviations being observed in the forecasts for the Greens and the Left Party. Studies have shown that there were still large voter shifts between these two parties in the election’s final days (from the Greens to the Left), which could explain the deviations in the forecasts.

Not everything that is possible is also sensible

Nevertheless, it should be noted that observations in small representative samples are very well suited to making statements about the population. Also in media research, inferential statistics is an efficient method for making relatively precise statements about a population:

  • How many people between 20 and 49 years old have had contact with my advertising campaign?
  • How significant were the reach overlaps between TV and digital?
  • How much did the brand awareness increase, and how much of that can be attributed to our campaign?

For these and similar questions, there is no ‘federal election’ in which the relevant population could be asked to vote. However, we can see that this is not even necessary. If only a small representative sample is observed, valid conclusions can be drawn from these observations.

Even if it sometimes seems as if we have almost unlimited technical possibilities, it rarely makes sense to want to collect everything deterministically to one decimal place when we also have efficient methods such as inferential statistics that lead to similarly good results at a fraction of the effort. Less is sometimes more. We should remind ourselves of this more often in our industry.

Learn more

Learn more about AudienceProject’s cross-media measurement platform by filling out the form below.