Incrementality Testing: Measure True Ad Impact

Attribution models tell you which ads touched a converter. Incrementality testing tells you which ads actually caused conversions. This distinction is critical because a significant portion of attributed conversions would have happened anyway, driven by brand awareness, organic search, direct visits, or other factors. Incrementality testing isolates the true causal impact of advertising by comparing outcomes between exposed and unexposed groups in a controlled experimental framework.

What Is Incrementality?

Incrementality measures the lift in conversions or other desired outcomes that are directly caused by advertising exposure, above and beyond what would have occurred organically. In other words, it answers the question: how many additional conversions did this campaign produce that would not have happened without it?

Consider a retargeting campaign that shows ads to users who visited a product page. Some of those users would have returned and purchased regardless of seeing the ad, perhaps through a bookmarked link, a branded search, or simply remembering the product. Attribution models would credit the retargeting campaign for all conversions where the ad was shown, but incrementality testing reveals how many of those conversions the ad actually influenced.

The difference between attributed conversions and incremental conversions can be substantial. Research across industries has shown that retargeting campaigns often claim three to five times more conversions through attribution than they actually cause incrementally. Understanding this gap is essential for making accurate budget allocation decisions.

How Incrementality Testing Works

Incrementality testing borrows its methodology from scientific experimentation and clinical trials. The core principle is the same: create equivalent groups, expose one to the treatment (the ad) and the other to nothing (or a control), then measure the difference in outcomes.

Test Group Design

The most common incrementality test design involves splitting the eligible audience into two randomly assigned groups:

Test group (exposed): This group sees the actual advertising campaign as intended. They receive the real ad creative, targeting, and frequency throughout the test period.
Control group (holdout): This group is deliberately prevented from seeing the campaign. In some implementations, they see a public service announcement or charity ad instead, ensuring the auction mechanics function normally while the actual brand ad is withheld.

Random assignment is essential. If the groups are not equivalent before the test begins, any difference in outcomes could be caused by the group composition rather than the advertising. Statistical techniques ensure the test and control groups are balanced across key characteristics like demographics, geography, past purchase behavior, and engagement levels.

Measuring Lift

During and after the test period, conversion rates are compared between the two groups. The incremental lift is calculated as the difference between the test group's conversion rate and the control group's conversion rate, expressed as a percentage of the control group's rate.

For example, if the test group converts at 4.2 percent and the control group converts at 3.5 percent, the incremental lift is 20 percent. This means the advertising campaign caused a 20 percent increase in conversions above what would have occurred naturally. The 3.5 percent conversion rate in the control group represents organic demand that would have existed without any advertising.

Statistical Significance

For results to be reliable, they must reach statistical significance. This means the observed difference between test and control groups is unlikely to have occurred by random chance. Achieving significance requires adequate sample sizes, sufficient test duration, and a meaningful effect size.

In practice, this means incrementality tests require large audiences and enough time to accumulate statistically reliable conversion data. A test with too few users or too short a duration may produce inconclusive results, where the observed lift cannot be distinguished from random variation.

Types of Incrementality Tests

Several experimental designs are used for incrementality measurement, each suited to different situations.

Ghost Ads (PSA Control)

In a ghost ad test, the control group is served a public service announcement or unrelated ad in place of the actual campaign ad. This approach ensures that the control group participates in the same auctions and experiences the same ad load as the test group, isolating the impact of the specific campaign creative.

Ghost ads are considered the gold standard for digital incrementality testing because they control for the mere presence of advertising. They answer the question: did this specific campaign produce more conversions than would have occurred if these users had simply seen another ad?

Geographic Holdout Tests

Geographic tests assign entire geographic regions to test and control groups rather than individual users. A campaign might run in certain designated market areas (DMAs) while being withheld from comparable DMAs. Conversion rates are then compared across the geographic groups.

Geographic tests are useful when user-level randomization is not possible, such as for television, radio, or out-of-home campaigns. They also avoid the technical challenges of cookie-based user assignment. However, they require careful selection of comparable geographies and typically need larger sample sizes to account for inherent geographic variation.

Time-Based Holdout Tests

Time-based tests alternate periods of advertising activity with periods of inactivity for the same audience or geography. By comparing conversion rates during active and inactive periods, advertisers can estimate the campaign's incremental contribution.

These tests are simpler to implement but less rigorous than randomized experiments. External factors like seasonality, competitive activity, and market trends can confound the results unless carefully controlled.

Incrementality Testing by Channel

Different advertising channels present unique considerations for incrementality testing:

Paid search: Incrementality is a critical question for branded search, where advertisers may be paying for clicks from users who were already searching for their brand. Tests often show that branded search incrementality is lower than attribution suggests, while generic search terms tend to be more incremental.
Social media: Platforms like Meta offer built-in conversion lift studies that use randomized test and control groups within their ecosystems. These tests leverage the platforms' extensive user-level data for precise randomization.
Display and video: Programmatic campaigns are well-suited to ghost ad tests, where DSPs can manage the randomization and control ad serving at the user level.
Retargeting: Incrementality testing is especially important for retargeting because these campaigns target users who have already demonstrated purchase intent. The baseline conversion rate for this audience is high, making the true incremental contribution of the advertising potentially modest.
Connected TV: CTV incrementality testing often relies on geographic holdouts due to the challenges of user-level targeting and measurement on television devices.

Common Pitfalls in Incrementality Testing

Several common mistakes can undermine the validity of incrementality tests:

Contamination: When control group users are inadvertently exposed to the campaign through other channels or devices, the measured lift will be understated. Cross-device exposure is a frequent source of contamination.
Insufficient sample size: Running tests with too few users produces results that are not statistically significant, leading to inconclusive findings or misleading conclusions drawn from random noise.
Short test duration: Campaigns with longer purchase cycles need longer test windows. A one-week test for a product with a 30-day consideration period will miss most of the campaign's impact.
Selection bias: If the randomization process is flawed, the test and control groups may differ in ways that affect conversion rates independently of the advertising treatment.

Building an Incrementality Testing Program

Organizations should approach incrementality testing as an ongoing program rather than a one-time exercise. Start by testing the channels and campaigns that consume the largest share of budget, as small improvements in understanding their true effectiveness can yield significant savings. Establish a regular testing cadence, running tests quarterly or semi-annually for major campaigns and channels. Use the results to recalibrate attribution models, adjusting the credit assigned to each channel based on measured incrementality rather than click paths alone.

Incrementality testing is not a replacement for attribution. It is a complement that provides ground truth to calibrate other measurement approaches. Together, they give advertisers a more complete and accurate picture of how their advertising spending translates into real business outcomes.

Incrementality Testing: Measure True Ad Impact

Key Takeaways

What Is Incrementality?

How Incrementality Testing Works

Test Group Design

Measuring Lift

Statistical Significance

Types of Incrementality Tests

Ghost Ads (PSA Control)

Geographic Holdout Tests

Time-Based Holdout Tests

Incrementality Testing by Channel

Common Pitfalls in Incrementality Testing

Building an Incrementality Testing Program

Worth sharing?

⚡ Key Takeaways

What Is Incrementality?

How Incrementality Testing Works

Test Group Design

Measuring Lift

Statistical Significance

Types of Incrementality Tests

Ghost Ads (PSA Control)

Geographic Holdout Tests

Time-Based Holdout Tests

Incrementality Testing by Channel

Common Pitfalls in Incrementality Testing

Building an Incrementality Testing Program

Share this article

Worth sharing?

Related Stories

Google Unveils Meridian GeoX: AI's Next Frontier in Measurement?

Channel 4's OOH Measurement Shift: Real Impact Unlocked

AEO ROI: 58% Higher Conversion from AI Traffic

AEO Trackers: 30% Share of Voice Means Visibility Wins

Stay in the loop

Key Takeaways