Measuring the true effects of advertising remains an ongoing challenge for professionals attempting to quantify a campaign’s return on investment.

Granted, the resources currently available for evaluating a digital campaign’s effectiveness are vast: advertisers and the platforms they use have access to sophisticated, highly-nuanced data—clicks, ad exposure rates, page visits, etc.—which provides them with extensive demographic and behavioral information that allows marketers to appropriately target consumers and, ultimately, lead them to make a purchase.

According to a March study, however, the methods commonly used for measuring digital ads’ effectiveness in the industry don’t take into account a litany of other important factors, and could therefore be inadequate for yielding accurate, reliable estimates of a campaign’s impact.

The study, which was jointly authored by researchers at Facebook and Northwestern University’s Kellogg School of Management, pointed out that the measurement methods frequently used by the industry rely on “observable,” individual-level data—they look at where and how an ad was effective based on internal variables—which often biases their effectiveness and fails to take into account several important, unobserved factors that may explain whether a campaign produced any incremental outcomes. Namely, were more consumers compelled to purchase something because they saw an ad? How many would’ve bought the product without ever seeing the ad at all?

Using data from 15 separate Facebook advertising studies totaling 500 million user-experiment observations and 1.6 billion ad impressions, the study assessed whether the observational measurement methods often used by the ad industry arrived at the same findings as large-scale, randomized experiments assigned to experimental groups.

The study concludes those observational methods often overestimated an ad’s effectiveness and sometimes significantly underestimated their effectiveness, and that large-scale, randomized control trials could be a more accurate and reliable way in assessing the variation produced by a digital ad campaign.

“We find that across the advertising studies, on average, a significant discrepancy exists between the observational approaches and RCTs,” the study’s authors wrote. “The observational methods we analyze mostly overestimate the RCT lift, although in some cases they significantly underestimate this lift. The bias can be high: in 50 percent of our studies, the estimated percentage increase in purchase outcomes is off by a factor of three across all methods.”

The study also noted that given the small number of studies used, however, “we could not identify campaign characteristics that are associated with strong biases. We also find that observational methods do a better job of approximating RCT lift for registration and page-view outcomes than for purchases. Finally, we do not find that one method consistently dominates. Instead, a given approach may perform better for one study but not another.”

The study’s findings will appear in the March edition of Marketing Science, a peer-reviewed marketing journal published by the Institute for Operations Research and the Management Sciences (INFORMS).