Data Measurement Errors, Historical Limitations, and the Peso Problem in CME

When Your Data Lies: The Hidden Dangers in CME Inputs

Every capital market expectation is only as good as the data behind it. Yet the data analysts rely on is routinely contaminated by errors, biases, and structural breaks that can silently corrupt portfolio allocations. The CFA Level III curriculum identifies two broad categories of problems: errors and biases in the data itself, and fundamental limitations of using historical estimates to forecast the future.

This article covers both categories in depth, with original worked examples and practical defenses.

Part 1: Data Measurement Errors and Biases

Transcription Errors: The Simplest Problem

Transcription errors are mistakes made during data gathering, recording, or entry. A monthly return entered as 18.7% instead of 1.87%. A price recorded in the wrong currency. A volume figure pasted into a price field.

These errors add noise to every estimate derived from the data. Unlike systematic biases, transcription errors have no consistent direction — but their impact on optimizer outputs can be dramatic. Consider Meridian Research building a covariance matrix for 15 asset classes. A single erroneous data point for one asset class shifts its sample mean, inflates its variance, and distorts its correlations with every other asset. Fed into an optimizer, this can swing allocations by several percentage points.

Defense: Implement range checks (flag returns beyond three standard deviations), cross-reference against independent data sources, and use sequential checks to identify suspicious spikes.

Survivorship Bias: The Database That Forgets Failures

Survivorship bias arises when a database only includes entities that survived to the end of the measurement period. Funds that closed due to poor performance disappear from the record, leaving a sample that systematically overstates average returns.

This problem is particularly severe for hedge funds and alternative investments. Research estimates that survivorship bias inflates reported hedge fund returns by one to three percentage points per year. A pension fund using biased data to set CMEs for its alternatives allocation will systematically overweight strategies whose apparent performance is partly illusory.

Loading diagram...

Defense: Use databases that track defunct funds (survivorship-bias-free databases), or apply statistical adjustments to standard databases.

Appraisal Smoothing: When Volatility Disappears

For assets without liquid public markets — real estate, private equity, timber, infrastructure — appraisal data substitutes for transaction data. But appraisers tend to anchor to previous valuations and adjust incrementally, creating artificially smooth return series.

The consequences cascade through CME inputs. Measured volatility for real estate might be reported at eight percent when true economic volatility is sixteen to twenty-four percent. Correlations with public equities appear to be 0.15 when the genuine figure is 0.40 to 0.60. In a mean-variance optimizer, these understated risk figures make illiquid assets look like diversification miracles — high returns with low risk and low correlation.

Defense: Apply statistical unsmoothing techniques (such as the Geltner adjustment) to remove serial correlation from appraisal-based return series before using them as optimizer inputs.

Part 2: Limitations of Historical Estimates

The Core Problem

History is a guide, not a gospel. Two issues undermine the use of historical data for forecasting. First, the past may not represent the future. Second, even if it does, statistics calculated from finite samples may be imprecise estimates of true parameters.

Both problems can be mitigated by imposing structure — that is, using models that describe how data was generated in the past and how it is expected to behave in the future.

Regime Changes and Nonstationarity

Changes in technology, politics, regulation, monetary policy, and market structure can fundamentally alter risk-return relationships. These shifts create nonstationarity: different segments of a data series reflect different underlying statistical properties.

Loading diagram...

An analyst estimating bond CMEs using data spanning all four periods is averaging across fundamentally different environments. The resulting estimate — perhaps 5.5% average annual return — describes none of the individual regimes and provides a poor forecast for any of them.

The Practical Framework:

The curriculum recommends asking two questions before deciding how much history to use:

Is there reason to believe the full sample period is no longer relevant? Has there been a fundamental change in the governing regime?
Do statistical tests (Chow test, Bai-Perron procedure) support the hypothesis that a structural break occurred?

If both answers are yes, use only the relevant portion of history — or employ regime-switching models that account for structural breaks.

Guideline: Use the longest data history for which there is reasonable assurance of stationarity. More data improves precision, but only when the data-generating process has been consistent.

Data Frequency Trade-Offs

It is tempting to assume that higher-frequency data (daily instead of monthly) always produces better estimates. The reality is more nuanced.

Higher-frequency data does improve the precision of variance, covariance, and correlation estimates — more observations reduce sampling error for second moments. But it does NOT improve the precision of the sample mean. Mean return precision depends on the total time span of the data, not the number of observations within that span. Switching from 60 monthly observations to 1,260 daily observations within a five-year window provides no additional information about the average annual return.

Higher-frequency data also introduces asynchronicity. Daily returns for markets in different time zones do not reflect the same hours, even when they carry the same calendar date. For example, Tokyo closes approximately 14 hours before New York. News released during US trading hours affects US prices on day t but Japanese prices on day t+1. This splits co-movements across calendar dates, biasing measured correlations downward and creating spurious lead-lag relationships.

The Covariance Matrix Problem

When the number of assets exceeds the number of observations, the sample covariance matrix becomes singular — some portfolio combinations appear to have zero variance. This happens frequently in investment analysis where analysts estimate risk for large asset universes using limited data.

The standard remedy is factor-model covariance estimation. By assuming returns are driven by a smaller set of common factors plus uncorrelated asset-specific components, analysts can produce well-conditioned covariance matrices even with limited observations.

Part 3: The Peso Problem

Perhaps the most conceptually important data challenge is the peso problem: the situation where asset prices reflect the possibility of a major negative event that does not occur during the sample period.

How It Works

When markets price a risk premium for a catastrophic event, investors earn that premium as excess return for every period the event fails to materialize. Looking backward at the data, the analyst sees high returns and low volatility — an apparently attractive asset. But the high returns were compensation for bearing a risk that simply didn't occur in the sample. The true expected return, weighted by the probability of the adverse event, is lower.

Original Example — Ironshore Sovereign Bonds

Imagine the Republic of Ironshore issues 10-year bonds at a 9.5% yield while comparable US Treasuries yield 3.5%. The 6% spread compensates investors for a perceived 12% annual probability of default with 50% recovery.

Over a ten-year period with no default:

Realized annual return: approximately 9.5%
Realized volatility: moderate (normal bond volatility)
Sharpe ratio: exceptionally high

An analyst using this history as a forward CME would project 9.5% returns with moderate risk. But the true expected return, adjusted for the default probability, is:

E[Return] = 0.88 × 9.5% + 0.12 × (-50%) = 8.36% - 6.0% = 2.36%

The true risk-adjusted return is dramatically lower than what the historical data shows.

The Mirror Image

The opposite problem exists for rare events that DO appear in the sample. If a once-in-fifty-years crash occurs in your ten-year dataset, its observed frequency (10%) vastly exceeds its true probability (2%). Risk measures based on this sample — VaR, CVaR — will substantially overstate the likelihood of such events recurring.

Defenses Against the Peso Problem

Adjust historical returns for the estimated probability and magnitude of priced risks that didn't materialize
Use scenario analysis that explicitly models tail events
Compare returns across similar assets in lower-risk environments to identify the unrealized risk premium component
Be skeptical of any asset class with an unusually smooth, high-return track record

Non-Normality: A Practical Note

Historical return distributions consistently exhibit negative skewness and excess kurtosis (fat tails). Formal normality tests routinely reject the null hypothesis. However, the analytical cost of fully modeling non-normal distributions — additional parameters, more complex optimization, reduced transparency — is substantial.

For strategic asset allocation with long horizons, the central limit theorem pushes multi-period returns toward normality, and the marginal improvement from modeling fat tails is often small. The pragmatic approach is to use normal assumptions for the core allocation framework and layer on separate tail-risk analysis (stress tests, scenario analysis) to address extreme events.

Putting It All Together

The best analysts are not those with the most sophisticated models. They are those most disciplined about recognizing what can go wrong with their inputs. A checklist approach works well:

Check for transcription errors — range tests, cross-source verification
Assess survivorship bias — is the database survivorship-free? If not, adjust.
Unsmooth appraisal data — apply Geltner or similar techniques before optimization
Test for regime breaks — use the two-question framework before selecting sample length
Choose data frequency deliberately — higher frequency for risk estimates, longer span for return estimates
Evaluate the peso problem — are high historical returns compensating for unrealized tail risks?
Acknowledge non-normality — use normal assumptions pragmatically but supplement with tail-risk analysis

Test your understanding of these data challenges in our CFA Level III question bank, or explore the community Q&A for detailed discussions on each topic.