Swapping the VAR without changing the name or the formula.
Same column name.
Different underlying quantity.
That’s the shell game.
The shell isn’t geography.
The shell is the variable label.
If this were just geometric mismatch, we would have:
But because the estimate silently shifts, what we’re actually doing is:
Imputing population via a proxy and continuing to treat it as observed.
This is how the distortion compounds.
We’re no longer propagating data.
We’re propagating assumptions.
Assumptions are sticky. They don’t decay; they accumulate.
When you apply standard administrative crosswalks in succession, each step amplifies the prior steps assumptions, and allocation choices. And, adds its own. In this example case study, shellgame is using HUD and TOT_RATIO, which assumes and allocates, residential address distributions as a proxy for population. The resulting estimates retain the population label despite no longer representing directly observed population quantities. As a result, successive users inherit increasingly imputed values while treating them as empirical measurements.
I am going to quantify mismatch at each hop, nothing more, nothing less.
ZCTA → ZIP → COUNTY
Some Census tabulation areas are being asked to stand in for multiple postal service geographies, with no guidance on how population should be divided.
This is the exact moment analysts silently switch from “joining data” to “inventing rules.”
At this stage, we:
have not: - applied TOT_RATIO - averaged anything - redistributed population
have only: - expanded relationships
Yet already: - The unit count has changed - The geography has fragmented - The analytical surface has shifted
Any downstream “fix” is compensating for damage already done, not refining a neutral process.
We can already say, truthfully and precisely:
“Using a lookup-style ZCTA-ZIP association increases the number of spatial units representing a county by 32% (74 ZCTAs to 98 ZIPs), prior to any allocation or weighting.”
For Hennepin County, Minnesota:
Input: Population at ZCTA level
Transformation: ZCTA → ZIP → County
Output: Population at each level
Result: delta(Population) = Baseline - Recovered
The shell game happens regardless of:
The transformation is the cause, not the tool or variable.
The shell game happens regardless of who’s shuffling the shells.
shellgame provides tools to:
Use it to audit yourself. Use it to understand what’s really happening when you transform geographic data.
Same column name. Different underlying quantity. That’s the shell game.