Topic

Math

The art of quantifying our world.

Simpson's Paradox

Simpson's paradox is a statistical phenomenon that occurs when a pattern across groups of data disappears when those groups are combined.

A famous example of this is UC Berkeley's admission rates for its 1973 class. The school discovered that it had admitted 44% of male applicants and 35% of female applicants.

On the surface, this looked like a considerable gender bias, but when they examined the data more closely, they discovered that women tended to apply to departments with more competitive rates of admission, while men tended to apply to less competitive departments.

Not only that, but because 6 departments were biased towards women, while only 4 departments were biased towards men, that year's class skewed in favor of accepting women, even though they had a lower overall acceptance rate.

This idea can be tricky to grok at first, so here's another example:

In baseball, player A can have a lower batting average than player B two years in a row. But if there is a discrepancy between the number of at bats they have, player A can have a higher batting average over the course of both years.

Here's how this played out for Derek Jeter and David Justice in the 1995-96 seasons:

Source: Wikipedia

Justice had a higher batting average both years, but Jeter had more at bats, so when the data are combined, Justice's lead disappears.

The common thread across both these examples is that there are hidden variables at play. The competitiveness of each of UC Berkeley's departments and the number of at bats are both concealed by the summarized data.

Simpson's paradox is an important reminder that our intuition is important when analyzing something.

Our intuition helps us figure out which questions to ask, and knowing what to ask is half the battle.

Compounding

As humans, we are terrible at understanding compounding effects. We tend to think in linear terms, but this blinds us to just how powerful compounding is.

A classic way to illustrate this is the following scenario: A genie appears and offers you either a million dollars now, or a sum of money every day for a month, starting with one penny today and doubling the amount you receive every day.

Which offer should you accept?

That million dollar offer is tempting, but the second option is significantly more lucrative: By day 30, you would net a whopping $10,737,418.23.

The compounding plan would initially be tough from a budgeting perspective, though. You'd have to wait a whole week to make your first dollar, and halfway through the month you would still only have $327.67.

It's hard to internalize just how dramatically something that grows like this can change over time, in part because it requires so much patience and delayed gratification.

But the truth is, many of the most important things in life compound. From relationships to wisdom, your investments in many areas grow exponentially over time. If you consistently put in the work, you'll reap the rewards.

For example, going to the gym once isn't going to have much effect. But if you go five times a week for six months, you'll see substantial changes in your fitness. The impact of that first workout actually increases over time if you maintain the habit.

If you want your efforts to compound, just keep going, and don't give up too early.

The Monty Hall Problem

Imagine you're on a game show facing three closed doors. The host tells you that one door has a car behind it, but the other two have goats.

You're asked to pick a door, in hopes of winning the new car. After you do so, the host opens one of the other doors that has a goat behind it. She then gives you the option to switch your choice to the other remaining closed door.

Here's the brain teaser: Are your odds of winning the car better if you switch your choice to the other door?

Two closed doors with question marks on them, next to an open door with a goat behind it.
Cepheus, Public domain, via Wikimedia Commons

This is known as the Monty Hall problem. It was popularized in Marilyn vos Savant's column in Parade magazine in 1990, and the solution is so unintuitive, thousands of people wrote letters of disagreement to her after she published it.

If you're like me, your instinct is that switching shouldn't matter. After the host opens a door, you have a 50/50 chance either way of picking the car, right?

Wrong.

In fact, your odds of winning the car are overwhelmingly better if you switch doors: A 2/3 chance if you switch, and a 1/3 chance if you don't. Though it might not seem like it at first, you have a lot more information than you did previously.

Understanding why this is true is easier if you consider a version of the problem with 100 doors: If you pick one door, and the host opens 98 of the remaining doors, should you switch to the other remaining closed door?

Which door seems more likely to have a car behind it? Your random pick or the door the host intentionally left closed?

Your chance of picking the correct door the first time was 1%. The host isn't opening doors at random – she knows which door the car is behind and is only opening doors with goats behind them.

So if you switch your choice to the other door, you have a 99% chance of winning the car, because so many of the wrong doors have already been opened.

No matter how many doors you imagine the problem with, your chance of picking correctly is always inverted if you switch once the host reveals every door except one.

Jim Frost calls this a statistical illusion. Just like an optical illusion can trick your brain into seeing something impossible, this problem can deceive you into thinking that the original solution is 50/50.

Here's another scenario that illustrates why this illusion is so compelling: Imagine you walk in the room after the host opens the door to reveal a goat. Since you don't know which door the contestant initially picked, your odds of picking the correct door at this point are 50/50.

It's a coin toss for you, because you have less information. But the contestant, who knows which door they initially picked, still has the better odds if they switch, because they know which door the host chose not to reveal.

So why do we care about this? On the surface, it's an inherently interesting problem, because it's a bit of an illusion. But it's also representative of something we experience regularly: When you're presented with multiple options and make a decision, be prepared to change your mind if you receive more information – even if it goes against your intuition.

It might make all the difference.

Context Matters

I'll never forget the season I scored half of my soccer team's goals.

To be honest, I was not a particularly good soccer player. The actual number of goals I scored that season?

One.

Context matters everywhere, but especially with statistics. It's easy to be misled by stats because they seem so objective, but they're easy to use in a dishonest way, even when they’re technically true.

A 100% year-over-year increase in people getting eaten by mountain lions sounds scary. But if last year that number was three, we probably don't have much to fear.

The average human has half a uterus, but that's not a useful representation of our anatomy if there aren't many people close to the average.

And just because we've observed the sun rising 100% of the time so far, that doesn't mean we can extrapolate that out forever.

Quantifying the world is just as much art as it is science.

Strathern’s Insight

Economist Charles Goodhart proposed the following rule in a 1984 article: "Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes."'

The statement is applicable in many places beyond statistics, of course, which the anthropologist Marilyn Strathern pointed out in her 1997 paper: "When a measure becomes a target, it ceases to be a good measure."

This observation is worth keeping in mind every time we try to measure something with a goal in mind.

The number on the scale does not necessarily reflect your overall health.

The number of likes your brand has on social media isn't necessarily a measure of its success.

The number of books you read in a year doesn't necessarily represent how much you've learned.

Another example: When you put pressure on the police to reduce crime, they may discover that it's easier to simply downgrade the severity of reported crimes than to address the systemic problems that lead to them in the first place.

It's often hard to know if we're making progress towards something without measuring it. But as Strathern so insightfully noted, once our goal is a number, the strategies we have for reaching that number may distract us from our actual goal.

Subscribe

Daily thoughts on living more intentionally and creating work that matters.
Email address
Subscribe