What Is Correlation, Really? — Gökhan Çetinkaya

Sit in enough business meetings and you will hear it regularly.

"We see a correlation between customer tenure and upsell rate." "There's a strong correlation between delivery time and churn." "The data shows correlation between ad spend and revenue."

Everyone nods. The conversation moves on.

What almost nobody in the room asks — and what matters enormously — is what kind of relationship is actually there. Because "correlation" in the statistical sense means something very specific, and that specific meaning is often not what the speaker intends.

The gap between the two is not a minor technicality. It leads to wrong model choices, misleading analyses, and confident conclusions built on shaky ground.

What Pearson's r actually measures

When statisticians say "correlation," they usually mean Pearson's correlation coefficient — a number between -1 and +1.

Here is what that number actually captures: the strength and direction of a linear relationship between two variables.

That is all.

r = 1 means a perfect positive line. r = -1 means a perfect negative line. r = 0 means no linear relationship.

It says nothing about:

Whether the relationship is nonlinear.
Whether the result is driven by outliers.
Whether the two variables are causally related.
Whether the relationship is stable across different subgroups or conditions.

Yet in most business usage, "correlation" is shorthand for something much broader: there is a pattern worth paying attention to. This is not wrong — but it is imprecise in ways that quietly cause problems downstream.

Anscombe's Quartet

In 1973, statistician Francis Anscombe constructed four small datasets specifically to illustrate this problem. He called them Anscombe's Quartet.

All four datasets have nearly identical summary statistics: same mean, same variance, same Pearson correlation coefficient (approximately 0.816), and even the same regression line.

Plot them, and the similarity disappears entirely.

Dataset I — a clean linear relationship with moderate scatter. Pearson's r is exactly the right tool here.
Dataset II — a perfect nonlinear curve. The relationship is deterministic and smooth, but not linear. The correlation coefficient still says 0.816 — which is not just uninformative, it is actively misleading.
Dataset III — a near-perfect line, except for one influential outlier. The r value is being driven almost entirely by a single point. Remove it, and the correlation collapses.
Dataset IV — a vertical cluster of points at x = 8, plus one isolated point at x = 19. There is no relationship in any meaningful sense. The non-zero correlation exists only because of that one point.

The quartet has been reproduced, extended, and animated many times since — most memorably as the Datasaurus Dozen, which shows twelve datasets (including one that forms the shape of a dinosaur) all sharing the same summary statistics.

The point is always the same:

The number hides what the plot reveals.

The nonlinear trap

Dataset II from Anscombe's Quartet deserves special attention, because the failure mode it represents is not exotic. It is common.

Consider an inverted-U relationship. Performance as a function of stress, for example. At low stress, performance is poor — not enough pressure to focus. At moderate stress, performance peaks. At high stress, it collapses.

The relationship is real. It is strong. It is deterministic. And Pearson's r on this data would be approximately zero — because the upward slope on one side and the downward slope on the other cancel each other out.

Price and demand often follow a similar shape at extremes. Quality and speed frequently do too. Many human behavioral and organizational variables live in this territory.

If you compute Pearson's r on an inverted-U relationship and get something close to zero, the natural interpretation is: no relationship. But the correct interpretation is: the wrong tool.

The variable may be highly predictive. The linear measure simply cannot see it.

What to do instead

The fix is not complicated. It just requires changing the default workflow.

Plot first, always. A scatter plot takes thirty seconds and reveals structure that no single number can summarize. Nonlinearity, outliers, clusters, truncated ranges — all visible to the eye, all invisible to r. If you are making decisions based on correlation coefficients without having looked at the scatter plot, you are skipping the most important step.

Use Spearman's rank correlation for monotonic relationships. Spearman measures whether the relationship is consistently increasing or decreasing, without requiring that it be linear. It is more robust to outliers and works on ordinal data. For many business relationships — where you care about direction more than precise shape — Spearman is simply the better default.

Use mutual information for general associations. Mutual information captures any statistical dependence — linear, nonlinear, symmetric, asymmetric — without assuming a specific shape. It is harder to interpret and requires more data, but it is the right choice when you genuinely do not know what form the relationship takes.

Be precise about what you mean. When you say "correlation," specify: linear association? Monotonic association? A pattern worth investigating? The word can mean all of these, and the distinction matters when choosing a model.

Back to the boardroom

None of this is an argument against using the word "correlation" in non-technical settings. Language simplifies, and that is fine.

It is an argument for knowing what question you are actually answering — and for not letting a convenient shorthand smuggle in assumptions you have not made deliberately.

When someone reports a correlation of 0.7 between two business variables, the right follow-up questions are:

Have you looked at the scatter plot?
Is the relationship actually linear, or does it bend?
Is a small number of observations driving the result?
Does the relationship hold across subgroups, or only in aggregate?

These are not pedantic questions. They are the difference between a finding that holds up and one that dissolves on closer inspection.

And when a model is later built on that finding — a linear regression, a feature engineering decision, a product hypothesis — the assumptions baked into the correlation coefficient come along for the ride, whether anyone intended them to or not.

Closing thoughts

Correlation is not a bad concept. It is a precise one.

The problem is not that people use it — it is that the word has been stretched to cover a much wider range of ideas than it can technically support. And when that stretched meaning quietly shapes an analysis or a model, the resulting errors are hard to spot because they were never made explicit.

Plot your data. Choose your measure deliberately. And when someone says "there's a correlation," ask what kind.

The answer will tell you more than the number ever could.