Customer Health Scores: Build One That Predicts Churn

The customer sent their churn notice on a Tuesday. Your health score showed them green on Monday. That is not a data problem. That is a model problem. Most customer health scores in use today do not predict churn — they describe recent activity and label it health. There is a significant difference between a customer who is active and a customer who is safe. Conflating the two is how you get ambushed by churns that the data said were not coming.

This is not a marginal failure. In subscription businesses, one undetected high-value churn can cancel out the revenue from ten successful expansions. The health score exists precisely to surface risk early enough that intervention is still possible. If it is surfacing risk the same week the customer decides to leave, it is decorative. You need a model that catches churn signals at least 60–90 days before the decision crystallises. That requires measuring different things than most health scores measure.

Why Most Health Scores Fail

The standard health score architecture looks like this: take a handful of product usage metrics (logins, feature usage, sessions), weight them roughly equally, add a support ticket volume component, maybe a survey score if you have one, and compute an aggregate. Anything above 70 is green. Below 50 is red. The band in the middle is yellow and nobody quite knows what to do with it.

This model fails for a predictable reason: it measures activity, not engagement depth or risk trajectory. A customer logging in daily is not necessarily healthy. They might be logging in daily because they are trying to extract value from a product that keeps failing them. High login frequency with low feature adoption is a warning sign, not a green flag. A customer who has not logged in for three weeks might be on holiday, or might be piloting a competitor. The activity metric cannot tell you which.

The second failure: health scores that are designed to be explained to customers, rather than to predict internal risk. When the CS team uses the health score in QBRs to show customers how "healthy" their account is, the incentive structure changes. Scores get rounded up. Reds get manually overridden. The model becomes a relationship management tool rather than a risk prediction instrument. These two functions need to be architecturally separated — what you show customers and what you use internally to trigger intervention should not be the same number.

The Vanity Metric Trap

The third failure: optimising your health score for correlation with renewal rather than predictive lead time. If you build a model by taking all churned customers and identifying which metrics were low in the 30 days before they churned, you will build a model that identifies customers who are already churning — not customers who are going to churn. You need signals that appear 60–90 days before the churn decision. The set of signals that does this is significantly different from the set of signals that describe current usage patterns.

The Signals That Actually Predict Churn

After you strip out the activity metrics that describe rather than predict, a shorter and more powerful set of signals emerges. These are the variables that, historically, show meaningful degradation well before a customer decides to leave.

Engagement Depth, Not Breadth

Feature adoption velocity is one of the strongest predictors of long-term retention: the rate at which a customer is expanding their use of core product features over time. A customer who adopted three features in month one, four in month three, and five in month six is on a healthy trajectory. A customer who adopted three features in month one and has used the same three features for twelve months is stagnant — they have found their ceiling, and your product is not growing with their needs.

The diagnostic question is not "are they using the product?" but "are they using more of the product over time?" Stagnant feature adoption is an early warning signal that typically appears 4–6 months before a churn decision. It is invisible to a health score that measures session count.

Stakeholder engagement breadth is equally important. A customer where your CS relationship exists with one person is fragile. If that person leaves, you lose the account. A customer where you have active relationships with three to five people across the organisation — including at least one economic buyer — is structurally safer. Track the number of unique named users actively engaging with your team, not just product users. Declining stakeholder count, or heavy reliance on a single champion, is a churn signal that has nothing to do with product usage.

Stakeholder Changes as Risk Events

Economic buyer or champion turnover is the single highest-correlation churn predictor across most B2B SaaS verticals. When the person who bought your product leaves the company, the evaluation clock resets. Their replacement has no sunk cost in your product, no relationship with your team, and a natural incentive to make a decision that marks them as independent from their predecessor.

Every stakeholder change in a customer account should automatically trigger an intervention workflow — not a polite check-in email, but a deliberate re-engagement with the incoming buyer: understanding their priorities, demonstrating value in their terms, and establishing a new relationship before a competitor does. The companies that systematically do this retain at rates 15–20 percentage points higher than those that do not, on the same base of accounts.

Support Ticket Patterns

Support volume alone is not a useful signal — it can indicate heavy usage as easily as dissatisfaction. The signal is in the nature and trajectory of support tickets. A customer whose tickets shift from "how do I do X" (configuration and capability questions) to "why isn't X working" (repeated failure and frustration) has entered a different phase of the relationship. Escalating severity, repeated tickets on the same issue, and tickets that reference competitive alternatives in their text are all signals that require immediate CS attention regardless of what the aggregate health score shows.

Also watch for the inverse: a customer who stops raising support tickets entirely after a period of heavy volume. This can mean their issues were resolved. It can also mean they have stopped investing in making the product work and are quietly building the business case to leave.

THE FRAMEWORK

The full interrogation framework is Dispatch #004 — Churn Early Warning Framework. 38 questions across four sections that expose whether your retention system catches risk before customers decide to leave — or after. $97. Instant download.

See the full framework →

How to Weight the Signals

Signal weighting is where most health score projects get either over-engineered or under-analysed. The right approach is empirical: take your churned accounts from the last 18–24 months, pull the signal values at 30, 60, and 90 days before churn, and identify which signals showed the most consistent degradation at each time horizon. That gives you a data-driven basis for weighting rather than an opinion-based one.

In the absence of sufficient historical data — which is common for smaller companies or those building their first health score — use a tiered weighting approach. Divide signals into three tiers:

Tier 1 (highest weight): signals that indicate relationship and strategic alignment. These include executive engagement frequency, stakeholder count, NPS or sentiment survey trends, and renewal conversation initiation. These signals predict churn most reliably because they reflect whether the customer sees strategic value in the relationship — not just whether the product is functional.

Tier 2 (medium weight): signals that indicate product value realisation. Feature adoption velocity, core use case completion rates, and outcome metrics (if you have them configured) sit here. These indicate whether the customer is getting value from the product, which is a prerequisite for renewal but not sufficient on its own.

Tier 3 (lower weight): activity signals. Login frequency, session duration, and general product engagement. These are useful as tie-breakers and for flagging sudden drops in usage, but should not dominate the score because they confuse activity for health.

The total score should be weighted approximately 50/30/20 across these tiers. If your current health score is weighted 70% or more on Tier 3 signals, you have built a usage dashboard with a health label on it. It will not predict churn.

Building the Health Score in Practice

The mechanics of implementation depend on your tooling, but the architecture is consistent regardless of whether you are using Gainsight, ChurnZero, Totango, or a custom build in your CRM.

Start with a signal inventory. List every data point you can capture about each customer account: product events, support system data, CRM activity logs, survey responses, finance data (invoice payment patterns, usage-based billing trends), and stakeholder change events from LinkedIn or HRIS integration where available. Not all of these will be usable in your first version. Pick the eight to twelve most reliable and automatable signals and build with those.

Establish baseline benchmarks by customer segment before applying the scoring model. A customer in month three of onboarding should be scored against a different baseline than a customer in year three of their contract. A 50-user enterprise account has different expected engagement patterns than a 5-user SMB. Applying uniform thresholds across segments produces false negatives in early-stage accounts and false positives in mature ones.

Build in a manual override mechanism with mandatory documentation. CS managers will sometimes know things the model does not — an upcoming expansion conversation, a known internal reorganisation at the customer, a relationship context that explains an anomalous metric. Allow overrides, but require a logged reason and a review date. This creates accountability and generates data you can use to improve the model over time.

How to Use the Health Score Operationally

A health score that does not trigger action is a metric in a dashboard that nobody reads. The operational value comes from automating the workflow that the score feeds.

Define clear intervention protocols for each risk tier. A customer dropping from green to yellow should automatically trigger a CS check-in within five business days and a stakeholder mapping review. A customer dropping from yellow to red should trigger an escalation to the CS manager, a risk review in the weekly CS team meeting, and an executive sponsor outreach within ten business days. These workflows should be automated wherever possible — not to remove human judgment, but to ensure that risk does not fall through the cracks when the CS team is carrying a heavy book of business.

Use the health score as the primary agenda driver for QBRs. The lowest-health accounts in your book should receive the most QBR preparation time and the most senior internal representation. High-health accounts should receive lighter-touch reviews that focus on expansion opportunity rather than retention risk. Allocating QBR resources uniformly across all accounts regardless of health is a resource efficiency failure with direct revenue consequences.

Track the score's predictive accuracy quarterly. What percentage of accounts that hit red status churned within 90 days? What percentage of churns were green or yellow at 90 days prior? This is how you know whether the model is working and how you improve it over time. The goal is to maximise the first number (red should predict churn reliably) and minimise the second (churn should not be invisible to the model). See Win/Loss Analysis: Run One That Actually Changes Behaviour for how to build the feedback loops that improve this over time. For the revenue-level impact, Net Revenue Retention is the metric that consolidates everything your health score is trying to protect.

A health score that measures activity is a rearview mirror. A health score that measures risk trajectory is a windscreen. Build accordingly.

Most customer health scores are built to reassure CS teams, not to surface risk. They measure what is easy to capture — logins, sessions, tickets — and label the result "health" because the word sounds good in a board presentation. The companies that build predictive models, weight signals empirically, and connect score outputs to automated intervention workflows catch churn before it becomes inevitable. The companies that build activity dashboards and call them health scores get surprised every quarter. Build the model that catches risk at 90 days, not one that confirms it at 30. Your net revenue retention rate will tell you which one you have.