Across 50 schools and 16,188 students, the interventions are running. The software is tracking. But nearly 4 in 10 students still won't hit their growth target — not because of the tool, but because of what happens around it.
Imagine standing at the top of a funnel watching students enter their intervention cycles with hope and structured support. The tools are running. The sessions are logged. And yet, system-wide, only 59.4% will meet their growth target — and a mere 18% will move up an entire tier.
This is not a failure of effort. Teachers are logging in. Coaches are checking in. The platform pulse is steady. The miss is structural — hidden in what the data reveals when you look past raw minutes.
Before the analysis began, five bets were made about what drives student growth. Every single one held up under statistical testing.
Here is the most counterintuitive finding in this entire dataset — and the one most likely to be ignored: students spending 120+ minutes per week on the tool perform worse than those spending 45-90 minutes. Drastically worse.
It's not the minutes that matter. It's the mastery happening inside them. Schools optimizing for time-on-tool are chasing a metric that has decoupled from the outcome it was meant to predict.
The mechanism: overloaded minutes usually mean students are being assigned to the tool as a compliance placeholder — a way to fill time — rather than to reinforce genuine instructional progress. When mastery isn't tracked as the primary signal, time becomes a proxy that misleads everyone looking at the dashboard.
The 38.1 percentage point gap between the best and worst cells in this matrix is the largest observed spread in the entire dataset. This is lever H3 — and it's Priority Zero.
Schools where intervention groups are aligned to classroom instruction see a 71.5% growth-target-met rate. Schools with low alignment: 41.1%. A 30.4 percentage point gap — larger than most staffing interventions, and fixable with a weekly planning routine.
Alignment means the student's intervention session addresses the same content their classroom teacher is about to teach — or just taught. It turns a parallel track into an integrated one. The research is clear; the implementation rarely is.
Every dot is a school. Color shows its archetype. The pattern is impossible to ignore.
The fix isn't sophisticated. It's an anchor meeting each week: look at what the classroom teachers are teaching next, regroup intervention students accordingly, and verify in the platform. Schools that do this consistently appear in the upper-right of that scatter. They're not special. They just didn't let the routine slip.
Cluster the 50 schools by their behavioral signatures — execution, alignment, tool usage, caseload — and five distinct archetypes emerge. Each one has a different problem and needs a different intervention.
The flat fidelity line (~72%) masks a deeper problem: execution is consistent but low. Schools are going through the motions of the weekly cycle — data reviews, regroupings, tool checks — but not with sufficient depth for those routines to move the needle. The 13.6-point gap between top- and bottom-quartile execution schools on tier movement tells the real story.
Teachers supported by top-quartile coaches gain 8.48 percentage points on state tests. Those with bottom-quartile coaches: 6.28 pp. That's a 35% difference in teacher growth — compounding year over year.
The effect is strongest in high-need schools — exactly where the best coaches are least often deployed. This creates a compounding disadvantage: the schools that most need exceptional coaching are receiving average coaching, while better-resourced schools lock in the strongest coaches and pull further ahead.
82.8% of intervention cycles involve an overloaded staff member. This isn't a personnel failure — it's a structural one.
All coaching action types score similarly on quality (~86-87%). The real differentiator is frequency and follow-through — not the type of action taken.
These aren't hypotheses anymore — they're confirmed findings, statistically significant and practically large. The question is sequencing: what do you fix first to unlock the rest?
Adjust the alignment improvement and overload reduction levers to see projected changes in growth outcomes. These projections are grounded in the observed effect sizes from the data.
These projections assume conservative effect sizes derived from observed differences in the dataset. The alignment effect (0.9× per pp improvement), overload effect (0.45×), and mastery floor effect (0.5×) are all conservative estimates — the actual gains could be larger in schools that are currently furthest from their potential.