Fixing a real sprint-predictability problem.
A product squad kept missing sprint commitments while everyone was working at full capacity. Adding people would not have fixed it. This is how I found the real constraint, aligned the team around it, built the fix into the tooling, and proved it held: sprint predictability moved from 58% to 89% over two quarters, with no capacity added.
Find the true constraint.
Before fixing anything, prove what is actually limiting delivery, and that it is systemic, not a local symptom. I walked the value stream stage by stage, tagging each activity as value-add or non-value-add, and marking where the primary constraint sat.
Prove it's a constraint, not a symptom.
Work was both waiting and being redone. High rework (~25% of effort) plus low flow efficiency (~28%) means work was being built before it was right, then done again. That points upstream, to requirements not being set properly before work entered the sprint, not to a slow team or a downstream queue.
The hypothesis and the fix.
If requirements and key decisions are resolved before work enters the sprint, rework and decision latency fall, and predictability, flow efficiency and carry-over improve without adding capacity.
- 01Definition of Ready
A story cannot enter a sprint until acceptance criteria, dependencies, stakeholder alignment, story points and a product owner are in place.
- 02Decisions moved upstream
Product and design decisions resolved before commitment, not mid-build.
- 03Stakeholder alignment first
On high-impact work, senior leadership reviews before effort is invested.
- 04Rework tracking
Turned the one-off diagnosis estimate into a standing metric, flagged automatically at sprint start.
Bringing people with it.
High-performing teams win by removing friction, not by working harder. I borrowed the marginal-gains idea associated with Dave Brailsford and British Cycling, lots of tiny improvements compounding, and pointed it at our own friction: rebuilding work that was already "finished". That framing landed because every function recognised the frustration of redoing finished work. The vision was one line: decide before we build, so we build it once.
The change was shopped round, not announced once. Prove it on one team and let results pull the rest.
We were not making any one function more efficient. We were improving the whole value stream.
Built into the tooling.
A process that relies on memory decays, so the gate was enforced in the work-management system, not left to discipline.
These were required fields, and a workflow rule blocked the move into a sprint until every one was complete. One shared scheme, inherited by every team, with a tracked exception path for genuine urgent cases.
The result.
No capacity added. The gate held because the system enforced it, and the metrics showed whether it was sticking.
Reporting ran on three tiers: team-level dashboards inside each squad for daily flow signal, a senior-leadership rollup that compared squads on the same measures, and a constraint-level view that tracked readiness violations and rework trends across the whole product organisation.
| Squad | Predictability | Carry-over | Cycle time |
|---|---|---|---|
| Reading | 82% | 12% | 5.1d |
| Writing | 58% | 34% | 8.4d |
| Workflow | 74% | 18% | 7.2d |
| Platform | 88% | 9% | 4.8d |
Building a production system from scratch at a data design consultancy (The Economist Group). Where this systems approach started.
Note on figures: the 58% to 89% predictability gain was measured over two quarters and is accurate. Some supporting numbers (rework rate, the ~85% pilot figure, and the per-squad table above) are approximate or illustrative, included to show how the reporting reads rather than as exact reported values.