Case · logistics SaaS · 2023
A work-stealing scheduler replaced a 40k-line cron monolith.
Shape of the problem
The platform coordinated physical pickup and delivery windows for several hundred enterprise shippers. Scheduling logic had accreted for nine years inside a single service whose nominal job was "run cron-like tasks" and whose actual job had become "orchestrate every workflow in the company." Two classes of bug — duplicate dispatch and silently-dropped reattempts — had become structural rather than incidental.
What we did
We resisted the urge to propose a queue broker. The client's operations team trusted their relational store in a way they would not have trusted any new system, and correctly. So we built a small queueing substrate directly on top of it: an outbox table with a strict lease protocol, a narrow worker library (under 1,100 lines), and a single idempotency discipline applied everywhere tasks could be produced. The monolith was not rewritten; it was drained, function by function, into workers that were boring to read.
Outcome
- Duplicate-dispatch incidents: three per quarter → zero observed in the eighteen months since completion.
- Mean task latency from trigger to execution: 47 s → 1.6 s.
- The cron monolith still exists, now at roughly 6k lines, and is expected to die of attrition.