Case · marketplace · 2023
A relevance rewrite without the nine-month relevance rewrite.
Shape of the problem
A vertical marketplace was serving search over roughly 26 million active listings from a single inverted index on one very large machine. Query latency was acceptable at the median and catastrophic at the tail. The team had spent eight months on a rewrite that had produced a new system with measurably worse relevance than the old one, and morale had collapsed along with the timeline.
What we did
We declared the rewrite paused and instrumented the existing system instead. It turned out that 4% of queries — almost all tied to three specific UI entry points — were responsible for 78% of the long tail. We introduced a retrieval tier in front of the existing index: a cheap candidate generator that served those specific shapes of query from a narrower, memory-resident index, falling through to the original system for everything else. Relevance was preserved because the candidate set was a strict superset of what the old system would have ranked.
Only after that stabilized did we return to the sharding question, and by then the team had the latency headroom to do it calmly.
Outcome
- p99 query latency on the hot paths: 1.4 s → 180 ms.
- Recall@20 held within measurement noise of the baseline.
- Infrastructure cost for the search path fell 61% over two quarters.
- The paused rewrite was formally cancelled. Nobody missed it.