A simple scoring formula for better search reranking

Search results can be reordered by giving extra weight to documents that are linked from other documents. The formula first caps the link count at a chosen maximum, then uses a logarithm so the boost grows slowly.

This gives a useful lift even when there is only one link, while stopping heavily linked documents from taking over the whole result list. The method was used in a legal RAG app to include document popularity in search ranking.

It later worked in a different tool that found similar code with embeddings, where closeness inside the codebase was used as the boost signal. The main idea is that search quality can sometimes improve with a cheap, predictable rule instead of adding another complex AI step.

Key points

  • The formula boosts results based on a signal such as link count or code distance.
  • The boost grows slowly, so very popular items do not dominate everything.
  • It was useful in both a legal RAG app and a similar-code search tool.
  • It is cheap and deterministic because it does not require another AI model call.
  • For AI agents, better reranking can reduce irrelevant material sent to the model.

Quick term guide

embeddings
A way of converting text into numbers so that similar meanings can be found and compared mathematically.
embedding
A way to turn text meaning into numbers so similar text can be found.
codebase
The full set of files and code that make an app or product work.
AI agents
AI agents are AI tools that can carry out steps toward a goal, not just answer once.
AI agent
An AI program that can inspect information and suggest what to do next.
reranking
A second pass that re-sorts search results by relevance so only the best ones are kept.
deterministic
Giving the same result every time when the input is the same.
model call
One request sent to an AI model to get an answer.
Read original