Introduction to Google’s Ranking Systems
A Google engineer’s testimony, published online by the U.S. Justice Department, has provided a rare glimpse into Google’s ranking systems. The document offers a general overview of how Google ranks websites, including the use of hand-crafted signals, quality scores, and a mysterious popularity signal that uses Chrome data.
Hand-Crafted Signals
The document begins by explaining the process of "hand-crafting" signals, which involves taking data from quality raters, clicks, and other sources, and applying mathematical and statistical formulas to generate a ranking score. This process is used to create three kinds of signals, referred to as ABC signals, which correspond to anchors, body, and clicks. These signals are used to determine the relevance of a document to a search query.
The ABC Signals
The ABC signals are a key component of Google’s ranking algorithm, and are used to determine the topicality of a document. The signals are combined in a "relatively hand-crafted way" to create a base score, which is then used to judge the relevance of the document. The document notes that the ABC signals are just one part of the ranking process, and that hundreds or thousands of additional algorithms are used at every step of the process.
Interplay Between Page Quality and Relevance
The document also reveals that page quality is independent of query, meaning that a page that is determined to be high-quality and trustworthy is regarded as such across all related queries. However, relevance-related signals can be used to calculate the final rankings, showing how relevance plays a decisive role in determining what gets ranked. The document notes that page quality is "incredibly important" and that people often complain about the quality of search results.
The Role of AI
The engineer notes that AI can sometimes make the situation worse, and that people still complain about the quality of search results. However, the document also notes that AI is used to improve the ranking algorithm, and that it is an important part of the process.
Other Ranking Signals
The document mentions several other ranking signals, including eDeepRank, which is an LLM-based system that uses BERT to decompose LLM-based signals into components. This makes the signals more transparent, allowing search engineers to understand why the LLM is ranking something in a certain way. The document also mentions PageRank, which is Google’s original ranking innovation, and notes that it is used as an input to the quality score.
Cryptic Chrome-Based Popularity Signal
The document also mentions a mysterious popularity signal that uses Chrome data, although the name of the signal is redacted. This has led to speculation about the nature of the signal, and whether it is related to the Chrome API leak.
Conclusion
The document provides a rare insight into Google’s ranking systems, and offers a general overview of how the company ranks websites. While it does not reveal the specifics of the algorithm, it provides a sense of the complexity and nuance of the ranking process, and highlights the importance of page quality, relevance, and transparency in determining search rankings. Overall, the document is a fascinating glimpse into the inner workings of Google’s search engine, and provides a unique perspective on how the company ranks websites.