Introduction to Google Discover
Google Discover is a mysterious system that is not well understood by publishers and the search marketing community. Despite official guidance from Google, it remains a puzzle. However, it can be classified as a recommender system, which is a type of system that suggests content to users based on their interests.
What are Recommender Systems?
Recommender systems have been around for a while. A classic example is the MovieLens system, which was launched in 1997. It allowed users to rate movies and then used those ratings to recommend other movies that they might like. However, these early systems had limitations that made them unsuitable for large-scale applications like YouTube or Google Discover.
The Two-Tower Recommender System Model
The modern approach to recommender systems is known as the Two-Tower architecture or model. This system was developed for YouTube, but it has implications for Google Discover as well. The Two-Tower model uses two separate representations, or "towers," to match users with content. One tower processes user information, while the other tower represents content items. These two representations are then matched using similarity scoring.
User Tower
The User Tower processes user data, such as watch history, search tokens, location, and demographics. This data is used to create a vector representation that maps the user’s interests in a mathematical space.
Item Tower
The Item Tower represents content items using learned embedding vectors. These vectors are trained alongside the user model and stored for fast retrieval. This allows the system to compare a user’s "coordinates" with millions of content "coordinates" instantly.
The Fresh Content Problem
Google’s research paper on YouTube recommendations highlights the importance of fresh content. The system has to balance between showing users content that is already known to be popular and exposing them to new and unproven content. The paper notes that users prefer fresh content, and this preference is likely to carry over to Google Discover.
Accuracy of Click Data
The research paper also provides insights into the accuracy of click data as a measure of user satisfaction. The authors note that click data is often noisy and does not provide accurate information about user satisfaction. This is because users may click on content for reasons other than interest or satisfaction.
Conclusion
In conclusion, Google Discover is a recommender system that uses a Two-Tower architecture to match users with content. The system prioritizes fresh content and uses user data to create a vector representation of their interests. While the research paper on YouTube recommendations is ten years old, it still offers valuable insights into how recommender systems work. By understanding how these systems work, publishers and marketers can optimize their content to increase their chances of being discovered by users. The key takeaways are to produce high-quality, fresh content that is relevant to user interests, and to use data and analytics to refine and improve content recommendations.

