Sunday, March 15, 2026

The Power of Persuasion:...

Writing blog posts that influence and inspire others is a valuable skill that...

How to Start a...

Starting a blog can seem like a daunting task, especially if you're new...

The Importance of On-Page...

On-page SEO, also known as on-site SEO, is the process of optimizing web...

Maximize Your Reach: The...

As a blogger, having a solid content marketing strategy is crucial to increasing...
HomeSEOGoogle's New User...

Google’s New User Intent Extraction Method

Introduction to Google’s Research on User Intent

Google has published a research paper on a new method for extracting user intent from interactions on mobile devices and browsers. The goal is to enable autonomous agents to understand what a user is trying to do, without compromising their privacy. The method uses small models that run on the device, eliminating the need to send data back to Google.

How the Method Works

The researchers discovered that by splitting the problem into two tasks, they could achieve superior performance compared to large language models. The first stage involves summarizing the user’s actions on the device, while the second stage identifies the user’s intent based on these summaries. This approach allows the processing to happen on the device, keeping the user’s data private.

Smaller Models on Browsers and Devices

The focus of the research is on identifying user intent through a series of actions on a mobile device or browser. The model on the device summarizes what the user is doing, and the sequence of summaries is then sent to a second model that identifies the user’s intent. This approach demonstrates superior performance compared to both smaller models and large language models, independent of dataset and model type.

- Advertisement -

Intent Extraction from UI Interactions

The researchers used a technique called intent extraction from screenshots and text descriptions of user interactions, which was proposed in 2025. They improved upon this approach by using a two-stage method, where the first stage summarizes the user’s actions, and the second stage identifies the user’s intent. The user journey, or trajectory, is represented as a sequence of interactions, including observations (screenshots) and actions (user interactions).

Challenges in Evaluating Extracted Intents

Evaluating extracted intents is a challenging task, as user intents contain complex details and are inherently subjective. The researchers explain that grading extracted intent is difficult because user intents contain ambiguities, making it a hard problem to solve. For example, did a user choose a product because of the price or the features? The actions are visible, but the motivations are not.

Two-Stage Approach

The researchers chose a two-stage approach that emulates Chain of Thought reasoning. The first stage generates a summary for each interaction, and the second stage generates an overall intent description. The first stage uses prompting to generate a summary, and the second stage applies fine-tuning to generate the intent description.

The First Stage: Screenshot Summary

The first stage involves dividing the summary into two parts: a description of what’s on the screen and a description of the user’s action. The researchers also used a third component, speculative intent, which is a way to get rid of speculation about the user’s intent. Surprisingly, allowing the model to speculate and then getting rid of that speculation leads to a higher quality result.

The Second Stage: Generating Overall Intent Description

The second stage involves fine-tuning a model to generate an overall intent description. The model is trained on summaries that represent all interactions in the trajectory and the matching ground truth that describes the overall intent for each trajectory. The researchers solved the problem of the model hallucinating by refining the target intents to remove details that aren’t reflected in the input summaries.

Ethical Considerations and Limitations

The research paper ends by summarizing potential ethical issues, where an autonomous agent might take actions that are not in the user’s interest. The authors also acknowledged limitations in the research, such as the testing being done only on Android and web environments, which may not generalize to Apple devices. The research was also limited to users in the United States in the English language.

Conclusion

The research paper demonstrates a new method for extracting user intent from interactions on mobile devices and browsers, without compromising user privacy. The two-stage approach shows superior performance compared to large language models, and the method has the potential to be used in various applications, such as proactive assistance and personalized memory. While the research is still in its early stages, it shows the direction that Google is heading, where small models on devices will be watching user interactions and sometimes stepping in to assist users based on their intent.

- Advertisement -

Latest Articles

- Advertisement -

Continue reading

Google Answers Questions About Search Console’s Branded Queries Filter

Introduction to Google Search Console's Branded Queries Filter Google Search Central recently announced that the branded queries filter in Search Console is now available to all eligible sites. This update has led to many questions from SEOs, which Google's John...

ChatGPT’s Default & Premium Models Search The Web Differently

Introduction to ChatGPT Models Ask ChatGPT's default and premium models the same question, and they'll cite almost entirely different sources. A Writesonic analysis found that GPT-5.4 Thinking, ChatGPT's premium model, sent 56% of its citations to brand websites, while GPT-5.3...

WordPress Gutenberg 22.7 Lays Groundwork For AI Publishing

New Updates in Gutenberg 22.7 Introduction to New Features Gutenberg 22.7 has introduced several exciting new features that make it easier for users to work with the platform. One of the key updates is the live preview for style variation transforms,...

WordPress Releases AI Plugins For Anthropic Claude, Google Gemini, And OpenAI

Introduction to WordPress AI Plugins WordPress has created three new plugins that make it easy to add OpenAI, Google Gemini, or Anthropic Claude integration for the PHP AI Client SDK. These plugins enable text, image, function calling, and web search...