Tuesday, October 1, 2024

What is Ferret-UI

Ferret-UI is a model designed to understand user interactions with a mobile screen.



Mobile UI Understanding

Hence the paradigm shift is from natural language understanding to mobile UI understanding.

Understanding conversational context to understanding the current context on a mobile screen.


Visual Understanding

Ferret-UI is a model designed to understand user interactions with a mobile screen.



NLU aims to enable machines to comprehend human language and respond appropriately. It focuses on structuring unstructured conversational input to understand the user’s meaning.

This structuring is essential for making sense of human conversation across various mediums, ensuring a Conversational UI can effectively process unstructured data.

The paradigm shift is from natural language understanding to mobile UI understanding, moving from comprehending conversational context to understanding the current context on a mobile screen.

Moving from only understanding conversations, to understanding screens.

Gleaning context from screens and user interactions, as apposed to conversations only.

Ferret-UI can be considered as a RAG implementation where augmentation is not performed via retrieved documents, but rather retrieved screens.

As conversations are unstructured data, and part of a Conversational UI is to create structure around this unstructured data. In a similar fashion, Ferret-UI creates a structure around what is displayed on the screen.

Ferret-UI adds language to what is mapped on the screen, allowing context rich, accurate and multi-turn conversations. A significant step up from the “single dialog-turn command and control” scenario.

Not only does Ferret-UI add a language layer to devices, but other functionality can be added. Like task orchestration based on user behaviour, anticipating the next interaction, user guidance, and more


 

references:

https://cobusgreyling.medium.com/moving-from-natural-language-understanding-to-mobile-ui-understanding-18cd775c11b3

No comments:

Post a Comment