{"id":61,"date":"2024-01-16T02:51:22","date_gmt":"2024-01-16T02:51:22","guid":{"rendered":"https:\/\/khozouei.com\/?p=61"},"modified":"2024-01-16T02:51:22","modified_gmt":"2024-01-16T02:51:22","slug":"large-action-agenic-models","status":"publish","type":"post","link":"https:\/\/khozouei.com\/index.php\/2024\/01\/16\/large-action-agenic-models\/","title":{"rendered":"Large Action (Agenic) Models"},"content":{"rendered":"\n<p>Wow, these are pretty cool. Basically taking multi-modal models like ChatGPT, and leveraging their combined image and language capabilities to use them to navigate software interfaces. <\/p>\n\n\n\n<p>Rabbit R1 (<a href=\"https:\/\/www.rabbit.tech\/\">link<\/a>) is the latest example, which uses a Large Action Model to learn how you use your mobile phone and learn how to navigate and use apps, and then perform tasks on your behalf. <\/p>\n\n\n\n<p>Its kinda like combining RPA with AI, where it interprets the interface rather than just pressing buttons blindly.  There is a lot of opportunity here around automated troubleshooting. <\/p>\n\n\n\n<p>I can imagine a use case where proprietary complex software, combined with hardware, perhaps in medicine, might require highly skilled troubleshooting support.  Rather than depending on people in a contact center, there is first level support delivered by an AI agent.<\/p>\n\n\n\n<p>If we can give the AI agent access to the interface, it can automatically navigate and troubleshoot on its own, given a description of the problem, otherwise it can describe the steps incrementally via a chat\/speech interface.  <\/p>\n\n\n\n<p>Ultimately there is a business model where the troubleshooting support services is outsourced to a third party, that runs and manages AI bots for that purpose. Or alternatively, the AI troubleshooting model is bundled with the technology for a kind of self healing app. <\/p>\n\n\n\n<p>You could even have the AI generate training packs based on the specific troubleshooting issue that the individual was having. <\/p>\n\n\n\n<p>You could identify troubleshooting trends and issues which are affecting large numbers of people, and even come up with a list of recommended improvements to increase reliability and user satisfaction. <\/p>\n\n\n\n<p>I could only see this working as an outsourced model however as running this in-house would require and AI\/ML Ops capability, which is very hard, and potentially costly to do.<\/p>\n\n\n\n<p>Found this interesting link for one approach to using prompts to solve this problem (<a href=\"https:\/\/dagster.io\/blog\/chatgpt-langchain\">link<\/a>).<\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Wow, these are pretty cool. Basically taking multi-modal models like ChatGPT, and leveraging their combined image and language capabilities to use them to navigate software interfaces. Rabbit R1 (link) is the latest example, which uses a Large Action Model to learn how you use your mobile phone and learn how to navigate and use apps, [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/khozouei.com\/index.php\/wp-json\/wp\/v2\/posts\/61"}],"collection":[{"href":"https:\/\/khozouei.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/khozouei.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/khozouei.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/khozouei.com\/index.php\/wp-json\/wp\/v2\/comments?post=61"}],"version-history":[{"count":1,"href":"https:\/\/khozouei.com\/index.php\/wp-json\/wp\/v2\/posts\/61\/revisions"}],"predecessor-version":[{"id":62,"href":"https:\/\/khozouei.com\/index.php\/wp-json\/wp\/v2\/posts\/61\/revisions\/62"}],"wp:attachment":[{"href":"https:\/\/khozouei.com\/index.php\/wp-json\/wp\/v2\/media?parent=61"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/khozouei.com\/index.php\/wp-json\/wp\/v2\/categories?post=61"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/khozouei.com\/index.php\/wp-json\/wp\/v2\/tags?post=61"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}