Our browser agent executes workflows. You can think of a workflow as a container for all the necessary context our agent needs to automate a business process reliably. A workflow stores several pieces of structured and unstructured data:
  • High-level workflow description
  • Directed automation graph specifying the happy path
  • Natural language descriptions of every action and UI state at time of the action
  • Input and output JSON schema for data needed to fulfill the task and data expected to be returned by the browser agent
  • Edge cases and associated error codes for each of them
Here’s an example of an automation graph: Main dashboard interface