Skip to main content
Many enterprise applications run exclusively inside Citrix virtual desktops—locked behind VPNs, proprietary networks, or compliance boundaries. CloudCruise can automate these applications by connecting to Citrix through its HTML5 browser interface, treating the remote desktop as just another web page.

How It Works

Citrix Virtual Apps and Desktops (formerly XenApp/XenDesktop) offer an HTML5-based web client that streams the remote session directly into a browser tab. CloudCruise connects to this stream and interacts with the virtualized desktop using the same click, type, and extract primitives you’d use on any web application. Because the entire session lives inside a standard browser, CloudCruise can:
  • Navigate to your organization’s Citrix portal
  • Authenticate with stored credentials and handle 2FA
  • Launch specific virtual desktops or published applications
  • Interact with applications inside the virtual session
  • Extract data from screens using AI vision

Prerequisites

RequirementDetails
Citrix StoreFront or Workspace URLObtain from your IT team (e.g., https://citrix.yourcompany.com or https://yourcompany.cloud.com)
Login credentialsUsername and password with access to the required virtual desktops
HTML5 receiver enabledYour Citrix admin must enable HTML5 Receiver / Workspace for Web
MFA configurationIf required, set up in your Vault entry
Ask your Citrix administrator to confirm that HTML5 Receiver is enabled. Some organizations disable browser access and require the native Citrix Workspace app.

Setting Up a Citrix Workflow

Step 1: Store Credentials in the Vault

Create a Vault entry with your Citrix credentials:
{
  "username": "domain\\your_username",
  "password": "your_password"
}
If your organization uses 2FA, add the appropriate configuration:
{
  "username": "domain\\your_username",
  "password": "your_password",
  "tfa_method": "AUTHENTICATOR",
  "tfa_secret": "JBSWY3DPEHPK3PXP"
}

Step 2: Build the Connection Workflow

When building your workflow with the Builder Agent, the typical flow is:
  1. Navigate to your Citrix portal URL
  2. Authenticate with vault credentials
  3. Handle 2FA if prompted
  4. Select the target desktop or application from the launcher
  5. Choose HTML5 mode when prompted (select “Use light version” or “Use browser”)
  6. Wait for the virtual desktop to fully load
  7. Perform your target automation inside the session

Step 3: Interact with the Virtual Desktop

Once inside the Citrix session, CloudCruise uses AI (Screenshot) execution mode to interact with the remote desktop. Since the desktop is rendered as a canvas element, standard DOM selectors don’t work—the agent uses vision-based interaction instead.
Set execution type to AI (Screenshot) for all click and input actions inside the Citrix session. This allows the agent to visually locate buttons, text fields, and other UI elements within the remote desktop.
Example node for clicking inside a Citrix session:
{
  "id": "click-citrix-app",
  "name": "Open application in Citrix",
  "action": "CLICK",
  "parameters": {
    "execution": "LLM_VISION",
    "element_description": "The icon labeled 'Patient Records' on the desktop"
  }
}
Example node for typing text:
{
  "id": "type-patient-name",
  "name": "Enter patient name in search field",
  "action": "INPUT_TEXT",
  "parameters": {
    "execution": "LLM_VISION",
    "element_description": "The search input field in the top navigation bar",
    "text": "{{context.inputs.patient_name}}"
  }
}

Best Practices

PracticeRationale
Leverage session persistenceEnable Cookies and Local Storage in your Vault entry to reuse sessions
Use descriptive element namesHelps the vision model locate targets accurately
Avoid window overlapsKeep the target application in the foreground

File Transfers

File downloads and uploads through Citrix require additional consideration:

Downloads from Virtual Desktop

  1. Files download to the virtual desktop’s local storage first
  2. Use Citrix’s file transfer feature to move files to the browser
  3. CloudCruise can then capture the file using the File Download node

Uploads to Virtual Desktop

  1. Use the File Upload node to upload to the Citrix client
  2. Navigate to the file in the virtual session using the Citrix file browser

Interaction Failures

SymptomSolution
Clicks miss their targetAdd delays for UI stabilization; improve action prompts; Use AI traces for each run and the reasoning output to improve action prompts
Text input appears garbledSlow down typing speed; some Citrix clients need character-by-character input
Actions go to wrong windowEnsure the target window is focused before interacting

Limitations

Compared to automating native web applications, Citrix workflows have some constraints:
LimitationImpact
No DOM accessMust use AI vision for all interactions
Higher latencyEach action has network round-trip overhead as well as addded AI inference latency
Session dependenciesVirtual desktop must remain stable throughout the run

Need Help?

Citrix environments vary significantly between organizations. If you encounter configuration issues or need assistance building a Citrix workflow, contact us at [email protected].