Many enterprise applications run exclusively inside Citrix virtual desktops—locked behind VPNs, proprietary networks, or compliance boundaries. CloudCruise can automate these applications by connecting to Citrix through its HTML5 browser interface, treating the remote desktop as just another web page.
How It Works
Citrix Virtual Apps and Desktops (formerly XenApp/XenDesktop) offer an HTML5-based web client that streams the remote session directly into a browser tab. CloudCruise connects to this stream and interacts with the virtualized desktop using the same click, type, and extract primitives you’d use on any web application.
Because the entire session lives inside a standard browser, CloudCruise can:
- Navigate to your organization’s Citrix portal
- Authenticate with stored credentials and handle 2FA
- Launch specific virtual desktops or published applications
- Interact with applications inside the virtual session
- Extract data from screens using AI vision
Prerequisites
| Requirement | Details |
|---|
| Citrix StoreFront or Workspace URL | Obtain from your IT team (e.g., https://citrix.yourcompany.com or https://yourcompany.cloud.com) |
| Login credentials | Username and password with access to the required virtual desktops |
| HTML5 receiver enabled | Your Citrix admin must enable HTML5 Receiver / Workspace for Web |
| MFA configuration | If required, set up in your Vault entry |
Ask your Citrix administrator to confirm that HTML5 Receiver is enabled. Some organizations disable browser access and require the native Citrix Workspace app.
Setting Up a Citrix Workflow
Step 1: Store Credentials in the Vault
Create a Vault entry with your Citrix credentials:
{
"username": "domain\\your_username",
"password": "your_password"
}
If your organization uses 2FA, add the appropriate configuration:
{
"username": "domain\\your_username",
"password": "your_password",
"tfa_method": "AUTHENTICATOR",
"tfa_secret": "JBSWY3DPEHPK3PXP"
}
Step 2: Build the Connection Workflow
When building your workflow with the Builder Agent, the typical flow is:
- Navigate to your Citrix portal URL
- Authenticate with vault credentials
- Handle 2FA if prompted
- Select the target desktop or application from the launcher
- Choose HTML5 mode when prompted (select “Use light version” or “Use browser”)
- Wait for the virtual desktop to fully load
- Perform your target automation inside the session
Step 3: Interact with the Virtual Desktop
Once inside the Citrix session, CloudCruise uses AI (Screenshot) execution mode to interact with the remote desktop. Since the desktop is rendered as a canvas element, standard DOM selectors don’t work—the agent uses vision-based interaction instead.
Set execution type to AI (Screenshot) for all click and input actions inside the Citrix session. This allows the agent to visually locate buttons, text fields, and other UI elements within the remote desktop.
Example node for clicking inside a Citrix session:
{
"id": "click-citrix-app",
"name": "Open application in Citrix",
"action": "CLICK",
"parameters": {
"execution": "LLM_VISION",
"element_description": "The icon labeled 'Patient Records' on the desktop"
}
}
Example node for typing text:
{
"id": "type-patient-name",
"name": "Enter patient name in search field",
"action": "INPUT_TEXT",
"parameters": {
"execution": "LLM_VISION",
"element_description": "The search input field in the top navigation bar",
"text": "{{context.inputs.patient_name}}"
}
}
Best Practices
| Practice | Rationale |
|---|
| Leverage session persistence | Enable Cookies and Local Storage in your Vault entry to reuse sessions |
| Use descriptive element names | Helps the vision model locate targets accurately |
| Avoid window overlaps | Keep the target application in the foreground |
File Transfers
File downloads and uploads through Citrix require additional consideration:
Downloads from Virtual Desktop
- Files download to the virtual desktop’s local storage first
- Use Citrix’s file transfer feature to move files to the browser
- CloudCruise can then capture the file using the File Download node
Uploads to Virtual Desktop
- Use the File Upload node to upload to the Citrix client
- Navigate to the file in the virtual session using the Citrix file browser
Interaction Failures
| Symptom | Solution |
|---|
| Clicks miss their target | Add delays for UI stabilization; improve action prompts; Use AI traces for each run and the reasoning output to improve action prompts |
| Text input appears garbled | Slow down typing speed; some Citrix clients need character-by-character input |
| Actions go to wrong window | Ensure the target window is focused before interacting |
Limitations
Compared to automating native web applications, Citrix workflows have some constraints:
| Limitation | Impact |
|---|
| No DOM access | Must use AI vision for all interactions |
| Higher latency | Each action has network round-trip overhead as well as addded AI inference latency |
| Session dependencies | Virtual desktop must remain stable throughout the run |
Need Help?
Citrix environments vary significantly between organizations. If you encounter configuration issues or need assistance building a Citrix workflow, contact us at [email protected].