OmniMCP

Automate Any UI Effortlessly

OmniMCP

Agent-Native Interface for Vision-Language UI Automation

Unlock powerful automation through scene graph tracking, rich visual context, persistent memory, and intuitive interactions powered by OmniParser and the Model Context Protocol (MCP).

Core Features:

Agent-Native Interface
Rich Visual Context
Scene Graph Tracking
Memory Persistence
Natural Language UI
Comprehensive Actions
Structured Types
Robust Error Handling

from omnimcp import Omni

omni = Omni(endpoint="localhost:1024")  # or omni.api

# Log in and get applicant's latest underwriting date
@omni.publish
def extract_underwriting_date(o):
    if o.is("Login form ready"):
        o.do(f"Enter {o.recall('credentials')}")
        o.do("Submit login")
        o.observe("latest underwriting date")
        o.store("applicant.last_underwriting_date")

omni.session("extract_underwriting_date").run()

Simple, powerful interface for UI automation

Read the technical whitepaper

See Pricing Start Free Trial

Technical Deep Dive

Understand the architecture and capabilities of OmniMCP in depth with our comprehensive technical whitepaper.

Read Whitepaper

View on GitHub

Core Features

OmniMCP delivers powerful features to enable deep UI understanding and reliable automation.

Rich Visual Context

Deep understanding of UI elements and their relationships for accurate interaction.

Natural Language Interface

Target and analyze elements using natural descriptions without complex selectors.

Comprehensive Interactions

Full range of UI operations with verification and robust error handling.

Structured Types

Clean, typed responses using dataclasses for reliable integration.

Robust Error Handling

Detailed error context and recovery strategies for reliable automation.

MCP Protocol Integration

Standardized interface for AI model interaction with UI automation.

How OmniMCP Works

Our four-step process creates rich UI understanding for AI models

1. Spatial Feature Understanding

OmniMCP begins by developing a deep understanding of the user interface's visual layout. Using OmniParser, it performs detailed visual parsing, segmenting the screen and identifying all interactive and informational elements.

2. Temporal Feature Understanding

To capture the dynamic aspects of the UI, OmniMCP tracks user interactions and the resulting state transitions. It builds a Process Graph that represents the flow of user workflows.

3. Internal API Generation

Utilizing the rich spatial and temporal context, OmniMCP leverages a Large Language Model to generate an internal, context-specific API through In-Context Learning.

4. External API Publication (MCP)

Finally, OmniMCP exposes this dynamically generated internal API through the Model Context Protocol (MCP), providing a consistent interface for both humans and AI models.

Simple, Transparent Pricing

Choose between self-hosting our open source solution or let us handle everything with our managed plans.

Community

Free/forever

Full open source access
Self-hosted deployment
Community support
MIT license

Get Started

Developer Plan

$49/month

Fully managed cloud hosting
Unlimited automation workflows
Email support
Regular updates and enhancements

Start Free Trial

Recommended

Team Plan

$199/month

Up to 5 team members
Collaboration tools and shared workspaces
Priority email support
Advanced analytics and usage insights

Start Free Trial

Enterprise

Custom/pricing

Unlimited users
Dedicated infrastructure
24/7 premium support
Personalized onboarding and training

Contact Sales

All paid plans include:

Free 14-day trial, cancel anytime

Comprehensive documentation

Secure, reliable cloud infrastructure

Regular feature updates

OmniMCP

Agent-Native Interface for Vision-Language UI Automation

Core Features:

Technical Deep Dive

Core Features

Rich Visual Context

Natural Language Interface

Comprehensive Interactions

Structured Types

Robust Error Handling

MCP Protocol Integration

How OmniMCP Works

1. Spatial Feature Understanding

2. Temporal Feature Understanding

3. Internal API Generation

4. External API Publication (MCP)

Simple, Transparent Pricing

Community

Developer Plan

Team Plan

Enterprise

All paid plans include:

Join the Waitlist

Be the first to access our managed OmniMCP service when it launches.