Single Rust binary (~3 MB) that reads the OS accessibility tree and gives every UI element a REST endpoint. Click buttons, type text, toggle checkboxes — all via JSON.
Works as an MCP server too, so Claude/Cursor/Windsurf can control any desktop app out of the box.
Windows + Linux + macOS. MIT licensed.
- Screenshot capture: GET /windows/{pid}/screenshot → returns PNG
- Batch operations: POST /interact/batch → multiple actions per request
- Wait/poll: GET /windows/{pid}/wait?q=Submit&timeout=5000
- Python & TypeScript SDKs (local install, PyPI/npm coming soon)
- OpenAPI spec, Dockerfile, 7 example scripts
- Demo GIF in README showing Calculator automation via Claude Code
Thanks for the feedback everyone!