Open Models Inference for coding
The best open coding models, at the best price.
Use them from the tool you already love, or from a cloud agent we host. Open it from any device.
Use it with the tools you already use
Running frontier open models
Hosted Kimi K2.6 and GLM 5.1. Generous coding plans, on infrastructure we own.
Why teams choose Umans AI
Frontier open coding models at the best price
Open models are getting good enough
Open coding models have caught up for real agentic work. You can ship with them today, and the lineup only gets better.
Open models, open stack, no lock-in
Open weights you can audit. An endpoint you can swap. We do not keep your prompts or code.
Inference is hard, that's our job
Running frontier open models on recent architectures takes real ops to do well. We handle the GPUs, the SLOs, and the model lineup so you do not have to.
Reach your agent from anywhere
Open a cloud agent on our infra, reach it from your laptop or your phone. Our users asked for it, we built it for them.
How it works
Use it in the tool you already use
Pick your tool. Setup is one command or two env vars.
OpenCode
Zero configListed as a provider on models.dev. Pick Umans AI Coding Plan, paste your key.
Remote
Cloud agents
Early releaseSpin up a coding agent on our infra in one click. It lands wired up to our inference, with the umans CLI installed and configured. Reach it from your laptop or phone. Same session, any device.
Same session on your laptop, phone, or tablet. Nothing to re-set up.
Each agent runs in its own box. No access to your files, repos, or configs.
The agent runs on our servers, not yours.
Pricing at a glance
Plans
- 200 effective requests per rolling 5h window
- Up to 3 concurrent requests
- Full open model lineup
- Unlimited tokens
- Up to 4 concurrent sessions
- Full open model lineup
- Seats for team members
- Per-token service keys for automations
- Centralized usage, billing, and roles
Plans for individuals. Service keys for automations.