Secure, OpenAI-compatible model serving at the perimeter — auth, routing, and token-aware load balancing across your edge backends.
Drop-in endpoint for any compatible client.
OIDC/JWT with per-route role rules.
Routes by load across vLLM / Ollama backends.
Validate identity & role.
Pick the backend.
NVFP4 fast path.
Meter & log.
Turnkey Edge-AI — fixed time, fixed cost, full responsibility.