Sionic AI · 2024.09 — now

OpenGateway Development/Operations

  • Extended an internal model routing engine into an OpenAI-compatible public API Gateway product
  • Serving RPM 180, Daily 250K, 10+ Providers, 100+ Models

Background

OpenGateway is an AI Gateway that serves multiple Model Providers through an OpenAI-compatible API.
It was initially developed as a feature used only by internal services, but as external customers wanting model routing emerged, the need arose to extend it into a standalone API product.

  • Parameter compatibility and routing quality had to be maintained even as models and Providers changed across SaaS / airgap environments
  • Considering zone separation such as kr and jp and airgap support, the internal composition needed to be flexibly assembled while preserving the OpenAI API spec

Outcomes

  • opengateway.ai — live service
  • Extended the existing internal model serving feature into an OpenAI-compatible public API Gateway product, building out every flow including API Key, Authn/Authz, Billing, Logs, and the frontend UI
  • Designed availability-first routing, Prompt Cache stickiness, parameter/error normalization, and a shared compatibility Mapper delegation structure, enabling consistent expansion to 10+ Providers under a single OpenAI spec
  • Stably serving traffic at the level of RPM 180 and Daily 250K, with 10+ Providers and 100+ models
  • Connected Redeem Code, Admin features, Grafana observability, and model smoke/CI/daily tests to improve both operational observability and live stability

Design and Implementation

To plan and develop 2 backends and 1 frontend simultaneously with one junior developer, I had to make active use of AI.

  • Kept policies and work standards as a single source of truth in Skills, so that humans and AI could work in the same context
  • Separated the flows to be controlled from dynamic decision-making in the system, clearly distinguishing the areas to review directly from the areas to delegate to AI
  • More details are documented in Thoughts on development that actively leverages AI

BE server release log 1

Backend release history showing frequent, versioned deployments.

BE server release log 2

Continued backend release log entries with change summaries per version.

OpenGateway end-to-end flow — separation of engine and back office

Core design — Facade · routing · error normalization

Parameter normalization and transparency

Core design — Provider extension structure

Billing pipeline

Front Office

OpenGateway Dashboard — API usage and performance monitoring

Front Office dashboard with API usage, traffic, and performance metrics.

OpenGateway Logs

Request logs list with per-call status, model, and timing.

OpenGateway Log Detail

Single log detail view showing request, routing, and response data.

Docs

OpenGateway Documentation — docs and API reference

Developer docs site with guides and OpenAI-compatible API reference.

Operational Stabilization

  • Managed operational status using Grafana to observe traffic, cost, response time, and Provider distribution
  • Standardized recurring operational tasks such as model addition/removal, SDK upgrades, releases, and weekly reports into Claude Skills, improving operational convenience
  • Applied smoke tests, CI tests, and daily jobs to 10+ Providers and 100+ models to continuously verify live environment stability

OpenGateway Grafana Dashboard

Grafana dashboard tracking traffic, cost, latency, and Provider distribution.

OpenGateway live smoke test results

Live smoke-test results verifying Providers and models in production.