Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Gateway Server

The Sipp Gateway Server is the first-party HTTP application for teams that want one inference boundary for local GGUF targets and provider-backed targets. It lives in apps/gateway-server.

This page covers source checkout and generated executable operation. Use Docker for container workflows and Configuration for the TOML schema.

The current release workflow does not publish a standalone binary, public container image, or cargo install target. Build it from the source checkout.

Source Workflow

Use sipp for source checkout workflows. sipp is the setup-installed launcher for cargo xtask; when the launcher is unavailable, use cargo xtask with the same arguments.

cp apps/gateway-server/config/local.toml.example apps/gateway-server/config/local.toml
cp apps/gateway-server/.env.example apps/gateway-server/.env
set -a
. apps/gateway-server/.env
set +a
sipp run gateway-server check --config apps/gateway-server/config/local.toml --backend vulkan
sipp run gateway-server serve --config apps/gateway-server/config/local.toml --backend vulkan

Before running real on-board inference tests, update the ignored local TOML with the token env names, admin password env name, and model path. Update only secret values in the secrets env file.

sipp run gateway-server check builds the staged gateway distribution for the selected backend, then runs sipp-gateway check. The binary check command parses and validates TOML only. It does not read bearer-token environment variables, load model files, contact providers, or bind ports.

sipp run gateway-server serve builds the staged gateway distribution, then runs the generated sipp-gateway executable from the workspace root. It reads secret environment variables named by TOML, loads targets, binds both listeners, and exits cleanly on Ctrl-C.

Use --backend cpu|vulkan|cuda|metal|all to select the backend compiled into the staged gateway distribution.

Provider-Only Source Workflow

Provider-only gateways route to upstream APIs and do not load a local GGUF model. Use a CPU gateway build because inference happens at the provider:

cp apps/gateway-server/config/provider-only.toml.example apps/gateway-server/config/provider-only.toml
cp apps/gateway-server/.env.example apps/gateway-server/.env
set -a
. apps/gateway-server/.env
set +a
sipp run gateway-server check --config apps/gateway-server/config/provider-only.toml --backend cpu
sipp run gateway-server serve --config apps/gateway-server/config/provider-only.toml --backend cpu

Use Configuration for Anthropic and OpenAI-compatible target snippets.

Generated Executable

sipp build gateway-server --backend <backend> stages a runnable distribution in .build/artifacts/gateway-server. The directory contains the sipp-gateway executable, base runtime libraries, and selected GGML backend plugins. The build also compiles the React Admin Dashboard from apps/gateway-server/admin-ui and copies its Vite output to .build/artifacts/gateway-server/admin-ui. Keep the executable, dashboard asset directory, and runtime libraries together.

Direct execution must put the artifact directory on the dynamic loader path. The executable reads dashboard assets from admin-ui beside the binary unless SIPP_GATEWAY_ADMIN_ASSETS_DIR points at another Vite dist directory.

Linux:

set -a
. apps/gateway-server/.env
set +a
export LD_LIBRARY_PATH="$(pwd)/.build/artifacts/gateway-server${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}"
.build/artifacts/gateway-server/sipp-gateway check --config apps/gateway-server/config/local.toml
.build/artifacts/gateway-server/sipp-gateway serve --config apps/gateway-server/config/local.toml

macOS:

set -a
. apps/gateway-server/.env
set +a
export DYLD_LIBRARY_PATH="$(pwd)/.build/artifacts/gateway-server${DYLD_LIBRARY_PATH:+:$DYLD_LIBRARY_PATH}"
.build/artifacts/gateway-server/sipp-gateway check --config apps/gateway-server/config/local.toml
.build/artifacts/gateway-server/sipp-gateway serve --config apps/gateway-server/config/local.toml

Windows PowerShell:

Get-Content apps\gateway-server\.env | ForEach-Object {
    if ($_ -and -not $_.StartsWith("#")) {
        $name, $value = $_.Split("=", 2)
        Set-Item -Path "Env:$name" -Value $value
    }
}
$dist = Join-Path (Get-Location) ".build\artifacts\gateway-server"
$env:PATH = "$dist;$env:PATH"
.\.build\artifacts\gateway-server\sipp-gateway.exe check --config apps\gateway-server\config\local.toml
.\.build\artifacts\gateway-server\sipp-gateway.exe serve --config apps\gateway-server\config\local.toml

Relative model paths in TOML are resolved from the process working directory. The sipp run gateway-server ... workflow runs from the workspace root. When running the executable from another directory, use absolute model paths or start the process from the workspace root.

Backends

The gateway server supports the same native backend names as other native targets:

  • cpu: provider-only router build or local-inference diagnostic backend.
  • cuda: NVIDIA CUDA backend.
  • metal: Apple Metal backend on macOS.
  • vulkan: Vulkan backend.
  • all: host-supported backend set for build commands.

For on-board local target TOML, backend = "auto" selects the best compiled and available backend in this order: CUDA, Metal, Vulkan, then CPU. Production model-serving configs should use auto or an explicit GPU backend. Explicit cpu disables GPU offload and is intended only for diagnostics. Explicit GPU backends fail if that backend was not compiled or is unavailable.

Admin Dashboard

The Admin Dashboard password is read from the env var named by TOML:

admin_password_env = "SIPP_GATEWAY_ADMIN_PASSWORD"

Keep the real value in a secrets env file or production secret manager.