Skip to content

Config: support IPv4 bind host for inference servers#757

Open
dustinrubin5050 wants to merge 4 commits intomainfrom
add-ipv4-branch
Open

Config: support IPv4 bind host for inference servers#757
dustinrubin5050 wants to merge 4 commits intomainfrom
add-ipv4-branch

Conversation

@dustinrubin5050
Copy link
Collaborator

Summary

  • Add a config surface to control what host inference servers bind to (IPv4 vs IPv6).
  • Intended for EKS IPv4 pod networking where binding to :: can break reachability.

Test plan

  • Stand up a fresh env and confirm launch-endpoint (vLLM) becomes reachable via the service alias.

Made with Cursor

dustinrubin5050 and others added 4 commits February 10, 2026 10:50
Route OpenAI-compatible endpoint Services to the user container port (5005) to avoid fragile forwarder dependencies for /v2 -> /v1 proxying. Also remove an extra positional arg from the vLLM command so --served-model-name is applied reliably.

Co-authored-by: Cursor <cursoragent@cursor.com>
When plugins auth is unavailable, bearer tokens flow through FakeAuthenticationRepository. Map long tokens into deterministic 24-char ids so model endpoint creation does not fail on VARCHAR(24) created_by/owner columns.

Co-authored-by: Cursor <cursoragent@cursor.com>
Coerce quoted boolean fields (e.g. \"false\") when loading service_config.yaml so istio_enabled and similar flags don't accidentally enable due to truthy strings.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant