Chat Completions APIs

HexGrid exposes OpenAI-compatible Chat Completions APIs for hosted LLMs, letting you use a familiar messages-based interface across different model families.

Use these guides to send standard chat requests, stream responses, configure generation parameters, and work with model-specific capabilities such as reasoning, tool calling, JSON output, and structured responses.

Each model page includes copy-pasteable cURL examples, expected response shapes, and notes for using that model through HexGrid’s vLLM engine.

Was this page helpful?