browse · models · azure_ai

Llama 4 Maverick 17B 128E Instruct FP8

azure_ai model — context 1M, capabilities: .

Specs

Model ID	`azure-ai-llama-4-maverick-17b-128e-instruct-fp8`
Provider	azure_ai
Family	llama-4
Status	active
Input $/MTok	$1.41
Output $/MTok	$0.35
Cache read $/MTok	—
Cache write $/MTok	—
Context window	1,000,000 tokens (1M)
Max output	16,384 tokens (16K)
Input modalities	text
Output modalities	text
Capabilities
Supported parameters	max_tokenstemperaturetop_p
Knowledge cutoff	1970-01
Release date	1970-01-01
Last synced	2026-05-13
Deprecated	—
Official docs	Open docs
Intelligence index	not measured
Throughput	—

Call via MCP

// claude-desktop / cursor / windsurf
{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "id": 1,
  "params": {
    "name": "get_model_pricing",
    "arguments": { "model_id": "azure-ai-llama-4-maverick-17b-128e-instruct-fp8" }
  }
}

Call via REST

curl -X POST https://mcp.aiworkflowmcp.com/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"tools/call","id":1,"params":{"name":"get_model_pricing","arguments":{"model_id":"azure-ai-llama-4-maverick-17b-128e-instruct-fp8"}}}'