33.1 LLM 网关概述

33.1.1 什么是 LLM 网关

LLM 网关是位于 Claude Code 和模型提供商之间的中间层，提供集中式的模型访问管理。它充当代理，处理所有与 LLM API 的交互，为企业提供额外的控制和管理能力。

LLM 网关的核心功能

集中身份验证 ：统一管理所有用户的 API 密钥和凭证
使用情况跟踪 ：监控跨团队和项目的使用情况
成本控制 ：实施预算限制和速率限制
审计日志 ：记录所有模型交互以满足合规要求
模型路由 ：在不同提供商和模型之间动态切换
负载均衡 ：在多个 API 端点之间分配请求
缓存优化 ：缓存常见查询以减少成本和延迟

LLM 网关架构

┌─────────────────────────────────────────┐ │ Claude Code 客户端 │ │ (多个用户、多个会话) │ └─────────────────────────────────────────┘ ↓ ┌─────────────────────────────────────────┐ │ LLM 网关 │ │ (身份验证、路由、缓存、监控) │ │ ┌──────────────────────────────┐ │ │ │ 认证层 │ │ │ │ (API 密钥、SSO、MFA) │ │ │ └──────────────────────────────┘ │ │ ┌──────────────────────────────┐ │ │ │ 路由层 │ │ │ │ (模型选择、负载均衡) │ │ │ └──────────────────────────────┘ │ │ ┌──────────────────────────────┐ │ │ │ 缓存层 │ │ │ │ (查询缓存、响应缓存) │ │ │ └──────────────────────────────┘ │ │ ┌──────────────────────────────┐ │ │ │ 监控层 │ │ │ │ (日志、指标、告警) │ │ │ └──────────────────────────────┘ │ └─────────────────────────────────────────┘ ↓ ┌─────────────────────────────────────────┐ │ 模型提供商 │ │ (Anthropic、Bedrock、Vertex AI) │ └─────────────────────────────────────────┘

33.1.2 LLM 网关的优势

与直接访问对比

特性	直接访问	LLM 网关

身份验证| 每个 API 密钥| 集中管理成本跟踪| 分散| 统一速率限制| 按用户| 按团队/项目审计日志| 有限| 完整模型切换| 手动| 自动缓存| 无| 有负载均衡| 无| 有故障转移| 无| 有

企业级优势

成本优化

集中计费和预算控制
智能缓存减少重复查询
按使用模式优化模型选择

安全增强

统一身份验证和授权
完整的审计追踪
数据脱敏和过滤

运营效率

集中配置管理
统一监控和告警
简化的用户管理

合规支持

详细的审计日志
数据驻留控制
访问控制策略

33.1.3 LLM 网关类型

1. 开源网关

LiteLLM

LiteLLM 是一个流行的开源 LLM 网关，支持多个提供商：特点：

支持 100+ LLM 提供商
统一的 API 接口
内置缓存和速率限制
成本跟踪和预算控制
易于部署和配置 适用场景 ：
需要多提供商支持
希望快速部署
预算有限

LangServe

LangServe 是 LangChain 的服务器组件：特点：

与 LangChain 深度集成
支持自定义链和代理
实时流式响应
灵活的部署选项 适用场景 ：
使用 LangChain 生态
需要自定义处理逻辑
构建复杂的应用

2. 商业网关

Azure AI Gateway

微软提供的托管网关服务：特点：

完全托管
企业级 SLA
集成 Azure 生态
高级安全功能 适用场景 ：
使用 Azure 基础设施
需要托管服务
要求高可用性

AWS Bedrock Gateway

AWS 提供的网关服务：特点：

与 AWS 服务集成
原生 IAM 支持
CloudWatch 监控
自动扩展 适用场景 ：
使用 AWS 基础设施
需要与 AWS 集成
要求企业级功能

3. 自建网关

企业可以构建自己的 LLM 网关：优势：

完全控制
自定义功能
无供应商锁定挑战：
需要开发和维护
需要专业知识
持续更新成本

33.1.4 网关选择决策

决策因素

python


    class GatewaySelector:
        """网关选择器"""

        def __init__(self):
            self.gateways = {
                'litellm': {
                    'type': 'open_source',
                    'cost': 'low',
                    'complexity': 'low',
                    'features': ['caching', 'rate_limiting', 'cost_tracking'],
                    'providers': ['anthropic', 'bedrock', 'vertex', 'openai', 'cohere']
                },
                'langserve': {
                    'type': 'open_source',
                    'cost': 'low',
                    'complexity': 'medium',
                    'features': ['streaming', 'custom_chains', 'langchain_integration'],
                    'providers': ['anthropic', 'openai', 'cohere']
                },
                'azure_gateway': {
                    'type': 'commercial',
                    'cost': 'high',
                    'complexity': 'low',
                    'features': ['managed', 'sla', 'azure_integration'],
                    'providers': ['anthropic', 'openai', 'azure_openai']
                },
                'aws_gateway': {
                    'type': 'commercial',
                    'cost': 'high',
                    'complexity': 'low',
                    'features': ['managed', 'iam_integration', 'cloudwatch'],
                    'providers': ['anthropic', 'bedrock', 'ai21']
                },
                'custom': {
                    'type': 'custom',
                    'cost': 'medium',
                    'complexity': 'high',
                    'features': ['full_control', 'custom_features'],
                    'providers': ['all']
                }
            }

        def select(self, requirements: Dict) -> GatewayRecommendation:
            """选择网关"""
            scores = {}

            # 评估每个网关
            for gateway, metadata in self.gateways.items():
                score = self._evaluate_gateway(gateway, metadata, requirements)
                scores[gateway] = score

            # 选择最佳网关
            best_gateway = max(scores, key=scores.get)

            return GatewayRecommendation(
                gateway=best_gateway,
                score=scores[best_gateway],
                reasoning=self._generate_reasoning(best_gateway, requirements),
                alternatives=self._get_alternatives(scores, best_gateway)
            )

        def _evaluate_gateway(self,
                            gateway: str,
                            metadata: Dict,
                            requirements: Dict) -> float:
            """评估网关"""
            score = 0.0

            # 成本因素
            cost_preference = requirements.get('cost_preference', 'medium')
            cost_scores = {'low': 3, 'medium': 2, 'high': 1}
            score += cost_scores.get(metadata['cost'], 2)

            # 复杂度因素
            complexity_preference = requirements.get('complexity_preference', 'medium')
            complexity_scores = {'low': 3, 'medium': 2, 'high': 1}
            score += complexity_scores.get(metadata['complexity'], 2)

            # 功能匹配
            required_features = requirements.get('required_features', [])
            feature_match = len(
                set(required_features) & set(metadata['features'])
            ) / len(required_features) if required_features else 1.0
            score += feature_match * 2

            # 提供商支持
            required_providers = requirements.get('required_providers', [])
            if required_providers:
                provider_match = len(
                    set(required_providers) & set(metadata['providers'])
                ) / len(required_providers)
                score += provider_match * 2

            return score

        def _generate_reasoning(self,
                               gateway: str,
                               requirements: Dict) -> str:
            """生成选择理由"""
            metadata = self.gateways[gateway]

            reasons = []

            if metadata['cost'] == requirements.get('cost_preference'):
                reasons.append(f"Matches cost preference ({metadata['cost']})")

            if metadata['complexity'] == requirements.get('complexity_preference'):
                reasons.append(f"Matches complexity preference ({metadata['complexity']})")

            if 'full_control' in metadata['features']:
                reasons.append("Provides full control over functionality")

            if 'managed' in metadata['features']:
                reasons.append("Fully managed service with SLA")

            return '; '.join(reasons) if reasons else "Best overall match"

    ```### 选择矩阵

```python
    | 需求 | LiteLLM | LangServe | Azure Gateway | AWS Gateway | 自建 |
    |-------|----------|-----------|---------------|--------------|-------|
    | 低成本 | ✓ | ✓ | ✗ | ✗ | ✓ |
    | 快速部署 | ✓ | ✓ | ✓ | ✓ | ✗ |
    | 多提供商 | ✓ | ✓ | ✗ | ✗ | ✓ |
    | 完全托管 | ✗ | ✗ | ✓ | ✓ | ✗ |
    | 自定义功能 | ✗ | ✓ | ✗ | ✗ | ✓ |
    | 低维护 | ✓ | ✓ | ✓ | ✓ | ✗ |
    | 企业 SLA | ✗ | ✗ | ✓ | ✓ | ✗ |

    ## 33.1.5 部署前准备

    ### 需求评估

    class GatewayRequirements:
    """网关需求评估"""
    def __init__(self):
    self.requirements = {
    'users': 0,
    'requests_per_day': 0,
    'providers': [],
    'features': [],
    'budget': 0.0,
    'sla_requirement': None
    }
    def assess(self, deployment_data: Dict) -> RequirementsReport:
    """评估需求"""
    report = RequirementsReport()
    # 评估用户数量
    report.users = deployment_data.get('users', 10)
    # 评估请求量
    report.requests_per_day = deployment_data.get('requests_per_day', 1000)
    # 评估提供商需求
    report.providers = deployment_data.get('providers', ['anthropic'])
    # 评估功能需求
    report.features = deployment_data.get('features', [])
    # 评估预算
    report.budget = deployment_data.get('budget', 1000.0)
    # 评估 SLA 需求
    report.sla_requirement = deployment_data.get('sla_requirement', '99.9%')
    # 生成基础设施需求
    report.infrastructure = self._calculate_infrastructure_needs(report)
    # 生成成本估算
    report.estimated_cost = self._estimate_cost(report)
    return report
    def _calculate_infrastructure_needs(self,
    report: RequirementsReport) -> InfrastructureNeeds:
    """计算基础设施需求"""
    needs = InfrastructureNeeds()
    # CPU 需求
    needs.cpu = max(2, report.users // 50)
    # 内存需求
    needs.memory = max(4, report.users // 25)
    # 存储需求
    needs.storage = max(20, report.requests_per_day // 100)
    # 网络带宽
    needs.bandwidth = max(10, report.requests_per_day // 100)
    return needs
    def _estimate_cost(self,
    report: RequirementsReport) -> CostEstimate:
    """估算成本"""
    estimate = CostEstimate()
    # 基础设施成本
    estimate.infrastructure_cost = (
    needs.cpu * 20 +  # $20 per CPU core per month
    needs.memory * 5 +  # $5 per GB RAM per month
    needs.storage * 0.1 +  # $0.1 per GB per month
    needs.bandwidth * 10  # $10 per Mbps per month
    )
    # API 成本
    estimate.api_cost = report.requests_per_day * 30 * 0.001  # $0.001 per request
    # 网关许可成本（如适用）
    estimate.license_cost = 0.0
    # 总成本
    estimate.total_cost = (
    estimate.infrastructure_cost +
    estimate.api_cost +
    estimate.license_cost
    )
    return estimate

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201

33.1 LLM 网关概述 ​

33.1.1 什么是 LLM 网关 ​

LLM 网关的核心功能 ​

LLM 网关架构 ​

33.1.2 LLM 网关的优势 ​

与直接访问对比 ​

企业级优势 ​

33.1.3 LLM 网关类型 ​

1. 开源网关 ​

LiteLLM ​

LangServe ​

2. 商业网关 ​

Azure AI Gateway ​

AWS Bedrock Gateway ​

3. 自建网关 ​

33.1.4 网关选择决策 ​

决策因素 ​