33.1 LLM 閘道器概述

33.1.1 什麼是 LLM 閘道器

LLM 閘道器是位於 Claude Code 和模型提供商之間的中間層，提供集中式的模型訪問管理。它充當代理，處理所有與 LLM API 的互動，為企業提供額外的控制和管理能力。

LLM 閘道器的核心功能

集中身份驗證 ：統一管理所有使用者的 API 金鑰和憑證
使用情況跟蹤 ：監控跨團隊和專案的使用情況
成本控制 ：實施預算限制和速率限制
審計日誌 ：記錄所有模型互動以滿足合規要求
模型路由 ：在不同提供商和模型之間動態切換
負載均衡 ：在多個 API 端點之間分配請求
快取最佳化 ：快取常見查詢以減少成本和延遲

LLM 閘道器架構

┌─────────────────────────────────────────┐ │ Claude Code 客戶端 │ │ (多個使用者、多個會話) │ └─────────────────────────────────────────┘ ↓ ┌─────────────────────────────────────────┐ │ LLM 閘道器 │ │ (身份驗證、路由、快取、監控) │ │ ┌──────────────────────────────┐ │ │ │ 認證層 │ │ │ │ (API 金鑰、SSO、MFA) │ │ │ └──────────────────────────────┘ │ │ ┌──────────────────────────────┐ │ │ │ 路由層 │ │ │ │ (模型選擇、負載均衡) │ │ │ └──────────────────────────────┘ │ │ ┌──────────────────────────────┐ │ │ │ 快取層 │ │ │ │ (查詢快取、響應快取) │ │ │ └──────────────────────────────┘ │ │ ┌──────────────────────────────┐ │ │ │ 監控層 │ │ │ │ (日誌、指標、告警) │ │ │ └──────────────────────────────┘ │ └─────────────────────────────────────────┘ ↓ ┌─────────────────────────────────────────┐ │ 模型提供商 │ │ (Anthropic、Bedrock、Vertex AI) │ └─────────────────────────────────────────┘

33.1.2 LLM 网关的优势

与直接访问对比

特性	直接訪問	LLM 閘道器

身份驗證| 每個 API 金鑰| 集中管理成本跟蹤| 分散| 統一速率限制| 按使用者| 按團隊/專案審計日誌| 有限| 完整模型切換| 手動| 自動快取| 無| 有負載均衡| 無| 有故障轉移| 無| 有

企業級優勢

成本最佳化

集中計費和預算控制
智慧快取減少重複查詢
按使用模式最佳化模型選擇

安全增強

統一身份驗證和授權
完整的審計追蹤
資料脫敏和過濾

運營效率

集中配置管理
統一監控和告警
簡化的使用者管理

合規支援

詳細的審計日誌
資料駐留控制
訪問控制策略

33.1.3 LLM 网关类型

1. 开源网关

LiteLLM

LiteLLM 是一個流行的開源 LLM 閘道器，支援多個提供商：特點：

支援 100+ LLM 提供商
統一的 API 介面
內建快取和速率限制
成本跟蹤和預算控制
易於部署和配置 適用場景 ：
需要多提供商支援
希望快速部署
預算有限

LangServe

LangServe 是 LangChain 的伺服器元件：特點：

與 LangChain 深度整合
支援自定義鏈和代理
實時流式響應
靈活的部署選項 適用場景 ：
使用 LangChain 生態
需要自定義處理邏輯
構建複雜的應用

2. 商业网关

Azure AI Gateway

微軟提供的託管閘道器服務：特點：

完全託管
企業級 SLA
整合 Azure 生態
高階安全功能 適用場景 ：
使用 Azure 基礎設施
需要託管服務
要求高可用性

AWS Bedrock Gateway

AWS 提供的閘道器服務：特點：

與 AWS 服務整合
原生 IAM 支援
CloudWatch 監控
自動擴充套件 適用場景 ：
使用 AWS 基礎設施
需要與 AWS 整合
要求企業級功能

3. 自建閘道器

企業可以構建自己的 LLM 閘道器：優勢：

完全控制
自定義功能
無供應商鎖定挑戰：
需要開發和維護
需要專業知識
持續更新成本

python

## 33.1.4 网关选择决策

### 决策因素

    python


    python

    class GatewaySelector:
        """网关选择器"""

        def __init__(self):
            self.gateways = {
                'litellm': {
                    'type': 'open_source',
                    'cost': 'low',
                    'complexity': 'low',
                    'features': ['caching', 'rate_limiting', 'cost_tracking'],
                    'providers': ['anthropic', 'bedrock', 'vertex', 'openai', 'cohere']
                },
                'langserve': {
                    'type': 'open_source',
                    'cost': 'low',
                    'complexity': 'medium',
                    'features': ['streaming', 'custom_chains', 'langchain_integration'],
                    'providers': ['anthropic', 'openai', 'cohere']
                },
                'azure_gateway': {
                    'type': 'commercial',
                    'cost': 'high',
                    'complexity': 'low',
                    'features': ['managed', 'sla', 'azure_integration'],
                    'providers': ['anthropic', 'openai', 'azure_openai']
                },
                'aws_gateway': {
                    'type': 'commercial',
                    'cost': 'high',
                    'complexity': 'low',
                    'features': ['managed', 'iam_integration', 'cloudwatch'],
                    'providers': ['anthropic', 'bedrock', 'ai21']
                },
                'custom': {
                    'type': 'custom',
                    'cost': 'medium',
                    'complexity': 'high',
                    'features': ['full_control', 'custom_features'],
                    'providers': ['all']
                }
            }

        def select(self, requirements: Dict) -> GatewayRecommendation:
            """选择网关"""
            scores = {}

            # 评估每个网关
            for gateway, metadata in self.gateways.items():
                score = self._evaluate_gateway(gateway, metadata, requirements)
                scores[gateway] = score

            # 选择最佳网关
            best_gateway = max(scores, key=scores.get)

            return GatewayRecommendation(
                gateway=best_gateway,
                score=scores[best_gateway],
                reasoning=self._generate_reasoning(best_gateway, requirements),
                alternatives=self._get_alternatives(scores, best_gateway)
            )

        def _evaluate_gateway(self,
                            gateway: str,
                            metadata: Dict,
                            requirements: Dict) -> float:
            """评估网关"""
            score = 0.0

            # 成本因素
            cost_preference = requirements.get('cost_preference', 'medium')
            cost_scores = {'low': 3, 'medium': 2, 'high': 1}
            score += cost_scores.get(metadata['cost'], 2)

            # 复杂度因素
            complexity_preference = requirements.get('complexity_preference', 'medium')
            complexity_scores = {'low': 3, 'medium': 2, 'high': 1}
            score += complexity_scores.get(metadata['complexity'], 2)

            # 功能匹配
            required_features = requirements.get('required_features', [])
            feature_match = len(
                set(required_features) & set(metadata['features'])
            ) / len(required_features) if required_features else 1.0
            score += feature_match * 2

            # 提供商支持
            required_providers = requirements.get('required_providers', [])
            if required_providers:
                provider_match = len(
                    set(required_providers) & set(metadata['providers'])
                ) / len(required_providers)
                score += provider_match * 2

            return score

        def _generate_reasoning(self,
                               gateway: str,
                               requirements: Dict) -> str:
            """生成选择理由"""
            metadata = self.gateways[gateway]

            reasons = []

            if metadata['cost'] == requirements.get('cost_preference'):
                reasons.append(f"Matches cost preference ({metadata['cost']})")

            if metadata['complexity'] == requirements.get('complexity_preference'):
                reasons.append(f"Matches complexity preference ({metadata['complexity']})")

            if 'full_control' in metadata['features']:
                reasons.append("Provides full control over functionality")

            if 'managed' in metadata['features']:
                reasons.append("Fully managed service with SLA")

            return '; '.join(reasons) if reasons else "Best overall match"

    ```### 選擇矩陣

```python
    | 需求 | LiteLLM | LangServe | Azure Gateway | AWS Gateway | 自建 |
    |-------|----------|-----------|---------------|--------------|-------|
    | 低成本 | ✓ | ✓ | ✗ | ✗ | ✓ |
    | 快速部署 | ✓ | ✓ | ✓ | ✓ | ✗ |
    | 多提供商 | ✓ | ✓ | ✗ | ✗ | ✓ |
    | 完全托管 | ✗ | ✗ | ✓ | ✓ | ✗ |
    | 自定义功能 | ✗ | ✓ | ✗ | ✗ | ✓ |
    | 低维护 | ✓ | ✓ | ✓ | ✓ | ✗ |
    | 企业 SLA | ✗ | ✗ | ✓ | ✓ | ✗ |

    ## 33.1.5 部署前准备

    ### 需求评估

    class GatewayRequirements:
    """网关需求评估"""
    def __init__(self):
    self.requirements = {
    'users': 0,
    'requests_per_day': 0,
    'providers': [],
    'features': [],
    'budget': 0.0,
    'sla_requirement': None
    }
    def assess(self, deployment_data: Dict) -> RequirementsReport:
    """评估需求"""
    report = RequirementsReport()
    # 评估用户数量
    report.users = deployment_data.get('users', 10)
    # 评估请求量
    report.requests_per_day = deployment_data.get('requests_per_day', 1000)
    # 评估提供商需求
    report.providers = deployment_data.get('providers', ['anthropic'])
    # 评估功能需求
    report.features = deployment_data.get('features', [])
    # 评估预算
    report.budget = deployment_data.get('budget', 1000.0)
    # 评估 SLA 需求
    report.sla_requirement = deployment_data.get('sla_requirement', '99.9%')
    # 生成基础设施需求
    report.infrastructure = self._calculate_infrastructure_needs(report)
    # 生成成本估算
    report.estimated_cost = self._estimate_cost(report)
    return report
    def _calculate_infrastructure_needs(self,
    report: RequirementsReport) -> InfrastructureNeeds:
    """计算基础设施需求"""
    needs = InfrastructureNeeds()
    # CPU 需求
    needs.cpu = max(2, report.users // 50)
    # 内存需求
    needs.memory = max(4, report.users // 25)
    # 存储需求
    needs.storage = max(20, report.requests_per_day // 100)
    # 网络带宽
    needs.bandwidth = max(10, report.requests_per_day // 100)
    return needs
    def _estimate_cost(self,
    report: RequirementsReport) -> CostEstimate:
    """估算成本"""
    estimate = CostEstimate()
    # 基础设施成本
    estimate.infrastructure_cost = (
    needs.cpu * 20 +  # $20 per CPU core per month
    needs.memory * 5 +  # $5 per GB RAM per month
    needs.storage * 0.1 +  # $0.1 per GB per month
    needs.bandwidth * 10  # $10 per Mbps per month
    )
    # API 成本
    estimate.api_cost = report.requests_per_day * 30 * 0.001  # $0.001 per request
    # 网关许可成本（如适用）
    estimate.license_cost = 0.0
    # 总成本
    estimate.total_cost = (
    estimate.infrastructure_cost +
    estimate.api_cost +
    estimate.license_cost
    )
    return estimate

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209

33.1 LLM 閘道器概述 ​

33.1.1 什麼是 LLM 閘道器 ​

LLM 閘道器的核心功能 ​

LLM 閘道器架構 ​

33.1.2 LLM 网关的优势 ​

与直接访问对比 ​

企業級優勢 ​

33.1.3 LLM 网关类型 ​

1. 开源网关 ​

LiteLLM ​

LangServe ​

2. 商业网关 ​

Azure AI Gateway ​

AWS Bedrock Gateway ​

3. 自建閘道器 ​