31.2 Google Vertex AI 集成

31.2.1 Vertex AI 概述

Google Vertex AI 是 Google Cloud 提供的统一机器学习平台，支持通过 API 访问多种基础模型，包括 Anthropic 的 Claude 模型。通过 Vertex AI 使用 Claude Code 可以为企业带来以下优势：

Vertex AI 的优势

GCP 原生集成 ：与 Google Cloud IAM、Cloud Audit Logs、Cloud Monitoring 等服务无缝集成
全球端点 ：支持全球访问，提供更好的延迟和可用性
100 万令牌上下文窗口 ：支持超长上下文，适合大型代码库分析
企业级安全 ：符合 Google Cloud 安全标准和合规要求
灵活的部署 ：支持区域和全局端点，满足不同需求

适用场景

已经使用 Google Cloud Platform 的企业
需要超长上下文窗口的应用
要求使用 Google Cloud IAM 进行身份验证的场景
需要全球访问和低延迟的环境

python

## 31.2.2 Vertex AI 配置步骤

### 1\. 前置条件检查

class VertexAIPrerequisitesChecker: """Vertex AI 前置条件检查器"""

def **init**(self): self.checks = { 'gcp_project': False, 'vertex_enabled': False, 'model_access': False, 'iam_permissions': False, 'gcloud_configured': False }

def check_all(self) -> PrerequisiteReport: """检查所有前置条件""" report = PrerequisiteReport()

# 检查 GCP 项目

self.checks['gcp_project'] = self._check_gcp_project()

# 检查 Vertex AI 是否启用

self.checks['vertex_enabled'] = self._check_vertex_enabled()

# 检查模型访问权限

self.checks['model_access'] = self._check_model_access()

# 检查 IAM 权限

self.checks['iam_permissions'] = self._check_iam_permissions()

# 检查 gcloud 配置

self.checks['gcloud_configured'] = self._check_gcloud_configured()

# 生成报告

report.checks = self.checks report.all_passed = all(self.checks.values()) report.missing = [ check for check, passed in self.checks.items() if not passed ]

return report

python

def _check_gcp_project(self) -> bool: """检查 GCP 项目""" try: result = subprocess.run( ['gcloud', 'config', 'get-value', 'project'], capture_output=True, text=True ) return result.returncode == 0 and result.stdout.strip() except Exception: return False

def _check_vertex_enabled(self) -> bool: """检查 Vertex AI 是否启用""" try: result = subprocess.run( ['gcloud', 'services', 'list', '--enabled'], capture_output=True, text=True ) return 'aiplatform.googleapis.com' in result.stdout except Exception: return False

### 2\. 启用 Vertex AI API

    bash


    bash

    # 设置项目 ID
    gcloud config set project YOUR-PROJECT-ID

    # 启用 Vertex AI API
    gcloud services enable aiplatform.googleapis.com

    # 验证启用
    gcloud services list --enabled | grep aiplatform

    ### 3. 请求模型访问权限

    # 通过 gcloud 请求访问
    gcloud ai models list \
    --region=global \
    --filter="displayName~'Claude'"
    # 或通过控制台访问
    # https://console.cloud.google.com/vertex-ai/model-garden

### 4\. 配置 GCP 凭证

#### 选项 A：服务账户密钥

    bash


    bash

    # 创建服务账户
    gcloud iam service-accounts create claude-code-sa \
      --display-name="Claude Code Service Account"

    # 授予必要权限
    gcloud projects add-iam-policy-binding YOUR-PROJECT-ID \
      --member="serviceAccount:claude-code-sa@YOUR-PROJECT-ID.iam.gserviceaccount.com" \
      --role="roles/aiplatform.user"

    # 创建密钥
    gcloud iam service-accounts keys create claude-code-key.json \
      --iam-account=claude-code-sa@YOUR-PROJECT-ID.iam.gserviceaccount.com

    # 设置环境变量
    export GOOGLE_APPLICATION_CREDENTIALS=/path/to/claude-code-key.json

    #### 选项 B：ADC (Application Default Credentials)

    # 使用用户凭证
    gcloud auth application-default login
    # 或使用服务账户
    gcloud auth application-default login \
    --impersonate-service-account=claude-code-sa@YOUR-PROJECT-ID.iam.gserviceaccount.com

#### 选项 C：工作负载身份联邦

    bash


    bash

    # 配置工作负载身份联邦
    gcloud iam workload-identity-pools create-cred-config \
      projects/YOUR-PROJECT-ID/locations/global/workloadIdentityPools/POOL-NAME/providers/PROVIDER-NAME \
      --credential-source-file=credential-config.json \
      --output-file=adc.json

    export GOOGLE_APPLICATION_CREDENTIALS=/path/to/adc.json

    ### 5. 启用 Claude Code Vertex AI 集成

    # 启用 Vertex AI
    export CLAUDE_CODE_USE_VERTEX=1
    # 设置项目 ID
    export ANTHROPIC_VERTEX_PROJECT_ID=YOUR-PROJECT-ID
    # 设置区域（全局或特定区域）
    export CLOUD_ML_REGION=global
    # 可选：为不同模型设置不同区域
    export VERTEX_REGION_CLAUDE_3_5_HAIKU=us-east5
    export VERTEX_REGION_CLAUDE_3_5_SONNET=us-east5
    export VERTEX_REGION_CLAUDE_4_0_OPUS=europe-west1
    export VERTEX_REGION_CLAUDE_4_0_SONNET=us-east5
    export VERTEX_REGION_CLAUDE_4_1_OPUS=europe-west1

### 6\. 配置模型

    bash


    bash

    # 主模型
    export ANTHROPIC_MODEL='claude-sonnet-4-5@20250929'

    # 小型/快速模型
    export ANTHROPIC_SMALL_FAST_MODEL='claude-haiku-4-5@20251001'

    # 使用 100 万令牌上下文窗口
    export ANTHROPIC_MODEL='claude-sonnet-4-5@20250929'
    export ANTHROPIC_VERTEX_ENABLE_1M_CONTEXT=1

    ## 31.2.3 IAM 权限配置

    ### 基础 IAM 角色

    # 使用预定义角色
    gcloud projects add-iam-policy-binding YOUR-PROJECT-ID \
    --member="serviceAccount:claude-code-sa@YOUR-PROJECT-ID.iam.gserviceaccount.com" \
    --role="roles/aiplatform.user"

### 自定义 IAM 角色

    bash


    yaml

    # custom-role.yaml
    title: "Claude Code Vertex AI Role"
    description: "Custom role for Claude Code Vertex AI access"
    stage: "GA"
    includedPermissions:
      - aiplatform.endpoints.predict
      - aiplatform.endpoints.streamPredict
      - aiplatform.models.list
      - aiplatform.models.get

    # 创建自定义角色
    gcloud iam roles create claude-code-vertex-role \
    --project=YOUR-PROJECT-ID \
    --file=custom-role.yaml
    # 授予自定义角色
    gcloud projects add-iam-policy-binding YOUR-PROJECT-ID \
    --member="serviceAccount:claude-code-sa@YOUR-PROJECT-ID.iam.gserviceaccount.com" \
    --role="projects/YOUR-PROJECT-ID/roles/claude-code-vertex-role"

### 组织策略配置

    bash


    bash

    # 创建组织策略以限制访问
    gcloud resource-manager org-policies create \
      --organization=YOUR-ORG-ID \
      --name=restrict-vertex-ai-models \
      --policy-file=vertex-ai-policy.yaml

    # vertex-ai-policy.yaml
    name: organizations/YOUR-ORG-ID/policies/restrict-vertex-ai-models
    spec:
      rules:
      - enforce: true
        values:
          allowedValues:
          - "claude-sonnet-4-5@20250929"
          - "claude-haiku-4-5@20251001"

    ## 31.2.4 高级配置

    ### 100 万令牌上下文窗口

    class ContextWindowManager:
    """上下文窗口管理器"""
    def __init__(self):
    self.max_tokens = 1_000_000
    self.context_headers = {
    'context-1m-2025-08-07': 'true'
    }
    def enable_extended_context(self) -> Dict[str, str]:
    """启用扩展上下文窗口"""
    return {
    'model': 'claude-sonnet-4-5@20250929',
    'headers': self.context_headers,
    'env_vars': {
    'ANTHROPIC_VERTEX_ENABLE_1M_CONTEXT': '1'
    }
    }
    def estimate_token_count(self, text: str) -> int:
    """估算令牌数量"""
    # 粗略估算：1 token ≈ 4 characters
    return len(text) // 4
    def check_context_fit(self, text: str) -> bool:
    """检查文本是否适合上下文窗口"""
    token_count = self.estimate_token_count(text)
    return token_count <= self.max_tokens

### 区域配置优化

    bash


    python

    class RegionOptimizer:
        """区域优化器"""

        def __init__(self):
            self.regions = {
                'us-east5': {
                    'latency': 50,
                    'cost_factor': 1.0,
                    'availability': 0.999
                },
                'europe-west1': {
                    'latency': 80,
                    'cost_factor': 1.1,
                    'availability': 0.999
                },
                'asia-southeast1': {
                    'latency': 100,
                    'cost_factor': 0.9,
                    'availability': 0.998
                }
            }

        def select_optimal_region(self,
                                 user_location: str,
                                 requirements: Dict) -> str:
            """选择最优区域"""
            # 基于用户位置选择
            if 'US' in user_location:
                primary_region = 'us-east5'
            elif 'EU' in user_location:
                primary_region = 'europe-west1'
            else:
                primary_region = 'asia-southeast1'

            # 根据需求调整
            if requirements.get('low_latency', False):
                return primary_region
            elif requirements.get('low_cost', False):
                return min(
                    self.regions.items(),
                    key=lambda x: x[1]['cost_factor']
                )[0]
            else:
                return primary_region

    ### 提示缓存配置

    # 启用提示缓存（默认启用）
    # 在请求中包含 cache_control 标志
    # 禁用提示缓存
    export DISABLE_PROMPT_CACHING=1

## 31.2.5 监控和故障排除

### Cloud Monitoring 配置

    bash


    python

    class VertexAIMonitor:
        """Vertex AI 监控器"""

        def __init__(self):
            self.monitoring_client = monitoring_v3.MetricServiceClient()
            self.project_name = f"projects/{os.getenv('GOOGLE_CLOUD_PROJECT')}"

        def create_dashboard(self):
            """创建监控仪表板"""
            dashboard = {
                "displayName": "Claude Code Vertex AI Dashboard",
                "gridLayout": {
                    "widgets": [
                        {
                            "title": "Request Count",
                            "xyChart": {
                                "dataSets": [{
                                    "timeSeriesQuery": {
                                        "timeSeriesFilter": {
                                            "filter": "resource.type=\"aiplatform.googleapis.com/Endpoint\"",
                                            "aggregation": {
                                                "alignmentPeriod": "60s",
                                                "perSeriesAligner": "ALIGN_RATE"
                                            }
                                        }
                                    }
                                }]
                            }
                        },
                        {
                            "title": "Latency",
                            "xyChart": {
                                "dataSets": [{
                                    "timeSeriesQuery": {
                                        "timeSeriesFilter": {
                                            "filter": "metric.type=\"aiplatform.googleapis.com/prediction_latency\"",
                                            "aggregation": {
                                                "alignmentPeriod": "60s",
                                                "perSeriesAligner": "ALIGN_PERCENTILE_99"
                                            }
                                        }
                                    }
                                }]
                            }
                        }
                    ]
                }
            }

            self.monitoring_client.create_dashboard(
                name=f"{self.project_name}/dashboards/claude-code",
                body=dashboard
            )

    ### 常见问题解决

    class VertexAITroubleshooter:
    """Vertex AI 故障排除器"""
    def diagnose(self, error: str) -> DiagnosisResult:
    """诊断问题"""
    if 'PermissionDenied' in error:
    return self._diagnose_permission_denied()
    elif 'ModelNotFound' in error:
    return self._diagnose_model_not_found()
    elif 'QuotaExceeded' in error:
    return self._diagnose_quota_exceeded()
    elif 'InvalidArgument' in error:
    return self._diagnose_invalid_argument()
    else:
    return DiagnosisResult(
    issue='Unknown',
    solution='Check Cloud Logging for details'
    )
    def _diagnose_permission_denied(self) -> DiagnosisResult:
    """诊断权限拒绝错误"""
    return DiagnosisResult(
    issue='IAM Permission Denied',
    solution='''1. Verify service account has aiplatform.user role

    commands=[
    'gcloud projects get-iam-policy YOUR-PROJECT-ID',
    'gcloud ai models list --region=global'
    ]
    )
    def _diagnose_quota_exceeded(self) -> DiagnosisResult:
    """诊断配额超限错误"""
    return DiagnosisResult(
    issue='Quota Exceeded',
    solution='''1. Check current quota in Cloud Console

    commands=[
    'gcloud compute project-info describe --project=YOUR-PROJECT-ID',
    'gcloud ai models list --region=global --filter="displayName~\'Claude\'"'
    ]
    )

### 日志配置

    bash


    bash

    # 启用详细日志
    export GOOGLE_CLOUD_LOGGING_LEVEL=debug

    # 查看日志
    gcloud logging read "resource.type=aiplatform.googleapis.com/Endpoint" \
      --project=YOUR-PROJECT-ID \
      --limit=50 \
      --format="table(timestamp,protoPayload.requestId,protoPayload.error)"

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375

通过正确配置 Google Vertex AI，企业可以利用 Google Cloud 的强大基础设施，安全、高效地部署 Claude Code，并享受超长上下文窗口带来的优势。

31.2 Google Vertex AI 集成 ​

31.2.1 Vertex AI 概述 ​

Vertex AI 的优势 ​

适用场景 ​

31.2 Google Vertex AI 集成

31.2.1 Vertex AI 概述

Vertex AI 的优势

适用场景