Skip to content

31.2 Google Vertex AI 集成

31.2.1 Vertex AI 概述

Google Vertex AI 是 Google Cloud 提供的统一机器学习平台,支持通过 API 访问多种基础模型,包括 Anthropic 的 Claude 模型。通过 Vertex AI 使用 Claude Code 可以为企业带来以下优势:

Vertex AI 的优势

  1. GCP 原生集成 :与 Google Cloud IAM、Cloud Audit Logs、Cloud Monitoring 等服务无缝集成
  2. 全球端点 :支持全球访问,提供更好的延迟和可用性
  3. 100 万令牌上下文窗口 :支持超长上下文,适合大型代码库分析
  4. 企业级安全 :符合 Google Cloud 安全标准和合规要求
  5. 灵活的部署 :支持区域和全局端点,满足不同需求

适用场景

  • 已经使用 Google Cloud Platform 的企业
  • 需要超长上下文窗口的应用
  • 要求使用 Google Cloud IAM 进行身份验证的场景
  • 需要全球访问和低延迟的环境
python
## 31.2.2 Vertex AI 配置步骤

### 1\. 前置条件检查

class VertexAIPrerequisitesChecker: """Vertex AI 前置条件检查器"""

def **init**(self): self.checks = { 'gcp_project': False, 'vertex_enabled': False, 'model_access': False, 'iam_permissions': False, 'gcloud_configured': False }

def check_all(self) -> PrerequisiteReport: """检查所有前置条件""" report = PrerequisiteReport()

# 检查 GCP 项目

self.checks['gcp_project'] = self._check_gcp_project()

# 检查 Vertex AI 是否启用

self.checks['vertex_enabled'] = self._check_vertex_enabled()

# 检查模型访问权限

self.checks['model_access'] = self._check_model_access()

# 检查 IAM 权限

self.checks['iam_permissions'] = self._check_iam_permissions()

# 检查 gcloud 配置

self.checks['gcloud_configured'] = self._check_gcloud_configured()

# 生成报告

report.checks = self.checks report.all_passed = all(self.checks.values()) report.missing = [ check for check, passed in self.checks.items() if not passed ]

return report

python
def _check_gcp_project(self) -> bool: """检查 GCP 项目""" try: result = subprocess.run( ['gcloud', 'config', 'get-value', 'project'], capture_output=True, text=True ) return result.returncode == 0 and result.stdout.strip() except Exception: return False

def _check_vertex_enabled(self) -> bool: """检查 Vertex AI 是否启用""" try: result = subprocess.run( ['gcloud', 'services', 'list', '--enabled'], capture_output=True, text=True ) return 'aiplatform.googleapis.com' in result.stdout except Exception: return False

### 2\. 启用 Vertex AI API

    bash


    bash

    # 设置项目 ID
    gcloud config set project YOUR-PROJECT-ID

    # 启用 Vertex AI API
    gcloud services enable aiplatform.googleapis.com

    # 验证启用
    gcloud services list --enabled | grep aiplatform

    ### 3. 请求模型访问权限

    # 通过 gcloud 请求访问
    gcloud ai models list \
    --region=global \
    --filter="displayName~'Claude'"
    # 或通过控制台访问
    # https://console.cloud.google.com/vertex-ai/model-garden

### 4\. 配置 GCP 凭证

#### 选项 A:服务账户密钥

    bash


    bash

    # 创建服务账户
    gcloud iam service-accounts create claude-code-sa \
      --display-name="Claude Code Service Account"

    # 授予必要权限
    gcloud projects add-iam-policy-binding YOUR-PROJECT-ID \
      --member="serviceAccount:claude-code-sa@YOUR-PROJECT-ID.iam.gserviceaccount.com" \
      --role="roles/aiplatform.user"

    # 创建密钥
    gcloud iam service-accounts keys create claude-code-key.json \
      --iam-account=claude-code-sa@YOUR-PROJECT-ID.iam.gserviceaccount.com

    # 设置环境变量
    export GOOGLE_APPLICATION_CREDENTIALS=/path/to/claude-code-key.json

    #### 选项 B:ADC (Application Default Credentials)

    # 使用用户凭证
    gcloud auth application-default login
    # 或使用服务账户
    gcloud auth application-default login \
    --impersonate-service-account=claude-code-sa@YOUR-PROJECT-ID.iam.gserviceaccount.com

#### 选项 C:工作负载身份联邦

    bash


    bash

    # 配置工作负载身份联邦
    gcloud iam workload-identity-pools create-cred-config \
      projects/YOUR-PROJECT-ID/locations/global/workloadIdentityPools/POOL-NAME/providers/PROVIDER-NAME \
      --credential-source-file=credential-config.json \
      --output-file=adc.json

    export GOOGLE_APPLICATION_CREDENTIALS=/path/to/adc.json

    ### 5. 启用 Claude Code Vertex AI 集成

    # 启用 Vertex AI
    export CLAUDE_CODE_USE_VERTEX=1
    # 设置项目 ID
    export ANTHROPIC_VERTEX_PROJECT_ID=YOUR-PROJECT-ID
    # 设置区域(全局或特定区域)
    export CLOUD_ML_REGION=global
    # 可选:为不同模型设置不同区域
    export VERTEX_REGION_CLAUDE_3_5_HAIKU=us-east5
    export VERTEX_REGION_CLAUDE_3_5_SONNET=us-east5
    export VERTEX_REGION_CLAUDE_4_0_OPUS=europe-west1
    export VERTEX_REGION_CLAUDE_4_0_SONNET=us-east5
    export VERTEX_REGION_CLAUDE_4_1_OPUS=europe-west1

### 6\. 配置模型

    bash


    bash

    # 主模型
    export ANTHROPIC_MODEL='claude-sonnet-4-5@20250929'

    # 小型/快速模型
    export ANTHROPIC_SMALL_FAST_MODEL='claude-haiku-4-5@20251001'

    # 使用 100 万令牌上下文窗口
    export ANTHROPIC_MODEL='claude-sonnet-4-5@20250929'
    export ANTHROPIC_VERTEX_ENABLE_1M_CONTEXT=1

    ## 31.2.3 IAM 权限配置

    ### 基础 IAM 角色

    # 使用预定义角色
    gcloud projects add-iam-policy-binding YOUR-PROJECT-ID \
    --member="serviceAccount:claude-code-sa@YOUR-PROJECT-ID.iam.gserviceaccount.com" \
    --role="roles/aiplatform.user"

### 自定义 IAM 角色

    bash


    yaml

    # custom-role.yaml
    title: "Claude Code Vertex AI Role"
    description: "Custom role for Claude Code Vertex AI access"
    stage: "GA"
    includedPermissions:
      - aiplatform.endpoints.predict
      - aiplatform.endpoints.streamPredict
      - aiplatform.models.list
      - aiplatform.models.get

    # 创建自定义角色
    gcloud iam roles create claude-code-vertex-role \
    --project=YOUR-PROJECT-ID \
    --file=custom-role.yaml
    # 授予自定义角色
    gcloud projects add-iam-policy-binding YOUR-PROJECT-ID \
    --member="serviceAccount:claude-code-sa@YOUR-PROJECT-ID.iam.gserviceaccount.com" \
    --role="projects/YOUR-PROJECT-ID/roles/claude-code-vertex-role"

### 组织策略配置

    bash


    bash

    # 创建组织策略以限制访问
    gcloud resource-manager org-policies create \
      --organization=YOUR-ORG-ID \
      --name=restrict-vertex-ai-models \
      --policy-file=vertex-ai-policy.yaml

    # vertex-ai-policy.yaml
    name: organizations/YOUR-ORG-ID/policies/restrict-vertex-ai-models
    spec:
      rules:
      - enforce: true
        values:
          allowedValues:
          - "claude-sonnet-4-5@20250929"
          - "claude-haiku-4-5@20251001"

    ## 31.2.4 高级配置

    ### 100 万令牌上下文窗口

    class ContextWindowManager:
    """上下文窗口管理器"""
    def __init__(self):
    self.max_tokens = 1_000_000
    self.context_headers = {
    'context-1m-2025-08-07': 'true'
    }
    def enable_extended_context(self) -> Dict[str, str]:
    """启用扩展上下文窗口"""
    return {
    'model': 'claude-sonnet-4-5@20250929',
    'headers': self.context_headers,
    'env_vars': {
    'ANTHROPIC_VERTEX_ENABLE_1M_CONTEXT': '1'
    }
    }
    def estimate_token_count(self, text: str) -> int:
    """估算令牌数量"""
    # 粗略估算:1 token ≈ 4 characters
    return len(text) // 4
    def check_context_fit(self, text: str) -> bool:
    """检查文本是否适合上下文窗口"""
    token_count = self.estimate_token_count(text)
    return token_count <= self.max_tokens

### 区域配置优化

    bash


    python

    class RegionOptimizer:
        """区域优化器"""

        def __init__(self):
            self.regions = {
                'us-east5': {
                    'latency': 50,
                    'cost_factor': 1.0,
                    'availability': 0.999
                },
                'europe-west1': {
                    'latency': 80,
                    'cost_factor': 1.1,
                    'availability': 0.999
                },
                'asia-southeast1': {
                    'latency': 100,
                    'cost_factor': 0.9,
                    'availability': 0.998
                }
            }

        def select_optimal_region(self,
                                 user_location: str,
                                 requirements: Dict) -> str:
            """选择最优区域"""
            # 基于用户位置选择
            if 'US' in user_location:
                primary_region = 'us-east5'
            elif 'EU' in user_location:
                primary_region = 'europe-west1'
            else:
                primary_region = 'asia-southeast1'

            # 根据需求调整
            if requirements.get('low_latency', False):
                return primary_region
            elif requirements.get('low_cost', False):
                return min(
                    self.regions.items(),
                    key=lambda x: x[1]['cost_factor']
                )[0]
            else:
                return primary_region

    ### 提示缓存配置

    # 启用提示缓存(默认启用)
    # 在请求中包含 cache_control 标志
    # 禁用提示缓存
    export DISABLE_PROMPT_CACHING=1

## 31.2.5 监控和故障排除

### Cloud Monitoring 配置

    bash


    python

    class VertexAIMonitor:
        """Vertex AI 监控器"""

        def __init__(self):
            self.monitoring_client = monitoring_v3.MetricServiceClient()
            self.project_name = f"projects/{os.getenv('GOOGLE_CLOUD_PROJECT')}"

        def create_dashboard(self):
            """创建监控仪表板"""
            dashboard = {
                "displayName": "Claude Code Vertex AI Dashboard",
                "gridLayout": {
                    "widgets": [
                        {
                            "title": "Request Count",
                            "xyChart": {
                                "dataSets": [{
                                    "timeSeriesQuery": {
                                        "timeSeriesFilter": {
                                            "filter": "resource.type=\"aiplatform.googleapis.com/Endpoint\"",
                                            "aggregation": {
                                                "alignmentPeriod": "60s",
                                                "perSeriesAligner": "ALIGN_RATE"
                                            }
                                        }
                                    }
                                }]
                            }
                        },
                        {
                            "title": "Latency",
                            "xyChart": {
                                "dataSets": [{
                                    "timeSeriesQuery": {
                                        "timeSeriesFilter": {
                                            "filter": "metric.type=\"aiplatform.googleapis.com/prediction_latency\"",
                                            "aggregation": {
                                                "alignmentPeriod": "60s",
                                                "perSeriesAligner": "ALIGN_PERCENTILE_99"
                                            }
                                        }
                                    }
                                }]
                            }
                        }
                    ]
                }
            }

            self.monitoring_client.create_dashboard(
                name=f"{self.project_name}/dashboards/claude-code",
                body=dashboard
            )

    ### 常见问题解决

    class VertexAITroubleshooter:
    """Vertex AI 故障排除器"""
    def diagnose(self, error: str) -> DiagnosisResult:
    """诊断问题"""
    if 'PermissionDenied' in error:
    return self._diagnose_permission_denied()
    elif 'ModelNotFound' in error:
    return self._diagnose_model_not_found()
    elif 'QuotaExceeded' in error:
    return self._diagnose_quota_exceeded()
    elif 'InvalidArgument' in error:
    return self._diagnose_invalid_argument()
    else:
    return DiagnosisResult(
    issue='Unknown',
    solution='Check Cloud Logging for details'
    )
    def _diagnose_permission_denied(self) -> DiagnosisResult:
    """诊断权限拒绝错误"""
    return DiagnosisResult(
    issue='IAM Permission Denied',
    solution='''1. Verify service account has aiplatform.user role

    commands=[
    'gcloud projects get-iam-policy YOUR-PROJECT-ID',
    'gcloud ai models list --region=global'
    ]
    )
    def _diagnose_quota_exceeded(self) -> DiagnosisResult:
    """诊断配额超限错误"""
    return DiagnosisResult(
    issue='Quota Exceeded',
    solution='''1. Check current quota in Cloud Console

    commands=[
    'gcloud compute project-info describe --project=YOUR-PROJECT-ID',
    'gcloud ai models list --region=global --filter="displayName~\'Claude\'"'
    ]
    )

### 日志配置

    bash


    bash

    # 启用详细日志
    export GOOGLE_CLOUD_LOGGING_LEVEL=debug

    # 查看日志
    gcloud logging read "resource.type=aiplatform.googleapis.com/Endpoint" \
      --project=YOUR-PROJECT-ID \
      --limit=50 \
      --format="table(timestamp,protoPayload.requestId,protoPayload.error)"

通过正确配置 Google Vertex AI,企业可以利用 Google Cloud 的强大基础设施,安全、高效地部署 Claude Code,并享受超长上下文窗口带来的优势。

基于 MIT 许可发布 | 永久导航