Vertex AI Audit Logging with Terraform: Track Every AI Call from Prompt to Response

Vertex AI Audit Logging with Terraform: Track Every AI Call from Prompt to Response 📋

Google Cloud Platform does not log Vertex AI data access events by default, leaving a blind spot in production environments. By implementing specific Terraform resources, engineers can capture critical metadata and full prompt-response bodies for every model invocation. This setup ensures that every ‘generateContent’ and ‘predict’ call is recorded for compliance and security auditing.

Why This Matters

In production AI systems, technical compliance requirements often demand proof of who accessed a model and exactly what data was exchanged. While standard Admin Activity logs are free and always on, they only track resource lifecycle events like creation or deletion, failing to capture the actual inference calls that constitute the bulk of AI operations. Without enabling Data Access logs and configuring persistent sinks, organizations risk losing all visibility into model usage, making it impossible to audit security incidents, perform anomaly detection, or track per-model costs effectively in high-volume environments.

Key Insights

Cloud Audit Logs capture caller identity, model ID, and method using the google_project_iam_audit_config resource.
Vertex AI Data Access logs are disabled by default and are required to record generateContent, predict, and streamGenerateContent methods.
A dual-sink architecture uses GCS for long-term retention—up to 7 years for regulated industries—and BigQuery for SQL-based usage analytics.
Request-response logging captures the full JSON bodies of prompts and responses but must be configured per-endpoint via the API or SDK with a specific sampling rate.
GCP separates metadata logging from content logging, providing finer cost control compared to AWS Bedrock which bundles these configurations.

Working Examples

Enables Data Access audit logs for Vertex AI to capture read/write operations.

resource "google_project_iam_audit_config" "vertex_ai" {
project = var.project_id
service = "aiplatform.googleapis.com"
audit_log_config {
log_type = "ADMIN_READ"
}
audit_log_config {
log_type = "DATA_READ"
}
audit_log_config {
log_type = "DATA_WRITE"
}
}

Configures a log sink to export Vertex AI audit logs to Cloud Storage for long-term retention.

resource "google_logging_project_sink" "vertex_ai_gcs" {
name = "${var.environment}-vertex-ai-audit-to-gcs"
project = var.project_id
destination = "storage.googleapis.com/${google_storage_bucket.vertex_ai_logs.name}"
filter = <<-EOT
protoPayload.serviceName="aiplatform.googleapis.com"
AND logName:"cloudaudit.googleapis.com"
EOT
unique_writer_identity = true
}

Query to identify the top models by invocation count over the last 7 days.

SELECT
protopayload_auditlog.resourceName AS model,
COUNT(*) AS call_count
FROM `PROJECT.DATASET.cloudaudit_googleapis_com_data_access`
WHERE timestamp > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
GROUP BY model
ORDER BY call_count DESC;

Enabling full request-response body logging via the Python SDK.

endpoint = aiplatform.Endpoint.create(
display_name="my-endpoint",
predict_request_response_logging_config={
"enabled": True,
"sampling_rate": 1.0,
"bigquery_destination": {
"output_uri": f"bq://{project_id}.{dataset_name}.request_response_logging"
}
}
)

Practical Applications

Use Case: Regulated industries implement GCS sinks with 2555-day retention policies and lifecycle rules to meet 7-year legal audit requirements. Pitfall: Relying on the _Default bucket which deletes logs after 30 days.
Use Case: Engineering teams use BigQuery partitioned tables to monitor daily token usage trends and detect cost anomalies per service account. Pitfall: Failing to use partitioned tables in BigQuery sinks, resulting in expensive full-table scans during analysis.

References:

https://dev.to/suhas_mallesh/vertex-ai-audit-logging-with-terraform-track-every-ai-call-from-prompt-to-response-4c9k

On This Page