Feat reads regional benchmark by chandra-siri · Pull Request #16795 · googleapis/google-cloud-python

chandra-siri · 2026-04-24T10:33:17Z

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
Ensure the tests and linter pass
Code coverage does not decrease (if any source code was changed)
Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

…gRPC CP

gemini-code-assist

Code Review

This pull request implements time-based microbenchmarks for regional bucket read operations using JSON and gRPC APIs, and updates the AsyncGrpcClient to support custom API endpoints and quota project IDs. Review feedback highlights several improvement opportunities: moving authentication token retrieval out of the module level to optimize multiprocessing startup and prevent test collection failures, specifying explicit file encoding when opening configuration files, removing an unused import, and adding HTTP status checks to ensure benchmark accuracy by failing fast on unauthorized or missing resource errors.

gemini-code-assist · 2026-04-24T10:35:35Z

+token = subprocess.run(
+    ["gcloud", "auth", "print-access-token"],
+    capture_output=True,
+    text=True,
+    check=True,
+).stdout.strip()


Executing subprocess.run at the module level to fetch an authentication token is problematic for several reasons:

Test Collection: This command runs whenever the module is imported, which can fail the entire test suite during collection if gcloud is not installed or the user is not authenticated.

Multiprocessing Overhead: Since the benchmark uses the spawn start method (line 214), this module is re-imported in every worker process. For a workload with many processes (e.g., 96), this results in 96 concurrent, redundant calls to gcloud, causing significant startup delay and potential rate limiting.

Consider fetching the token once in the parent process and passing it to the workers via the initializer or as an argument to the worker function.

gemini-code-assist · 2026-04-24T10:35:35Z

+    """Generates a dictionary of benchmark parameters for time based read operations."""
+    params: Dict[str, List[TimeBasedReadParameters]] = {}
+    config_path = os.path.join(os.path.dirname(__file__), "config.yaml")
+    with open(config_path, "r") as f:


It is recommended to specify an explicit encoding (e.g., encoding="utf-8") when opening files to ensure consistent behavior across different platforms and locales.

Suggested change

with open(config_path, "r") as f:

with open(config_path, "r", encoding="utf-8") as f:

gemini-code-assist · 2026-04-24T10:35:35Z

+import pytest
+import aiohttp
+import subprocess
+import math


The math module is imported but not used in this file. Removing unused imports helps maintain code cleanliness.

gemini-code-assist · 2026-04-24T10:35:35Z

+            async with session.get(url, headers=headers) as response:
+                data = await response.read()
+                bytes_in_iteration += len(data)


The code reads the response body without checking the HTTP status code. If the request fails (e.g., 401 Unauthorized, 404 Not Found), response.read() will return the error message body, which will then be counted as downloaded data, leading to incorrect benchmark results. Adding response.raise_for_status() ensures fail-fast behavior and prevents potential issues with unexpected states.

Suggested change

async with session.get(url, headers=headers) as response:

data = await response.read()

bytes_in_iteration += len(data)

async with session.get(url, headers=headers) as response:

response.raise_for_status()

data = await response.read()

bytes_in_iteration += len(data)

References

When a function encounters an unexpected state or unsupported input, it should raise an error instead of silently continuing to ensure fail-fast behavior.

…l benchmark

chandra-siri added 3 commits April 20, 2026 17:12

fix: propagate quota_project_id and api_endpoint in AsyncGrpcClient

04825ec

test: update expected calls in test_async_grpc_client.py after refactor

b8f8828

feat: Add reads_regional microbenchmark comparing JSON, gRPC DP, and …

2a68c53

…gRPC CP

gemini-code-assist Bot reviewed Apr 24, 2026

View reviewed changes

feat: Add 'whole' pattern and update config to bytes in reads_regiona…

40b1d75

…l benchmark

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat reads regional benchmark#16795

Feat reads regional benchmark#16795
chandra-siri wants to merge 4 commits intogoogleapis:mainfrom
chandra-siri:feat-reads-regional-benchmark

chandra-siri commented Apr 24, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	with open(config_path, "r") as f:
	with open(config_path, "r", encoding="utf-8") as f:

Conversation

chandra-siri commented Apr 24, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant