Skip to content

feat(BigQuery): Add support for Stateless Queries: Stateless query#9022

Open
Hectorhammett wants to merge 4 commits intomainfrom
stateless-query
Open

feat(BigQuery): Add support for Stateless Queries: Stateless query#9022
Hectorhammett wants to merge 4 commits intomainfrom
stateless-query

Conversation

@Hectorhammett
Copy link
Collaborator

Add support for stateless queries.

b/331273989

@Hectorhammett Hectorhammett requested a review from a team as a code owner March 17, 2026 21:20
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the BigQuery API. label Mar 17, 2026
@Hectorhammett Hectorhammett added the next release PRs to be included in the next release label Mar 18, 2026
Copy link
Contributor

@bshaffer bshaffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! A couple nits, but my main concern is with the QueryResults::fromStatelessQuery method. I think there may be an opportunity for improvement there.

Also, could we add a SystemTest for this? Even if it's not run on CI it would be nice to have it as a sanity check.

isset($queryConfig['priority']) &&
$queryConfig['priority'] !== 'INTERACTIVE'
) ||
(isset($queryConfig['useLegacySql']) && $queryConfig['useLegacySql']) ||
Copy link
Contributor

@bshaffer bshaffer Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Consider this more concise variant?

Suggested change
(isset($queryConfig['useLegacySql']) && $queryConfig['useLegacySql']) ||
($queryConfig['useLegacySql'] ?? false) ||

return false;
}

if (isset($config['configuration']['dryRun']) && $config['configuration']['dryRun']) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Suggested change
if (isset($config['configuration']['dryRun']) && $config['configuration']['dryRun']) {
if ($config['configuration']['dryRun'] ?? false) {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you're right, I know you're right haha, but my brain always liked the other one:
Hey, is this set? Yes? ok so... is this true?

And the bottom one reads like "This true ?? FALSE" to me haha.

But I think is more readable your way haha.

Comment on lines +638 to +639
if (isset($config['jobReference']['jobId']) && !$this->isJobIdGenerated()
) {
Copy link
Contributor

@bshaffer bshaffer Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the closing parenthesis should only on a newline if the if statement spans multiple lines

Suggested change
if (isset($config['jobReference']['jobId']) && !$this->isJobIdGenerated()
) {
if (isset($config['jobReference']['jobId']) && !$this->isJobIdGenerated())
{

Comment on lines +615 to +618
(
isset($queryConfig['priority']) &&
$queryConfig['priority'] !== 'INTERACTIVE'
) ||
Copy link
Contributor

@bshaffer bshaffer Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I find it a tiny bit hard to read that we have a bunch of isset calls, but then we have this block, and the block below, that are different. Also, we have dryRun, which follows the same logic as useLegacySql, but isn't in this block.

My preference would be that we separate out the more complex checks from this large if statement, for readability.

if ($query instanceof QueryJobConfiguration && $query->isStateless()) {
$queryRequest = $query->toQueryRequest();

if (isset($queryResultsOptions['formatOptions.useInt64Timestamp'])) {
Copy link
Contributor

@bshaffer bshaffer Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This is for backwards compatiblity, correct? Maybe add a comment saying that, for clarity

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because for some reason, the RequestBuilder handles this in a dot notation for get requests but not for post request, which in a stateless query is a post request. In the jobs.query api, this is expected to be in the body instead of the query parameters. So I am normalizing the regular options passed by the user into a json body for the request.

This lives on cloud-core::RequestBuilder::build line 115.

Comment on lines +434 to +436
$statelessArgs = $queryRequest + $queryResultsOptions + [
'projectId' => $this->projectId
] + $options;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love that we are just adding $options at the end like this... can we validate the keys before or after doing this, so we know for sure what values are being added?

$this->projectId,
$statelessResponse,
$this->mapper,
$queryResultsOptions + $options
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here... we did all the work to validate $queryResultOptions and $queryRequest, it would be nice to do the same to these $options and the ones above, so we know what keys are being passed in.

*/
private $config = [];

private bool $isJobIdGenerated = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is only used in QueryJobConfiguration, I'd advocate for moving it out of the trait and instead keeping it entirely in QueryJobConfiguration. I don't see any need to have it in the trait if it isn't used in the other classes which use the trait.

Comment on lines +295 to +297
if (!isset($this->info['jobReference'])) {
return $this->info;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem right

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we had a stateless query, the response would be in the info field. If it has not JobReference, it means that it was in fact, completely stateless and we cannot "reload" by sending the request as there is no jobId to getQueryResults from.

Maybe we could throw and error if you want to reload a stateless query results? But that means that the user would have to be aware of the internal changes and it would be a breaking change to some people.

* [documentation](https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/getQueryResults#query-parameters)
* for available options.
*/
public static function fromStatelessQuery(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to say, I don't love this implementation. It seems like we are creating a Job for no reason, where we should make the Job nullable (Stateless queries don't have jobs, right? You run them immediately?).

This may require further discussion, but IMHO it seems like it would be better to instantiate QueryResults in the runQuery method, and then have waitUntilComplete be a no-op (or throw an exception) for Stateless Queries, since they're not applicable.

Copy link
Collaborator Author

@Hectorhammett Hectorhammett Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A Sateless query may or may not create a job. If we take a look at QueryJobConfiguration::toQueryRequest() line 678, the request has the field:
jobCreationMode => self::JOB_CREATION_MODE_OPTIONAL enum here

So even though we run the stateless query path, BigQuery itself may decide that the best course of action for this query is to actually create a Job, meaning that we do have a job reference and jobId when the job.query endpoint responds.

We cannot force BigQuery to run it stateless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigquery Issues related to the BigQuery API. next release PRs to be included in the next release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants