-
Notifications
You must be signed in to change notification settings - Fork 168
Updated tracking / transitions of the task's version state. #9127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
b9c6f9e to
01d362e
Compare
| 'attempts: $attempts', | ||
| 'zone: $zone', | ||
| 'instance: $instance', | ||
| 'secretToken: $secretToken', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add new properties :D
| ].join(', ') + | ||
| ')'; | ||
|
|
||
| // Remove instanceName, zone, secretToken, and set attempts = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // Remove instanceName, zone, secretToken, and set attempts = 0 | |
| /// Remove instanceName, zone, secretToken, and set attempts = 0 |
| finished: true, | ||
| zone: null, | ||
| instance: null, // version is no-longer running on this instance | ||
| secretToken: null, // TODO: Consider retaining this for idempotency |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't oppose the idea of retaining this for idempotency, that's a great idea.
| s.versions!.addAll({ | ||
| for (final v in pendingVersions.map((v) => v.toString())) | ||
| v: PackageVersionStateInfo( | ||
| scheduled: now, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're using a different now after this change, is that a good idea?
| // suppose to run on the instance we just failed to create. | ||
| // If this doesn't work, we'll eventually retry. Hence, correctness | ||
| // does not hinge on this transaction being successful. | ||
| await db.tasks.restorePreviousVersionsState( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure we shouldn't have a method for this, should be inline logic.
There is never a reason to call restorePreviousVersionsState from anywhere else is there?
Also you cannot look at restorePreviousVersionsState and not understand the context within which it belongs.
if we want it as a method, then it should be a helper method in this file, like updatePackageStateWithPendingVersions (which IMO could also be inline); but I do think updatePackageStateWithPendingVersions is a bit more specific in what it does and worth documenting separately. Also just to keep the code short.
Why can't we simply store previousScheduled as a local variable in scope of scheduleInstance?
| PackageVersionStateInfo scheduleNew({ | ||
| required String zone, | ||
| required String instanceName, | ||
| }) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't dislike these methods. It moves a lot of invariants into PackageVersionStateInfo -- and ensures that we're maintaining the invariants here.
But let's expand the documentation. You contract with the world is no longer than the world outside must understand all the invariants of this class, but that the world can only use method in this class.
| }) { | |
| /// Create updated [PackageVersionStateInfo] for when a new instance have been | |
| /// scheduled. | |
| /// | |
| /// You must supply: | |
| /// * [scheduled] when the instance was scheduled. | |
| /// * [zone] within which the instance was scheduled. | |
| /// * [instanceName] as name of the of the isntance scheduled. | |
| /// * [secretToken] passed to the instance for authentication when | |
| /// the instance wants to callback. | |
| PackageVersionStateInfo scheduleNew({ | |
| required DateTime scheduled, | |
| required String zone, | |
| required String instanceName, | |
| required String secretToken, | |
| }) { |
I think we should leave deciding values to the scheduler. It is the schedulers responsibility to pick the values. This class mostly manages the invariants (it becomes a lot less pure if we ask this class to generate random tokens).
Yes, adding more parameters is just a nit 🙈
But expanding the documentation is important.
| /// Reverts the status of the last scheduling attempt, which has presumably failed. | ||
| PackageVersionStateInfo resetAfterFailedAttempt() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// Reverts the status of the last scheduling attempt, which has presumably failed. | |
| PackageVersionStateInfo resetAfterFailedAttempt() { | |
| /// Reverts the status of the last scheduling attempt, when it is known that scheduling failed. | |
| /// | |
| /// > [!WARNING] | |
| /// > This state transition **may only** be used if it's | |
| /// > **known with certainty** that scheduling failed. | |
| /// > | |
| /// > If an instance _may_ have been scheduled, but we suspect | |
| /// > scheduling failed, we have to wait for a retry. | |
| /// > As we otherwise risk leaving an instance unable to call back, | |
| /// > which will leave the instance logging errors that indicate | |
| /// > internal errors in our system. | |
| PackageVersionStateInfo resetAfterFailedAttempt({required DateTime previousScheduled}) { |
We only do this is if we know for sure that scheduling the instance failed.
We shouldn't do this if the scheduling request failed with a IOException or other transport error. Only if we get an error from the GCE API, then we know for sure that the instance wasn't scheduled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we might actually do this on an IOException, we probably shouldn't but I'm also okay with that minor issue.
I'd just prefer that we clearly document what the correct behavior is.
docsandpanaflags after scheduling (otherwise there is a time window where a new schedule has been started but not completed, where a previous run's results are not available).previousScheduledfield, but we could skip it and just useinitialTimestampwhen restoring the previous state.updatePackageStateWithPendingVersionsandrestorePreviousVersionsStatesince they no longer need to keep all the previous version state information.