-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Add API liveness probes #35752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add API liveness probes #35752
Conversation
RFC. As described in ppy#35736. Some ground rules: - I'm not going to attempt to decide where the probe should be hosted, this PR is mostly agnostic to that anyway. - When API is in online state, the probe is periodically queried every minute (interval up for debate). - If the probe URL is not set, nothing happens. - If the client can't reach the probe, nothing happens (or at least that's the intended design). This is such that client-local outages or inabilities to reach the probe don't knock the rest of the game offline *even though everything can be fine with it*. - If any other deserialisation error or anything else happens, nothing happens. - The only time where anything happens is when the probe is up and actively returns a result that shows that there is an active outage going on. In this case, API is instantly knocked to `Failing` state. - When API is in failing state, the probe is queried more actively (every 5 seconds as written, but this is also up for discussion). - If the probe actively returns a result that shows that there continues to be an active outage going on, the client does not attempt to get back online. - In *any other circumstance* (read: the probe being disabled, unreachable, or returning a result that indicates that there is no active outage), the client will try to get back online. The reason this is designed this way is to be failsafe. In particular, my concern here is that e.g. the probe being potentially unavailable to Russia users due to network blocks does not completely delete online functions for those users.
Speeds up gameplay loading when loading is more or less doomed to fail anyway. Will also spam notifications about submission not happening. This could be probably made smarter if we want to (e.g. notify once until API comes back online), but I'm not super sure we *want* that, and maybe being super loud about submission failures is the preferred way to go instead.
| if (!api.IsLoggedIn || api.State.Value == APIState.Failing) | ||
| { | ||
| handleTokenFailure(new InvalidOperationException("API is not online.")); | ||
| handleTokenFailure(new InvalidOperationException("Online functionality is not available."), displayNotification: api.State.Value == APIState.Failing); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably also do
diff --git a/osu.Game/Screens/Play/SubmittingPlayer.cs b/osu.Game/Screens/Play/SubmittingPlayer.cs
index 78e16c0f4b..62a2fc6f56 100644
--- a/osu.Game/Screens/Play/SubmittingPlayer.cs
+++ b/osu.Game/Screens/Play/SubmittingPlayer.cs
@@ -162,7 +162,7 @@ void handleTokenFailure(Exception exception, bool displayNotification = false)
break;
default:
- Logger.Log($"{whatWillHappen} {exception.Message}", level: LogLevel.Important);
+ Logger.Log($"{exception.Message}\n\n{whatWillHappen}", level: LogLevel.Important);
break;
}
}
to make the notification read more correctly. Bonus points if we have a custom styled notification for this.
I was also thinking, rather than a notification for this state, it might be nice to put informational text in the place where the epilepsy warning etc. is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bonus points if we have a custom styled notification for this.
See what you think of 28efcf7.
I was also thinking, rather than a notification for this state, it might be nice to put informational text in the place where the epilepsy warning etc. is.
Not trivially doable because those are in PlayerLoader and this is Player. Would need to either lift token retrieval out to player loader or figure out some awkward flow to shove the information into it from player.
|
General structure of this looks correct. Stable also does this neat thing where when connectivity is unavailable, it shows a global overlay (broken chain link) in the bottom-right of the screen, making it very clear that things are offline. We already kinda do this with the avatar in the toolbar, so maybe that's enough to call a replacement. I would say that the notification which appears as a result of the API going offline should have a custom style at very least (colour and icon) to make it stand out and look like a global broadcast. |
Attempt was made in 33398b9. See what you think. Screen.Recording.2025-11-21.at.14.43.57.movNotably the score submission notifications become much more spammy because they are no longer debounced via the mechanism forwarding |
| { | ||
| Text = this.message = message; | ||
|
|
||
| IsCritical = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disputable, but seemed like a good idea.
| [BackgroundDependencyLoader] | ||
| private void load() | ||
| { | ||
| Icon = FontAwesome.Solid.FireExtinguisher; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was very difficult to resist using FontAwesome.Solid.DumpsterFire here (which exists for some reason), but I don't think anyone else would have found it as funny and it's kind of a bad icon anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤣
i'd agree with using that, except the fire is in the wrong place.
I forget you had to do this in the old days...
This is a very loose RFC that needs more testing, but first and foremost I would like to confirm that I am at least in the vaguely correct wheelhouse before proceeding further.
Video demo: https://drive.google.com/file/d/1BWlrohNgfKWDLDEENDunhsNnl6W4sDmY/view?usp=share_link (too large to live on GitHub).
Add liveness probes to API
As described in #35736.
Some ground rules:
I'm not going to attempt to decide where the probe should be hosted, this PR is mostly agnostic to that anyway.
When API is in online state, the probe is periodically queried every minute (interval up for debate).
Failingstate.When API is in failing state, the probe is queried more actively (every 5 seconds as written, but this is also up for discussion).
The reason this is designed this way is to be fail-safe. In particular, my concern here is that e.g. the probe being potentially unavailable to Russia users due to network blocks does not completely delete online functions for those users.
Skip attempting submission when API is in failing state
Speeds up gameplay loading when loading is more or less doomed to fail anyway.
Will also spam notifications about submission not happening. This could be probably made smarter if we want to (e.g. notify once until API comes back online), but I'm not super sure we want that, and maybe being super loud about submission failures is the preferred way to go instead.