feat(web): add classes for use in multi-token correction 🚂#15818
Open
jahorton wants to merge 11 commits intochange/web/suggestion-rangefrom
Open
feat(web): add classes for use in multi-token correction 🚂#15818jahorton wants to merge 11 commits intochange/web/suggestion-rangefrom
jahorton wants to merge 11 commits intochange/web/suggestion-rangefrom
Conversation
User Test ResultsTest specification and instructions User tests are not required Test Artifacts
|
6cedadc to
eaf419f
Compare
This was referenced Apr 9, 2026
34e48e8 to
b9480ec
Compare
2ae39dd to
cd19326
Compare
1baecad to
be45f31
Compare
Build-bot: skip build:web Test-bot: skip
…ly for queue This, plus the upcoming QuotientNodeFinalizer class, will support result forwarding in multi-tokenization contexts during multi-token & multi-tokenization correction.
4d05f5e to
7e594e0
Compare
486cbd7 to
e81ae0a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
As mentioned in the description of #15814, up until now, we've always searched for corrections for just one token at a time - the current token. To better support whitespace fat-fingering, we'll need the ability to answer the following question: "Which tokenization pattern is the closest to matching the current text after corrections?" In essence, we need the ability to correct word boundaries, or perhaps to correct the engine's tokenization of the context.
One notable way forward is to consider what valid corrections we can find for each token in the tokenized context. This allows us to search for phrase-level corrections not focused on a specific word, prioritizing the most likely combination of tokenization-pattern and correction-cost for one of the pattern's tokens.
Now, as to the implementation of the new type...
Predictive text generally should not extend words not adjacent to the caret unnecessarily; predictive text exists to predict what is yet to be typed, rather than adjusting what was typed. Corrections validly may adjust what was typed, but should not aim to add to it. To use a simple example, for text such as
the quicl brown fox, correctingquicltoquickmay well be valid, but correcting the same word toquicklywould not make sense - those two extra characters would appear out of nowhere.So, when doing correction across a range of context, rather than a single token, we generally will want to correct all words not adjacent to the token, only allowing prediction for the adjacent token.
Build-bot: skip build:web
Test-bot: skip