Skip to content

fix(oxide): extract classes adjacent to [% ... %] template tags (#20233)#20268

Closed
ther12k wants to merge 1 commit into
tailwindlabs:mainfrom
ther12k:fix/issue-20233-template-toolkit-syntax
Closed

fix(oxide): extract classes adjacent to [% ... %] template tags (#20233)#20268
ther12k wants to merge 1 commit into
tailwindlabs:mainfrom
ther12k:fix/issue-20233-template-toolkit-syntax

Conversation

@ther12k

@ther12k ther12k commented Jun 22, 2026

Copy link
Copy Markdown

Fixes #20233

Problem

Template Toolkit, Text::Xslate, ExpressionEngine, and similar template engines use [% ... %] as their tag delimiters. The scanner was dropping classes adjacent to these tags: neither [ nor ] was a valid boundary character, so a candidate like bg-white[% failed the after-boundary check even after the tag was identified.

[% IF ... %]bg-white/40[% ELSE %]bg-white/10
              ^^^^^^^^^^                   -- not extracted (after-boundary '[' invalid)
                            ^^^^^^^^^^^^^  -- not extracted (before-boundary ']' invalid)

Fix — three coordinated changes

1. 3-layer [% ... %] skip in extract()

Implemented in the outer extractor loop, not inside the candidate machine, because the inner utility machine can terminate a candidate whose last char immediately precedes [, advancing the cursor PAST [ before any in-machine [% check could see it.

  • Top-of-loop [% check in extract() — handles input-start and post-whitespace arrivals (cursor enters an iteration parked on [).
  • Post-match [% check after candidate_machine.next() returns — handles the common case where next() returns Done with cursor.pos on the [ that opens a template tag. Uses continue to skip the bottom cursor.advance() since the tag-skip already moved past [.
  • [% check in extract_sub_candidates() — prevents spawning an inner CandidateMachine at % (after [) that would otherwise re-extract template keywords like if, cond, end.

The skip is non-greedy (first %] ends the tag). Nested [% ... %] inside the body isn't a thing in any of the target template languages. If no %] is found (e.g. a typo), the normal candidate extraction path takes over so the rest of the input still parses.

2. Boundary additions (minimum required)

Adding ] as a valid before-boundary and [ as a valid after-boundary is required for candidates adjacent to template tags to extract:

  • bg-white[% — before-boundary e (Common), after-boundary [. [ is now After-class.
  • bg-white[% end %] — same as above.
  • bg-white]% — before-boundary ]. ] is now also valid as before-boundary (it already lived in After for [class.foo] syntax, so we special-case the check rather than overwrite the enum).

3. Carve-out in has_valid_boundaries

[ as after-boundary is rejected when the candidate span itself already contains [. This preserves the "one arbitrary value per utility" semantics: bg-[red][blue] still extracts nothing.

Tests

Added test_template_toolkit_syntax (9 cases) in crates/oxide/src/extractor/mod.rs:

  • Issue reproduction: [% IF ... %]bg-white/40[% ELSE %]bg-white/10
  • Lowercase keywords
  • Static class adjacent to a [% %] block
  • [color:red] still extracts (arbitrary property regression guard)
  • bg-[red][blue] still extracts nothing (carve-out regression guard)
  • bg-red-500[% if x %] — candidate before tag, no trailing
  • bg-red-500[% x %]bg-blue-500[% y %]text-white — multi-tag loop
  • [% if x %] bg-red-500 — top-of-loop check at input start
  • bg-red-500 [% if x %] — post-whitespace arrival path

Verification

  • cargo test -p oxide115 passed, 0 failed, 12 ignored (1 new test added, was 114 baseline)
  • cargo fmt --check → clean
  • cargo clippy --workspace --all-targets → 9 errors, 0 new from this diff (verified via git diff origin/main..HEAD — my changes don't introduce new patterns). Pre-existing clippy on origin/main (verified with the same diff on the parent commit).
  • cargo test does not exercise this path under the default test config; the new test in mod.rs is a unit test that runs under cargo test -p oxide.

Scope

fix(oxide): — all changes are isolated to crates/oxide/src/extractor/. No public API changes, no breaking changes.

Files changed

 CHANGELOG.md                                    |   1 +
 crates/oxide/src/extractor/boundary.rs          |  42 +++++++-
 crates/oxide/src/extractor/candidate_machine.rs |  10 ++
 crates/oxide/src/extractor/mod.rs               | 137 +++++++++++++++++++++++-
 4 files changed, 186 insertions(+), 4 deletions(-)

…s#20233)

Template Toolkit, Text::Xslate, ExpressionEngine, and similar template
engines use [% ... %] as their tag delimiters. The scanner was dropping
classes adjacent to these tags: neither '[' nor ']' was a valid boundary
character, so a candidate like 'bg-white[%' failed the after-boundary
check even after the tag was identified.

Fix: three coordinated changes — a 3-layer [% ... %] skip and the
minimum boundary additions required for adjacent candidates to extract.

3-layer [% ... %] skip in extract()
------------------------------------

The skip is implemented in the outer extractor loop, not inside the
candidate machine, because the inner utility machine can terminate a
candidate whose last char immediately precedes '[', advancing the
cursor PAST '[' before any in-machine [% check could see it.

  1. Top-of-loop [% check in extract() — handles input-start and
     post-whitespace arrivals (when cursor enters an iteration parked
     on '[').

  2. Post-match [% check after candidate_machine.next() returns —
     handles the common case where next() returns Done with cursor.pos
     on the '[' that opens a template tag. Uses 'continue' to skip the
     bottom cursor.advance() since the tag-skip already moved past '['.

  3. [% check in extract_sub_candidates() — prevents spawning an inner
     CandidateMachine at '%' (after '[') that would otherwise
     re-extract template keywords like 'if', 'cond', 'end'.

The skip is non-greedy (first %] ends the tag). Nested [% ... %] inside
the body is not a thing in any of the target template languages. If
no %] is found (e.g. a typo), the normal candidate extraction path
takes over so the rest of the input still parses.

Boundary additions (why they belong with this fix)
--------------------------------------------------

Adding ']' as a valid before-boundary and '[' as a valid after-boundary
is required for candidates adjacent to template tags to extract:

  - 'bg-white[%':  before-boundary 'e' (Common), after-boundary '['
    — '[' is now After-class.
  - 'bg-white[% end %]': same as above.
  - 'bg-white]%':  before-boundary ']' — ']' is now also valid as
    before-boundary (it already lived in After for [class.foo] syntax,
    so we special-case the check rather than overwrite the enum).

Carve-out in has_valid_boundaries: '[' as after-boundary is rejected
when the candidate span itself already contains '['. This preserves
'one arbitrary value per utility' semantics: bg-[red][blue] still
extracts nothing.

Tests (in test_template_toolkit_syntax)
---------------------------------------

  - Issue reproduction: [% IF ... %]bg-white/40[% ELSE %]bg-white/10
  - Lowercase keywords
  - Static class adjacent to a [% %] block
  - [color:red] still extracts (arbitrary property regression guard)
  - bg-[red][blue] still extracts nothing (carve-out regression guard)
  - bg-red-500[% if x %] — candidate before tag, no trailing
  - bg-red-500[% x %]bg-blue-500[% y %]text-white — multi-tag loop
  - [% if x %] bg-red-500 — top-of-loop check at input start
  - bg-red-500 [% if x %] — post-whitespace arrival path

All 115 oxide lib tests pass. cargo fmt --check clean. cargo clippy
errors are pre-existing on main (verified via diff against origin/main)
and unrelated to this change.
@ther12k ther12k requested a review from a team as a code owner June 22, 2026 07:37
@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 4140cc27-50bc-4c77-8453-c928543baa0a

📥 Commits

Reviewing files that changed from the base of the PR and between 0fee7b5 and 3dc2ea4.

📒 Files selected for processing (4)
  • CHANGELOG.md
  • crates/oxide/src/extractor/boundary.rs
  • crates/oxide/src/extractor/candidate_machine.rs
  • crates/oxide/src/extractor/mod.rs

Walkthrough

The oxide extractor gains support for Template Toolkit–style [% ... %] template tags. Boundary validation in boundary.rs is updated: ] is explicitly accepted as a valid before-boundary, [ is added to the Class byte classification, and has_valid_boundaries is refactored into staged checks with a carve-out that rejects spans where the after character is [ and the candidate already contains [. In mod.rs, the main extraction loop checks for [% before processing each candidate and advances past the matching %] to skip tag bodies; a second post-match check handles the case where candidate_machine.next() parks the cursor on a [ opening a template tag. The sub-candidate scanning loop in extract_sub_candidates is converted from a for to a while loop with the same skip logic. A test covering multiple [% ... %] adjacency and boundary scenarios is added.

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main fix: enabling class extraction adjacent to template tags. It directly relates to the primary change in the changeset.
Description check ✅ Passed The description thoroughly explains the problem, solution, and implementation details. It directly relates to the changeset with detailed explanations of all three coordinated changes and testing approach.
Linked Issues check ✅ Passed The PR fully addresses issue #20233 objectives: enables extraction of classes adjacent to [% ... %] delimiters, maintains existing semantics for arbitrary values, and provides equivalent treatment to ERB/EJS templates.
Out of Scope Changes check ✅ Passed All changes are directly scoped to extracting classes from [% ... %] template tags. Files modified are isolated to crates/oxide/src/extractor/ with CHANGELOG entry, and no unrelated changes are present.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps

greptile-apps Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Confidence Score: 3/5

Safe to merge for the tested cases (named utilities adjacent to template tags), but arbitrary-value utilities immediately before a template tag will silently fail to extract.

The carve-out in boundary.rs drops arbitrary-value utilities like bg-[red] when immediately followed by a [% template tag — an untested combination that would affect real Template Toolkit users using arbitrary values.

crates/oxide/src/extractor/boundary.rs — the carve-out in has_valid_boundaries (lines 58-60) needs to distinguish a bare [ from a template-tag [% before rejecting the span.

Reviews (1): Last reviewed commit: "fix: extract classes adjacent to [% ... ..." | Re-trigger Greptile

Comment on lines +58 to +60
if after == b'[' && span.slice(input).contains(&b'[') {
return false;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Carve-out is too broad for arbitrary-value utilities before template tags

The check after == b'[' && span.slice(input).contains(&b'[') rejects any span that already contains [ when the very next character is [. That correctly blocks bg-[red][blue], but it also silently drops a perfectly valid case: an arbitrary-value utility that immediately precedes a [%…%] template tag, e.g. bg-[red][% if x %]. In that input, span is bg-[red] (contains [), after is [ (from [%), so the carve-out fires and bg-[red] is never emitted — even though the template tag would be skipped by the outer loop moments later.

The fix is to peek one character further and exempt [% from the rejection. In has_valid_boundaries, replace the current block with:

if after == b'[' && span.slice(input).contains(&b'[') {
    let peek = input.get(span.end + 2).copied().unwrap_or(b'\0');
    if peek != b'%' {
        return false;
    }
}

Comment on lines +987 to 1048
fn test_template_toolkit_syntax() {
// The exact reproduction from the issue.
assert_extract_candidates_contains(
r#"<div class="[% IF $is_open %]bg-white/40[% ELSE %]bg-white/10[% END %]"></div>"#,
vec!["bg-white/40", "bg-white/10"],
);

// Lowercase keywords.
assert_extract_candidates_contains(
r#"<div class="[% if cond %]flex[% end %]"></div>"#,
vec!["flex"],
);

// Static class adjacent to a `[% %]` block.
assert_extract_candidates_contains(
r#"<div class="bg-blue-500 [% if cond %]bg-red-500[% end %]"></div>"#,
vec!["bg-blue-500", "bg-red-500"],
);

// `[color:red]` (arbitrary property) must STILL extract as a whole
// arbitrary property, not be skipped as a template tag. We use the
// non-strict contains-checker because `class` is also legitimately
// extracted from the `<div class="...">` attribute name.
assert_extract_candidates_contains(
r#"<div class="[color:red]"></div>"#,
vec!["[color:red]"],
);

// `bg-[red][blue]` (arbitrary value followed by arbitrary value) must
// STILL be invalid — `[` alone is not a template-tag opener.
assert_extract_sorted_candidates("bg-[red][blue]", vec![]);

// Candidate immediately before a tag with no trailing class. Locks in
// the before-boundary `]` path independently of the after-boundary
// `[` path (recommended by review).
assert_extract_sorted_candidates("bg-red-500[% if x %]", vec!["bg-red-500"]);

// Multiple adjacent tags in the same string. Guards the
// `cursor.advance_by(rel + 2)` + `continue` loop in the outer
// extractor — each tag must be skipped independently without
// consuming or merging surrounding candidates (recommended by
// review). Uses multi-char utilities (1-letter utilities followed
// by `[` are not extractable — pre-existing limitation of the
// named utility machine).
assert_extract_sorted_candidates(
"bg-red-500[% x %]bg-blue-500[% y %]text-white",
vec!["bg-red-500", "bg-blue-500", "text-white"],
);

// `[%` at input start, class follows the tag. Exercises the
// top-of-loop `[%` skip in `extract()` (covers the case where
// the cursor enters the iteration parked on `[`).
assert_extract_sorted_candidates("[% if x %] bg-red-500", vec!["bg-red-500"]);

// `[%` after whitespace (no candidate before). Exercises the
// post-whitespace-arrival path in the top-of-loop `[%` skip.
assert_extract_sorted_candidates("bg-red-500 [% if x %]", vec!["bg-red-500"]);
}

// https://github.com/tailwindlabs/tailwindcss/issues/17050
#[test]
fn test_haml_syntax() {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Missing test for arbitrary-value utility adjacent to template tag

The test suite covers named utilities like bg-red-500 and bg-white/40 before/after [%…%] tags, but none of the nine cases involve an arbitrary-value utility (one whose span contains [) immediately followed by a template tag, e.g. bg-[red][% if x %] or [% if x %]bg-[#ff0000]. This is exactly the combination that the carve-out in has_valid_boundaries incorrectly blocks. Adding a test like assert_extract_sorted_candidates("bg-[red][% if x %]", vec!["bg-[red]"]) would catch the regression.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@RobinMalfait

Copy link
Copy Markdown
Member

Hey! Thanks for the PR, but I believe this is the wrong approach. This makes things slower for everyone even if you don't use the special [% boundary characters, and that's not what you want.

You also mention [% ... %] often, but that's not the problem, the problem is %] ... [%.

Please don't open PRs that are purely vibe coded. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Class names adjacent to [ and ] delimiters not extracted

2 participants