Skip to content

fix(dsl): prevent lexer panic on string ending in trailing backslash#3007

Open
ManthanNimodiya wants to merge 1 commit into
Permify:masterfrom
ManthanNimodiya:fix/lexer-string-panic-on-trailing-backslash
Open

fix(dsl): prevent lexer panic on string ending in trailing backslash#3007
ManthanNimodiya wants to merge 1 commit into
Permify:masterfrom
ManthanNimodiya:fix/lexer-string-panic-on-trailing-backslash

Conversation

@ManthanNimodiya

@ManthanNimodiya ManthanNimodiya commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Bug

The schema DSL lexer panics with a slice-out-of-range error when lexing an unterminated string that ends in a trailing backslash.
Minimal repro:

NewParser(""\").Parse()
panic: runtime error: slice bounds out of range [:3] with length 2
pkg/dsl/lexer/lexer.go:231

Reached from the SchemaWrite API via Parse(); the gRPC recovery interceptor catches it, so it errors the request instead of crashing the server, but it's still an uncontrolled panic on untrusted input.

Root cause

In lexString, when a \ is the last character of the input, l.readChar() (skipping the backslash) sets l.ch = 0 but the code still executes position = l.position + 1, advancing position one past len(input).
The loop then exits on l.ch == 0, and the final str += l.input[position:l.position] slices with a start index greater than the string length, panicking.

Fix

Detect l.ch == 0 immediately after skipping the backslash and break early with position = l.position, treating it as an unterminated string instead of continuing to advance past the input bounds.

Testing

Added a regression test in lexer_test.go asserting the lexer doesn't panic on ""\", and verified the exact reporter NewParser(""\").Parse()) now returns cleanly. Full pkg/dsl/... suite still passes.

Closes #3004

Summary by CodeRabbit

  • Bug Fixes

    • Fixed a crash in the DSL parser when processing string literals with unterminated escape sequences.
  • Tests

    • Added test coverage to verify the parser handles malformed strings gracefully without panicking.

@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Review Change Stack

πŸ“ Walkthrough

Walkthrough

The fix adds an end-of-input guard inside lexString in the DSL lexer: when a backslash is encountered and l.ch is 0 (EOF), the lexer updates its position and exits the escape loop instead of attempting to parse a non-existent escape character. A Ginkgo regression test confirms no panic occurs on the two-byte input "\.

Changes

Lexer panic fix for trailing backslash

Layer / File(s) Summary
lexString EOF guard and regression test
pkg/dsl/lexer/lexer.go, pkg/dsl/lexer/lexer_test.go
lexString checks l.ch == 0 after a \ and breaks out of the escape loop instead of slicing past end-of-input; a Ginkgo test verifies the fix by iterating tokens on "\ until EOF and asserting no panic.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~5 minutes

Poem

A backslash sat at the end of the line,
No char came after β€” a panic by design!
But the rabbit hopped in with a nil-check so neat,
Now the lexer just shrugs and accepts the defeat.
No more slice-out-of-range, no more fright β€” πŸ‡
Unterminated strings handled just right!

πŸš₯ Pre-merge checks | βœ… 5
βœ… Passed checks (5 passed)
Check name Status Explanation
Description Check βœ… Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check βœ… Passed The title 'fix(dsl): prevent lexer panic on string ending in trailing backslash' clearly and specifically describes the main change: preventing a panic in the DSL lexer for strings with trailing backslashes.
Linked Issues check βœ… Passed The changes fully address issue #3004: the fix detects unterminated strings ending with a backslash and breaks early to prevent out-of-bounds slicing, while the regression test verifies the lexer no longer panics on the problematic input.
Out of Scope Changes check βœ… Passed All changes are directly scoped to fixing the reported bug: modifications to lexString function logic and addition of a regression test, with no extraneous changes detected.
Docstring Coverage βœ… Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
πŸ§ͺ Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❀️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ManthanNimodiya

Copy link
Copy Markdown
Contributor Author

@omer-topal @ucatbas please have a look and lmk for any required changes

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
pkg/dsl/lexer/lexer_test.go (1)

1193-1207: πŸ’€ Low value

Consider asserting on token properties for more comprehensive validation.

The test correctly verifies that no panic occurs, which addresses the primary objective. However, it could be enhanced by also asserting:

  • The token type (expected: token.STRING)
  • The literal value (expected: empty string "")
  • The number of tokens before EOF (expected: 1)

These additions would document the expected behavior more explicitly and catch any regression in how unterminated strings are tokenized.

πŸ“‹ Example enhancement
 	It("Case 11 - Unterminated string ending in a trailing backslash should not panic", func() {
 		// Regression test: a string literal that ends right after a backslash
 		// (no character left to escape) used to overshoot the input bounds
 		// and panic with a slice-out-of-range error.
 		l := NewLexer("\"\\")
 
-		Expect(func() {
-			for {
-				tok := l.NextToken()
-				if tok.Type == token.EOF {
-					break
-				}
-			}
-		}).ShouldNot(Panic())
+		var tokens []token.Token
+		Expect(func() {
+			for {
+				tok := l.NextToken()
+				tokens = append(tokens, tok)
+				if tok.Type == token.EOF {
+					break
+				}
+			}
+		}).ShouldNot(Panic())
+		
+		Expect(tokens).To(HaveLen(2)) // STRING token + EOF token
+		Expect(tokens[0].Type).To(Equal(token.STRING))
+		Expect(tokens[0].Literal).To(Equal("")) // Unterminated string yields empty literal
 	})
πŸ€– Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/dsl/lexer/lexer_test.go` around lines 1193 - 1207, The test for the
unterminated string with trailing backslash currently only validates that no
panic occurs, but should also verify the correct tokenization behavior. Add
assertions within the token iteration loop to capture and verify: the token type
should be token.STRING before reaching token.EOF, the token literal value should
be an empty string, and there should be exactly one token before the EOF token.
Store the tokens in a slice or counter variable during the loop, then assert
these properties after the loop completes to document the expected behavior and
catch any regression in how unterminated strings are tokenized by the Lexer.
πŸ€– Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/dsl/lexer/lexer_test.go`:
- Around line 1193-1207: Add additional regression test cases to the test file
following the same pattern as the existing "Case 11" test. Create separate test
cases for valid escape sequences at string end (such as "\n and "\t), invalid
escape sequences at string end (such as "\x and "\z), and multiple backslashes
at string end (such as "\\ and "\\\). Each new test case should follow the same
structure as Case 11: create a NewLexer with the unterminated string input,
iterate through tokens with NextToken() until reaching token.EOF, and verify
that the operation does not panic. This ensures comprehensive regression
coverage for all unterminated string patterns with escape sequences.

---

Nitpick comments:
In `@pkg/dsl/lexer/lexer_test.go`:
- Around line 1193-1207: The test for the unterminated string with trailing
backslash currently only validates that no panic occurs, but should also verify
the correct tokenization behavior. Add assertions within the token iteration
loop to capture and verify: the token type should be token.STRING before
reaching token.EOF, the token literal value should be an empty string, and there
should be exactly one token before the EOF token. Store the tokens in a slice or
counter variable during the loop, then assert these properties after the loop
completes to document the expected behavior and catch any regression in how
unterminated strings are tokenized by the Lexer.
πŸͺ„ Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
βš™οΈ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5a2c45ea-e34e-4ca5-a6ac-619f66a5facb

πŸ“₯ Commits

Reviewing files that changed from the base of the PR and between aa3a7c6 and 08a3dd0.

πŸ“’ Files selected for processing (2)
  • pkg/dsl/lexer/lexer.go
  • pkg/dsl/lexer/lexer_test.go

Comment on lines +1193 to +1207
It("Case 11 - Unterminated string ending in a trailing backslash should not panic", func() {
// Regression test: a string literal that ends right after a backslash
// (no character left to escape) used to overshoot the input bounds
// and panic with a slice-out-of-range error.
l := NewLexer("\"\\")

Expect(func() {
for {
tok := l.NextToken()
if tok.Type == token.EOF {
break
}
}
}).ShouldNot(Panic())
})

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟑 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Search for existing tests covering escape sequences and unterminated strings

# Search for test cases involving escape sequences in strings
rg -n -A 5 -B 2 'NewLexer.*\\\\' pkg/dsl/lexer/lexer_test.go

# Search for test cases with unterminated strings
rg -n -A 5 -B 2 'unterminated|Unterminated' pkg/dsl/lexer/lexer_test.go

Repository: Permify/permify

Length of output: 680


🏁 Script executed:

# Get all test cases in the file to see the scope
rg -n 'It\(' pkg/dsl/lexer/lexer_test.go | head -20

Repository: Permify/permify

Length of output: 615


🏁 Script executed:

# Search for tests with actual escape sequences in test inputs
rg -n '\\\\n|\\\\t|\\\\x|\\\\z|\\\\\\\\' pkg/dsl/lexer/lexer_test.go | head -20

Repository: Permify/permify

Length of output: 41


🏁 Script executed:

# Get context around Case 11 and nearby test cases to see what else is tested
sed -n '1150,1250p' pkg/dsl/lexer/lexer_test.go

Repository: Permify/permify

Length of output: 1448


🏁 Script executed:

# Find and examine the lexer implementation to understand escape sequence handling
fd lexer.go pkg/dsl/lexer/ --type f

Repository: Permify/permify

Length of output: 82


🏁 Script executed:

# Search for escape sequence handling in the lexer
rg -n -B 3 -A 10 'escape|backslash' pkg/dsl/lexer/lexer.go | head -40

Repository: Permify/permify

Length of output: 823


🏁 Script executed:

# Get more of the lexString function to see complete escape handling
sed -n '205,280p' pkg/dsl/lexer/lexer.go

Repository: Permify/permify

Length of output: 1836


Add regression tests for other escape-sequence edge cases in unterminated strings.

The trailing backslash case (Case 11) is properly tested, but similar edge cases with escape sequences lack coverage. The lexer implementation has a safe if l.ch == 0 check that prevents panics, but the following scenarios should be explicitly tested:

  • Valid escape sequences at string end: "\n, "\t
  • Invalid escape sequences at string end: "\x, "\z
  • Multiple backslashes at string end: "\\, "\\\

Adding these tests ensures regression protection and documents the expected behavior for all unterminated string patterns.

πŸ€– Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/dsl/lexer/lexer_test.go` around lines 1193 - 1207, Add additional
regression test cases to the test file following the same pattern as the
existing "Case 11" test. Create separate test cases for valid escape sequences
at string end (such as "\n and "\t), invalid escape sequences at string end
(such as "\x and "\z), and multiple backslashes at string end (such as "\\ and
"\\\). Each new test case should follow the same structure as Case 11: create a
NewLexer with the unterminated string input, iterate through tokens with
NextToken() until reaching token.EOF, and verify that the operation does not
panic. This ensures comprehensive regression coverage for all unterminated
string patterns with escape sequences.

@codecov

codecov Bot commented Jun 19, 2026

Copy link
Copy Markdown

Codecov Report

βœ… All modified and coverable lines are covered by tests.
βœ… Project coverage is 74.69%. Comparing base (70b1110) to head (08a3dd0).
⚠️ Report is 26 commits behind head on master.

❌ Your project check has failed because the head coverage (74.69%) is below the target coverage (75.00%). You can increase the head coverage or adjust the target coverage.

❗ There is a different number of reports uploaded between BASE (70b1110) and HEAD (08a3dd0). Click for more details.

HEAD has 2 uploads less than BASE
Flag BASE (70b1110) HEAD (08a3dd0)
3 1
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3007      +/-   ##
==========================================
- Coverage   82.58%   74.69%   -7.89%     
==========================================
  Files          74       83       +9     
  Lines        8300     9215     +915     
==========================================
+ Hits         6854     6882      +28     
- Misses        910     1798     +888     
+ Partials      536      535       -1     

β˜” View full report in Codecov by Harness.
πŸ“’ Have feedback on the report? Share it here.

πŸš€ New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • πŸ“¦ JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] DSL lexer panics (slice out of range) on an unterminated string

1 participant