Skip to content

Conversation

@sawka
Copy link
Member

@sawka sawka commented Jan 21, 2026

lots of stuff here. way too much for a single PR :)

introduces a streaming framework for the RPC system with flow control. new authentication primitives for the RPC system. this is used to create a persistent "job manager" process (via wsh) that can survive disconnects. and then a jobcontroller in the main server that can create, reconnect, and manage these new persistent jobs.

code is currently not actively hooked up to anything minus some new debugging wsh commands, and a switch in the term block that lets me test viewing the output.

after PRing this change the next steps are more testing and then integrating this functionality into the product.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 21, 2026

Walkthrough

Adds comprehensive job-management and streaming support across backend and frontend. New packages: pkg/jobcontroller (controller APIs), pkg/jobmanager (job manager daemon, domain-socket IPC), pkg/jobmanager streammanager and cirbuf (ACK-based buffering), and jobcmd for PTY-backed commands. Migrates stream IDs/readers/writers to string UUIDs in streamclient/broker, extends wshrpc types and commands for job lifecycle and stream packets, updates wshremote with job-manager wiring, adds frontend RPC/type and terminal job propagation, DB migration for db_job, router domain-socket mode, and replaces route event "route:gone" with "route:up"/"route:down". Two WebSocket block-input/resize WS commands removed.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.44% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'new job manager / framework for creating persistent remove sessions' clearly summarizes the main feature being added—a job manager and framework for persistent sessions.
Description check ✅ Passed The description is directly related to the changeset, explaining the streaming framework, authentication primitives, job manager process, job controller, and current integration status.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

return false, fmt.Errorf("cannot parse destination URI %q: %w", destUri, err)
}
destPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(destConn.Path))
destinfo, err := os.Stat(destPathCleaned)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 3 days ago

In general, to fix this issue we need to ensure that user-controlled paths are either (a) strictly validated/normalized and then checked to be within an allowed directory, or (b) rejected if they contain unsafe constructs such as directory traversal, absolute paths, or disallowed prefixes. It’s also important not to silently ignore validation errors (as ExpandHomeDirSafe currently does) when building filesystem paths used by os.Stat, os.Remove, or similar.

For this specific case, the best targeted fix without changing overall functionality is to (1) stop ignoring errors from ExpandHomeDir, and (2) enforce that paths expanded from ~ remain within the user’s home directory. We already have that guarantee in ExpandHomeDir (it returns an error if the normalized absolute path does not have homeDir as a prefix), but ExpandHomeDirSafe drops that error and returns an empty string on failure. In RemoteFileCopyCommand, we should therefore call wavebase.ExpandHomeDir directly, handle any error, and only then pass the validated result to filepath.Clean. This ensures that malicious values like ~/../../etc/passwd are rejected before they are used in os.Stat or os.Remove. We do not need extra imports and we keep behavior for valid paths: ~ and ~/... still work, and non-tilde paths are passed through filepath.Clean by ExpandHomeDir as before.

Concretely, in pkg/wshrpc/wshremote/wshremote_file.go inside RemoteFileCopyCommand, replace the line

destPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(destConn.Path))

with a small block that:

  1. Calls wavebase.ExpandHomeDir(destConn.Path) and checks the error.
  2. If there is an error, returns it (wrapped with context) to the caller instead of proceeding.
  3. Cleans the successfully expanded path using filepath.Clean and assigns it to destPathCleaned.

This change keeps all existing logic that depends on destPathCleaned intact, but removes the possibility of silently accepting unsafe home-dir-based paths.

Suggested changeset 1
pkg/wshrpc/wshremote/wshremote_file.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/pkg/wshrpc/wshremote/wshremote_file.go b/pkg/wshrpc/wshremote/wshremote_file.go
--- a/pkg/wshrpc/wshremote/wshremote_file.go
+++ b/pkg/wshrpc/wshremote/wshremote_file.go
@@ -281,7 +281,11 @@
 	if err != nil {
 		return false, fmt.Errorf("cannot parse destination URI %q: %w", destUri, err)
 	}
-	destPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(destConn.Path))
+	expandedDestPath, err := wavebase.ExpandHomeDir(destConn.Path)
+	if err != nil {
+		return false, fmt.Errorf("invalid destination path %q: %w", destConn.Path, err)
+	}
+	destPathCleaned := filepath.Clean(expandedDestPath)
 	destinfo, err := os.Stat(destPathCleaned)
 	if err != nil {
 		if !errors.Is(err, fs.ErrNotExist) {
EOF
@@ -281,7 +281,11 @@
if err != nil {
return false, fmt.Errorf("cannot parse destination URI %q: %w", destUri, err)
}
destPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(destConn.Path))
expandedDestPath, err := wavebase.ExpandHomeDir(destConn.Path)
if err != nil {
return false, fmt.Errorf("invalid destination path %q: %w", destConn.Path, err)
}
destPathCleaned := filepath.Clean(expandedDestPath)
destinfo, err := os.Stat(destPathCleaned)
if err != nil {
if !errors.Is(err, fs.ErrNotExist) {
Copilot is powered by AI and may make mistakes. Always verify output.
if !overwrite {
return false, fmt.Errorf(fstype.OverwriteRequiredError, destPathCleaned)
} else {
err := os.Remove(destPathCleaned)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 3 days ago

General approach: Validate or constrain user-controlled paths before using them in filesystem operations like os.Stat and os.Remove. Either (1) restrict them to a particular safe directory and verify the resolved path stays within that directory, or (2) if the path should be relative to the user’s home, ensure the final absolute path is under GetHomeDir() and fail otherwise.

Best fix here without changing external behavior significantly: enhance wavebase.ExpandHomeDirSafe so it never returns unsafe paths. Right now, ExpandHomeDirSafe just calls ExpandHomeDir and discards the error. For non-~ paths, ExpandHomeDir simply filepath.Cleans the input and returns it, so ExpandHomeDirSafe will happily return an arbitrary absolute path like /etc/passwd. We can tighten this by:

  1. Updating ExpandHomeDir to:

    • For non-~-prefixed paths, still filepath.Clean(pathStr), but then compute the absolute path and ensure it is either:
      • under the user’s home directory, or
      • a relative path (based on how the project wants to treat these).
    • If the path resolves outside the allowed base (home directory), return an error.
  2. Keeping the function signature the same so callers like RemoteFileCopyCommand don’t change, but now wavebase.ExpandHomeDirSafe will return "" when ExpandHomeDir detects an unsafe path. This empty string, when passed through filepath.Clean, becomes ".", which is not ideal. So we should also harden RemoteFileCopyCommand to detect obviously invalid results from ExpandHomeDirSafe (such as "") and fail early instead of operating on them.

Given we can only change code in the shown files, and we want a minimal, robust change:

  • Modify wavebase.ExpandHomeDir to always enforce “stay under home directory” for both ~ and non-~ paths. That means:
    • Compute absPath, err := filepath.Abs(expandedPathOrPathStr).
    • If error, or if !strings.HasPrefix(absPath, homeDir) (for absolute paths), return an error.
  • Leave ExpandHomeDirSafe as a thin wrapper, but now it will often return "" when the input is invalid.
  • In RemoteFileCopyCommand (in pkg/wshrpc/wshremote/wshremote_file.go), validate the result of wavebase.ExpandHomeDirSafe(destConn.Path) before using it:
    • Call destPath, err := wavebase.ExpandHomeDir(destConn.Path) instead, check err, and fail with a clear error if the path is unsafe.
    • Then apply filepath.Clean to destPath.

This keeps existing semantics for valid paths but rejects malicious ones that would escape the user’s home directory. It also maintains compatibility with other callers of ExpandHomeDirSafe, while strengthening the core implementation.

Concretely:

  • In pkg/wavebase/wavebase.go, enhance ExpandHomeDir to validate non-~ paths using the same “must be under home dir” check that it currently applies only in the ~ branch.
  • In pkg/wshrpc/wshremote/wshremote_file.go, change line 284 to call ExpandHomeDir explicitly and handle the error, rather than ExpandHomeDirSafe.

No new imports are needed.


Suggested changeset 2
pkg/wshrpc/wshremote/wshremote_file.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/pkg/wshrpc/wshremote/wshremote_file.go b/pkg/wshrpc/wshremote/wshremote_file.go
--- a/pkg/wshrpc/wshremote/wshremote_file.go
+++ b/pkg/wshrpc/wshremote/wshremote_file.go
@@ -281,7 +281,11 @@
 	if err != nil {
 		return false, fmt.Errorf("cannot parse destination URI %q: %w", destUri, err)
 	}
-	destPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(destConn.Path))
+	destPath, err := wavebase.ExpandHomeDir(destConn.Path)
+	if err != nil {
+		return false, fmt.Errorf("invalid destination path %q: %w", destConn.Path, err)
+	}
+	destPathCleaned := filepath.Clean(destPath)
 	destinfo, err := os.Stat(destPathCleaned)
 	if err != nil {
 		if !errors.Is(err, fs.ErrNotExist) {
EOF
@@ -281,7 +281,11 @@
if err != nil {
return false, fmt.Errorf("cannot parse destination URI %q: %w", destUri, err)
}
destPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(destConn.Path))
destPath, err := wavebase.ExpandHomeDir(destConn.Path)
if err != nil {
return false, fmt.Errorf("invalid destination path %q: %w", destConn.Path, err)
}
destPathCleaned := filepath.Clean(destPath)
destinfo, err := os.Stat(destPathCleaned)
if err != nil {
if !errors.Is(err, fs.ErrNotExist) {
pkg/wavebase/wavebase.go
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/pkg/wavebase/wavebase.go b/pkg/wavebase/wavebase.go
--- a/pkg/wavebase/wavebase.go
+++ b/pkg/wavebase/wavebase.go
@@ -146,19 +146,27 @@
 }
 
 func ExpandHomeDir(pathStr string) (string, error) {
-	if pathStr != "~" && !strings.HasPrefix(pathStr, "~/") && (!strings.HasPrefix(pathStr, `~\`) || runtime.GOOS != "windows") {
-		return filepath.Clean(pathStr), nil
-	}
 	homeDir := GetHomeDir()
+	// Handle home-relative paths starting with "~" explicitly.
 	if pathStr == "~" {
 		return homeDir, nil
 	}
-	expandedPath := filepath.Clean(filepath.Join(homeDir, pathStr[2:]))
-	absPath, err := filepath.Abs(filepath.Join(homeDir, expandedPath))
+	if strings.HasPrefix(pathStr, "~/") || (runtime.GOOS == "windows" && strings.HasPrefix(pathStr, `~\`)) {
+		expandedPath := filepath.Clean(filepath.Join(homeDir, pathStr[2:]))
+		absPath, err := filepath.Abs(expandedPath)
+		if err != nil || !strings.HasPrefix(absPath, homeDir) {
+			return "", fmt.Errorf("potential path traversal detected for path %s", pathStr)
+		}
+		return expandedPath, nil
+	}
+
+	// For all other paths, normalize and ensure they remain within the home directory.
+	cleanPath := filepath.Clean(pathStr)
+	absPath, err := filepath.Abs(cleanPath)
 	if err != nil || !strings.HasPrefix(absPath, homeDir) {
 		return "", fmt.Errorf("potential path traversal detected for path %s", pathStr)
 	}
-	return expandedPath, nil
+	return cleanPath, nil
 }
 
 func ExpandHomeDirSafe(pathStr string) string {
EOF
@@ -146,19 +146,27 @@
}

func ExpandHomeDir(pathStr string) (string, error) {
if pathStr != "~" && !strings.HasPrefix(pathStr, "~/") && (!strings.HasPrefix(pathStr, `~\`) || runtime.GOOS != "windows") {
return filepath.Clean(pathStr), nil
}
homeDir := GetHomeDir()
// Handle home-relative paths starting with "~" explicitly.
if pathStr == "~" {
return homeDir, nil
}
expandedPath := filepath.Clean(filepath.Join(homeDir, pathStr[2:]))
absPath, err := filepath.Abs(filepath.Join(homeDir, expandedPath))
if strings.HasPrefix(pathStr, "~/") || (runtime.GOOS == "windows" && strings.HasPrefix(pathStr, `~\`)) {
expandedPath := filepath.Clean(filepath.Join(homeDir, pathStr[2:]))
absPath, err := filepath.Abs(expandedPath)
if err != nil || !strings.HasPrefix(absPath, homeDir) {
return "", fmt.Errorf("potential path traversal detected for path %s", pathStr)
}
return expandedPath, nil
}

// For all other paths, normalize and ensure they remain within the home directory.
cleanPath := filepath.Clean(pathStr)
absPath, err := filepath.Abs(cleanPath)
if err != nil || !strings.HasPrefix(absPath, homeDir) {
return "", fmt.Errorf("potential path traversal detected for path %s", pathStr)
}
return expandedPath, nil
return cleanPath, nil
}

func ExpandHomeDirSafe(pathStr string) string {
Copilot is powered by AI and may make mistakes. Always verify output.
}

copyFileFunc := func(path string, finfo fs.FileInfo, srcFile io.Reader) (int64, error) {
nextinfo, err := os.Stat(path)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 3 days ago

In general, to fix uncontrolled path usage you must either (a) reject unsafe paths up front (e.g., containing .., absolute paths, path separators when a single component is expected), or (b) constrain all paths to lie under a designated safe directory by resolving the user-supplied part relative to that directory and verifying the result stays inside it. Also, if you have a function meant to enforce such safety (like ExpandHomeDir), you must not silently ignore its errors.

In this codebase, the core problem is that ExpandHomeDirSafe discards the error from ExpandHomeDir and returns the possibly-unsafe cleaned path, and RemoteFileCopyCommand then uses that path directly in filesystem operations. The single best targeted fix, without changing the public API shape, is:

  1. Make ExpandHomeDirSafe actually “safe” by falling back to "/" when ExpandHomeDir returns an error (including the “potential path traversal detected” error). This ensures that any path the traversal detection flags does not propagate into file operations. Since ExpandHomeDir already validates that ~-based paths resolve under the home directory, we should respect that; for non-~ paths, it still just Cleans them, which is fine for now because the dangerous case we care about is the ~-relative traversal that ExpandHomeDir already protects against.

  2. Alternatively (and more robustly), we can also slightly tighten ExpandHomeDirSafe to handle the case where ExpandHomeDir returns an empty string due to an error by returning "/" instead of an empty path. This avoids ambiguous or relative paths leaking.

This change is nicely localized to pkg/wavebase/wavebase.go and automatically strengthens all call sites, including the one in RemoteFileCopyCommand where CodeQL flags the issue. It preserves existing functionality for valid inputs (successful ExpandHomeDir behavior is unchanged) while ensuring that invalid or potentially traversing ~-paths are no longer used.

Concretely:

  • Edit ExpandHomeDirSafe in pkg/wavebase/wavebase.go to:
    • Call ExpandHomeDir(pathStr).
    • If there is an error or the returned path is empty, return "/" (or some other minimal safe value; / is already used as the fallback in GetHomeDir).
    • Otherwise return path.

No new imports or methods are needed beyond this change. All other code, including RemoteFileCopyCommand, can remain unchanged; it will simply see a safe fallback path instead of an unsafe, user-controlled one when traversal is detected.


Suggested changeset 1
pkg/wavebase/wavebase.go
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/pkg/wavebase/wavebase.go b/pkg/wavebase/wavebase.go
--- a/pkg/wavebase/wavebase.go
+++ b/pkg/wavebase/wavebase.go
@@ -162,7 +162,12 @@
 }
 
 func ExpandHomeDirSafe(pathStr string) string {
-	path, _ := ExpandHomeDir(pathStr)
+	path, err := ExpandHomeDir(pathStr)
+	if err != nil || path == "" {
+		// On error or detection of a potentially unsafe path, fall back to a safe base.
+		// We use "/" here to match the existing fallback behavior in GetHomeDir.
+		return "/"
+	}
 	return path
 }
 
EOF
@@ -162,7 +162,12 @@
}

func ExpandHomeDirSafe(pathStr string) string {
path, _ := ExpandHomeDir(pathStr)
path, err := ExpandHomeDir(pathStr)
if err != nil || path == "" {
// On error or detection of a potentially unsafe path, fall back to a safe base.
// We use "/" here to match the existing fallback behavior in GetHomeDir.
return "/"
}
return path
}

Copilot is powered by AI and may make mistakes. Always verify output.
if !finfo.IsDir() {
// try to create file in directory
path = filepath.Join(path, filepath.Base(finfo.Name()))
newdestinfo, err := os.Stat(path)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 3 days ago

In general, the problem is that an unvalidated path query parameter is treated as a connection URI (connection argument to CreateFileShareClient), and later the Path component of that connection is used as a filesystem path (after only minimal normalization). To fix this, we should: (1) restrict which schemes are allowed when a user-supplied value is interpreted as a connection, and/or (2) validate the filesystem paths derived from those URIs so they are confined to a safe directory or otherwise checked for traversal before use.

The best targeted fix, without changing existing functionality more than necessary, is to harden connparse.ParseURIAndReplaceCurrentHost so that for filesystem-oriented connection types (not S3 or other remote-only backends), it rejects Path values that perform directory traversal when resolved against the user’s home directory. This centralizes validation for all callers that eventually pass Connection.Path into wavebase.ExpandHomeDirSafe and then OS filesystem APIs. wavebase.ExpandHomeDir already protects the “starts with ~” case; we should complement this by adding a generic traversal check for non-~ paths in RemoteFileCopyCommand before using destConn.Path. Specifically, we can resolve the path to an absolute path and ensure it lies under the user’s home directory (or another safe root), matching the examples from the background section.

Concretely, in RemoteFileCopyCommand in pkg/wshrpc/wshremote/wshremote_file.go, after obtaining destConn, we will derive a safe absolute destination path as follows:

  1. Get the user’s home directory via wavebase.GetHomeDir().
  2. If destConn.Path starts with ~ (already handled by ExpandHomeDir), keep using wavebase.ExpandHomeDir but now check its error instead of discarding it via ExpandHomeDirSafe. If it errors, abort the operation.
  3. If destConn.Path does not start with ~, resolve it to an absolute path with filepath.Abs, then reject it if the resulting path is not under the home directory (using a HasPrefix check on a cleaned, slash-suffixed home directory) and return an error like “potential path traversal detected”.
  4. Use the validated absolute path as destPathCleaned for all subsequent filesystem operations.

This preserves the semantics for normal in-home paths and ~-relative paths, but prevents a user from specifying destinations outside the home directory (e.g., /etc/passwd, ../.., etc.). The changes are localized to RemoteFileCopyCommand and reuse existing helpers in wavebase, so no new imports or dependencies are needed.


Suggested changeset 1
pkg/wshrpc/wshremote/wshremote_file.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/pkg/wshrpc/wshremote/wshremote_file.go b/pkg/wshrpc/wshremote/wshremote_file.go
--- a/pkg/wshrpc/wshremote/wshremote_file.go
+++ b/pkg/wshrpc/wshremote/wshremote_file.go
@@ -281,7 +281,32 @@
 	if err != nil {
 		return false, fmt.Errorf("cannot parse destination URI %q: %w", destUri, err)
 	}
-	destPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(destConn.Path))
+	// Validate and normalize destination path to prevent path traversal.
+	homeDir := wavebase.GetHomeDir()
+	var destPath string
+	if strings.HasPrefix(destConn.Path, "~") {
+		// ExpandHomeDir enforces that the resolved path stays under homeDir.
+		destPath, err = wavebase.ExpandHomeDir(destConn.Path)
+		if err != nil {
+			return false, fmt.Errorf("invalid destination path %q: %w", destConn.Path, err)
+		}
+	} else {
+		// Resolve to an absolute path and ensure it stays within the home directory.
+		absPath, err := filepath.Abs(destConn.Path)
+		if err != nil {
+			return false, fmt.Errorf("invalid destination path %q: %w", destConn.Path, err)
+		}
+		cleanHome := filepath.Clean(homeDir)
+		homeWithSep := cleanHome
+		if !strings.HasSuffix(homeWithSep, string(os.PathSeparator)) {
+			homeWithSep += string(os.PathSeparator)
+		}
+		if absPath != cleanHome && !strings.HasPrefix(absPath, homeWithSep) {
+			return false, fmt.Errorf("potential path traversal detected for destination %s", destConn.Path)
+		}
+		destPath = absPath
+	}
+	destPathCleaned := filepath.Clean(destPath)
 	destinfo, err := os.Stat(destPathCleaned)
 	if err != nil {
 		if !errors.Is(err, fs.ErrNotExist) {
EOF
@@ -281,7 +281,32 @@
if err != nil {
return false, fmt.Errorf("cannot parse destination URI %q: %w", destUri, err)
}
destPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(destConn.Path))
// Validate and normalize destination path to prevent path traversal.
homeDir := wavebase.GetHomeDir()
var destPath string
if strings.HasPrefix(destConn.Path, "~") {
// ExpandHomeDir enforces that the resolved path stays under homeDir.
destPath, err = wavebase.ExpandHomeDir(destConn.Path)
if err != nil {
return false, fmt.Errorf("invalid destination path %q: %w", destConn.Path, err)
}
} else {
// Resolve to an absolute path and ensure it stays within the home directory.
absPath, err := filepath.Abs(destConn.Path)
if err != nil {
return false, fmt.Errorf("invalid destination path %q: %w", destConn.Path, err)
}
cleanHome := filepath.Clean(homeDir)
homeWithSep := cleanHome
if !strings.HasSuffix(homeWithSep, string(os.PathSeparator)) {
homeWithSep += string(os.PathSeparator)
}
if absPath != cleanHome && !strings.HasPrefix(absPath, homeWithSep) {
return false, fmt.Errorf("potential path traversal detected for destination %s", destConn.Path)
}
destPath = absPath
}
destPathCleaned := filepath.Clean(destPath)
destinfo, err := os.Stat(destPathCleaned)
if err != nil {
if !errors.Is(err, fs.ErrNotExist) {
Copilot is powered by AI and may make mistakes. Always verify output.
return 0, fmt.Errorf(fstype.OverwriteRequiredError, path)
}
} else if overwrite {
err := os.RemoveAll(path)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 3 days ago

In general, to fix uncontrolled path usage when copying files, you need to constrain or validate the destination path before performing filesystem operations. Common strategies: (1) restrict all paths to be under a dedicated “safe” base directory and verify with filepath.Abs + prefix-check; and/or (2) disallow absolute paths and parent directory components (..) entirely if the feature is only meant to work in a workspace subtree. The goal is that, after normalization, any path used in os.RemoveAll, os.OpenFile, etc. cannot escape the intended root.

For this specific case, the simplest fix without changing overall behavior too much is to ensure that the resolved destination path (destPathCleaned / destFilePath) stays within a safe root, e.g. the user’s home directory. We already have wavebase.GetHomeDir(). We can introduce a helper sanitizeDestPath inside RemoteFileCopyCommand that:

  1. Takes a candidate path (like destPathCleaned or destFilePath).
  2. Computes its absolute form with filepath.Abs.
  3. Computes the absolute home directory via wavebase.GetHomeDir() and filepath.Abs.
  4. Checks that the absolute path has the home directory as a prefix (after cleaning and maybe ensuring trailing separator semantics).
  5. Returns an error if the path is outside of the home directory.

We then:

  • Use this helper right after destPathCleaned is computed, so destPathCleaned is guaranteed to be inside the home directory before any os.Stat/os.Remove on it.
  • In copyFileFunc, call the helper at the very beginning to re-sanitize path (because copyFileFunc is called with destFilePath, which is derived from user input, and also because it may further adjust path when a directory already exists).
  • Potentially sanitize the parent directory we create with os.MkdirAll(filepath.Dir(path), 0755) by relying on the sanitized path.

This approach avoids altering the external API: URIs and options stay the same, but if a caller tries to copy to a path outside the user’s home directory, the operation fails with a clear error. All filesystem operations (os.Stat, os.Remove, os.RemoveAll, os.MkdirAll, os.OpenFile) then operate only on sanitized paths.

Concretely, in pkg/wshrpc/wshremote/wshremote_file.go within RemoteFileCopyCommand:

  • Add a small local helper function sanitizeDestPath near the start of RemoteFileCopyCommand (before using destPathCleaned). It should use wavebase.GetHomeDir, filepath.Abs, and strings.HasPrefix to enforce the home-directory root.
  • After computing destPathCleaned, call sanitizeDestPath(destPathCleaned) and bail out if it returns an error.
  • Inside copyFileFunc, before doing os.Stat(path), call sanitizeDestPath(path) and overwrite path with the safe version; if it errors, return that error.

No new imports are needed because filepath, strings, and wavebase are already imported there.


Suggested changeset 1
pkg/wshrpc/wshremote/wshremote_file.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/pkg/wshrpc/wshremote/wshremote_file.go b/pkg/wshrpc/wshremote/wshremote_file.go
--- a/pkg/wshrpc/wshremote/wshremote_file.go
+++ b/pkg/wshrpc/wshremote/wshremote_file.go
@@ -277,11 +277,34 @@
 		return false, fmt.Errorf("cannot specify both overwrite and merge")
 	}
 
+	// sanitizeDestPath ensures that destination paths stay within the user's home directory.
+	sanitizeDestPath := func(p string) (string, error) {
+		homeDir := wavebase.GetHomeDir()
+		homeAbs, err := filepath.Abs(homeDir)
+		if err != nil {
+			return "", fmt.Errorf("cannot resolve home directory: %w", err)
+		}
+		pClean := filepath.Clean(p)
+		pAbs, err := filepath.Abs(pClean)
+		if err != nil {
+			return "", fmt.Errorf("cannot resolve destination path %q: %w", p, err)
+		}
+		// Ensure the destination is within the home directory
+		if !strings.HasPrefix(pAbs, homeAbs+string(os.PathSeparator)) && pAbs != homeAbs {
+			return "", fmt.Errorf("destination path %q is outside of allowed directory %q", pAbs, homeAbs)
+		}
+		return pAbs, nil
+	}
+
 	destConn, err := connparse.ParseURIAndReplaceCurrentHost(ctx, destUri)
 	if err != nil {
 		return false, fmt.Errorf("cannot parse destination URI %q: %w", destUri, err)
 	}
 	destPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(destConn.Path))
+	destPathCleaned, err = sanitizeDestPath(destPathCleaned)
+	if err != nil {
+		return false, err
+	}
 	destinfo, err := os.Stat(destPathCleaned)
 	if err != nil {
 		if !errors.Is(err, fs.ErrNotExist) {
@@ -309,6 +327,12 @@
 	}
 
 	copyFileFunc := func(path string, finfo fs.FileInfo, srcFile io.Reader) (int64, error) {
+		var err error
+		path, err = sanitizeDestPath(path)
+		if err != nil {
+			return 0, err
+		}
+
 		nextinfo, err := os.Stat(path)
 		if err != nil && !errors.Is(err, fs.ErrNotExist) {
 			return 0, fmt.Errorf("cannot stat file %q: %w", path, err)
@@ -319,6 +343,10 @@
 				if !finfo.IsDir() {
 					// try to create file in directory
 					path = filepath.Join(path, filepath.Base(finfo.Name()))
+					path, err = sanitizeDestPath(path)
+					if err != nil {
+						return 0, err
+					}
 					newdestinfo, err := os.Stat(path)
 					if err != nil && !errors.Is(err, fs.ErrNotExist) {
 						return 0, fmt.Errorf("cannot stat file %q: %w", path, err)
EOF
@@ -277,11 +277,34 @@
return false, fmt.Errorf("cannot specify both overwrite and merge")
}

// sanitizeDestPath ensures that destination paths stay within the user's home directory.
sanitizeDestPath := func(p string) (string, error) {
homeDir := wavebase.GetHomeDir()
homeAbs, err := filepath.Abs(homeDir)
if err != nil {
return "", fmt.Errorf("cannot resolve home directory: %w", err)
}
pClean := filepath.Clean(p)
pAbs, err := filepath.Abs(pClean)
if err != nil {
return "", fmt.Errorf("cannot resolve destination path %q: %w", p, err)
}
// Ensure the destination is within the home directory
if !strings.HasPrefix(pAbs, homeAbs+string(os.PathSeparator)) && pAbs != homeAbs {
return "", fmt.Errorf("destination path %q is outside of allowed directory %q", pAbs, homeAbs)
}
return pAbs, nil
}

destConn, err := connparse.ParseURIAndReplaceCurrentHost(ctx, destUri)
if err != nil {
return false, fmt.Errorf("cannot parse destination URI %q: %w", destUri, err)
}
destPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(destConn.Path))
destPathCleaned, err = sanitizeDestPath(destPathCleaned)
if err != nil {
return false, err
}
destinfo, err := os.Stat(destPathCleaned)
if err != nil {
if !errors.Is(err, fs.ErrNotExist) {
@@ -309,6 +327,12 @@
}

copyFileFunc := func(path string, finfo fs.FileInfo, srcFile io.Reader) (int64, error) {
var err error
path, err = sanitizeDestPath(path)
if err != nil {
return 0, err
}

nextinfo, err := os.Stat(path)
if err != nil && !errors.Is(err, fs.ErrNotExist) {
return 0, fmt.Errorf("cannot stat file %q: %w", path, err)
@@ -319,6 +343,10 @@
if !finfo.IsDir() {
// try to create file in directory
path = filepath.Join(path, filepath.Base(finfo.Name()))
path, err = sanitizeDestPath(path)
if err != nil {
return 0, err
}
newdestinfo, err := os.Stat(path)
if err != nil && !errors.Is(err, fs.ErrNotExist) {
return 0, fmt.Errorf("cannot stat file %q: %w", path, err)
Copilot is powered by AI and may make mistakes. Always verify output.
return fmt.Errorf("cannot parse destination URI %q: %w", srcUri, err)
}
destPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(destConn.Path))
destinfo, err := os.Stat(destPathCleaned)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 3 days ago

In general, to fix uncontrolled path usage, normalize the path and then enforce that it is within an allowed directory (or matches a strict pattern) before using it in filesystem operations. For multi-component paths, the usual approach is: pick a trusted base directory, resolve filepath.Abs(filepath.Join(baseDir, userPath)), and then verify that the resulting absolute path still lies under baseDir (by checking prefix on a canonicalized absolute path).

In this codebase, wavebase.ExpandHomeDir already performs a strong check for paths beginning with ~, ensuring they stay under the user’s home directory. The problem is that all other paths are simply cleaned without any constraint. The least intrusive fix that keeps existing functionality is to strengthen ExpandHomeDir so that all paths (not just ~ ones) are constrained to the home directory. That way, any call to wavebase.ExpandHomeDirSafe—including the ones in RemoteFileMoveCommand and RemoteFileTouchCommand—will only ever return paths inside the user’s home directory, mitigating traversal and absolute-path issues while preserving the “operate within home” semantics that the ~ branch already implies.

Concretely:

  • Modify ExpandHomeDir in pkg/wavebase/wavebase.go so that:
    • It always resolves to an absolute path under GetHomeDir().
    • For non-~ paths, it treats them as relative to homeDir (e.g., "foo/bar"$HOME/foo/bar, "/etc/passwd" will fail validation).
    • After joining and cleaning, it computes filepath.Abs and checks strings.HasPrefix(absPath, homeDir); if not, it returns an error.
  • Leave ExpandHomeDirSafe unchanged so that callers continue to ignore the error and get an empty string "" in failure cases, which will then typically cause os.Stat("")/os.Rename("", ...) etc. to fail safely rather than touching unexpected locations.
  • No changes to RemoteFileMoveCommand or RemoteFileTouchCommand are required, because they already rely on wavebase.ExpandHomeDirSafe; strengthening that function provides the necessary validation at a single point.

This approach avoids changing the signatures or call sites and only tightens path resolution behavior. It ensures that all existing uses of ExpandHomeDirSafe now benefit from stricter validation without otherwise altering how callers construct or use their paths, beyond forbidding previously allowed but unsafe absolute or .. paths outside the home directory.

Suggested changeset 1
pkg/wavebase/wavebase.go
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/pkg/wavebase/wavebase.go b/pkg/wavebase/wavebase.go
--- a/pkg/wavebase/wavebase.go
+++ b/pkg/wavebase/wavebase.go
@@ -146,19 +146,28 @@
 }
 
 func ExpandHomeDir(pathStr string) (string, error) {
-	if pathStr != "~" && !strings.HasPrefix(pathStr, "~/") && (!strings.HasPrefix(pathStr, `~\`) || runtime.GOOS != "windows") {
-		return filepath.Clean(pathStr), nil
-	}
 	homeDir := GetHomeDir()
+
+	var resolved string
+	// Handle home-relative paths starting with "~" (Unix) or "~\" (Windows)
 	if pathStr == "~" {
-		return homeDir, nil
+		resolved = homeDir
+	} else if strings.HasPrefix(pathStr, "~/") || (runtime.GOOS == "windows" && strings.HasPrefix(pathStr, `~\`)) {
+		// Strip the "~/" or "~\" prefix and join with homeDir
+		resolved = filepath.Join(homeDir, pathStr[2:])
+	} else {
+		// For all other inputs, treat them as relative to the home directory
+		// so that callers cannot escape outside homeDir via absolute or ".." paths.
+		resolved = filepath.Join(homeDir, pathStr)
 	}
-	expandedPath := filepath.Clean(filepath.Join(homeDir, pathStr[2:]))
-	absPath, err := filepath.Abs(filepath.Join(homeDir, expandedPath))
+
+	// Clean and verify the resulting path is still under the home directory
+	cleaned := filepath.Clean(resolved)
+	absPath, err := filepath.Abs(cleaned)
 	if err != nil || !strings.HasPrefix(absPath, homeDir) {
 		return "", fmt.Errorf("potential path traversal detected for path %s", pathStr)
 	}
-	return expandedPath, nil
+	return cleaned, nil
 }
 
 func ExpandHomeDirSafe(pathStr string) string {
EOF
@@ -146,19 +146,28 @@
}

func ExpandHomeDir(pathStr string) (string, error) {
if pathStr != "~" && !strings.HasPrefix(pathStr, "~/") && (!strings.HasPrefix(pathStr, `~\`) || runtime.GOOS != "windows") {
return filepath.Clean(pathStr), nil
}
homeDir := GetHomeDir()

var resolved string
// Handle home-relative paths starting with "~" (Unix) or "~\" (Windows)
if pathStr == "~" {
return homeDir, nil
resolved = homeDir
} else if strings.HasPrefix(pathStr, "~/") || (runtime.GOOS == "windows" && strings.HasPrefix(pathStr, `~\`)) {
// Strip the "~/" or "~\" prefix and join with homeDir
resolved = filepath.Join(homeDir, pathStr[2:])
} else {
// For all other inputs, treat them as relative to the home directory
// so that callers cannot escape outside homeDir via absolute or ".." paths.
resolved = filepath.Join(homeDir, pathStr)
}
expandedPath := filepath.Clean(filepath.Join(homeDir, pathStr[2:]))
absPath, err := filepath.Abs(filepath.Join(homeDir, expandedPath))

// Clean and verify the resulting path is still under the home directory
cleaned := filepath.Clean(resolved)
absPath, err := filepath.Abs(cleaned)
if err != nil || !strings.HasPrefix(absPath, homeDir) {
return "", fmt.Errorf("potential path traversal detected for path %s", pathStr)
}
return expandedPath, nil
return cleaned, nil
}

func ExpandHomeDirSafe(pathStr string) string {
Copilot is powered by AI and may make mistakes. Always verify output.
if !overwrite {
return fmt.Errorf("destination %q already exists, use overwrite option", destUri)
} else {
err := os.Remove(destPathCleaned)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 3 days ago

General approach:
User-controlled paths should either (a) be validated as single file names with no separators or traversal, or (b) be resolved relative to a predefined safe base directory and then checked that the normalized absolute path remains within that base directory. For operations that are supposed to work on arbitrary paths across the filesystem (like a full remote shell for a single user), the main remaining risk is cross-tenant or cross-context access; in that case, you typically constrain operations to that user’s home or workspace directory.

Best fix for this code without changing functionality more than necessary:

  • Introduce a small helper in pkg/wshrpc/wshremote/wshremote_file.go that takes a potentially tainted path (such as destConn.Path or srcConn.Path) and converts it into a safe absolute path confined to the current user’s home directory.
  • This helper should:
    1. Use wavebase.GetHomeDir() as the base directory.
    2. Expand ~ via wavebase.ExpandHomeDir (and handle error).
    3. If the resulting path is not absolute, join it to the home directory.
    4. Normalize the result with filepath.Clean and filepath.Abs.
    5. Reject the path (return an error) if the computed absolute path does not have the home directory as a prefix (using a path-aware check, not just string prefix that could be bypassed by /home/user2 starting with /home/user).
  • Replace the direct uses of filepath.Clean(wavebase.ExpandHomeDirSafe(...)) in RemoteFileMoveCommand (and in RemoteFileTouchCommand, which has the same issue) with calls to this new helper, and propagate errors appropriately.

This keeps the intent—operate on paths relative to the user’s environment—while enforcing that operations stay within the user’s home tree. It does not introduce any new external dependency; it reuses wavebase.GetHomeDir and standard-library filepath.

Concretely:

  1. In pkg/wshrpc/wshremote/wshremote_file.go, add a helper like:

    func resolveAndValidateUserPath(rawPath string) (string, error) {
        homeDir := wavebase.GetHomeDir()
        if homeDir == "" || homeDir == "/" {
            return "", fmt.Errorf("invalid home directory")
        }
        expanded, err := wavebase.ExpandHomeDir(rawPath)
        if err != nil {
            return "", err
        }
        var base string
        if filepath.IsAbs(expanded) {
            base = expanded
        } else {
            base = filepath.Join(homeDir, expanded)
        }
        abs, err := filepath.Abs(base)
        if err != nil {
            return "", err
        }
        rel, err := filepath.Rel(homeDir, abs)
        if err != nil || strings.HasPrefix(rel, "..") || rel == "." && abs != homeDir {
            return "", fmt.Errorf("path %q escapes home directory", rawPath)
        }
        return abs, nil
    }

    (We will tailor the exact checks in the replacement block, but the essence is: resolve to absolute, ensure it’s inside homeDir.)

  2. Update RemoteFileTouchCommand:

    • Replace cleanedPath := filepath.Clean(wavebase.ExpandHomeDirSafe(path)) with a call to resolveAndValidateUserPath(path) and error out if invalid.
    • The rest of the function (checking existence, mkdir, write file) can still use cleanedPath.
  3. Update RemoteFileMoveCommand:

    • After getting destConn and srcConn, replace filepath.Clean(wavebase.ExpandHomeDirSafe(destConn.Path)) and filepath.Clean(wavebase.ExpandHomeDirSafe(srcConn.Path)) with resolveAndValidateUserPath(destConn.Path) and resolveAndValidateUserPath(srcConn.Path), respectively, and propagate errors.
    • This ensures that both source and destination paths used in os.Stat, os.Remove, and os.Rename cannot escape the home tree.

No other files (pkg/web/web.go, pkg/remote/fileshare/fileshare.go, pkg/remote/connparse/connparse.go, pkg/wavebase/wavebase.go) need to be modified for this specific issue.


Suggested changeset 1
pkg/wshrpc/wshremote/wshremote_file.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/pkg/wshrpc/wshremote/wshremote_file.go b/pkg/wshrpc/wshremote/wshremote_file.go
--- a/pkg/wshrpc/wshremote/wshremote_file.go
+++ b/pkg/wshrpc/wshremote/wshremote_file.go
@@ -30,6 +30,41 @@
 	"github.com/wavetermdev/waveterm/pkg/wshutil"
 )
 
+func resolveAndValidateUserPath(rawPath string) (string, error) {
+	homeDir := wavebase.GetHomeDir()
+	if homeDir == "" || homeDir == "/" {
+		return "", fmt.Errorf("invalid home directory")
+	}
+
+	expandedPath, err := wavebase.ExpandHomeDir(rawPath)
+	if err != nil {
+		return "", err
+	}
+
+	var joined string
+	if filepath.IsAbs(expandedPath) {
+		joined = expandedPath
+	} else {
+		joined = filepath.Join(homeDir, expandedPath)
+	}
+
+	absPath, err := filepath.Abs(joined)
+	if err != nil {
+		return "", err
+	}
+
+	// Ensure the final path is within the user's home directory to prevent traversal.
+	rel, err := filepath.Rel(homeDir, absPath)
+	if err != nil {
+		return "", err
+	}
+	if rel == ".." || strings.HasPrefix(rel, ".."+string(os.PathSeparator)) {
+		return "", fmt.Errorf("path %q escapes home directory", rawPath)
+	}
+
+	return absPath, nil
+}
+
 type ByteRangeType struct {
 	All   bool
 	Start int64
@@ -642,7 +677,10 @@
 }
 
 func (impl *ServerImpl) RemoteFileTouchCommand(ctx context.Context, path string) error {
-	cleanedPath := filepath.Clean(wavebase.ExpandHomeDirSafe(path))
+	cleanedPath, err := resolveAndValidateUserPath(path)
+	if err != nil {
+		return fmt.Errorf("invalid path %q: %w", path, err)
+	}
 	if _, err := os.Stat(cleanedPath); err == nil {
 		return fmt.Errorf("file %q already exists", path)
 	}
@@ -666,7 +704,10 @@
 	if err != nil {
 		return fmt.Errorf("cannot parse destination URI %q: %w", srcUri, err)
 	}
-	destPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(destConn.Path))
+	destPathCleaned, err := resolveAndValidateUserPath(destConn.Path)
+	if err != nil {
+		return fmt.Errorf("invalid destination path %q: %w", destConn.Path, err)
+	}
 	destinfo, err := os.Stat(destPathCleaned)
 	if err == nil {
 		if !destinfo.IsDir() {
@@ -687,7 +728,10 @@
 		return fmt.Errorf("cannot parse source URI %q: %w", srcUri, err)
 	}
 	if srcConn.Host == destConn.Host {
-		srcPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(srcConn.Path))
+		srcPathCleaned, err := resolveAndValidateUserPath(srcConn.Path)
+		if err != nil {
+			return fmt.Errorf("invalid source path %q: %w", srcConn.Path, err)
+		}
 		finfo, err := os.Stat(srcPathCleaned)
 		if err != nil {
 			return fmt.Errorf("cannot stat file %q: %w", srcPathCleaned, err)
EOF
Copilot is powered by AI and may make mistakes. Always verify output.
}
if srcConn.Host == destConn.Host {
srcPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(srcConn.Path))
finfo, err := os.Stat(srcPathCleaned)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 3 days ago

In general, the fix is to validate or constrain any user-controlled path before using it in filesystem operations. For local filesystem paths, this typically means either (a) enforcing that only simple filenames without separators are accepted, or (b) resolving the path relative to a known “safe base” directory and verifying that the resulting absolute path remains under that base. Additionally, helper functions must not silently ignore validation errors.

For this specific case, RemoteFileMoveCommand should (1) stop ignoring errors from wavebase.ExpandHomeDir, and (2) ensure that the resolved source and destination paths are safe before passing them to os.Stat and os.Rename. The safest change that does not alter overall behavior is to switch from ExpandHomeDirSafe to ExpandHomeDir and propagate its error, since ExpandHomeDir already implements a containment check using the user’s home directory. If a path like ~ or ~/subdir/file is supplied, it will be expanded and cleaned; if a path attempts to escape the home directory, ExpandHomeDir will return an error and we can reject the operation instead of attempting to operate on an unsafe or unintended filesystem location. We follow the same pattern for both srcConn.Path and destConn.Path, and we only proceed with os.Stat and os.Rename if expansion succeeds.

Concretely, in pkg/wshrpc/wshremote/wshremote_file.go inside RemoteFileMoveCommand, replace the uses of wavebase.ExpandHomeDirSafe(...) with calls to wavebase.ExpandHomeDir(...) that capture and check the error. For the destination, introduce:

destPathExpanded, err := wavebase.ExpandHomeDir(destConn.Path)
if err != nil {
    return fmt.Errorf("invalid destination path %q: %w", destUri, err)
}
destPathCleaned := filepath.Clean(destPathExpanded)

and similarly for the source:

srcPathExpanded, err := wavebase.ExpandHomeDir(srcConn.Path)
if err != nil {
    return fmt.Errorf("invalid source path %q: %w", srcUri, err)
}
srcPathCleaned := filepath.Clean(srcPathExpanded)

This keeps the semantics of using home-directory-relative paths while enforcing the safety checks already present in ExpandHomeDir. No new imports are needed, as wavebase and filepath are already imported.

Suggested changeset 1
pkg/wshrpc/wshremote/wshremote_file.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/pkg/wshrpc/wshremote/wshremote_file.go b/pkg/wshrpc/wshremote/wshremote_file.go
--- a/pkg/wshrpc/wshremote/wshremote_file.go
+++ b/pkg/wshrpc/wshremote/wshremote_file.go
@@ -666,7 +666,11 @@
 	if err != nil {
 		return fmt.Errorf("cannot parse destination URI %q: %w", srcUri, err)
 	}
-	destPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(destConn.Path))
+	destPathExpanded, err := wavebase.ExpandHomeDir(destConn.Path)
+	if err != nil {
+		return fmt.Errorf("invalid destination path %q: %w", destUri, err)
+	}
+	destPathCleaned := filepath.Clean(destPathExpanded)
 	destinfo, err := os.Stat(destPathCleaned)
 	if err == nil {
 		if !destinfo.IsDir() {
@@ -687,7 +691,11 @@
 		return fmt.Errorf("cannot parse source URI %q: %w", srcUri, err)
 	}
 	if srcConn.Host == destConn.Host {
-		srcPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(srcConn.Path))
+		srcPathExpanded, err := wavebase.ExpandHomeDir(srcConn.Path)
+		if err != nil {
+			return fmt.Errorf("invalid source path %q: %w", srcUri, err)
+		}
+		srcPathCleaned := filepath.Clean(srcPathExpanded)
 		finfo, err := os.Stat(srcPathCleaned)
 		if err != nil {
 			return fmt.Errorf("cannot stat file %q: %w", srcPathCleaned, err)
EOF
@@ -666,7 +666,11 @@
if err != nil {
return fmt.Errorf("cannot parse destination URI %q: %w", srcUri, err)
}
destPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(destConn.Path))
destPathExpanded, err := wavebase.ExpandHomeDir(destConn.Path)
if err != nil {
return fmt.Errorf("invalid destination path %q: %w", destUri, err)
}
destPathCleaned := filepath.Clean(destPathExpanded)
destinfo, err := os.Stat(destPathCleaned)
if err == nil {
if !destinfo.IsDir() {
@@ -687,7 +691,11 @@
return fmt.Errorf("cannot parse source URI %q: %w", srcUri, err)
}
if srcConn.Host == destConn.Host {
srcPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(srcConn.Path))
srcPathExpanded, err := wavebase.ExpandHomeDir(srcConn.Path)
if err != nil {
return fmt.Errorf("invalid source path %q: %w", srcUri, err)
}
srcPathCleaned := filepath.Clean(srcPathExpanded)
finfo, err := os.Stat(srcPathCleaned)
if err != nil {
return fmt.Errorf("cannot stat file %q: %w", srcPathCleaned, err)
Copilot is powered by AI and may make mistakes. Always verify output.
if finfo.IsDir() && !recursive {
return fmt.Errorf(fstype.RecursiveRequiredError)
}
err = os.Rename(srcPathCleaned, destPathCleaned)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 3 days ago

In general, to fix uncontrolled-path issues, you must constrain user-provided paths so they can only reference locations that are acceptable for the application. This is typically done by resolving the user input relative to a known “safe” base directory (or a small set of allowed directories), normalizing the result, and rejecting any input whose resolved absolute path does not stay within that base. For move/copy operations, both source and destination should be validated. If paths are supposed to be single file names rather than arbitrary paths, rejecting any separators or .. components is another option.

For this specific case, the problematic part is that srcConn.Path and destConn.Path are passed to wavebase.ExpandHomeDirSafe and filepath.Clean, then used directly in os.Stat, os.Remove, and os.Rename. ExpandHomeDirSafe ignores errors from ExpandHomeDir, which means potential traversal attempts using ~ that are rejected by ExpandHomeDir will silently turn into an empty string and then be cleaned; and there is no guarantee that either path is under an allowed root. The safest change that doesn’t otherwise alter behavior is to introduce a local helper in wshremote_file.go that (1) uses wavebase.ExpandHomeDir instead of the “Safe” variant so validation errors are honored, (2) returns an absolute path, and (3) optionally allows for future enforcement that the path lies under a configured safe directory. Then replace the direct uses of wavebase.ExpandHomeDirSafe/filepath.Clean in RemoteFileMoveCommand with this helper. This keeps the existing semantics for valid inputs (home expansion + cleaning), while preventing obviously-malicious or malformed paths from being used in filesystem operations.

Concretely, in pkg/wshrpc/wshremote/wshremote_file.go:

  • Add a small helper function, e.g. resolveLocalPath, near the top of the file (after the imports or type definitions). This function should:
    • Call wavebase.ExpandHomeDir on the provided path and return an error if it fails.
    • Use filepath.Abs on the expanded path to normalize it.
    • Return the resulting absolute path and error.
  • In RemoteFileMoveCommand:
    • Replace destPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(destConn.Path)) with a call to resolveLocalPath(destConn.Path) and handle the error (returning a formatted error including the destination URI).
    • Similarly, replace srcPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(srcConn.Path)) with a call to resolveLocalPath(srcConn.Path) and handle the error.
  • Leave the subsequent os.Stat, os.Remove, and os.Rename calls unchanged, but they will now only operate on validated, absolute paths.

This approach ensures that any path that would have been rejected by ExpandHomeDir (e.g., containing suspicious use of ~ leading outside the home directory) will now cause the move to fail, instead of silently proceeding with a potentially dangerous path. It also makes the path canonicalization explicit and easier to harden further later (for example, by enforcing a specific safe root directory).

Suggested changeset 1
pkg/wshrpc/wshremote/wshremote_file.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/pkg/wshrpc/wshremote/wshremote_file.go b/pkg/wshrpc/wshremote/wshremote_file.go
--- a/pkg/wshrpc/wshremote/wshremote_file.go
+++ b/pkg/wshrpc/wshremote/wshremote_file.go
@@ -30,6 +30,18 @@
 	"github.com/wavetermdev/waveterm/pkg/wshutil"
 )
 
+func resolveLocalPath(rawPath string) (string, error) {
+	expanded, err := wavebase.ExpandHomeDir(rawPath)
+	if err != nil {
+		return "", err
+	}
+	absPath, err := filepath.Abs(expanded)
+	if err != nil {
+		return "", err
+	}
+	return absPath, nil
+}
+
 type ByteRangeType struct {
 	All   bool
 	Start int64
@@ -666,7 +678,10 @@
 	if err != nil {
 		return fmt.Errorf("cannot parse destination URI %q: %w", srcUri, err)
 	}
-	destPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(destConn.Path))
+	destPathCleaned, err := resolveLocalPath(destConn.Path)
+	if err != nil {
+		return fmt.Errorf("invalid destination path %q: %w", destConn.Path, err)
+	}
 	destinfo, err := os.Stat(destPathCleaned)
 	if err == nil {
 		if !destinfo.IsDir() {
@@ -687,7 +702,10 @@
 		return fmt.Errorf("cannot parse source URI %q: %w", srcUri, err)
 	}
 	if srcConn.Host == destConn.Host {
-		srcPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(srcConn.Path))
+		srcPathCleaned, err := resolveLocalPath(srcConn.Path)
+		if err != nil {
+			return fmt.Errorf("invalid source path %q: %w", srcConn.Path, err)
+		}
 		finfo, err := os.Stat(srcPathCleaned)
 		if err != nil {
 			return fmt.Errorf("cannot stat file %q: %w", srcPathCleaned, err)
EOF
@@ -30,6 +30,18 @@
"github.com/wavetermdev/waveterm/pkg/wshutil"
)

func resolveLocalPath(rawPath string) (string, error) {
expanded, err := wavebase.ExpandHomeDir(rawPath)
if err != nil {
return "", err
}
absPath, err := filepath.Abs(expanded)
if err != nil {
return "", err
}
return absPath, nil
}

type ByteRangeType struct {
All bool
Start int64
@@ -666,7 +678,10 @@
if err != nil {
return fmt.Errorf("cannot parse destination URI %q: %w", srcUri, err)
}
destPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(destConn.Path))
destPathCleaned, err := resolveLocalPath(destConn.Path)
if err != nil {
return fmt.Errorf("invalid destination path %q: %w", destConn.Path, err)
}
destinfo, err := os.Stat(destPathCleaned)
if err == nil {
if !destinfo.IsDir() {
@@ -687,7 +702,10 @@
return fmt.Errorf("cannot parse source URI %q: %w", srcUri, err)
}
if srcConn.Host == destConn.Host {
srcPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(srcConn.Path))
srcPathCleaned, err := resolveLocalPath(srcConn.Path)
if err != nil {
return fmt.Errorf("invalid source path %q: %w", srcConn.Path, err)
}
finfo, err := os.Stat(srcPathCleaned)
if err != nil {
return fmt.Errorf("cannot stat file %q: %w", srcPathCleaned, err)
Copilot is powered by AI and may make mistakes. Always verify output.
if finfo.IsDir() && !recursive {
return fmt.Errorf(fstype.RecursiveRequiredError)
}
err = os.Rename(srcPathCleaned, destPathCleaned)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 3 days ago

In general, the fix is to ensure that any filesystem path derived from user-controlled input is both normalized and constrained to a defined safe base directory (or otherwise validated) before it is used in operations like os.Rename, os.WriteFile, etc. In this codebase, ExpandHomeDir already contains a safeguard against path traversal for ~/-style paths, but ExpandHomeDirSafe discards its error, effectively disabling that protection. We should either (a) stop using the “safe” wrapper in security‑sensitive operations and handle ExpandHomeDir errors properly, and/or (b) add explicit validation of the resulting path to ensure it does not escape an allowed root.

The minimal fix, without changing the current functionality more than necessary, is to change RemoteFileMoveCommand so that it (1) calls wavebase.ExpandHomeDir directly instead of ExpandHomeDirSafe, (2) checks and propagates any error from that function, and (3) optionally validates that the expanded path is absolute and does not contain path traversal components (though ExpandHomeDir already enforces this for ~/ paths). We should do this for both the destination and the source paths, since both are ultimately used in filesystem operations. This preserves the current semantics for valid inputs (paths that ExpandHomeDir accepts) but rejects clearly unsafe ones instead of silently “fixing” them to / or another unintended location.

Concretely, in pkg/wshrpc/wshremote/wshremote_file.go, inside RemoteFileMoveCommand, replace the use of:

destPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(destConn.Path))
...
srcPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(srcConn.Path))

with code that:

  1. Calls wavebase.ExpandHomeDir(destConn.Path) and wavebase.ExpandHomeDir(srcConn.Path).
  2. Checks the returned error and returns a formatted error if expansion fails (indicating a potentially malicious or malformed path).
  3. Applies filepath.Clean to the successfully expanded path (this is mostly harmless normalization).

This change requires no new imports (we already import wavebase and filepath) and keeps the rest of the logic (stat, overwrite checks, os.Rename) unchanged.


Suggested changeset 1
pkg/wshrpc/wshremote/wshremote_file.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/pkg/wshrpc/wshremote/wshremote_file.go b/pkg/wshrpc/wshremote/wshremote_file.go
--- a/pkg/wshrpc/wshremote/wshremote_file.go
+++ b/pkg/wshrpc/wshremote/wshremote_file.go
@@ -666,7 +666,11 @@
 	if err != nil {
 		return fmt.Errorf("cannot parse destination URI %q: %w", srcUri, err)
 	}
-	destPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(destConn.Path))
+	destPathExpanded, err := wavebase.ExpandHomeDir(destConn.Path)
+	if err != nil {
+		return fmt.Errorf("invalid destination path %q: %w", destConn.Path, err)
+	}
+	destPathCleaned := filepath.Clean(destPathExpanded)
 	destinfo, err := os.Stat(destPathCleaned)
 	if err == nil {
 		if !destinfo.IsDir() {
@@ -687,7 +691,11 @@
 		return fmt.Errorf("cannot parse source URI %q: %w", srcUri, err)
 	}
 	if srcConn.Host == destConn.Host {
-		srcPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(srcConn.Path))
+		srcPathExpanded, err := wavebase.ExpandHomeDir(srcConn.Path)
+		if err != nil {
+			return fmt.Errorf("invalid source path %q: %w", srcConn.Path, err)
+		}
+		srcPathCleaned := filepath.Clean(srcPathExpanded)
 		finfo, err := os.Stat(srcPathCleaned)
 		if err != nil {
 			return fmt.Errorf("cannot stat file %q: %w", srcPathCleaned, err)
EOF
@@ -666,7 +666,11 @@
if err != nil {
return fmt.Errorf("cannot parse destination URI %q: %w", srcUri, err)
}
destPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(destConn.Path))
destPathExpanded, err := wavebase.ExpandHomeDir(destConn.Path)
if err != nil {
return fmt.Errorf("invalid destination path %q: %w", destConn.Path, err)
}
destPathCleaned := filepath.Clean(destPathExpanded)
destinfo, err := os.Stat(destPathCleaned)
if err == nil {
if !destinfo.IsDir() {
@@ -687,7 +691,11 @@
return fmt.Errorf("cannot parse source URI %q: %w", srcUri, err)
}
if srcConn.Host == destConn.Host {
srcPathCleaned := filepath.Clean(wavebase.ExpandHomeDirSafe(srcConn.Path))
srcPathExpanded, err := wavebase.ExpandHomeDir(srcConn.Path)
if err != nil {
return fmt.Errorf("invalid source path %q: %w", srcConn.Path, err)
}
srcPathCleaned := filepath.Clean(srcPathExpanded)
finfo, err := os.Stat(srcPathCleaned)
if err != nil {
return fmt.Errorf("cannot stat file %q: %w", srcPathCleaned, err)
Copilot is powered by AI and may make mistakes. Always verify output.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

🤖 Fix all issues with AI agents
In `@cmd/wsh/cmd/wshcmd-jobdebug.go`:
- Line 165: Fix the typo in the log message: update the log.Printf call that
prints connectedJobIds (the statement using connectedJobIds) to use "connected"
instead of "connnected" so the format string becomes "connected jobids: %v\n".
Ensure you update the exact log.Printf invocation that references
connectedJobIds.
- Around line 340-363: The jobDebugStartCmd command definition lacks the Args
field to enforce the minimum number of arguments. Locate the jobDebugStartCmd
Cobra command setup and add the field Args: cobra.MinimumNArgs(1) to ensure the
command requires at least one argument before running the jobDebugStartRun
handler.

In `@cmd/wsh/cmd/wshcmd-jobmanager.go`:
- Around line 81-118: The goroutine started in readJobAuthToken can remain
blocked on reader.ReadString if the context times out; modify the goroutine to
observe ctx and avoid blocking sends: pass ctx into the goroutine and after
ReadString (both success and error paths) use a select that either sends to
resultCh/errorCh or returns if ctx.Done() is closed (drop the value instead of
blocking); keep resultCh and errorCh buffered (they already are) and ensure
every send is guarded with a select { case <-ctx.Done(): return; case
resultCh<-token: } (and similarly for errorCh) so the goroutine will not block
trying to send when the caller has already returned.

In `@pkg/jobcontroller/jobcontroller.go`:
- Around line 236-239: Separate the error and nil cases when calling
wstore.DBGetSingleton[*waveobj.Client] so you don't wrap a nil error; first
check if err != nil and return fmt.Errorf("failed to get client: %w", err), then
check if clientId == nil and return a clear non-wrapped error like
fmt.Errorf("client not found") (referencing the clientId variable and the
wstore.DBGetSingleton call to locate the code).

In `@pkg/jobmanager/cirbuf.go`:
- Around line 98-126: writeAvailable's logic assumes cb.count <= cb.windowSize,
but SetEffectiveWindow can shrink windowSize leaving count > windowSize, causing
future writes to advance readPos (dropping data) instead of increasing count;
fix by ensuring SetEffectiveWindow (and any code that reduces cb.windowSize)
enforces the invariant cb.count <= cb.windowSize: when shrinking, compute excess
:= cb.count - newWindow; if excess > 0, advance cb.readPos by excess (modulo
buffer length) and decrement cb.count by excess so the oldest bytes are dropped
deterministically; keep the fields involved named exactly (CirBuf,
writeAvailable, SetEffectiveWindow, cb.count, cb.windowSize, cb.readPos,
cb.writePos, cb.totalSize) and update any docstring/comments to reflect this
behavior.

In `@pkg/waveobj/wtype.go`:
- Around line 313-349: The Job struct currently includes the sensitive
JobAuthToken field which must not be persisted or broadcast; update the Job
declaration so JobAuthToken is not included in JSON/serialization (e.g., remove
JSON tag or mark it unexported/`json:"-"`) and move token storage to a
server-only location (such as a separate in-memory map keyed by Job.OID or a
JobManagerCredential store) and ensure any code paths that build WaveObj updates
strip or replace JobAuthToken with nil/empty before sending; locate usages by
searching for the Job type and the JobAuthToken identifier and update
serialization, marshalling, and any WaveObj update functions to never include
this field.

In `@pkg/wcore/block.go`:
- Around line 171-181: The goroutine that detaches a job uses the parent ctx
which may be cancelled before DetachJobFromBlock finishes; update the anonymous
goroutine to create and use an independent context (e.g., context.Background()
or context.WithTimeout(context.Background(), <reasonableDuration>)) instead of
the captured ctx when calling jobcontroller.DetachJobFromBlock, and ensure you
still call panichandler.PanicHandler("DetachJobFromBlock", recover()) and log
errors including blockId on failure.

In `@pkg/wshrpc/wshremote/wshremote_file.go`:
- Around line 1-2: Update the copyright header in wshremote_file.go to use the
consistent year used across the PR (change "2026" to "2025") so it matches other
files; edit the top-of-file comment block accordingly to ensure
SPDX-License-Identifier remains unchanged.
- Around line 776-783: The linter flags that the number of bytes written (n)
from file.WriteAt and file.Write is assigned but never used; change those
assignments to discard n and only capture the error (e.g., replace "n, err =
file.WriteAt(...)" and "n, err = file.Write(...)" with assignments that ignore n
and assign err only, such as "_, err = file.WriteAt(...)" and "_, err =
file.Write(...)" so the subsequent err check (return fmt.Errorf(..., err))
remains correct.

In `@pkg/wshrpc/wshremote/wshremote_job.go`:
- Around line 57-77: The goroutine defer and the returned cleanup function both
perform conn.Close(), impl.Router.UnregisterLink(linkId), and
impl.removeJobManagerConnection(jobId) causing potential double-cleanup;
introduce a sync.Once (e.g., once := &sync.Once{}) and wrap the shared cleanup
logic in once.Do(func(){ conn.Close(); impl.Router.UnregisterLink(linkId);
impl.removeJobManagerConnection(jobId); close(proxy.FromRemoteCh) }) then
replace the goroutine's defer body and the cleanup closure to call once.Do(...)
(or call a small single-call fn that invokes once.Do) so the cleanup runs
exactly once while still ensuring proxy.FromRemoteCh is closed by the owner.

In `@pkg/wshrpc/wshremote/wshremote.go`:
- Around line 95-104: The remote branch of ServerImpl.getWshPath currently
hardcodes "~/.waveterm/bin/wsh" which diverges from the local path logic; change
the remote branch to resolve the data directory using the same
wavebase.GetWaveDataDir() call (and then filepath.Join with "bin","wsh") instead
of wavebase.ExpandHomeDir so both local and remote use the same
WAVETERM_DATA_HOME/XDG resolution; update getWshPath (and remove the
ExpandHomeDir usage in that branch) accordingly.

In `@pkg/wshutil/wshrouter.go`:
- Around line 468-471: The code currently accepts RPC responses solely by
matching rpcMsg.ResId and router.getRouteInfo(rpcMsg.ResId), which allows
spoofing; update the guard to also verify that the stored route info's intended
destination/link matches the actual incoming link for this message. After
calling router.getRouteInfo(rpcMsg.ResId), compare the route info's
destination/link field (e.g., RouteInfo.Link or RouteInfo.DestinationID) against
the current incoming link identifier (the variable representing this connection
in the surrounding scope) and only proceed if they match; otherwise continue.
Ensure you reference rpcMsg.ResId, router.getRouteInfo, and the route info's
destination/link field when implementing the check.
- Around line 258-264: isBindableRouteId currently allows route IDs that start
with RoutePrefix_Link ("link:") while sendRoutedMessage treats any "link:*" as a
direct link ID, which can cause silent misrouting; update the binding check to
reserve the "link:" prefix by rejecting route IDs that start with
RoutePrefix_Link or by first consulting the router's route map before treating
"link:" as special. Concretely, modify isBindableRouteId (or the
binding-registration path) to return false for route IDs with RoutePrefix_Link,
and/or add a guard in sendRoutedMessage to check the router's route map (route
lookup) before parsing link IDs so explicitly bound routes named "link:*" are
preserved; reference RoutePrefix_Link, isBindableRouteId, sendRoutedMessage,
sendMessageToLink and the router route map when applying the change.
🧹 Nitpick comments (15)
pkg/wcore/wcore.go (1)

53-60: LGTM! Follows the established pattern correctly.

The new InstallId initialization mirrors the existing TempOID block and integrates cleanly into the startup flow.

Optional: If both TempOID and InstallId are empty (e.g., on migration), this results in two separate DBUpdate calls. You could consolidate them:

♻️ Optional: Consolidate DB updates
 	if client.TempOID == "" {
 		log.Println("client.TempOID is empty")
 		client.TempOID = uuid.NewString()
-		err = wstore.DBUpdate(ctx, client)
-		if err != nil {
-			return firstLaunch, fmt.Errorf("error updating client: %w", err)
-		}
 	}
 	if client.InstallId == "" {
 		log.Println("client.InstallId is empty")
 		client.InstallId = uuid.NewString()
+	}
+	if client.TempOID != "" || client.InstallId != "" {
 		err = wstore.DBUpdate(ctx, client)
 		if err != nil {
 			return firstLaunch, fmt.Errorf("error updating client: %w", err)
 		}
 	}

Note: The condition logic would need adjustment to track whether either field was actually modified. A cleaner approach might be a needsUpdate flag similar to InitMainServer.

pkg/wstore/wstore_dbops.go (1)

320-329: Allow updateFn to return an error to abort the transaction cleanly.

Line 326 can only fail via panic; returning an error is safer and keeps rollback explicit.

♻️ Suggested refactor
-func DBUpdateFn[T waveobj.WaveObj](ctx context.Context, id string, updateFn func(T)) error {
+func DBUpdateFn[T waveobj.WaveObj](ctx context.Context, id string, updateFn func(T) error) error {
 	return WithTx(ctx, func(tx *TxWrap) error {
 		val, err := DBMustGet[T](tx.Context(), id)
 		if err != nil {
 			return err
 		}
-		updateFn(val)
+		if err := updateFn(val); err != nil {
+			return err
+		}
 		return DBUpdate(tx.Context(), val)
 	})
 }
cmd/wsh/cmd/wshcmd-connserver.go (2)

224-235: Sanity‑check the socket path extracted from the unverified JWT.

Line 225 uses unverified JWT data to dial a Unix socket. Consider constraining it to the expected wave home subdir to avoid connecting to arbitrary sockets if the token is tampered.

♻️ Example hardening
 	sockName = wavebase.ExpandHomeDirSafe(sockName)
+	sockName = filepath.Clean(sockName)
+	baseDir := filepath.Join(wavebase.GetHomeDir(), wavebase.RemoteWaveHomeDirName)
+	if !strings.HasPrefix(sockName, baseDir+string(os.PathSeparator)) {
+		return fmt.Errorf("invalid upstream socket path: %s", sockName)
+	}
 	conn, err := net.Dial("unix", sockName)

392-412: Enforce mutual exclusivity between --router and --router‑domainsocket.

If both flags are set, router mode wins silently. An explicit error avoids confusion.

♻️ Suggested guard
 	sigutil.InstallSIGUSR1Handler()
+	if connServerRouter && connServerRouterDomainSocket {
+		if logFile != nil {
+			fmt.Fprintf(logFile, "--router and --router-domainsocket are mutually exclusive\n")
+		}
+		return fmt.Errorf("--router and --router-domainsocket are mutually exclusive")
+	}
 	if connServerRouter {
 		err := serverRunRouter()
 		if err != nil && logFile != nil {
pkg/wshutil/wshrouter_controlimpl.go (1)

196-269: Consider extracting duplicated job validation logic.

Lines 208-217 in AuthenticateJobManagerVerifyCommand and lines 241-250 in AuthenticateJobManagerCommand contain identical job retrieval and token validation logic. This could be extracted into a helper function.

Proposed refactor
+func (impl *WshRouterControlImpl) validateJobAuthToken(ctx context.Context, jobId, jobAuthToken string) (*waveobj.Job, error) {
+	job, err := wstore.DBMustGet[*waveobj.Job](ctx, jobId)
+	if err != nil {
+		return nil, fmt.Errorf("failed to get job: %w", err)
+	}
+	if job.JobAuthToken != jobAuthToken {
+		return nil, fmt.Errorf("invalid jobauthtoken")
+	}
+	return job, nil
+}

 func (impl *WshRouterControlImpl) AuthenticateJobManagerVerifyCommand(ctx context.Context, data wshrpc.CommandAuthenticateJobManagerData) error {
 	if !impl.Router.IsRootRouter() {
 		return fmt.Errorf("authenticatejobmanagerverify can only be called on root router")
 	}
 	// ... validation ...
-	job, err := wstore.DBMustGet[*waveobj.Job](ctx, data.JobId)
-	if err != nil {
-		log.Printf("wshrouter authenticate-jobmanager-verify error jobid=%q: failed to get job: %v", data.JobId, err)
-		return fmt.Errorf("failed to get job: %w", err)
-	}
-	if job.JobAuthToken != data.JobAuthToken {
-		log.Printf("wshrouter authenticate-jobmanager-verify error jobid=%q: invalid jobauthtoken", data.JobId)
-		return fmt.Errorf("invalid jobauthtoken")
+	_, err := impl.validateJobAuthToken(ctx, data.JobId, data.JobAuthToken)
+	if err != nil {
+		log.Printf("wshrouter authenticate-jobmanager-verify error jobid=%q: %v", data.JobId, err)
+		return err
 	}
 	// ...
 }
cmd/wsh/cmd/wshcmd-jobdebug.go (1)

96-106: Consider scoping flag variables to avoid potential conflicts.

Multiple commands reuse jobIdFlag (delete, getoutput). While Cobra handles flag parsing per-command, if flags with the same name are registered on different commands pointing to the same variable, only the last registration wins for the flag definition.

Currently this works because jobIdFlag is registered separately for jobDebugDeleteCmd (line 124) and jobDebugGetOutputCmd (line 139). Just flagging for awareness—if more commands need a --jobid flag, consider command-local variables or prefixed names to avoid confusion.

frontend/app/view/vdom/vdom-model.tsx (1)

164-174: LGTM! Event type updated to match new route lifecycle semantics.

The change from "route:gone" to "route:down" aligns with the PR's system-wide update to explicit up/down route events.

Consider renaming routeGoneUnsub to routeDownUnsub to match the new event naming convention:

Optional rename
-    routeGoneUnsub: () => void;
+    routeDownUnsub: () => void;

And update the usage at lines 164 and 197.

pkg/wshutil/wshstreamadapter.go (1)

14-20: Prefer wshrpc command constants to avoid string drift.
[synthetize_recommended_refactor]

♻️ Suggested update
-	return a.rpc.SendCommand("streamdataack", data, opts)
+	return a.rpc.SendCommand(wshrpc.Command_StreamDataAck, data, opts)
-	return a.rpc.SendCommand("streamdata", data, opts)
+	return a.rpc.SendCommand(wshrpc.Command_StreamData, data, opts)
pkg/jobmanager/streammanager_test.go (3)

42-45: Consider handling decode errors in test helper.

The decodeData helper silently ignores base64 decoding errors. While this is a test helper, propagating or logging errors would help diagnose test failures if malformed data is ever encountered.

♻️ Suggested improvement
-func decodeData(data64 string) string {
-	decoded, _ := base64.StdEncoding.DecodeString(data64)
-	return string(decoded)
+func decodeData(t *testing.T, data64 string) string {
+	t.Helper()
+	decoded, err := base64.StdEncoding.DecodeString(data64)
+	if err != nil {
+		t.Fatalf("failed to decode base64 data: %v", err)
+	}
+	return string(decoded)
 }

57-62: Time-based synchronization may cause test flakiness.

Multiple tests rely on time.Sleep for synchronization with async goroutines. While this is a pragmatic approach for testing streaming behavior, consider using channels, condition variables, or polling with timeouts for more deterministic tests if flakiness occurs in CI.

Also applies to: 82-87, 132-139, 202-203


160-182: Test lacks assertions after disconnect.

This test runs through the connect/disconnect flow but doesn't verify any specific behavior after ClientDisconnected() is called. Consider adding assertions to verify expected state (e.g., that subsequent data isn't sent, or that reconnection works correctly).

pkg/jobmanager/jobcmd.go (1)

47-53: Redundant validation logic.

After setting default values when Rows == 0 || Cols == 0, the subsequent check for Rows <= 0 || Cols <= 0 will never be true for those defaulted fields. Consider simplifying or clarifying the intent.

♻️ Suggested simplification
-	if cmdDef.TermSize.Rows == 0 || cmdDef.TermSize.Cols == 0 {
-		cmdDef.TermSize.Rows = 25
-		cmdDef.TermSize.Cols = 80
-	}
-	if cmdDef.TermSize.Rows <= 0 || cmdDef.TermSize.Cols <= 0 {
-		return nil, fmt.Errorf("invalid term size: %v", cmdDef.TermSize)
-	}
+	if cmdDef.TermSize.Rows <= 0 {
+		cmdDef.TermSize.Rows = 25
+	}
+	if cmdDef.TermSize.Cols <= 0 {
+		cmdDef.TermSize.Cols = 80
+	}
pkg/jobmanager/mainserverconn.go (1)

1-2: Copyright year inconsistency.

This file has copyright year 2026 while other files in the PR use 2025. Consider aligning for consistency.

pkg/jobmanager/jobmanager.go (1)

245-245: Unused variable assignment.

_ = wshRpc appears to be a no-op. If wshRpc is needed to keep the RPC server running, consider documenting this. Otherwise, this line can be removed.

pkg/wshrpc/wshremote/wshremote_job.go (1)

245-252: Consider logging errors from cmd.Wait().

The goroutine at lines 245-247 reaps the child process but discards any error. While the job manager's exit may be intentional, logging the wait result would aid debugging if the job manager crashes unexpectedly.

Optional: Log wait errors
 	go func() {
-		cmd.Wait()
+		if err := cmd.Wait(); err != nil {
+			log.Printf("RemoteStartJobCommand: job manager exited: %v\n", err)
+		}
 	}()

Comment on lines +81 to +118
func readJobAuthToken(ctx context.Context) (string, error) {
resultCh := make(chan string, 1)
errorCh := make(chan error, 1)

go func() {
reader := bufio.NewReader(os.Stdin)
line, err := reader.ReadString('\n')
if err != nil {
errorCh <- fmt.Errorf("error reading from stdin: %v", err)
return
}

line = strings.TrimSpace(line)
prefix := jobmanager.JobAccessTokenLabel + ":"
if !strings.HasPrefix(line, prefix) {
errorCh <- fmt.Errorf("invalid token format: expected '%s'", prefix)
return
}

token := strings.TrimPrefix(line, prefix)
token = strings.TrimSpace(token)
if token == "" {
errorCh <- fmt.Errorf("empty job auth token")
return
}

resultCh <- token
}()

select {
case token := <-resultCh:
return token, nil
case err := <-errorCh:
return "", err
case <-ctx.Done():
return "", ctx.Err()
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Goroutine leak on context timeout.

When the context times out (line 115-116), the goroutine spawned at line 85 continues running and blocks on reader.ReadString('\n') at line 87. This goroutine will only exit when stdin is closed or provides input.

For a job manager process that may be long-running, this is a minor concern since the goroutine will eventually complete when the parent process closes stdin. However, if this function is called multiple times with timeouts, goroutines could accumulate.

Potential fix using a done channel
 func readJobAuthToken(ctx context.Context) (string, error) {
 	resultCh := make(chan string, 1)
 	errorCh := make(chan error, 1)
+	doneCh := make(chan struct{})
 
 	go func() {
+		defer close(doneCh)
 		reader := bufio.NewReader(os.Stdin)
 		line, err := reader.ReadString('\n')
 		if err != nil {
+			select {
+			case <-ctx.Done():
+				return
+			default:
+			}
 			errorCh <- fmt.Errorf("error reading from stdin: %v", err)
 			return
 		}
 		// ... rest of the function
 	}()
 
 	select {
 	case token := <-resultCh:
 		return token, nil
 	case err := <-errorCh:
 		return "", err
 	case <-ctx.Done():
 		return "", ctx.Err()
 	}
 }
🤖 Prompt for AI Agents
In `@cmd/wsh/cmd/wshcmd-jobmanager.go` around lines 81 - 118, The goroutine
started in readJobAuthToken can remain blocked on reader.ReadString if the
context times out; modify the goroutine to observe ctx and avoid blocking sends:
pass ctx into the goroutine and after ReadString (both success and error paths)
use a select that either sends to resultCh/errorCh or returns if ctx.Done() is
closed (drop the value instead of blocking); keep resultCh and errorCh buffered
(they already are) and ensure every send is guarded with a select { case
<-ctx.Done(): return; case resultCh<-token: } (and similarly for errorCh) so the
goroutine will not block trying to send when the caller has already returned.

Comment on lines +236 to +239
clientId, err := wstore.DBGetSingleton[*waveobj.Client](ctx)
if err != nil || clientId == nil {
return "", fmt.Errorf("failed to get client: %w", err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Improve nil check and error handling for client retrieval.

The condition err != nil || clientId == nil combines two different failure modes into one error message. If err is nil but clientId is nil, the error wrapping %w will produce a confusing message with <nil>.

Suggested fix
 	clientId, err := wstore.DBGetSingleton[*waveobj.Client](ctx)
-	if err != nil || clientId == nil {
-		return "", fmt.Errorf("failed to get client: %w", err)
+	if err != nil {
+		return "", fmt.Errorf("failed to get client: %w", err)
+	}
+	if clientId == nil {
+		return "", fmt.Errorf("client not found")
 	}
🤖 Prompt for AI Agents
In `@pkg/jobcontroller/jobcontroller.go` around lines 236 - 239, Separate the
error and nil cases when calling wstore.DBGetSingleton[*waveobj.Client] so you
don't wrap a nil error; first check if err != nil and return fmt.Errorf("failed
to get client: %w", err), then check if clientId == nil and return a clear
non-wrapped error like fmt.Errorf("client not found") (referencing the clientId
variable and the wstore.DBGetSingleton call to locate the code).

Comment on lines +776 to +783
if atOffset > 0 && !append {
n, err = file.WriteAt(dataBytes[:n], atOffset)
} else {
n, err = file.Write(dataBytes[:n])
}
if err != nil {
return fmt.Errorf("cannot write to file %q: %w", path, err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Ineffectual assignments to n flagged by static analysis.

The return value n from WriteAt and Write is assigned but never used. The linter correctly flags this.

Suggested fix
 	if atOffset > 0 && !append {
-		n, err = file.WriteAt(dataBytes[:n], atOffset)
+		_, err = file.WriteAt(dataBytes[:n], atOffset)
 	} else {
-		n, err = file.Write(dataBytes[:n])
+		_, err = file.Write(dataBytes[:n])
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if atOffset > 0 && !append {
n, err = file.WriteAt(dataBytes[:n], atOffset)
} else {
n, err = file.Write(dataBytes[:n])
}
if err != nil {
return fmt.Errorf("cannot write to file %q: %w", path, err)
}
if atOffset > 0 && !append {
_, err = file.WriteAt(dataBytes[:n], atOffset)
} else {
_, err = file.Write(dataBytes[:n])
}
if err != nil {
return fmt.Errorf("cannot write to file %q: %w", path, err)
}
🧰 Tools
🪛 golangci-lint (2.5.0)

[error] 777-777: ineffectual assignment to n

(ineffassign)


[error] 779-779: ineffectual assignment to n

(ineffassign)

🤖 Prompt for AI Agents
In `@pkg/wshrpc/wshremote/wshremote_file.go` around lines 776 - 783, The linter
flags that the number of bytes written (n) from file.WriteAt and file.Write is
assigned but never used; change those assignments to discard n and only capture
the error (e.g., replace "n, err = file.WriteAt(...)" and "n, err =
file.Write(...)" with assignments that ignore n and assign err only, such as "_,
err = file.WriteAt(...)" and "_, err = file.Write(...)" so the subsequent err
check (return fmt.Errorf(..., err)) remains correct.

Comment on lines +95 to +104
func (impl *ServerImpl) getWshPath() (string, error) {
if impl.IsLocal {
return filepath.Join(wavebase.GetWaveDataDir(), "bin", "wsh"), nil
}
wshPath, err := wavebase.ExpandHomeDir("~/.waveterm/bin/wsh")
if err != nil {
return "", fmt.Errorf("cannot expand wsh path: %w", err)
}
return wshPath, nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find GetWaveDataDir implementation
rg -n "func.*GetWaveDataDir|const.*WaveDataDir" --type go -A 10

Repository: wavetermdev/waveterm

Length of output: 597


🏁 Script executed:

#!/bin/bash
# Search for how GetWaveDataDir is used throughout the codebase
rg -n "GetWaveDataDir" --type go | head -20

Repository: wavetermdev/waveterm

Length of output: 1853


🏁 Script executed:

#!/bin/bash
# Search for ExpandHomeDir to understand how the remote path is expanded
rg -n "func.*ExpandHomeDir" --type go -A 15

Repository: wavetermdev/waveterm

Length of output: 3653


🏁 Script executed:

#!/bin/bash
# Find DataHome_VarCache definition and initialization
rg -n "DataHome_VarCache" --type go -B 5 -A 5

Repository: wavetermdev/waveterm

Length of output: 2126


🏁 Script executed:

#!/bin/bash
# Check platform-specific implementations in wavebase
fd "wavebase.*\.go" --type f

Repository: wavetermdev/waveterm

Length of output: 148


🏁 Script executed:

#!/bin/bash
# Find where WAVETERM_DATA_HOME is set and what it defaults to
rg -n "WaveDataHomeEnvVar|WAVETERM_DATA_HOME" --type go --type ts --type js -B 3 -A 3 | head -60

Repository: wavetermdev/waveterm

Length of output: 2339


🏁 Script executed:

#!/bin/bash
# Search for initialization of wavebase variables in main/init
rg -n "SetWaveBaseVars|Init\(\)|wavebase.Init" --type go -B 2 -A 5 | head -80

Repository: wavetermdev/waveterm

Length of output: 46


🏁 Script executed:

#!/bin/bash
# Look at the actual constant definitions
rg -n "WaveDataHomeEnvVar|WaveConfigHomeEnvVar" --type go

Repository: wavetermdev/waveterm

Length of output: 682


🏁 Script executed:

#!/bin/bash
# Find where WAVETERM_DATA_HOME is actually set
rg -n "WAVETERM_DATA_HOME\|process.env\[" emain/ --type ts -B 5 -A 5 | head -100

Repository: wavetermdev/waveterm

Length of output: 46


🏁 Script executed:

#!/bin/bash
# Search for waveterm data home initialization
rg -n "\.waveterm|dataHome|data-home" emain/ --type ts | head -40

Repository: wavetermdev/waveterm

Length of output: 309


🏁 Script executed:

#!/bin/bash
# Look at the context around line 71 in emain-platform.ts
sed -n '65,90p' emain/emain-platform.ts

Repository: wavetermdev/waveterm

Length of output: 918


🏁 Script executed:

#!/bin/bash
# Search more broadly for where data home is initialized
rg -n "getWaveHomeDir|getDataHomeDir|getWaveDataDir" emain/ --type ts -A 10

Repository: wavetermdev/waveterm

Length of output: 8609


🏁 Script executed:

#!/bin/bash
# Look at the full context of getWaveDataDir in electron to see all fallbacks
sed -n '131,155p' emain/emain-platform.ts

Repository: wavetermdev/waveterm

Length of output: 771


Use consistent path resolution for remote wsh binary location.

The local path uses GetWaveDataDir() which respects the WAVETERM_DATA_HOME environment variable and follows platform-specific defaults (including XDG_DATA_HOME fallback). The remote path hardcodes ~/.waveterm/bin/wsh, which will diverge if the data directory is configured elsewhere—for example, when ~/.waveterm doesn't exist but XDG_DATA_HOME is set, causing the electron app to use $XDG_DATA_HOME/waveterm instead.

The remote path should use the same environment variable or path resolution logic as the local side.

🤖 Prompt for AI Agents
In `@pkg/wshrpc/wshremote/wshremote.go` around lines 95 - 104, The remote branch
of ServerImpl.getWshPath currently hardcodes "~/.waveterm/bin/wsh" which
diverges from the local path logic; change the remote branch to resolve the data
directory using the same wavebase.GetWaveDataDir() call (and then filepath.Join
with "bin","wsh") instead of wavebase.ExpandHomeDir so both local and remote use
the same WAVETERM_DATA_HOME/XDG resolution; update getWshPath (and remove the
ExpandHomeDir usage in that branch) accordingly.

Comment on lines +468 to +471
// allow responses to RPCs we initiated
if rpcMsg.ResId == "" || router.getRouteInfo(rpcMsg.ResId) == nil {
continue
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate untrusted responses against the intended destination.

Allowing unauthenticated links to send responses solely based on ResId enables response spoofing if an ID is guessed or leaked. Constrain acceptance to responses that were routed to this specific link.

🔒 Suggested guard
-			// allow responses to RPCs we initiated
-			if rpcMsg.ResId == "" || router.getRouteInfo(rpcMsg.ResId) == nil {
-				continue
-			}
+			// allow responses only for RPCs that were routed to this link
+			if rpcMsg.ResId == "" {
+				continue
+			}
+			routeInfo := router.getRouteInfo(rpcMsg.ResId)
+			if routeInfo == nil {
+				continue
+			}
+			if strings.HasPrefix(routeInfo.destRouteId, RoutePrefix_Link) {
+				linkIdStr := strings.TrimPrefix(routeInfo.destRouteId, RoutePrefix_Link)
+				linkIdInt, err := strconv.ParseInt(linkIdStr, 10, 32)
+				if err != nil || baseds.LinkId(linkIdInt) != linkId {
+					continue
+				}
+			} else {
+				destLm := router.getLinkForRoute(routeInfo.destRouteId)
+				if destLm == nil || destLm.linkId != linkId {
+					continue
+				}
+			}
🤖 Prompt for AI Agents
In `@pkg/wshutil/wshrouter.go` around lines 468 - 471, The code currently accepts
RPC responses solely by matching rpcMsg.ResId and
router.getRouteInfo(rpcMsg.ResId), which allows spoofing; update the guard to
also verify that the stored route info's intended destination/link matches the
actual incoming link for this message. After calling
router.getRouteInfo(rpcMsg.ResId), compare the route info's destination/link
field (e.g., RouteInfo.Link or RouteInfo.DestinationID) against the current
incoming link identifier (the variable representing this connection in the
surrounding scope) and only proceed if they match; otherwise continue. Ensure
you reference rpcMsg.ResId, router.getRouteInfo, and the route info's
destination/link field when implementing the check.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

🤖 Fix all issues with AI agents
In `@cmd/wsh/cmd/wshcmd-jobdebug.go`:
- Line 165: Fix the typographical error in the log message: change the
log.Printf call that prints connectedJobIds (the statement containing
"connnected jobids: %v") to use the correct spelling "connected" so the message
reads "connected jobids: %v"; locate the log.Printf referencing connectedJobIds
in wshcmd-jobdebug.go and update the format string accordingly.

In `@cmd/wsh/cmd/wshcmd-jobmanager.go`:
- Around line 68-71: The nil check on readyFile after calling os.NewFile(3,
"ready-pipe") will not catch invalid or unavailable file descriptors because
os.NewFile returns a non-nil *os.File regardless of whether the descriptor is
actually valid. Remove the nil check and instead validate the file descriptor by
attempting an actual operation on it such as calling Stat() or attempting to
write to it, which will properly return an error if fd 3 is not available or
valid. This will ensure the ready pipe was actually provided before proceeding.

In `@pkg/jobmanager/jobcmd.go`:
- Around line 135-151: GetExitInfo in JobCmd is populating JobId from the global
WshCmdJobManager.JobId which can return wrong IDs; change the assignment to use
the instance field jm.jobId when building exitData (in the GetExitInfo method)
so exitData.JobId = jm.jobId; ensure you only modify the JobId field and keep
the rest of the logic (exitCode, exitSignal, exitTs, exitErr) unchanged.
- Around line 43-53: The validation after defaulting in MakeJobCmd is
ineffective for negative term sizes; update MakeJobCmd to first reject negative
values (check if cmdDef.TermSize.Rows < 0 || cmdDef.TermSize.Cols < 0 and return
an error) and only then apply defaults for zero values (set Rows=25 and Cols=80
when ==0); this ensures negative values from waveobj.TermSize are caught while
preserving the intended defaults.

In `@pkg/jobmanager/jobmanager_unix.go`:
- Around line 71-83: The code opens logFile before calling unix.Dup2, but if
Dup2 fails, logFile is not closed, causing a file descriptor leak. To fix this,
close logFile before each return after a Dup2 failure in the function around the
log file handling code. This ensures no resource leaks when Dup2 calls fail.

In `@pkg/jobmanager/streammanager.go`:
- Around line 290-299: The handleReadData method has a lock inversion: it calls
sm.buf.Write(data) (which acquires CirBuf.lock) before taking sm.lock, while
other callers (e.g., ClientConnected, RecvAck) take sm.lock then call sm.buf.* —
causing potential deadlock. Fix by acquiring sm.lock first (sm.lock.Lock() /
defer sm.lock.Unlock()), then perform sm.buf.Write(data) and the checks/logging,
and only then call sm.drainCond.Signal() if sm.connected; ensure you retain the
logging and deferred unlock; note that if CirBuf.Write may block, confirm
holding sm.lock during the write is acceptable or refactor to avoid long-held
manager locks.

In `@pkg/remote/conncontroller/conncontroller.go`:
- Around line 876-889: The `IsConnected` function calls `GetConn` which creates
a new SSHConn entry in clientControllerMap if one doesn't exist, polluting the
map with phantom connections. Replace the `GetConn` call with a read-only lookup
that directly accesses clientControllerMap under globalLock protection. Acquire
the lock, check if the connection exists in the map, retrieve its status if
found, and return false if not found, all without creating any new entries.

In `@pkg/streamclient/stream_test.go`:
- Around line 306-317: The goroutine reading from reader currently calls
t.Errorf which may not properly fail the test; change the pattern to send any
read error (or EOF/unexpected short read) back on an error channel and close
readDone, then in the main test goroutine wait for readDone (or select on both
channels) and call t.Fatalf/t.Errorf there based on the received error or
mismatch in totalRead/expectedLen; reference the read goroutine using readDone,
the reader.Read call, buf, totalRead and expectedLen to locate and implement the
error channel and post-goroutine assertion.

In `@pkg/streamclient/streambroker.go`:
- Around line 79-100: DetachStreamWriter currently removes b.writers and
b.writerRoutes but leaves b.readerRoutes stale; update the DetachStreamWriter
method to also delete the reader route entry for the given streamId by calling
delete(b.readerRoutes, streamId) inside the same lock/unlock block so it mirrors
AttachStreamWriter (refer to AttachStreamWriter, DetachStreamWriter,
b.readerRoutes, b.writerRoutes, b.writers).

In `@pkg/waveobj/wtype.go`:
- Around line 313-354: The Job struct has a naming mismatch: the field
AttachedBlockId is tagged `json:"ownerblockid"`; update them to be consistent by
either renaming the field to OwnerBlockId (and keep `json:"ownerblockid"`) or
change the JSON tag to `json:"attachedblockid"` (for the existing
AttachedBlockId); modify the Job struct declaration (field AttachedBlockId) and
any callers/JSON serializers that rely on that name to match the chosen API
name.

In `@pkg/wshrpc/wshclient/barerpcclient.go`:
- Around line 24-31: RegisterTrustedLeaf's returned error is ignored inside
GetBareRpcClient's sync.Once block; capture its (linkId, err) return, set
waveSrvClient_RouteId from the returned linkId only on success, and handle err
(e.g., log the error via your logger and fail fast by returning nil/panic or
storing an error state) so the failure isn't silent. Update the block around
wshutil.DefaultRouter.RegisterTrustedLeaf to assign the two returns, check err,
and only call wps.Broker.SetClient(wshutil.DefaultRouter) after successful
registration; reference symbols: GetBareRpcClient, waveSrvClient_Once,
waveSrvClient_Singleton, waveSrvClient_RouteId,
wshutil.DefaultRouter.RegisterTrustedLeaf, and wps.Broker.SetClient.

In `@pkg/wshrpc/wshremote/wshremote_file.go`:
- Around line 777-780: The assignment to variable `n` from the calls to
`file.WriteAt`/`file.Write` in wshremote_file.go is ineffectual because `n` is
never used afterwards; change those assignments to discard the byte-count return
(use `_` for the first return) and only capture `err` (i.e., replace `n, err =
file.WriteAt(...)` and `n, err = file.Write(...)` with `_ , err = ...`) so the
linter warning is resolved while preserving error handling in the same
functions/methods.

In `@pkg/wshrpc/wshremote/wshremote_job.go`:
- Around line 57-77: The cleanup block and the deferred cleanup inside the
goroutine both close conn and call impl.Router.UnregisterLink(linkId) and
impl.removeJobManagerConnection(jobId), which can run twice; change the cleanup
logic to run exactly once by introducing a sync.Once (e.g., a once variable
captured by both the deferred goroutine closure and the cleanup function) and
move only non-idempotent operations (conn.Close, impl.Router.UnregisterLink,
impl.removeJobManagerConnection) into a single once.Do call; keep closing
proxy.FromRemoteCh and other safe operations outside or guard them similarly so
the goroutine that calls wshutil.AdaptStreamToMsgCh(conn, proxy.FromRemoteCh)
and the outer cleanup both invoke the same once.Do to prevent
double-close/unregister.
🧹 Nitpick comments (18)
pkg/wavejwt/wavejwt.go (1)

27-35: Add explicit validation for new auth-role claims.
MainServer and JobId are authorization-sensitive. Without centralized validation, it’s easy to mint or accept tokens with conflicting role fields (e.g., MainServer plus Router/Sock/RouteId, or JobId missing where required). Please ensure downstream auth logic enforces valid claim combinations, or add a small validator/constructor for WaveJwtClaims and use it at issuance/validation points.

pkg/wcore/wcore.go (1)

53-60: LGTM! Follows established pattern.

The InstallId initialization mirrors the existing TempOID pattern correctly.

Minor efficiency consideration: if both TempOID and InstallId are empty, you could consolidate into a single DB update. However, since this is startup code that runs rarely, the current approach is fine.

♻️ Optional consolidation
+	needsUpdate := false
 	if client.TempOID == "" {
 		log.Println("client.TempOID is empty")
 		client.TempOID = uuid.NewString()
-		err = wstore.DBUpdate(ctx, client)
-		if err != nil {
-			return firstLaunch, fmt.Errorf("error updating client: %w", err)
-		}
+		needsUpdate = true
 	}
 	if client.InstallId == "" {
 		log.Println("client.InstallId is empty")
 		client.InstallId = uuid.NewString()
+		needsUpdate = true
+	}
+	if needsUpdate {
 		err = wstore.DBUpdate(ctx, client)
 		if err != nil {
 			return firstLaunch, fmt.Errorf("error updating client: %w", err)
 		}
 	}
pkg/jobmanager/jobcmd.go (1)

153-194: Holding lock during I/O operations.

HandleInput holds the mutex while performing potentially blocking I/O operations: jm.cmdPty.Write(), jm.cmd.Process.Signal(), and pty.Setsize(). While these are typically fast, holding a lock during I/O can cause contention if other goroutines need to access JobCmd state.

Consider releasing the lock after validation and re-acquiring only when needed, or accepting this as a design trade-off for simplicity given the expected low contention.

pkg/streamclient/streamreader.go (1)

99-130: Consider adding a cap on out-of-order packet buffer size.

The oooPackets slice can grow unbounded if a sender transmits many future-sequence packets before the expected one arrives. This could lead to memory pressure under adversarial or buggy conditions.

Consider adding a maximum buffer size and dropping/erroring when exceeded:

const maxOOOPackets = 100 // or configurable

func (r *Reader) addOOOPacketLocked(dataPk wshrpc.CommandStreamData) {
    if len(r.oooPackets) >= maxOOOPackets {
        r.err = fmt.Errorf("too many out-of-order packets")
        return
    }
    // ... existing logic
}
pkg/jobmanager/streammanager_test.go (1)

160-182: Consider adding assertions for the disconnect transition test.

TestConnectedToDisconnectedTransition exercises the disconnect path but has no assertions beyond checking for panics. Consider verifying expected state (e.g., that buffered data remains in the circular buffer, or that reconnection would resume correctly).

pkg/wshutil/wshrpc.go (1)

292-320: Consider logging decode errors for debugging stream issues.

Both handleStreamData and handleStreamAck silently return on ReUnmarshal errors. While this prevents crashes, it may make debugging stream protocol issues difficult in production.

♻️ Suggested improvement
 func (w *WshRpc) handleStreamData(req *RpcMessage) {
 	if w.StreamBroker == nil {
 		return
 	}
 	if req.Data == nil {
 		return
 	}
 	var dataPk wshrpc.CommandStreamData
 	err := utilfn.ReUnmarshal(&dataPk, req.Data)
 	if err != nil {
+		log.Printf("error unmarshaling stream data: %v", err)
 		return
 	}
 	w.StreamBroker.RecvData(dataPk)
 }

 func (w *WshRpc) handleStreamAck(req *RpcMessage) {
 	if w.StreamBroker == nil {
 		return
 	}
 	if req.Data == nil {
 		return
 	}
 	var ackPk wshrpc.CommandStreamAckData
 	err := utilfn.ReUnmarshal(&ackPk, req.Data)
 	if err != nil {
+		log.Printf("error unmarshaling stream ack: %v", err)
 		return
 	}
 	w.StreamBroker.RecvAck(ackPk)
 }
cmd/wsh/cmd/wshcmd-jobdebug.go (1)

340-363: Verify command argument handling with --.

The start command expects arguments after the command name, but there's no usage hint about using -- to separate flags from the command to run. Users might get confused.

♻️ Suggested improvement to command usage
 var jobDebugStartCmd = &cobra.Command{
 	Use:   "start",
-	Short: "start a new job",
+	Short: "start a new job (usage: start --conn=name -- cmd [args...])",
 	RunE:  jobDebugStartRun,
 }
pkg/wshutil/wshrouter.go (1)

123-125: Missing RoutePrefix_Job constant.

MakeJobRouteId uses a hardcoded "job:" prefix, but there's no corresponding RoutePrefix_Job constant defined alongside the other route prefixes (lines 32-38). This inconsistency could lead to maintenance issues if the prefix needs to change.

Proposed fix

Add the constant with the other route prefixes:

 	RoutePrefix_FeBlock    = "feblock:"
 	RoutePrefix_Builder    = "builder:"
 	RoutePrefix_Link       = "link:"
+	RoutePrefix_Job        = "job:"
 )

Then update MakeJobRouteId:

 func MakeJobRouteId(jobId string) string {
-	return "job:" + jobId
+	return RoutePrefix_Job + jobId
 }
pkg/jobmanager/cirbuf.go (3)

105-123: Byte-by-byte writing is inefficient; consider bulk copy.

The current implementation writes one byte at a time in a loop, which is significantly slower than using copy() for bulk transfers. For a 2MB buffer with large writes, this could become a performance bottleneck.

Sketch of bulk copy approach
func (cb *CirBuf) writeAvailable(data []byte) (int, chan struct{}) {
	cb.lock.Lock()
	defer cb.lock.Unlock()

	size := len(cb.buf)
	written := 0

	for written < len(data) {
		if cb.syncMode && cb.count >= cb.windowSize {
			spaceAvailable := make(chan struct{})
			if !tryWriteCh(cb.waiterChan, spaceAvailable) {
				panic("CirBuf: multiple concurrent blocked writes not allowed")
			}
			return written, spaceAvailable
		}

		// Calculate how many bytes we can write in this iteration
		availableSpace := cb.windowSize - cb.count
		if !cb.syncMode {
			availableSpace = size // Can overwrite in async mode
		}
		toWrite := len(data) - written
		if toWrite > availableSpace {
			toWrite = availableSpace
		}

		// Bulk copy (handling wrap-around)
		firstChunk := size - cb.writePos
		if firstChunk > toWrite {
			firstChunk = toWrite
		}
		copy(cb.buf[cb.writePos:], data[written:written+firstChunk])
		if toWrite > firstChunk {
			copy(cb.buf[0:], data[written+firstChunk:written+toWrite])
		}

		// Update positions...
		written += toWrite
		cb.writePos = (cb.writePos + toWrite) % size
		// ... (similar logic for count and readPos adjustment)
	}
	return written, nil
}

116-120: Clarify the semantics of windowSize vs buffer overwriting.

When cb.count >= cb.windowSize but syncMode is false, the code overwrites old data by advancing readPos. However, the condition uses windowSize for the threshold but compares against cb.count < cb.windowSize to decide whether to increment count. This means:

  • In async mode, once count reaches windowSize, new writes overwrite old data (readPos advances)
  • The actual buffer capacity (len(cb.buf)) may be larger than windowSize

This behavior is correct but subtle. Consider adding a comment explaining that windowSize acts as the "effective" buffer size for retention purposes, not just flow control.


145-149: Byte-by-byte read in PeekDataAt is similarly inefficient.

Same concern as the write path - consider using copy() with wrap-around handling for better performance when peeking large amounts of data.

pkg/wshrpc/wshremote/wshremote_job.go (3)

143-148: Pipe write end not closed before waiting for ready signal.

readyPipeWrite is passed to the child process as ExtraFiles[0], but you defer close it at line 148. If the child process writes to the pipe and then reads from it expecting the parent to have closed it, this could cause a deadlock. However, since the child only writes "Wave-JobManagerStart" and the parent only reads, this should be fine.

More importantly: after cmd.Start() (line 169), the parent should close readyPipeWrite immediately so the child sees EOF if it tries to read from the pipe's other end.

Proposed fix
 	if err := cmd.Start(); err != nil {
 		return nil, fmt.Errorf("cannot start job manager: %w", err)
 	}
+	readyPipeWrite.Close() // Child has its own fd now
 	log.Printf("RemoteStartJobCommand: job manager process started\n")

245-247: Detached goroutine waiting on cmd.Wait() may leak if process never exits.

The goroutine at lines 245-247 calls cmd.Wait() but there's no mechanism to forcibly terminate it if the process hangs. Consider adding a timeout or ensuring the process is killed on context cancellation.


342-348: Consider returning SIGTERM error to caller.

The SIGTERM send error is logged but not returned. The caller may want to know if the signal failed to be delivered (e.g., permission issues). Consider returning the error:

Proposed fix
 	err = proc.SendSignal(syscall.SIGTERM)
 	if err != nil {
 		log.Printf("failed to send SIGTERM to job manager: %v", err)
+		return fmt.Errorf("failed to send SIGTERM to job manager: %w", err)
 	} else {
 		log.Printf("RemoteTerminateJobManagerCommand: sent SIGTERM to job manager process, jobid=%s, pid=%d\n", data.JobId, data.JobManagerPid)
 	}
 	return nil
pkg/jobmanager/jobmanager.go (3)

124-155: Consider logging the reason when disconnecting an existing stream client.

The helper method correctly manages stream connection state with proper locking (caller holds lock as indicated by _withlock suffix). However, when disconnecting an existing client at line 131, consider logging more context about why disconnection is happening (e.g., new client replacing old one).


178-211: Socket cleanup on panic may leave stale socket file.

The panic handler at line 196 runs before os.Remove(socketPath) at line 198, but if recover() returns a non-nil value, the function will exit early from the panic handler. This could leave the socket file behind.

Consider restructuring cleanup order
 	go func() {
 		defer func() {
-			panichandler.PanicHandler("MakeJobDomainSocket:accept", recover())
 			listener.Close()
 			os.Remove(socketPath)
+			panichandler.PanicHandler("MakeJobDomainSocket:accept", recover())
 		}()

Or ensure cleanup happens regardless:

 	go func() {
 		defer func() {
+			listener.Close()
+			os.Remove(socketPath)
 			panichandler.PanicHandler("MakeJobDomainSocket:accept", recover())
-			listener.Close()
-			os.Remove(socketPath)
 		}()

213-245: The _ = wshRpc assignment appears to be a placeholder.

Line 245 assigns wshRpc to blank identifier, which keeps it from being garbage collected (the goroutines reference it via serverImpl.WshRpc). The code works but the blank assignment could be removed if the intent is just to keep the RPC alive via the goroutines.

pkg/wshrpc/wshrpctypes.go (1)

778-784: Consider consistency in ExitCode handling.

CommandJobCmdExitedData.ExitCode is *int without omitempty, which means it serializes as null when nil. This is intentional to distinguish "exit code 0" from "terminated by signal" (where ExitSignal would be set instead).

However, this creates a potential inconsistency with the TypeScript type (line 355 in gotypes.d.ts) where exitcode: number is required. Signal-terminated processes would have nil ExitCode but the TS type expects a number.

Consider adding omitempty to align with similar optional patterns elsewhere:

♻️ Suggested change
 type CommandJobCmdExitedData struct {
 	JobId      string `json:"jobid"`
-	ExitCode   *int   `json:"exitcode"`
+	ExitCode   *int   `json:"exitcode,omitempty"`
 	ExitSignal string `json:"exitsignal,omitempty"`
 	ExitErr    string `json:"exiterr,omitempty"`
 	ExitTs     int64  `json:"exitts,omitempty"`
 }

Comment on lines +43 to +53
func MakeJobCmd(jobId string, cmdDef CmdDef) (*JobCmd, error) {
jm := &JobCmd{
jobId: jobId,
}
if cmdDef.TermSize.Rows == 0 || cmdDef.TermSize.Cols == 0 {
cmdDef.TermSize.Rows = 25
cmdDef.TermSize.Cols = 80
}
if cmdDef.TermSize.Rows <= 0 || cmdDef.TermSize.Cols <= 0 {
return nil, fmt.Errorf("invalid term size: %v", cmdDef.TermSize)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unreachable validation after defaulting.

Lines 51-53 check if cmdDef.TermSize.Rows <= 0 || cmdDef.TermSize.Cols <= 0 after lines 47-50 have already defaulted zero values to 25/80. This validation can only catch negative values passed in, but waveobj.TermSize uses int fields which could be negative.

Consider either:

  1. Checking for invalid values before applying defaults, or
  2. Removing the redundant check if negative values aren't expected.
Suggested fix
 func MakeJobCmd(jobId string, cmdDef CmdDef) (*JobCmd, error) {
 	jm := &JobCmd{
 		jobId: jobId,
 	}
+	if cmdDef.TermSize.Rows < 0 || cmdDef.TermSize.Cols < 0 {
+		return nil, fmt.Errorf("invalid term size: %v", cmdDef.TermSize)
+	}
 	if cmdDef.TermSize.Rows == 0 || cmdDef.TermSize.Cols == 0 {
 		cmdDef.TermSize.Rows = 25
 		cmdDef.TermSize.Cols = 80
 	}
-	if cmdDef.TermSize.Rows <= 0 || cmdDef.TermSize.Cols <= 0 {
-		return nil, fmt.Errorf("invalid term size: %v", cmdDef.TermSize)
-	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func MakeJobCmd(jobId string, cmdDef CmdDef) (*JobCmd, error) {
jm := &JobCmd{
jobId: jobId,
}
if cmdDef.TermSize.Rows == 0 || cmdDef.TermSize.Cols == 0 {
cmdDef.TermSize.Rows = 25
cmdDef.TermSize.Cols = 80
}
if cmdDef.TermSize.Rows <= 0 || cmdDef.TermSize.Cols <= 0 {
return nil, fmt.Errorf("invalid term size: %v", cmdDef.TermSize)
}
func MakeJobCmd(jobId string, cmdDef CmdDef) (*JobCmd, error) {
jm := &JobCmd{
jobId: jobId,
}
if cmdDef.TermSize.Rows < 0 || cmdDef.TermSize.Cols < 0 {
return nil, fmt.Errorf("invalid term size: %v", cmdDef.TermSize)
}
if cmdDef.TermSize.Rows == 0 || cmdDef.TermSize.Cols == 0 {
cmdDef.TermSize.Rows = 25
cmdDef.TermSize.Cols = 80
}
🤖 Prompt for AI Agents
In `@pkg/jobmanager/jobcmd.go` around lines 43 - 53, The validation after
defaulting in MakeJobCmd is ineffective for negative term sizes; update
MakeJobCmd to first reject negative values (check if cmdDef.TermSize.Rows < 0 ||
cmdDef.TermSize.Cols < 0 and return an error) and only then apply defaults for
zero values (set Rows=25 and Cols=80 when ==0); this ensures negative values
from waveobj.TermSize are caught while preserving the intended defaults.

Comment on lines +135 to +151
func (jm *JobCmd) GetExitInfo() (bool, *wshrpc.CommandJobCmdExitedData) {
jm.lock.Lock()
defer jm.lock.Unlock()
if !jm.processExited {
return false, nil
}
exitData := &wshrpc.CommandJobCmdExitedData{
JobId: WshCmdJobManager.JobId,
ExitCode: jm.exitCode,
ExitSignal: jm.exitSignal,
ExitTs: jm.exitTs,
}
if jm.exitErr != nil {
exitData.ExitErr = jm.exitErr.Error()
}
return true, exitData
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential bug: Uses global WshCmdJobManager.JobId instead of instance jm.jobId.

Line 142 uses WshCmdJobManager.JobId to populate the exit data, but the JobCmd instance has its own jm.jobId field set during construction. If multiple jobs were ever managed, this would return the wrong job ID.

Suggested fix
 	exitData := &wshrpc.CommandJobCmdExitedData{
-		JobId:      WshCmdJobManager.JobId,
+		JobId:      jm.jobId,
 		ExitCode:   jm.exitCode,
 		ExitSignal: jm.exitSignal,
 		ExitTs:     jm.exitTs,
 	}
🤖 Prompt for AI Agents
In `@pkg/jobmanager/jobcmd.go` around lines 135 - 151, GetExitInfo in JobCmd is
populating JobId from the global WshCmdJobManager.JobId which can return wrong
IDs; change the assignment to use the instance field jm.jobId when building
exitData (in the GetExitInfo method) so exitData.JobId = jm.jobId; ensure you
only modify the JobId field and keep the rest of the logic (exitCode,
exitSignal, exitTs, exitErr) unchanged.

Comment on lines +71 to +83
logPath := GetJobFilePath(clientId, jobId, "log")
logFile, err := os.OpenFile(logPath, os.O_CREATE|os.O_WRONLY|os.O_APPEND, 0600)
if err != nil {
return fmt.Errorf("failed to open log file: %w", err)
}
err = unix.Dup2(int(logFile.Fd()), int(os.Stdout.Fd()))
if err != nil {
return fmt.Errorf("failed to dup2 stdout: %w", err)
}
err = unix.Dup2(int(logFile.Fd()), int(os.Stderr.Fd()))
if err != nil {
return fmt.Errorf("failed to dup2 stderr: %w", err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Close the log file on dup2 failures.

If dup2 fails, logFile is left open on the error path. Close it before returning to avoid FD leaks during repeated daemonization attempts.

🧩 Proposed fix
 logFile, err := os.OpenFile(logPath, os.O_CREATE|os.O_WRONLY|os.O_APPEND, 0600)
 if err != nil {
 	return fmt.Errorf("failed to open log file: %w", err)
 }
 err = unix.Dup2(int(logFile.Fd()), int(os.Stdout.Fd()))
 if err != nil {
+	logFile.Close()
 	return fmt.Errorf("failed to dup2 stdout: %w", err)
 }
 err = unix.Dup2(int(logFile.Fd()), int(os.Stderr.Fd()))
 if err != nil {
+	logFile.Close()
 	return fmt.Errorf("failed to dup2 stderr: %w", err)
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
logPath := GetJobFilePath(clientId, jobId, "log")
logFile, err := os.OpenFile(logPath, os.O_CREATE|os.O_WRONLY|os.O_APPEND, 0600)
if err != nil {
return fmt.Errorf("failed to open log file: %w", err)
}
err = unix.Dup2(int(logFile.Fd()), int(os.Stdout.Fd()))
if err != nil {
return fmt.Errorf("failed to dup2 stdout: %w", err)
}
err = unix.Dup2(int(logFile.Fd()), int(os.Stderr.Fd()))
if err != nil {
return fmt.Errorf("failed to dup2 stderr: %w", err)
}
logPath := GetJobFilePath(clientId, jobId, "log")
logFile, err := os.OpenFile(logPath, os.O_CREATE|os.O_WRONLY|os.O_APPEND, 0600)
if err != nil {
return fmt.Errorf("failed to open log file: %w", err)
}
err = unix.Dup2(int(logFile.Fd()), int(os.Stdout.Fd()))
if err != nil {
logFile.Close()
return fmt.Errorf("failed to dup2 stdout: %w", err)
}
err = unix.Dup2(int(logFile.Fd()), int(os.Stderr.Fd()))
if err != nil {
logFile.Close()
return fmt.Errorf("failed to dup2 stderr: %w", err)
}
🤖 Prompt for AI Agents
In `@pkg/jobmanager/jobmanager_unix.go` around lines 71 - 83, The code opens
logFile before calling unix.Dup2, but if Dup2 fails, logFile is not closed,
causing a file descriptor leak. To fix this, close logFile before each return
after a Dup2 failure in the function around the log file handling code. This
ensures no resource leaks when Dup2 calls fail.

Comment on lines +79 to +100
func (b *Broker) AttachStreamWriter(meta *wshrpc.StreamMeta, writer StreamWriter) error {
b.lock.Lock()
defer b.lock.Unlock()

if _, exists := b.writers[meta.Id]; exists {
return nil, fmt.Errorf("writer already registered for stream id %d", meta.Id)
return fmt.Errorf("writer already registered for stream id %s", meta.Id)
}

writer := NewWriter(meta.Id, meta.RWnd, b)
b.writers[meta.Id] = writer
b.readerRoutes[meta.Id] = meta.ReaderRouteId
b.writerRoutes[meta.Id] = meta.WriterRouteId

return nil
}

func (b *Broker) DetachStreamWriter(streamId string) {
b.lock.Lock()
defer b.lock.Unlock()

delete(b.writers, streamId)
delete(b.writerRoutes, streamId)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

DetachStreamWriter doesn't clean up readerRoutes.

AttachStreamWriter sets both b.readerRoutes[meta.Id] and b.writerRoutes[meta.Id], but DetachStreamWriter only cleans up writers and writerRoutes. The readerRoutes entry remains, which could cause stale route information.

Proposed fix
 func (b *Broker) DetachStreamWriter(streamId string) {
 	b.lock.Lock()
 	defer b.lock.Unlock()

 	delete(b.writers, streamId)
 	delete(b.writerRoutes, streamId)
+	delete(b.readerRoutes, streamId)
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (b *Broker) AttachStreamWriter(meta *wshrpc.StreamMeta, writer StreamWriter) error {
b.lock.Lock()
defer b.lock.Unlock()
if _, exists := b.writers[meta.Id]; exists {
return nil, fmt.Errorf("writer already registered for stream id %d", meta.Id)
return fmt.Errorf("writer already registered for stream id %s", meta.Id)
}
writer := NewWriter(meta.Id, meta.RWnd, b)
b.writers[meta.Id] = writer
b.readerRoutes[meta.Id] = meta.ReaderRouteId
b.writerRoutes[meta.Id] = meta.WriterRouteId
return nil
}
func (b *Broker) DetachStreamWriter(streamId string) {
b.lock.Lock()
defer b.lock.Unlock()
delete(b.writers, streamId)
delete(b.writerRoutes, streamId)
}
func (b *Broker) DetachStreamWriter(streamId string) {
b.lock.Lock()
defer b.lock.Unlock()
delete(b.writers, streamId)
delete(b.writerRoutes, streamId)
delete(b.readerRoutes, streamId)
}
🤖 Prompt for AI Agents
In `@pkg/streamclient/streambroker.go` around lines 79 - 100, DetachStreamWriter
currently removes b.writers and b.writerRoutes but leaves b.readerRoutes stale;
update the DetachStreamWriter method to also delete the reader route entry for
the given streamId by calling delete(b.readerRoutes, streamId) inside the same
lock/unlock block so it mirrors AttachStreamWriter (refer to AttachStreamWriter,
DetachStreamWriter, b.readerRoutes, b.writerRoutes, b.writers).

Comment on lines 24 to 31
var waveSrvClient_RouteId string

func GetBareRpcClient() *wshutil.WshRpc {
waveSrvClient_Once.Do(func() {
waveSrvClient_Singleton = wshutil.MakeWshRpc(wshrpc.RpcContext{}, &WshServerImpl, "bare-client")
wshutil.DefaultRouter.RegisterTrustedLeaf(waveSrvClient_Singleton, BareClientRoute)
waveSrvClient_RouteId = fmt.Sprintf("bare:%s", uuid.New().String())
wshutil.DefaultRouter.RegisterTrustedLeaf(waveSrvClient_Singleton, waveSrvClient_RouteId)
wps.Broker.SetClient(wshutil.DefaultRouter)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for RegisterTrustedLeaf method signature
rg -nP --type=go 'RegisterTrustedLeaf\s*\(' -A 2 -B 2

Repository: wavetermdev/waveterm

Length of output: 2477


🏁 Script executed:

#!/bin/bash
# Search for DefaultRouter declaration and initialization
rg -nP --type=go 'DefaultRouter\b' -B 3 -A 3

Repository: wavetermdev/waveterm

Length of output: 7309


🏁 Script executed:

#!/bin/bash
# Check for error handling patterns in the file under review
rg -nP --type=go 'var waveSrvClient' -A 20

Repository: wavetermdev/waveterm

Length of output: 2185


Error from RegisterTrustedLeaf is being silently ignored.

The RegisterTrustedLeaf method returns (baseds.LinkId, error) but the error is discarded at line 30. While DefaultRouter is safely initialized before this code runs, unchecked registration failures could leave the client unregistered without indication. Consider capturing and handling the error, especially since this occurs within a sync.Once block where failures cannot be retried.

(Note: This pattern is repeated in other call sites, suggesting a potential codebase-wide concern.)

🤖 Prompt for AI Agents
In `@pkg/wshrpc/wshclient/barerpcclient.go` around lines 24 - 31,
RegisterTrustedLeaf's returned error is ignored inside GetBareRpcClient's
sync.Once block; capture its (linkId, err) return, set waveSrvClient_RouteId
from the returned linkId only on success, and handle err (e.g., log the error
via your logger and fail fast by returning nil/panic or storing an error state)
so the failure isn't silent. Update the block around
wshutil.DefaultRouter.RegisterTrustedLeaf to assign the two returns, check err,
and only call wps.Broker.SetClient(wshutil.DefaultRouter) after successful
registration; reference symbols: GetBareRpcClient, waveSrvClient_Once,
waveSrvClient_Singleton, waveSrvClient_RouteId,
wshutil.DefaultRouter.RegisterTrustedLeaf, and wps.Broker.SetClient.

Comment on lines +777 to +780
n, err = file.WriteAt(dataBytes[:n], atOffset)
} else {
n, err = file.Write(dataBytes[:n])
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix ineffectual assignment to n.

The variable n is assigned but never used after WriteAt/Write operations. The linter correctly identifies this as ineffectual.

Proposed fix
 	if atOffset > 0 && !append {
-		n, err = file.WriteAt(dataBytes[:n], atOffset)
+		_, err = file.WriteAt(dataBytes[:n], atOffset)
 	} else {
-		n, err = file.Write(dataBytes[:n])
+		_, err = file.Write(dataBytes[:n])
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
n, err = file.WriteAt(dataBytes[:n], atOffset)
} else {
n, err = file.Write(dataBytes[:n])
}
_, err = file.WriteAt(dataBytes[:n], atOffset)
} else {
_, err = file.Write(dataBytes[:n])
}
🧰 Tools
🪛 golangci-lint (2.5.0)

[error] 777-777: ineffectual assignment to n

(ineffassign)


[error] 779-779: ineffectual assignment to n

(ineffassign)

🤖 Prompt for AI Agents
In `@pkg/wshrpc/wshremote/wshremote_file.go` around lines 777 - 780, The
assignment to variable `n` from the calls to `file.WriteAt`/`file.Write` in
wshremote_file.go is ineffectual because `n` is never used afterwards; change
those assignments to discard the byte-count return (use `_` for the first
return) and only capture `err` (i.e., replace `n, err = file.WriteAt(...)` and
`n, err = file.Write(...)` with `_ , err = ...`) so the linter warning is
resolved while preserving error handling in the same functions/methods.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@frontend/types/gotypes.d.ts`:
- Around line 352-359: The TypeScript type CommandJobCmdExitedData declares
exitcode as required but the Go model uses a nullable *int, so update
CommandJobCmdExitedData to make exitcode optional (exitcode?: number) to allow
null/undefined from JSON; locate the CommandJobCmdExitedData type in
frontend/types/gotypes.d.ts and change the exitcode property to be optional to
match other optional exit code fields.

In `@pkg/wshrpc/wshremote/wshremote_job.go`:
- Around line 145-156: The parent currently defers closing readyPipeWrite which
keeps the write-end open even after passing it to the child; instead remove the
defer for readyPipeWrite and, after creating cmd (exec.Command(...)) and setting
cmd.ExtraFiles, call cmd.Start() and on success immediately close readyPipeWrite
in the parent so the child is the sole owner; if cmd.Start() fails, close
readyPipeWrite before returning the error to avoid leaks. Reference
readyPipeRead, readyPipeWrite, cmd, ExtraFiles and cmd.Start() when making the
change.
♻️ Duplicate comments (5)
cmd/wsh/cmd/wshcmd-jobmanager.go (1)

82-119: Goroutine leak on context timeout remains unaddressed.

When the context times out, the goroutine spawned at line 86 continues running and blocks on reader.ReadString('\n'). Since this function is only called once at startup, the impact is minimal, but the goroutine will remain blocked until stdin is closed.

pkg/wshrpc/wshclient/barerpcclient.go (1)

29-31: Error from RegisterTrustedLeaf is still being ignored.

While the comment explains that the route format is valid, RegisterTrustedLeaf could still fail for other reasons (e.g., if the router is in an unexpected state). Since this runs inside sync.Once, failures cannot be retried, potentially leaving the client unregistered.

Consider at minimum logging the error for observability, even if the code continues:

-		// we can safely ignore the error from RegisterTrustedLeaf since the route is valid
-		wshutil.DefaultRouter.RegisterTrustedLeaf(waveSrvClient_Singleton, waveSrvClient_RouteId)
+		_, err := wshutil.DefaultRouter.RegisterTrustedLeaf(waveSrvClient_Singleton, waveSrvClient_RouteId)
+		if err != nil {
+			log.Printf("warning: failed to register bare rpc client route: %v", err)
+		}
pkg/wshutil/wshrouter.go (1)

468-471: Untrusted response validation concern remains.

The current check allows responses based solely on ResId existence in rpcMap. An attacker who guesses or learns a valid ResId could spoof responses. Consider validating that the response originates from the expected link.

pkg/waveobj/wtype.go (1)

323-323: Security concern: JobAuthToken persisted in WaveObj.

The JobAuthToken field is serialized to JSON and will be persisted to the database and potentially broadcast in WaveObj updates. This could leak the authentication token to clients. Consider using json:"-" to exclude it from serialization and store it separately on the server side.

pkg/jobcontroller/jobcontroller.go (1)

236-239: Improve error handling for client retrieval.

The condition err != nil || clientId == nil combines two failure modes. If err is nil but clientId is nil, the %w formatting with a nil error produces a confusing message.

Suggested fix
 	clientId, err := wstore.DBGetSingleton[*waveobj.Client](ctx)
-	if err != nil || clientId == nil {
-		return "", fmt.Errorf("failed to get client: %w", err)
+	if err != nil {
+		return "", fmt.Errorf("failed to get client: %w", err)
+	}
+	if clientId == nil {
+		return "", fmt.Errorf("client not found")
 	}
🧹 Nitpick comments (6)
frontend/app/view/term/term-wsh.tsx (1)

107-110: Placeholder handler acknowledged.

The handler logs the event and contains a TODO for future implementation. This is appropriate for the current PR scope where the job manager framework is being established but not yet fully integrated.

Would you like me to help implement the frontend logic to handle job attachment updates when you're ready to complete this integration?

pkg/wshutil/wshrouter.go (1)

258-264: Consider logging when link ID parsing fails.

If strconv.ParseInt fails, the code silently falls through to normal route lookup. While this may be intentional (treating malformed link: routes as regular routes), logging a warning could aid debugging.

Optional: Add debug logging for parse failures
 	if strings.HasPrefix(routeId, RoutePrefix_Link) {
 		linkIdStr := strings.TrimPrefix(routeId, RoutePrefix_Link)
 		linkIdInt, err := strconv.ParseInt(linkIdStr, 10, 32)
 		if err == nil {
 			return router.sendMessageToLink(msgBytes, baseds.LinkId(linkIdInt), ingressLinkId)
+		} else {
+			log.Printf("[router] warning: malformed link route id %q: %v", routeId, err)
 		}
 	}
pkg/wshrpc/wshremote/wshremote_job.go (1)

344-350: SIGTERM sent without waiting for graceful shutdown.

After sending SIGTERM, the function returns immediately without waiting for the process to terminate. This could leave the process in a zombie state temporarily or cause race conditions if the caller expects immediate termination.

Consider adding a brief wait with timeout, or document that termination is asynchronous.

pkg/wshrpc/wshrpctypes.go (1)

792-798: Inconsistent TermSize usage: pointer vs value.

CommandJobControllerStartJobData.TermSize uses *waveobj.TermSize (pointer, line 797), while CommandStartJobData.TermSize uses waveobj.TermSize (value, line 718). This inconsistency could cause confusion.

Consider aligning these - using a pointer allows distinguishing "not provided" from "zero size", which may be desirable for optional fields.

cmd/wsh/cmd/wshcmd-jobdebug.go (2)

341-344: Redundant argument check.

The if len(args) == 0 check on lines 342-344 is redundant since jobDebugStartCmd already specifies Args: cobra.MinimumNArgs(1) at line 81, which will prevent the handler from running if no arguments are provided.

Suggested removal
 func jobDebugStartRun(cmd *cobra.Command, args []string) error {
-	if len(args) == 0 {
-		return fmt.Errorf("no command specified after --")
-	}
-
 	cmdToRun := args[0]
 	cmdArgs := args[1:]

166-166: Inconsistent output method.

Line 166 uses log.Printf while the rest of the command uses fmt.Printf for user-facing output. If this log statement is intended for debugging purposes, consider removing it before the feature is finalized, or switch to fmt.Printf for consistency with other output in this file.

Comment on lines +145 to +156
readyPipeRead, readyPipeWrite, err := os.Pipe()
if err != nil {
return nil, fmt.Errorf("cannot create ready pipe: %w", err)
}
defer readyPipeRead.Close()
defer readyPipeWrite.Close()

cmd := exec.Command(wshPath, "jobmanager", "--jobid", data.JobId, "--clientid", data.ClientId)
if data.PublicKeyBase64 != "" {
cmd.Env = append(os.Environ(), "WAVETERM_PUBLICKEY="+data.PublicKeyBase64)
}
cmd.ExtraFiles = []*os.File{readyPipeWrite}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Close write-end of ready pipe after passing to child process.

readyPipeWrite is passed to the child via ExtraFiles, but the parent's file descriptor should be closed after cmd.Start() succeeds to ensure the child becomes the sole owner. Otherwise, the parent holding the write end open could prevent proper EOF signaling.

Suggested fix
 	cmd.ExtraFiles = []*os.File{readyPipeWrite}
 	stdin, err := cmd.StdinPipe()
 	if err != nil {
 		return nil, fmt.Errorf("cannot create stdin pipe: %w", err)
 	}
 	// ... stdout, stderr pipe setup ...

 	if err := cmd.Start(); err != nil {
 		return nil, fmt.Errorf("cannot start job manager: %w", err)
 	}
+	readyPipeWrite.Close() // Close parent's write end after child inherits it
 	log.Printf("RemoteStartJobCommand: job manager process started\n")

And remove the defer readyPipeWrite.Close() from line 150.

🤖 Prompt for AI Agents
In `@pkg/wshrpc/wshremote/wshremote_job.go` around lines 145 - 156, The parent
currently defers closing readyPipeWrite which keeps the write-end open even
after passing it to the child; instead remove the defer for readyPipeWrite and,
after creating cmd (exec.Command(...)) and setting cmd.ExtraFiles, call
cmd.Start() and on success immediately close readyPipeWrite in the parent so the
child is the sole owner; if cmd.Start() fails, close readyPipeWrite before
returning the error to avoid leaks. Reference readyPipeRead, readyPipeWrite,
cmd, ExtraFiles and cmd.Start() when making the change.

@sawka sawka merged commit ae3e9f0 into main Jan 22, 2026
6 of 7 checks passed
@sawka sawka deleted the sawka/sessionmgr2 branch January 22, 2026 00:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants