Skip to content

PR-07: Skill Reorg — Utilities (Teil 1): Aphorisms, Browser, Cloudflare, CreateCLI, Documents, Evals, PAIUpgrade, Parser, Pdf#74

Merged
Steffen025 merged 5 commits intomainfrom
release/v3.0-pr07-utilities-reorg-teil1
Mar 16, 2026
Merged

PR-07: Skill Reorg — Utilities (Teil 1): Aphorisms, Browser, Cloudflare, CreateCLI, Documents, Evals, PAIUpgrade, Parser, Pdf#74
Steffen025 merged 5 commits intomainfrom
release/v3.0-pr07-utilities-reorg-teil1

Conversation

@Steffen025
Copy link
Owner

@Steffen025 Steffen025 commented Mar 16, 2026

PR-07: Utilities Reorg Teil 1 (130 Dateien)

Zusammenfassung

Verschiebt 11 Skill-Verzeichnisse in die Utilities/ Namespace-Struktur als Teil der PAI v3.0 Skill-Reorganisation.

Betroffene Skills

  • AphorismsUtilities/Aphorisms
  • BrowserUtilities/Browser
  • CloudflareUtilities/Cloudflare
  • CreateCLIUtilities/CreateCLI
  • CreateSkillUtilities/CreateSkill
  • DelegationUtilities/Delegation
  • Documents (inkl. Docx, Pdf, Pptx, Xlsx) → Utilities/Documents + Unterverzeichnisse
  • EvalsUtilities/Evals
  • PAIUpgradeUtilities/PAIUpgrade
  • ParserUtilities/Parser

Scope

  • 130 Dateien (alle RENAME/ADD — neue Pfade)
  • Keine inhaltlichen Änderungen, reine Pfad-Reorganisation
  • Alte Pfade werden in PR-10 gelöscht

Review-Hinweis

Review für Korrektheit, OpenCode-Konformität, und verbleibende Claude→OpenCode Probleme

Jira

WARNEX-81

Summary by CodeRabbit

  • New Features

    • Aphorisms: curated thematic aphorism database plus add/find/search/research workflows and usage tracking.
    • Browser Automation v2.0: code-first Playwright browser, persistent session API, CLI tools (navigate/screenshot/verify/extract/interact), examples, and test suite.
    • Evals Framework: new graders (code- & model-based), suite management, failure→task converter, Algorithm bridge and CLI utilities.
    • Docx tooling: OOXML editors with tracked-changes/comments, pack/unpack/validate scripts.
  • Documentation

    • Extensive SKILL.md and workflow docs across Aphorisms, Browser, Evals, Docx, CreateCLI/CreateSkill, Cloudflare.

Move 130 skill files into Utilities/ subdirectory structure:
- Aphorisms → Utilities/Aphorisms
- Browser → Utilities/Browser
- Cloudflare → Utilities/Cloudflare
- CreateCLI → Utilities/CreateCLI
- CreateSkill → Utilities/CreateSkill
- Documents (Docx, Pdf, Pptx, Xlsx) → Utilities/Documents
- Evals → Utilities/Evals
- Prompting → Utilities/Prompting
- SECUpdates → Utilities/SECUpdates
- System → Utilities/System
- VoiceServer → Utilities/VoiceServer

Part of PAI v3.0 skill reorganization (PR-07 of 12)
@coderabbitai
Copy link

coderabbitai bot commented Mar 16, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds many new Utilities skills and tooling: an Aphorisms database and workflows; a Playwright-based Browser v2 (library, persistent session server, CLI, examples, tests); OOXML/Docx Python editor and pack/unpack/validate tools; a full Evals framework (graders, suite management, tools); plus numerous SKILL docs and workflows for CreateCLI, CreateSkill, Cloudflare, Documents, Delegation, and related utilities.

Changes

Cohort / File(s) Summary
Aphorisms Skill
.opencode/skills/Utilities/Aphorisms/Database/aphorisms.md, .opencode/skills/Utilities/Aphorisms/SKILL.md, .opencode/skills/Utilities/Aphorisms/Workflows/*
Adds static aphorism database and comprehensive skill documentation and workflows (add/find/search/research), theme taxonomy and usage tracking.
Browser Skill v2.0
.opencode/skills/Utilities/Browser/index.ts, .opencode/skills/Utilities/Browser/Tools/Browse.ts, .opencode/skills/Utilities/Browser/Tools/BrowserSession.ts, .opencode/skills/Utilities/Browser/package.json, .opencode/skills/Utilities/Browser/tsconfig.json, .opencode/skills/Utilities/Browser/*Workflows/*, .opencode/skills/Utilities/Browser/examples/*, .opencode/skills/Utilities/Browser/README.md, .opencode/skills/Utilities/Browser/SKILL.md
Implements PlaywrightBrowser class with exported singleton and rich API, persistent BrowserSession HTTP server, Bun CLI, examples and comprehensive tests; adds package/tsconfig and workflow docs. Public TS types and instance exported.
Docx / OOXML Tools
.opencode/skills/Utilities/Docx/Scripts/*, .opencode/skills/Utilities/Docx/Scripts/document.py, .opencode/skills/Utilities/Docx/Scripts/utilities.py, .opencode/skills/Utilities/Docx/Scripts/__init__.py, .opencode/skills/Utilities/Docx/SKILL.md, .opencode/skills/Utilities/Docx/docx-js.md, .opencode/skills/Utilities/Docx/ooxml.md, .opencode/skills/Utilities/Docx/LICENSE.txt
Adds Python OOXML library (XMLEditor, DocxXMLEditor, Document manager) with tracked-changes/comments APIs, pack/unpack/validate scripts, pretty-printing, CLI entry points and extensive DOCX/OOXML docs. New public classes/functions added.
Evals Framework
.opencode/skills/Utilities/Evals/*, .opencode/skills/Utilities/Evals/Graders/*, .opencode/skills/Utilities/Evals/Tools/*, .opencode/skills/Utilities/Evals/Suites/*, .opencode/skills/Utilities/Evals/Data/*
Introduces System-Evals: PROJECT/SKILL docs, Base grader registry, many code- and model-based graders (registered), suite management utilities, AlgorithmBridge and FailureToTask tools, templates and domain patterns; new exported TS APIs and CLI utilities.
CreateCLI & CreateSkill
.opencode/skills/Utilities/CreateCLI/*, .opencode/skills/Utilities/CreateSkill/*
Adds documentation, patterns, and workflows for CLI and Skill generation, canonicalization, migration, validation, and TypeScript patterns.
Delegation & Documents Skills
.opencode/skills/Utilities/Delegation/SKILL.md, .opencode/skills/Utilities/Documents/SKILL.md, .opencode/skills/Utilities/Documents/Workflows/ProcessLargePdfGemini3.md
New Delegation guidance for multi-agent orchestration and Documents skill with multimodal PDF processing workflow.
Cloudflare Skill
.opencode/skills/Utilities/Cloudflare/SKILL.md, .opencode/skills/Utilities/Cloudflare/Workflows/*
Adds Cloudflare deployment and troubleshooting workflow docs with verification and browser-check integration.
Misc / Config & Small Docs
.coderabbit.yaml, assorted SKILL/workflow docs across Utilities
Minor locale config change and many new documentation files across Utilities (Evals best practices, Templates, CreateCLI patterns, Browser workflows, etc.).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant CLI as "CLI (Browse.ts)"
  participant Session as "BrowserSession (HTTP API)"
  participant Playwright as "PlaywrightBrowser (index.ts)"
  participant Page as "Browser Page / Process"
  participant FS as "Filesystem / State"

  CLI->>Session: POST /navigate or /screenshot (args)
  Session->>Playwright: invoke navigate/capture/evaluate APIs
  Playwright->>Page: perform goto/click/evaluate/wait
  Page-->>Playwright: emit console / network / dialog events
  Playwright->>FS: persist session state, logs, screenshot
  Playwright-->>Session: return result (JSON / buffer)
  Session-->>CLI: HTTP response with payload + diagnostics
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Poem

🐰
I hop through lines of prose and code,
Collected quotes in tidy mode,
Browsers hum while snapshots glow,
Docx edits and Evals in a row,
A rabbit cheers — behold the new node!

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch release/v3.0-pr07-utilities-reorg-teil1

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.opencode/skills/Utilities/Browser/Workflows/Update.md (1)

81-87: ⚠️ Potential issue | 🟡 Minor

Fenced Code Block ohne Sprachkennung

Der Block ab Line 81 sollte eine Sprache bekommen (text reicht), damit Markdownlint sauber bleibt.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Browser/Workflows/Update.md around lines 81 - 87,
The fenced code block containing the lines starting with "# Last sync:
2026-01-03" is missing a language tag; update that triple-backtick fence to
include a language identifier (e.g., ```text) so Markdownlint stops flagging it
— locate the fenced block in Update.md and add the language after the opening
``` while keeping the block contents unchanged.
🟡 Minor comments (38)
.opencode/skills/Utilities/Delegation/SKILL.md-60-80 (1)

60-80: ⚠️ Potential issue | 🟡 Minor

Überschriftennummerierung ist inkonsistent (doppelte „3.“).

Bei Line 60 und Line 72 steht jeweils „### 3.“. Das erschwert Referenzen innerhalb der Skill-Doku; bitte Nummerierung durchgehend korrigieren.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Delegation/SKILL.md around lines 60 - 80, Die
Überschriftennummerierung ist inkonsistent: ändere die zweite "### 3." (die
Überschrift "Foreground Agents") auf "### 4." und passe alle nachfolgenden
Abschnittsnummern bei Bedarf an, so dass "Background Agents" bleibt "### 3." und
"Foreground Agents" eindeutig "### 4." ist; prüfe die Abschnitte "Background
Agents" und "Foreground Agents" im SKILL.md und korrigiere nur die sichtbaren
Nummern in den Markdown-Überschriften.
.opencode/skills/Utilities/Delegation/SKILL.md-50-52 (1)

50-52: ⚠️ Potential issue | 🟡 Minor

Fenced Code Blocks ohne Sprachkennung (Markdown-Lint MD040).

Bei den Blöcken ab Line 50, Line 64, Line 103, Line 148 und Line 160 fehlt die Sprachangabe. Bitte z. B. bash oder text ergänzen, damit Rendering/Linting konsistent bleibt.

Vorgeschlagener Patch
-```
+```text
 Task(subagent_type="Engineer", isolation: "worktree", prompt="...")

- +text
Task(subagent_type="Engineer", run_in_background: true, prompt="...")


-```
+```text
1. TeamCreate(team_name="my-project")           # Creates team + task list
2. TaskCreate(subject="Implement auth module")   # Create team tasks
3. Task(subagent_type="Engineer", team_name="my-project", name="auth-engineer")  # Spawn teammate
4. TaskUpdate(taskId="1", owner="auth-engineer") # Assign task
5. SendMessage(type="message", recipient="auth-engineer", content="...")  # Coordinate

- +text
Task(subagent_type="general-purpose", model="haiku", max_turns=3, prompt="...")


-```
+```text
Task(subagent_type="general-purpose", prompt="...")  # or specialized agent type
</details>



Also applies to: 64-66, 103-109, 148-150, 160-162

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Delegation/SKILL.md around lines 50 - 52, Several
fenced code blocks containing Task(...) examples and the numbered
TeamCreate/TaskCreate sequence lack a language tag; update each opening
triple-backtick fence for the blocks that include Task(subagent_type="Engineer",
isolation: "worktree", prompt="..."), Task(subagent_type="Engineer",
run_in_background: true, prompt="..."), the numbered block starting with "1.
TeamCreate(team_name="my-project")", Task(subagent_type="general-purpose",
model="haiku", max_turns=3, prompt="...") and
Task(subagent_type="general-purpose", prompt="...") to include an explicit
language identifier such as "text" or "bash" (e.g., change totext) so
Markdown lint MD040 is satisfied and rendering is consistent.


</details>

</blockquote></details>
<details>
<summary>.opencode/skills/Utilities/Docx/Scripts/__init__.py-1-1 (1)</summary><blockquote>

`1-1`: _⚠️ Potential issue_ | _🟡 Minor_

**Dateiberechtigung korrigieren: Datei sollte nicht ausführbar sein.**

Die Datei hat das ausführbare Flag gesetzt, obwohl `__init__.py`-Dateien keine ausführbaren Skripte sind. Bitte entfernen Sie die Ausführungsberechtigung.



<details>
<summary>🔧 Vorgeschlagene Korrektur</summary>

Führen Sie folgenden Befehl aus, um die Berechtigung zu korrigieren:

```shell
chmod -x .opencode/skills/Utilities/Docx/Scripts/__init__.py
```

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Docx/Scripts/__init__.py at line 1, The file
__init__.py in the Scripts package has its executable bit set; remove the
executable permission (unset the executable flag) so it is a plain Python
package file rather than a script (e.g., update file mode to remove execute
permissions for owner/group/others).
```

</details>

</blockquote></details>
<details>
<summary>.opencode/skills/Utilities/CreateSkill/Workflows/CreateSkill.md-86-88 (1)</summary><blockquote>

`86-88`: _⚠️ Potential issue_ | _🟡 Minor_

**Markdownlint MD040 in eingebetteten Beispielblöcken**

Die inneren Fences haben keine Sprachangabe. Bitte mit `text`/`markdown` ergänzen, damit die Datei lint-konform ist.  



Also applies to: 102-107, 110-114, 119-119

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/CreateSkill/Workflows/CreateSkill.md around lines
86 - 88, Update the nested fenced code blocks in the markdown examples so the
inner fences include a language tag (e.g., text or markdown) to satisfy
markdownlint MD040; specifically edit the example that contains the text
"Running WorkflowName in SkillName..." and the other similar embedded
blocks in the same document so each inner fence starts with a language
identifier instead of an empty fence.


</details>

</blockquote></details>
<details>
<summary>.opencode/skills/Utilities/CreateSkill/Workflows/UpdateSkill.md-22-24 (1)</summary><blockquote>

`22-24`: _⚠️ Potential issue_ | _🟡 Minor_

**Markdownlint MD040 auf dem Pfad-Block**

Der Codeblock hat keine Sprachangabe. Bitte z. B. `text` verwenden.

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/CreateSkill/Workflows/UpdateSkill.md around lines
22 - 24, In der Datei UpdateSkill.md im vorhandenen dreifachen
Backtick-Codeblock mit Inhalt "~/.opencode/skills/CORE/SkillSystem.md" fehlt die
Sprachangabe, was MD040 auslöst; fixiere das, indem du den Codeblock von auf z.B.text (oder eine passende Sprache) änderst, so dass der Block eine
Sprachkennzeichnung erhält und den Markdownlint-Warnhinweis behebt.


</details>

</blockquote></details>
<details>
<summary>.opencode/skills/Utilities/CreateSkill/Workflows/CanonicalizeSkill.md-289-289 (1)</summary><blockquote>

`289-289`: _⚠️ Potential issue_ | _🟡 Minor_

**Verzeichnisname auf Line 289 auf `Tools/` standardisieren**

Die Checkliste nutzt `tools/`, während der Rest der Doku mit `Tools/` arbeitet. Bitte vereinheitlichen.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/CreateSkill/Workflows/CanonicalizeSkill.md at
line 289, Die Checkliste-Zeile mit dem Eintrag "`tools/` directory exists (even
if empty)" ist inkonsistent zur restlichen Dokumentation, die "`Tools/`"
verwendet; aktualisiere diesen Eintrag im Dokument (CanonicalizeSkill.md) so
dass `tools/` zu `Tools/` geändert wird, damit die Verzeichnisbenennung
durchgängig gleich ist.
```

</details>

</blockquote></details>
<details>
<summary>.opencode/skills/Utilities/CreateSkill/SKILL.md-239-240 (1)</summary><blockquote>

`239-240`: _⚠️ Potential issue_ | _🟡 Minor_

**Verzeichnisname inkonsistent: `tools/` vs `Tools/`**

In den Beispielen wird auf Line 239 `tools/` verwendet, während der Rest des Dokuments `Tools/` (TitleCase) fordert. Auf case-sensitiven Systemen führt das zu falschen Pfaden.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/CreateSkill/SKILL.md around lines 239 - 240, Die
Verzeichnisbezeichnung im Beispiel ist inkonsistent: ersetze im
SKILL.md-Eintrag, der aktuell "tools/" zeigt, durch die TitleCase-Variante
"Tools/" so dass alle Vorkommen mit dem restlichen Dokument übereinstimmen;
überprüfe und aktualisiere insbesondere die Zeile mit "→ Creates SKILL.md,
Workflows/, tools/" und alle ähnlichen Beispiele, damit sie "Tools/" verwenden
(Suche nach "tools/" und passe alle Treffer an).
```

</details>

</blockquote></details>
<details>
<summary>.opencode/skills/Utilities/CreateSkill/Workflows/CanonicalizeSkill.md-149-185 (1)</summary><blockquote>

`149-185`: _⚠️ Potential issue_ | _🟡 Minor_

**Schrittnummerierung doppelt vergeben (`Step 6`)**

Es gibt zwei verschiedene `Step 6`-Abschnitte. Bitte neu nummerieren, damit der Ablauf eindeutig bleibt.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/CreateSkill/Workflows/CanonicalizeSkill.md around
lines 149 - 185, Die Datei enthält zwei identische Überschriften "Step 6"
(genannt "Step 6: Convert YAML Frontmatter" und "Step 6: Add Workflow Routing to
Body"), wodurch die Nummerierung doppelt ist; bitte die zweite "Step 6: Add
Workflow Routing to Body" und alle nachfolgenden Schritte um eine Stufe
hochzählen (z.B. in "Step 7: Add Workflow Routing to Body") und konsistent alle
referenzierten Step-Nummern im Dokument aktualisieren, sodass die Schrittfolge
eindeutig bleibt.
```

</details>

</blockquote></details>
<details>
<summary>.opencode/skills/Utilities/CreateSkill/Workflows/ValidateSkill.md-149-157 (1)</summary><blockquote>

`149-157`: _⚠️ Potential issue_ | _🟡 Minor_

**`tools/` und `Tools/` sind gemischt verwendet**

In der Strukturprüfung steht `tools/`, später aber `Tools/` (z. B. Line 163). Das sollte auf einen Namensstandard vereinheitlicht werden.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/CreateSkill/Workflows/ValidateSkill.md around
lines 149 - 157, The document mixes "tools/" and "Tools/" directory names;
standardize on a single casing (use "tools/" lowercase) throughout
ValidateSkill.md by replacing any occurrences of "Tools/" with "tools/"
(including headings, checklist items, examples and the Step 7a section) and
ensure any related checks or prose consistently refer to the lowercase "tools/"
directory.
```

</details>

</blockquote></details>
<details>
<summary>.opencode/skills/Utilities/CreateSkill/Workflows/CreateSkill.md-216-217 (1)</summary><blockquote>

`216-217`: _⚠️ Potential issue_ | _🟡 Minor_

**Checklist verweist auf falsches Verzeichnis (`tools/`)**

Auf Line 216 wird `tools/` gefordert, obwohl im Workflow sonst `Tools/` angelegt/verwendet wird (z. B. Lines 50, 56). Bitte einheitlich `Tools/` nutzen.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/CreateSkill/Workflows/CreateSkill.md around lines
216 - 217, Die Checkliste benutzt fälschlich das Verzeichnisnamen-Symbol
`tools/`; ändere den Eintrag "- [ ] `tools/` directory exists (even if empty)"
zu "- [ ] `Tools/` directory exists (even if empty)" damit es konsistent mit den
übrigen Verweisen auf `Tools/` im Workflow (z. B. die an anderen Stellen
verwendeten `Tools/`-Referenzen) ist.
```

</details>

</blockquote></details>
<details>
<summary>.opencode/skills/Utilities/CreateSkill/Workflows/CanonicalizeSkill.md-22-24 (1)</summary><blockquote>

`22-24`: _⚠️ Potential issue_ | _🟡 Minor_

**Markdownlint MD040: fehlende Sprach-Tags bei mehreren Fenced Blöcken**

Bitte für alle betroffenen Blöcke explizite Sprachen setzen (`text`, `bash`, `markdown`).  



Also applies to: 62-65, 68-71, 74-77, 80-83, 112-115, 118-121, 124-127, 197-199, 250-250, 257-257

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/CreateSkill/Workflows/CanonicalizeSkill.md around
lines 22 - 24, In CanonicalizeSkill.md there are multiple fenced code blocks
using plain ``` with no language tag (e.g. the block showing
"~/.opencode/skills/CORE/SkillSystem.md"); update each such fence to include an
explicit language token (use bash for shell paths/commands, text for plain paths
or outputs, markdown for embedded markdown snippets) so markdownlint MD040 is
satisfied; ensure every affected triple-backtick block in the file is modified
to include the appropriate language tag.


</details>

</blockquote></details>
<details>
<summary>.opencode/skills/Utilities/CreateSkill/SKILL.md-51-57 (1)</summary><blockquote>

`51-57`: _⚠️ Potential issue_ | _🟡 Minor_

**Markdownlint MD040: Sprachangaben bei mehreren Fenced Code Blocks fehlen**

Mehrere Codeblöcke sind ohne Sprach-Tag definiert. Bitte z. B. `bash`, `text` oder `markdown` setzen, damit Lint sauber bleibt.  



Also applies to: 61-67, 123-128, 131-137, 143-153, 218-220, 234-241, 244-251, 254-261

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/CreateSkill/SKILL.md around lines 51 - 57,
Several fenced code blocks in SKILL.md are missing language tags (e.g., the
block that begins with "skills/SkillName/SKILL.md # Skill
root") which triggers markdownlint MD040; update every fenced code block in this
file to include an explicit language (use "text" for plain file lists, "bash"
for shell snippets, or "markdown" for embedded markdown), including the other
blocks referenced in the comment ranges (61-67, 123-128, 131-137, 143-153,
218-220, 234-241, 244-251, 254-261) so all triple-backtick fences become like
text or bash as appropriate to satisfy the linter.


</details>

</blockquote></details>
<details>
<summary>.opencode/skills/Utilities/CreateSkill/Workflows/ValidateSkill.md-22-24 (1)</summary><blockquote>

`22-24`: _⚠️ Potential issue_ | _🟡 Minor_

**Markdownlint MD040: Sprachangaben in mehreren Codeblöcken ergänzen**

Die betroffenen Fenced Blocks sollten Sprach-Tags erhalten (`text`, `bash`, `markdown`).  



Also applies to: 98-100, 117-121

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/CreateSkill/Workflows/ValidateSkill.md around
lines 22 - 24, Several fenced code blocks (e.g., the ones containing the literal
string "~/.opencode/skills/CORE/SkillSystem.md" and the other blocks noted in
the review) are missing language tags and trigger markdownlint MD040; edit each
fenced block in ValidateSkill.md to add an appropriate language tag (use text
for plain paths, bash for shell snippets, or markdown for embedded markdown)
immediately after the opening backticks so the blocks become text, bash or

.opencode/skills/Utilities/Evals/Tools/FailureToTask.ts-47-56 (1)

47-56: ⚠️ Potential issue | 🟡 Minor

Fehlerbehandlung für fehlerhafte JSON-Zeilen fehlt.

Wenn eine Zeile in der JSONL-Datei fehlerhaft ist, wirft JSON.parse einen Fehler und die gesamte Funktion schlägt fehl. Bei einer persistenten Datendatei sollte die Funktion robuster sein.

🛡️ Vorgeschlagene Korrektur mit Fehlerbehandlung
 export function loadFailures(): FailureLog[] {
   if (!existsSync(FAILURES_LOG)) return [];

   const content = readFileSync(FAILURES_LOG, 'utf-8');
   return content
     .trim()
     .split('\n')
     .filter(Boolean)
-    .map(line => JSON.parse(line) as FailureLog);
+    .map((line, index) => {
+      try {
+        return JSON.parse(line) as FailureLog;
+      } catch (e) {
+        console.warn(`Warnung: Fehlerhafte Zeile ${index + 1} übersprungen`);
+        return null;
+      }
+    })
+    .filter((f): f is FailureLog => f !== null);
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Evals/Tools/FailureToTask.ts around lines 47 -
56, The loadFailures function currently calls JSON.parse on every line and will
throw if any line is malformed; update loadFailures to handle per-line parse
errors by wrapping the JSON.parse(line) call in a try/catch, skip lines that
fail to parse (optionally collecting or logging the parse error and the
offending line), and return only successfully parsed FailureLog objects;
reference the loadFailures function and the FAILURE_LOG constant (FAILURES_LOG)
so you modify the map over lines to safe parsing rather than directly calling
JSON.parse.
.opencode/skills/Utilities/Evals/Tools/FailureToTask.ts-289-295 (1)

289-295: ⚠️ Potential issue | 🟡 Minor

Fehlende Laufzeitvalidierung für Severity-Parameter.

Die Type-Assertion values.severity as 'low' | 'medium' | 'high' | 'critical' validiert nicht zur Laufzeit. Bei ungültiger Eingabe wie -s invalid wird der Wert ohne Prüfung gespeichert.

🛡️ Vorgeschlagene Korrektur mit Validierung
+    const validSeverities = ['low', 'medium', 'high', 'critical'] as const;
+    const severity = validSeverities.includes(values.severity as any)
+      ? (values.severity as 'low' | 'medium' | 'high' | 'critical')
+      : 'medium';
+
     const failure = logFailure({
       description: args.join(' '),
       category: values.category ?? 'unknown',
-      severity: values.severity as 'low' | 'medium' | 'high' | 'critical',
+      severity,
       expected_behavior: values.expected,
       actual_behavior: values.actual,
     });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Evals/Tools/FailureToTask.ts around lines 289 -
295, Die Severity wird derzeit mit einer Type-Assertion (`values.severity as
'low' | 'medium' | 'high' | 'critical'`) nur statisch angenommen; füge eine
Laufzeitvalidierung hinzu bevor du `logFailure` aufrufst: prüfe
`values.severity` gegen ein Whitelist-Array oder Set von
['low','medium','high','critical'], und verwende nur den geprüften Wert (oder
setze eine sichere Default wie 'low'/'unknown' oder wirf/logge einen Fehler)
statt die rohe Eingabe; aktualisiere die Stelle in FailureToTask.ts rund um den
`failure = logFailure({... severity: ...})`-Aufruf und erwäge eine kleine
Hilfsfunktion `isValidSeverity` zum Wiederverwenden.
.opencode/skills/Utilities/Evals/Tools/FailureToTask.ts-93-107 (1)

93-107: ⚠️ Potential issue | 🟡 Minor

Inkonsistente Behandlung der Kategorie-Groß-/Kleinschreibung.

inferDomain (Zeile 87) normalisiert die Kategorie mit .toLowerCase(), aber inferGraders verwendet die Kategorie direkt in includes(). Bei gemischter Schreibweise wie "File_Targeting" wird die Domain korrekt erkannt, aber die Grader-Logik greift nicht.

♻️ Vorgeschlagene Korrektur für konsistente Normalisierung
 function inferGraders(category: string, failure: FailureLog): GraderConfig[] {
   const graders: GraderConfig[] = [];
+  const normalizedCategory = category.toLowerCase();

   // Always add tool call verification for coding/agent failures
-  if (['file_targeting', 'wrong_file', 'tool_sequence'].includes(category)) {
+  if (['file_targeting', 'wrong_file', 'tool_sequence'].includes(normalizedCategory)) {
     graders.push({
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Evals/Tools/FailureToTask.ts around lines 93 -
107, The graders selection in inferGraders uses category without normalization
causing mismatches with inferDomain which calls .toLowerCase(); modify
inferGraders to normalize the incoming category (e.g., const normalized =
category.toLowerCase()) and use normalized in the includes() check and in any
subsequent comparisons for the 'file_targeting', 'wrong_file', 'tool_sequence'
branch so mixed-case inputs like "File_Targeting" trigger the tool_calls grader
consistently with inferDomain.
.opencode/skills/Utilities/Evals/PROJECT.md-733-809 (1)

733-809: ⚠️ Potential issue | 🟡 Minor

Dateistruktur in PROJECT.md ist veraltet und stimmt nicht mit der tatsächlichen Verzeichnisstruktur überein.

Die dokumentierte Dateistruktur in PROJECT.md unterscheidet sich erheblich von der tatsächlichen Struktur in der Evals-Skill. Aktualisieren Sie die Dokumentation:

Dokumentiert (falsch):

  • cli/, scorers/, use-cases/, results/, storage/

Tatsächlich vorhanden:

  • Data/, Graders/, Suites/, Tools/, Types/, UseCases/, Workflows/

Die Verzeichnisse Tools/ und Workflows/ existieren bereits gemäß PAI v3.0 Konventionen, sind aber nicht im Dokumentations-Diagramm enthalten. Zusätzlich haben mehrere Verzeichnisse andere Namen (z.B. Graders statt scorers, UseCases statt use-cases). Bitte aktualisieren Sie die Dateistruktur-Dokumentation um die aktuelle Realität widerzuspiegeln.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Evals/PROJECT.md around lines 733 - 809, Update
PROJECT.md's file-structure diagram to match the actual Evals-Skill layout:
replace the outdated entries (cli/, scorers/, use-cases/, results/, storage/)
with the real top-level directories (Data/, Graders/, Suites/, Tools/, Types/,
UseCases/, Workflows/) and ensure correct casing and PAI v3.0 conventions; keep
the same diagram style and examples (e.g., retain lines for README.md,
package.json, tsconfig.json) and add brief notes if any legacy dirs still exist
so the diagram in PROJECT.md accurately reflects the repository.
.opencode/skills/Utilities/Documents/Workflows/ProcessLargePdfGemini3.md-619-619 (1)

619-619: ⚠️ Potential issue | 🟡 Minor

Kleine Sprachkorrektur: zusammengesetztes Adjektiv mit Bindestrich.

„~30 page research paper“ → „~30-page research paper“.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Documents/Workflows/ProcessLargePdfGemini3.md at
line 619, Die Phrase in der Überschrift "Typical Processing Times (on ~30 page
research paper):" sollte das zusammengesetzte Adjektiv korrekt mit Bindestrich
schreiben; ersetze "~30 page research paper" durch "~30-page research paper" in
der Zeile mit "Typical Processing Times (on ~30 page research paper):".
.opencode/skills/Utilities/Evals/TemplateIntegration.md-57-62 (1)

57-62: ⚠️ Potential issue | 🟡 Minor

Gleiche Pfadinkonsistenz bei Rubric-Beispiel.

Auch hier sollte der Pfad auf Utilities/Evals/ aktualisiert werden.

🔧 Vorgeschlagene Korrektur
 ```bash
 bun run ~/.opencode/Templates/Tools/RenderTemplate.ts \
   -t Evals/Rubric.hbs \
-  -d ~/.opencode/skills/Evals/UseCases/<name>/rubric.yaml \
-  -o ~/.opencode/skills/Evals/UseCases/<name>/rubric.md
+  -d ~/.opencode/skills/Utilities/Evals/UseCases/<name>/rubric.yaml \
+  -o ~/.opencode/skills/Utilities/Evals/UseCases/<name>/rubric.md
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Evals/TemplateIntegration.md around lines 57 -
62, Update the example command that runs RenderTemplate.ts so the data (-d) and
output (-o) paths point to the Utilities/Evals location rather than
skills/Evals; specifically edit the invocation of
~/.opencode/Templates/Tools/RenderTemplate.ts (the command referencing
Evals/Rubric.hbs) and change any occurrences of
~/.opencode/skills/Evals/UseCases//rubric.yaml and
~/.opencode/skills/Evals/UseCases//rubric.md to
~/.opencode/skills/Utilities/Evals/UseCases//rubric.yaml and
~/.opencode/skills/Utilities/Evals/UseCases//rubric.md respectively.


</details>

</blockquote></details>
<details>
<summary>.opencode/skills/Utilities/Evals/TemplateIntegration.md-21-25 (1)</summary><blockquote>

`21-25`: _⚠️ Potential issue_ | _🟡 Minor_

**Pfadinkonsistenz nach Skill-Reorganisation.**

Die CLI-Beispiele verwenden den alten Pfad `~/.opencode/skills/Evals/` statt des neuen Pfads `~/.opencode/skills/Utilities/Evals/`. Dies könnte zu Verwirrung oder Ausführungsfehlern führen.


<details>
<summary>🔧 Vorgeschlagene Korrektur</summary>

```diff
 ```bash
 bun run ~/.opencode/Templates/Tools/RenderTemplate.ts \
   -t Evals/Judge.hbs \
-  -d ~/.opencode/skills/Evals/UseCases/<name>/judge-config.yaml \
-  -o ~/.opencode/skills/Evals/UseCases/<name>/judge-prompt.md
+  -d ~/.opencode/skills/Utilities/Evals/UseCases/<name>/judge-config.yaml \
+  -o ~/.opencode/skills/Utilities/Evals/UseCases/<name>/judge-prompt.md
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Evals/TemplateIntegration.md around lines 21 -
25, Die CLI-Beispiele verwenden noch den alten Pfad
~/.opencode/skills/Evals/UseCases/<name>/...; aktualisiere alle occurrences in
der Anleitung so die Befehle auf
~/.opencode/skills/Utilities/Evals/UseCases/<name>/judge-config.yaml und
.../judge-prompt.md zeigen (bei dem Render-Template-Aufruf mit
RenderTemplate.ts und der Vorlage Evals/Judge.hbs), also ersetze die beiden -d
und -o Pfade auf die neuen Utilities/Evals Pfade.


</details>

</blockquote></details>
<details>
<summary>.opencode/skills/Utilities/Evals/CLIReference.md-51-54 (1)</summary><blockquote>

`51-54`: _⚠️ Potential issue_ | _🟡 Minor_

**Pfad zum EvalServer-Verzeichnis aktualisieren.**


<details>
<summary>🔧 Vorgeschlagene Korrektur</summary>

```diff
-cd ~/.opencode/skills/Evals/EvalServer
+cd ~/.opencode/skills/Utilities/Evals/EvalServer
 bun run dev  # Starts on http://localhost:5173
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Evals/CLIReference.md around lines 51 - 54,
Aktualisiere den Pfad zur EvalServer-Startanweisung: ersetze die Zeile mit "cd
~/.opencode/skills/Evals/EvalServer" durch den korrekten Verzeichnis-Pfad zum
EvalServer (z. B. "cd ~/.opencode/skills/Utilities/Evals/EvalServer") und lasse
die anschließende Startanweisung "bun run dev  # Starts on
http://localhost:5173" unverändert; überprüfe, dass der Pfadstring in der Datei
CLIReference.md mit dem tatsächlichen Projektverzeichnis übereinstimmt.
.opencode/skills/Utilities/Evals/CLIReference.md-68-83 (1)

68-83: ⚠️ Potential issue | 🟡 Minor

Storage-Pfade und fehlende Sprachspezifikation.

Der Verzeichnisbaum verwendet den alten Pfad und benötigt eine Sprachspezifikation.

🔧 Vorgeschlagene Korrektur
-```
-~/.opencode/skills/Evals/
+```text
+~/.opencode/skills/Utilities/Evals/
 ├── UseCases/
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Evals/CLIReference.md around lines 68 - 83,
Update the example directory header to the correct path and add a language
specifier for the code block: replace the old path string
"~/.opencode/skills/Evals/" with "~/.opencode/skills/Utilities/Evals/" and
ensure the fenced block is marked (e.g., "text") so the tree snippet is rendered
correctly; locate the snippet that begins with "~/.opencode/skills/Evals/" and
adjust that header and its fence language while leaving the UseCases/ Results/
EvalServer/ structure unchanged.
.opencode/skills/Utilities/Evals/SKILL.md-113-116 (1)

113-116: ⚠️ Potential issue | 🟡 Minor

Pfadinkonsistenz in ALGORITHM-Integration.

Auch der AlgorithmBridge-Pfad in diesem Abschnitt muss aktualisiert werden.

🔧 Vorgeschlagene Korrektur
 # Run eval and update ISC row
-bun run ~/.opencode/skills/Evals/Tools/AlgorithmBridge.ts -s regression-core -r 3 -u
+bun run ~/.opencode/skills/Utilities/Evals/Tools/AlgorithmBridge.ts -s regression-core -r 3 -u
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Evals/SKILL.md around lines 113 - 116, Die
Dokumentation enthält eine veraltete Pfadangabe zur AlgorithmBridge-Skriptdatei;
aktualisiere die bun run-Aufrufzeile in
.opencode/skills/Utilities/Evals/SKILL.md so dass sie auf den aktuellen
Speicherort des AlgorithmBridge-Skripts verweist (verwende den korrekten Pfad
zur Datei AlgorithmBridge.ts im neuen Tools-/Evals-Layout) und prüfe, dass der
Aufruf weiterhin die Flags (-s, -r, -u) unverändert übergibt.
.opencode/skills/Utilities/Evals/SKILL.md-92-107 (1)

92-107: ⚠️ Potential issue | 🟡 Minor

Mehrere Pfadinkonsistenzen in CLI-Referenz.

Alle CLI-Pfade verwenden das alte Schema ~/.opencode/skills/Evals/ statt ~/.opencode/skills/Utilities/Evals/. Dies betrifft AlgorithmBridge.ts, FailureToTask.ts und SuiteManager.ts.

🔧 Vorgeschlagene Korrektur für alle CLI-Befehle
 # Run an eval suite
-bun run ~/.opencode/skills/Evals/Tools/AlgorithmBridge.ts -s <suite>
+bun run ~/.opencode/skills/Utilities/Evals/Tools/AlgorithmBridge.ts -s <suite>

 # Log a failure for later conversion
-bun run ~/.opencode/skills/Evals/Tools/FailureToTask.ts log "description" -c category -s severity
+bun run ~/.opencode/skills/Utilities/Evals/Tools/FailureToTask.ts log "description" -c category -s severity

 # Convert failures to test tasks
-bun run ~/.opencode/skills/Evals/Tools/FailureToTask.ts convert-all
+bun run ~/.opencode/skills/Utilities/Evals/Tools/FailureToTask.ts convert-all

 # Manage suites
-bun run ~/.opencode/skills/Evals/Tools/SuiteManager.ts create <name> -t capability -d "description"
-bun run ~/.opencode/skills/Evals/Tools/SuiteManager.ts list
-bun run ~/.opencode/skills/Evals/Tools/SuiteManager.ts check-saturation <name>
-bun run ~/.opencode/skills/Evals/Tools/SuiteManager.ts graduate <name>
+bun run ~/.opencode/skills/Utilities/Evals/Tools/SuiteManager.ts create <name> -t capability -d "description"
+bun run ~/.opencode/skills/Utilities/Evals/Tools/SuiteManager.ts list
+bun run ~/.opencode/skills/Utilities/Evals/Tools/SuiteManager.ts check-saturation <name>
+bun run ~/.opencode/skills/Utilities/Evals/Tools/SuiteManager.ts graduate <name>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Evals/SKILL.md around lines 92 - 107, Update the
CLI examples in SKILL.md to use the correct path prefix by replacing occurrences
of ~/.opencode/skills/Evals/ with ~/.opencode/skills/Utilities/Evals/ for the
command examples referencing AlgorithmBridge.ts, FailureToTask.ts, and
SuiteManager.ts; edit the section containing the bash snippets so each
invocation (AlgorithmBridge.ts, FailureToTask.ts log/convert-all, and
SuiteManager.ts create/list/check-saturation/graduate) points to
~/.opencode/skills/Utilities/Evals/ to ensure the documented paths match the
actual file locations.
.opencode/skills/Utilities/Evals/CLIReference.md-17-43 (1)

17-43: ⚠️ Potential issue | 🟡 Minor

Pfadinkonsistenzen in allen CLI-Befehlen.

Alle CLI-Pfade verweisen auf das alte Schema ~/.opencode/skills/Evals/ statt auf den neuen Pfad ~/.opencode/skills/Utilities/Evals/.

🔧 Vorgeschlagene Korrektur
 # Create new use case
-bun run ~/.opencode/skills/Evals/EvalServer/cli.ts use-case create --name <name>
+bun run ~/.opencode/skills/Utilities/Evals/EvalServer/cli.ts use-case create --name <name>

 # List all use cases
-bun run ~/.opencode/skills/Evals/EvalServer/cli.ts use-case list
+bun run ~/.opencode/skills/Utilities/Evals/EvalServer/cli.ts use-case list

 # Show use case details
-bun run ~/.opencode/skills/Evals/EvalServer/cli.ts use-case show <name>
+bun run ~/.opencode/skills/Utilities/Evals/EvalServer/cli.ts use-case show <name>

Gleiches gilt für Test Case Management (Zeilen 30-36) und Run Evaluations (Zeilen 40-43).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Evals/CLIReference.md around lines 17 - 43,
Update all example CLI command paths to use the new directory prefix
~/.opencode/skills/Utilities/Evals/ instead of the old
~/.opencode/skills/Evals/; specifically edit the use-case commands (references
to cli.ts), test-case commands (add/list referencing cli.ts), and the run
evaluation command (cli-run.ts) so each invocation points to
~/.opencode/skills/Utilities/Evals/cli.ts or
~/.opencode/skills/Utilities/Evals/cli-run.ts respectively, ensuring consistent
paths throughout the file.
.opencode/skills/Utilities/Evals/Graders/ModelBased/PairwiseComparison.ts-67-81 (1)

67-81: ⚠️ Potential issue | 🟡 Minor

Inkonsistenz zwischen aggregateWinner und score bei Position-Swap.

Die Score-Berechnung in Zeilen 78-81 überschreibt den zuvor gesetzten Score, aber aggregateWinner bleibt unverändert. Dadurch kann z.B. aggregateWinner = 'output' mit score = 0.25 kombiniert werden, was widersprüchlich ist.

♻️ Konsistente Berechnung
+    // Berechne Score und Winner konsistent
+    if (positionSwap && results.length === 2) {
+      score = (outputWins + ties * 0.5) / 2;
+      if (score > 0.5) {
+        aggregateWinner = 'output';
+      } else if (score < 0.5) {
+        aggregateWinner = 'reference';
+      } else {
+        aggregateWinner = 'tie';
+      }
+    } else {
       if (outputWins > referenceWins) {
         score = 1.0;
         aggregateWinner = 'output';
       } else if (referenceWins > outputWins) {
         score = 0.0;
         aggregateWinner = 'reference';
       } else {
         score = 0.5;
         aggregateWinner = 'tie';
       }
-
-    // For the score, also consider partial wins
-    if (positionSwap && results.length === 2) {
-      score = (outputWins + ties * 0.5) / 2;
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Evals/Graders/ModelBased/PairwiseComparison.ts
around lines 67 - 81, The position-swap branch updates score without updating
aggregateWinner, causing inconsistent pairs (e.g., aggregateWinner='output' with
score=0.25); after computing the adjusted score in the positionSwap &&
results.length === 2 block (which uses outputWins, referenceWins, ties),
recompute and set aggregateWinner so it matches the new score (e.g.,
aggregateWinner = 'output' if score > 0.5, 'reference' if score < 0.5, 'tie' if
score === 0.5), keeping the existing variables score and aggregateWinner in
PairwiseComparison.ts consistent.
.opencode/skills/Utilities/Evals/Graders/Base.ts-94-122 (1)

94-122: ⚠️ Potential issue | 🟡 Minor

Fehlende Fehlerbehandlung in runGraders.

Wenn ein Grader während grade() eine Exception wirft, bricht die gesamte Aggregation ab. Die anderen Grader werden nicht ausgeführt.

♻️ Robustere Implementierung
   for (const grader of graders) {
-    const result = await grader.grade(context);
-    results.push(result);
+    try {
+      const result = await grader.grade(context);
+      results.push(result);
+    } catch (e) {
+      // Erzeuge Fehler-Ergebnis statt Abbruch
+      results.push({
+        grader_type: grader.type,
+        weight: grader.getWeight(),
+        score: 0,
+        passed: false,
+        duration_ms: 0,
+        reasoning: `Grader error: ${e}`,
+      });
+      if (grader.isRequired()) {
+        allRequiredPassed = false;
+      }
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Evals/Graders/Base.ts around lines 94 - 122, The
runGraders function should catch exceptions from each grader.grade call so one
failing grader doesn't stop the loop; wrap the await grader.grade(context) in a
try/catch, and on error push a GraderResult that indicates failure (e.g., score
0, passed false, include an error message in result.details or similar) while
still using grader.getWeight() and grader.isRequired() for aggregation; ensure
totalWeight/weightedSum logic remains correct (use 0 score for errored graders)
and that allRequiredPassed is set false for required graders that errored so
final aggregate_score and passed reflect the failure.
.opencode/skills/Utilities/Evals/Graders/ModelBased/NaturalLanguageAssert.ts-109-117 (1)

109-117: ⚠️ Potential issue | 🟡 Minor

Fallback-Parsing ist unzuverlässig.

Die Fallback-Logik prüft nur, ob der Text "assertion X" UND "true" enthält, unabhängig vom Kontext. Ein Satz wie "assertion 1 is definitely not true" würde fälschlicherweise als passed: true interpretiert.

♻️ Vorgeschlagene Verbesserung
       // Try to find by content if numbered format didn't work
-      const containsTrue = text.toLowerCase().includes(`assertion ${i + 1}`) &&
-        text.toLowerCase().includes('true');
+      // Fallback: Suche nach "assertion X: true" oder ähnlichen Mustern
+      const fallbackPattern = new RegExp(
+        `assertion\\s*${i + 1}[:\\s]*(is\\s+)?true`,
+        'i'
+      );
+      const containsTrue = fallbackPattern.test(text);
 
       return {
         assertion,
         passed: containsTrue,
-        explanation: 'Could not parse result',
+        explanation: containsTrue ? 'Parsed via fallback' : 'Could not parse result',
       };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Evals/Graders/ModelBased/NaturalLanguageAssert.ts
around lines 109 - 117, The current fallback uses a simple containsTrue check
that looks for "assertion X" and "true" anywhere, causing false positives;
replace that with a targeted parsing using a regex or token scan that locates
the specific "assertion {i+1}" occurrence and then inspects the immediate clause
for a boolean value (e.g., "assertion 1: true", "assertion 1 is true", or
"assertion 1 — false") while explicitly rejecting matches where negation words
("not", "isn't", "never", "no", "false") appear in close proximity to "true";
update the logic around the containsTrue variable (and the returned passed
value) so it uses this clause-scoped result and still falls back to 'Could not
parse result' if no unambiguous boolean is found.
.opencode/skills/Utilities/Cloudflare/SKILL.md-27-27 (1)

27-27: ⚠️ Potential issue | 🟡 Minor

Markdownlint-Befunde (MD040/MD058) bitte bereinigen.

Mehrere Code-Fences haben keine Sprachangabe (Lines 27, 42, 84, 92, 100), und die Tabelle um Line 72 sollte von Leerzeilen umgeben sein.

Also applies to: 42-42, 72-74, 84-84, 92-92, 100-100

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Cloudflare/SKILL.md at line 27, Update the
Markdown to satisfy markdownlint rules by adding a language identifier to each
code fence at the specified locations (the triple-backtick code fences at lines
27, 42, 84, 92, and 100) — choose the appropriate language tag for the snippet
(e.g., json, bash, js, md) so each fence becomes ```<lang> ... ```, and ensure
the table near line 72 is surrounded by a blank line above and below it (insert
one empty line before the table start and one after the table end) so the table
is properly separated from surrounding content.
.opencode/skills/Utilities/Cloudflare/SKILL.md-51-52 (1)

51-52: ⚠️ Potential issue | 🟡 Minor

Account-ID-Variablenname ist inkonsistent zu Workflows/Create.md.

In Line 51 steht CF_ACCOUNT_ID, im Create-Workflow wird CLOUDFLARE_ACCOUNT_ID verwendet. Einheitliche Benennung reduziert Fehlkonfigurationen.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Cloudflare/SKILL.md around lines 51 - 52, The ENV
name for the Cloudflare account is inconsistent: SKILL.md uses CF_ACCOUNT_ID
while Workflows/Create.md uses CLOUDFLARE_ACCOUNT_ID; update SKILL.md to use
CLOUDFLARE_ACCOUNT_ID (or vice‑versa across docs) so the environment variable
name is consistent, and ensure the referenced symbol CF_ACCOUNT_ID in the
"Account ID" line is replaced with CLOUDFLARE_ACCOUNT_ID and any related
examples or README references are aligned.
.opencode/skills/Utilities/Cloudflare/Workflows/Create.md-18-25 (1)

18-25: ⚠️ Potential issue | 🟡 Minor

Code-Fence ohne Sprachangabe (MD040).

Der Block ab Line 18 sollte mit einer Sprache markiert werden (hier z. B. text), damit Markdownlint sauber bleibt.

🔧 Vorschlag
-```
+```text
 workers/mcp-server-name/
 ├── src/
 │   └── simple.js      # Main worker code
 ├── wrangler.toml      # Cloudflare config
 ├── package.json       # Dependencies
 └── README.md          # Documentation
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Cloudflare/Workflows/Create.md around lines 18 -
25, The fenced code block in Create.md (the directory tree starting with
"workers/mcp-server-name/") is missing a language tag which triggers MD040;
update that block to include a language identifier (e.g., add ```text instead of

.opencode/skills/Utilities/Cloudflare/Workflows/Create.md and prepend the
language token to the opening backticks for the sample directory listing.
.opencode/skills/Utilities/Documents/SKILL.md-27-28 (1)

27-28: ⚠️ Potential issue | 🟡 Minor

Fenced code blocks ohne Sprachangabe (MD040).

Bitte den Blöcken eine Sprache geben (z. B. text), damit Markdown-Lint stabil bleibt.

✅ Beispiel-Fix
-```
+```text
 Running the **WorkflowName** workflow in the **Documents** skill to ACTION...
</details>


Also applies to: 204-205, 213-214, 222-223

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Documents/SKILL.md around lines 27 - 28, Several
fenced code blocks in SKILL.md (e.g., the block containing "Running the
WorkflowName workflow in the Documents skill to ACTION...") are missing
a language tag; add a language (for example text) after the opening backticks
for those blocks (also update the other affected blocks referenced at lines
204-205, 213-214, 222-223) so Markdown lint rule MD040 is satisfied and the
document renders consistently.


</details>

</blockquote></details>
<details>
<summary>.opencode/skills/Utilities/Cloudflare/Workflows/Troubleshoot.md-40-42 (1)</summary><blockquote>

`40-42`: _⚠️ Potential issue_ | _🟡 Minor_

**Formulierung im Workflow ist unprofessionell und sollte neutralisiert werden.**

Bitte die Formulierung sachlich halten, damit der Workflow teamtauglich bleibt.

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Cloudflare/Workflows/Troubleshoot.md around lines
40 - 42, Die Überschrift und die Anweisung "NEVER USE FUCKING CURL, ONLY
BROWSER-AUTOMATION (or Chrome MCP fallback)" sind unprofessionell; ersetze diese
durch sachliche Formulierungen, z.B. eine neutrale Überschrift und eine klare
Anweisung wie "Verwenden Sie bevorzugt Browser-Automation (Chrome MCP als
Fallback)". Erhalte den inhaltlichen Hinweis auf "Chrome MCP tools" und die
Anweisung zur Nutzung von "browser navigate/screenshot/act", entferne
Großschrei-Formatierung und Kraftausdrücke und formuliere die Zeilen sachlich
und teamfähig.


</details>

</blockquote></details>
<details>
<summary>.opencode/skills/Utilities/Cloudflare/Workflows/Troubleshoot.md-499-532 (1)</summary><blockquote>

`499-532`: _⚠️ Potential issue_ | _🟡 Minor_

**Example-Output-Codeblock ohne Sprachangabe (MD040).**

Bitte `text` für den Fence setzen.

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Cloudflare/Workflows/Troubleshoot.md around lines
499 - 532, Die Markdown code-fence für das Beispiel-Ausgabe-Block (der Block,
der mit "🔍 Cloudflare Deployment Troubleshooter" beginnt) hat keine
Sprachangabe; aktualisiere die öffnende Fence von auftext, damit der
Fence die Sprache "text" setzt und die MD040-Regel erfüllt, und speichere die
geänderte Troubleshooter-Block-Darstellung unverändert.


</details>

</blockquote></details>
<details>
<summary>.opencode/skills/Utilities/Docx/Scripts/document.py-60-68 (1)</summary><blockquote>

`60-68`: _⚠️ Potential issue_ | _🟡 Minor_

**Default-Autor ist weiterhin `Claude` statt OpenCode-neutral.**

Für die angestrebte Claude→OpenCode-Bereinigung sollten die Defaults neutral bzw. konfigurierbar standardisiert werden.  



Also applies to: 620-621

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Docx/Scripts/document.py around lines 60 - 68,
Change the hard-coded defaults for author and initials to neutral/configurable
values: replace the constructor signature default author "Claude" and initials
"C" in the __init__ method (def __init__(self, xml_path, rsid: str, author: str
= "Claude", initials: str = "C")) with a neutral default such as "OpenCode" and
"OC" or make them configurable via a module-level constant/ENV var; also update
the other occurrence in the file where the same defaults are set (the similar
initializer near the bottom referenced in the review) so both places use the new
neutral/configurable defaults and ensure any docstring/comments reflect the
change.
```

</details>

</blockquote></details>
<details>
<summary>.opencode/skills/Utilities/Docx/SKILL.md-229-246 (1)</summary><blockquote>

`229-246`: _⚠️ Potential issue_ | _🟡 Minor_

**Beispiel-Codeblöcke ohne Sprachangabe (MD040).**

Bitte `text` als Fence-Sprache ergänzen.  


<details>
<summary>✅ Beispiel-Fix</summary>

```diff
-```
+```text
 User: "Create a Word doc with my consulting proposal"
 → Reads docx-js.md for syntax
 ...
 ```
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Docx/SKILL.md around lines 229 - 246, The MD040
warning indicates the example code fences lack a language; update each
triple-backtick fence used for the examples (the three blocks that start with
User: "Create a Word doc...", User: "Add tracked changes...", and User: "What's
in this Word doc?") to include the language specifier text (i.e., change totext) so the fenced blocks declare text as their language; ensure you
only modify the fence markers and not the inner content.


</details>

</blockquote></details>
<details>
<summary>.opencode/skills/Utilities/Aphorisms/SKILL.md-14-33 (1)</summary><blockquote>

`14-33`: _⚠️ Potential issue_ | _🟡 Minor_

**PAI-v3.0-SKILL-Format ist unvollständig (OPTIONAL-Abschnitt fehlt)**

Die Datei hat einen klaren MANDATORY-Block, aber keinen expliziten OPTIONAL-Block. Bitte ergänzen, damit das Skill-Template konsistent bleibt.  
As per coding guidelines, "SKILL.md follows PAI v3.0 format (USE WHEN triggers, MANDATORY/OPTIONAL sections)".

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Aphorisms/SKILL.md around lines 14 - 33, The
SKILL.md currently contains a MANDATORY voice-notification block but is missing
the required OPTIONAL section per PAI-v3.0-SKILL-Format; add an explicit
OPTIONAL block below the MANDATORY block that provides fallback or additional
steps (e.g., alternative notification methods, examples of non-blocking
text-only notifications, and any post-invocation cleanup), label it "##
OPTIONAL" and include the example curl/text output formats consistent with the
existing MANDATORY content so the document follows the MANDATORY/OPTIONAL
structure expected by the skill template.
```

</details>

</blockquote></details>
<details>
<summary>.opencode/skills/Utilities/Aphorisms/Workflows/SearchAphorisms.md-39-76 (1)</summary><blockquote>

`39-76`: _⚠️ Potential issue_ | _🟡 Minor_

**Mehrere Codeblöcke ohne Sprach-Tag (MD040)**

Bitte die Fences mit einer Sprache versehen (z. B. `text`, `markdown`, `bash`), sonst bleibt der markdownlint-Fehler bestehen.



Also applies to: 118-119, 135-136, 151-152, 164-165, 181-182, 204-205, 227-228, 353-354, 364-365, 375-376, 386-387, 401-402, 418-419, 434-435, 492-493, 508-509, 529-530

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Aphorisms/Workflows/SearchAphorisms.md around
lines 39 - 76, The markdown contains multiple fenced code blocks without a
language tag (e.g., the blocks starting with lines like User: "Search aphorisms about resilience", User: "Search for quotes with 'curiosity'", and other
User: examples); add an appropriate language identifier (for example text or
markdown) immediately after each opening triple-backtick so the fences become
text (or markdown) to satisfy MD040 and eliminate the markdownlint errors
across all similar blocks referenced in the file.


</details>

</blockquote></details>

</blockquote></details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: Path: .coderabbit.yaml

**Review profile**: CHILL

**Plan**: Pro

**Run ID**: `1984514d-42e2-4019-8082-43b3e62f7f5a`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between d9e691ba460679f729902e4d0912cfb50c8aa58c and 8a4140d450cd4f8604d38209997dae7b4072e0c4.

</details>

<details>
<summary>⛔ Files ignored due to path filters (1)</summary>

* `.opencode/skills/Utilities/Browser/bun.lock` is excluded by `!**/*.lock`, `!**/*.lock`

</details>

<details>
<summary>📒 Files selected for processing (129)</summary>

* `.opencode/skills/Utilities/Aphorisms/Database/aphorisms.md`
* `.opencode/skills/Utilities/Aphorisms/SKILL.md`
* `.opencode/skills/Utilities/Aphorisms/Workflows/AddAphorism.md`
* `.opencode/skills/Utilities/Aphorisms/Workflows/FindAphorism.md`
* `.opencode/skills/Utilities/Aphorisms/Workflows/ResearchThinker.md`
* `.opencode/skills/Utilities/Aphorisms/Workflows/SearchAphorisms.md`
* `.opencode/skills/Utilities/Browser/README.md`
* `.opencode/skills/Utilities/Browser/SKILL.md`
* `.opencode/skills/Utilities/Browser/Tools/Browse.ts`
* `.opencode/skills/Utilities/Browser/Tools/BrowserSession.ts`
* `.opencode/skills/Utilities/Browser/Workflows/Extract.md`
* `.opencode/skills/Utilities/Browser/Workflows/Interact.md`
* `.opencode/skills/Utilities/Browser/Workflows/Screenshot.md`
* `.opencode/skills/Utilities/Browser/Workflows/Update.md`
* `.opencode/skills/Utilities/Browser/Workflows/VerifyPage.md`
* `.opencode/skills/Utilities/Browser/examples/comprehensive-test.ts`
* `.opencode/skills/Utilities/Browser/examples/screenshot.ts`
* `.opencode/skills/Utilities/Browser/examples/verify-page.ts`
* `.opencode/skills/Utilities/Browser/index.ts`
* `.opencode/skills/Utilities/Browser/package.json`
* `.opencode/skills/Utilities/Browser/tsconfig.json`
* `.opencode/skills/Utilities/Cloudflare/SKILL.md`
* `.opencode/skills/Utilities/Cloudflare/Workflows/Create.md`
* `.opencode/skills/Utilities/Cloudflare/Workflows/Troubleshoot.md`
* `.opencode/skills/Utilities/CreateCLI/FrameworkComparison.md`
* `.opencode/skills/Utilities/CreateCLI/Patterns.md`
* `.opencode/skills/Utilities/CreateCLI/SKILL.md`
* `.opencode/skills/Utilities/CreateCLI/TypescriptPatterns.md`
* `.opencode/skills/Utilities/CreateCLI/Workflows/AddCommand.md`
* `.opencode/skills/Utilities/CreateCLI/Workflows/CreateCli.md`
* `.opencode/skills/Utilities/CreateCLI/Workflows/UpgradeTier.md`
* `.opencode/skills/Utilities/CreateSkill/SKILL.md`
* `.opencode/skills/Utilities/CreateSkill/Workflows/CanonicalizeSkill.md`
* `.opencode/skills/Utilities/CreateSkill/Workflows/CreateSkill.md`
* `.opencode/skills/Utilities/CreateSkill/Workflows/UpdateSkill.md`
* `.opencode/skills/Utilities/CreateSkill/Workflows/ValidateSkill.md`
* `.opencode/skills/Utilities/Delegation/SKILL.md`
* `.opencode/skills/Utilities/Documents/SKILL.md`
* `.opencode/skills/Utilities/Documents/Workflows/ProcessLargePdfGemini3.md`
* `.opencode/skills/Utilities/Docx/LICENSE.txt`
* `.opencode/skills/Utilities/Docx/Ooxml/Scripts/pack.py`
* `.opencode/skills/Utilities/Docx/Ooxml/Scripts/unpack.py`
* `.opencode/skills/Utilities/Docx/Ooxml/Scripts/validate.py`
* `.opencode/skills/Utilities/Docx/SKILL.md`
* `.opencode/skills/Utilities/Docx/Scripts/__init__.py`
* `.opencode/skills/Utilities/Docx/Scripts/document.py`
* `.opencode/skills/Utilities/Docx/Scripts/utilities.py`
* `.opencode/skills/Utilities/Docx/docx-js.md`
* `.opencode/skills/Utilities/Docx/ooxml.md`
* `.opencode/skills/Utilities/Evals/BestPractices.md`
* `.opencode/skills/Utilities/Evals/CLIReference.md`
* `.opencode/skills/Utilities/Evals/Data/DomainPatterns.yaml`
* `.opencode/skills/Utilities/Evals/Graders/Base.ts`
* `.opencode/skills/Utilities/Evals/Graders/CodeBased/BinaryTests.ts`
* `.opencode/skills/Utilities/Evals/Graders/CodeBased/RegexMatch.ts`
* `.opencode/skills/Utilities/Evals/Graders/CodeBased/StateCheck.ts`
* `.opencode/skills/Utilities/Evals/Graders/CodeBased/StaticAnalysis.ts`
* `.opencode/skills/Utilities/Evals/Graders/CodeBased/StringMatch.ts`
* `.opencode/skills/Utilities/Evals/Graders/CodeBased/ToolCallVerification.ts`
* `.opencode/skills/Utilities/Evals/Graders/CodeBased/index.ts`
* `.opencode/skills/Utilities/Evals/Graders/ModelBased/LLMRubric.ts`
* `.opencode/skills/Utilities/Evals/Graders/ModelBased/NaturalLanguageAssert.ts`
* `.opencode/skills/Utilities/Evals/Graders/ModelBased/PairwiseComparison.ts`
* `.opencode/skills/Utilities/Evals/Graders/ModelBased/index.ts`
* `.opencode/skills/Utilities/Evals/Graders/index.ts`
* `.opencode/skills/Utilities/Evals/PROJECT.md`
* `.opencode/skills/Utilities/Evals/SKILL.md`
* `.opencode/skills/Utilities/Evals/ScienceMapping.md`
* `.opencode/skills/Utilities/Evals/ScorerTypes.md`
* `.opencode/skills/Utilities/Evals/Suites/Regression/core-behaviors.yaml`
* `.opencode/skills/Utilities/Evals/TemplateIntegration.md`
* `.opencode/skills/Utilities/Evals/Tools/AlgorithmBridge.ts`
* `.opencode/skills/Utilities/Evals/Tools/FailureToTask.ts`
* `.opencode/skills/Utilities/Evals/Tools/SuiteManager.ts`
* `.opencode/skills/Utilities/Evals/Tools/TranscriptCapture.ts`
* `.opencode/skills/Utilities/Evals/Tools/TrialRunner.ts`
* `.opencode/skills/Utilities/Evals/Types/index.ts`
* `.opencode/skills/Utilities/Evals/UseCases/Regression/task_file_targeting_basic.yaml`
* `.opencode/skills/Utilities/Evals/UseCases/Regression/task_no_hallucinated_paths.yaml`
* `.opencode/skills/Utilities/Evals/UseCases/Regression/task_tool_sequence_read_before_edit.yaml`
* `.opencode/skills/Utilities/Evals/UseCases/Regression/task_verification_before_done.yaml`
* `.opencode/skills/Utilities/Evals/Workflows/CompareModels.md`
* `.opencode/skills/Utilities/Evals/Workflows/ComparePrompts.md`
* `.opencode/skills/Utilities/Evals/Workflows/CreateJudge.md`
* `.opencode/skills/Utilities/Evals/Workflows/CreateUseCase.md`
* `.opencode/skills/Utilities/Evals/Workflows/RunEval.md`
* `.opencode/skills/Utilities/Evals/Workflows/ViewResults.md`
* `.opencode/skills/Utilities/PAIUpgrade/SKILL.md`
* `.opencode/skills/Utilities/PAIUpgrade/Tools/Anthropic.ts`
* `.opencode/skills/Utilities/PAIUpgrade/Workflows/CheckForUpgrades.md`
* `.opencode/skills/Utilities/PAIUpgrade/Workflows/FindSources.md`
* `.opencode/skills/Utilities/PAIUpgrade/Workflows/ReleaseNotesDeepDive.md`
* `.opencode/skills/Utilities/PAIUpgrade/Workflows/ResearchUpgrade.md`
* `.opencode/skills/Utilities/PAIUpgrade/sources.json`
* `.opencode/skills/Utilities/PAIUpgrade/youtube-channels.json`
* `.opencode/skills/Utilities/Parser/EntitySystem.md`
* `.opencode/skills/Utilities/Parser/Lib/parser.ts`
* `.opencode/skills/Utilities/Parser/Lib/validators.ts`
* `.opencode/skills/Utilities/Parser/Prompts/entity-extraction.md`
* `.opencode/skills/Utilities/Parser/Prompts/link-analysis.md`
* `.opencode/skills/Utilities/Parser/Prompts/summarization.md`
* `.opencode/skills/Utilities/Parser/Prompts/topic-classification.md`
* `.opencode/skills/Utilities/Parser/README.md`
* `.opencode/skills/Utilities/Parser/SKILL.md`
* `.opencode/skills/Utilities/Parser/Schema/content-schema.json`
* `.opencode/skills/Utilities/Parser/Schema/schema.ts`
* `.opencode/skills/Utilities/Parser/Tests/fixtures/example-output.json`
* `.opencode/skills/Utilities/Parser/Utils/collision-detection.ts`
* `.opencode/skills/Utilities/Parser/Web/README.md`
* `.opencode/skills/Utilities/Parser/Web/debug.html`
* `.opencode/skills/Utilities/Parser/Web/index.html`
* `.opencode/skills/Utilities/Parser/Web/parser.js`
* `.opencode/skills/Utilities/Parser/Web/simple-test.html`
* `.opencode/skills/Utilities/Parser/Web/styles.css`
* `.opencode/skills/Utilities/Parser/Workflows/BatchEntityExtractionGemini3.md`
* `.opencode/skills/Utilities/Parser/Workflows/CollisionDetection.md`
* `.opencode/skills/Utilities/Parser/Workflows/DetectContentType.md`
* `.opencode/skills/Utilities/Parser/Workflows/ExtractArticle.md`
* `.opencode/skills/Utilities/Parser/Workflows/ExtractBrowserExtension.md`
* `.opencode/skills/Utilities/Parser/Workflows/ExtractNewsletter.md`
* `.opencode/skills/Utilities/Parser/Workflows/ExtractPdf.md`
* `.opencode/skills/Utilities/Parser/Workflows/ExtractTwitter.md`
* `.opencode/skills/Utilities/Parser/Workflows/ExtractYoutube.md`
* `.opencode/skills/Utilities/Parser/Workflows/ParseContent.md`
* `.opencode/skills/Utilities/Parser/entity-index.json`
* `.opencode/skills/Utilities/Pdf/LICENSE.txt`
* `.opencode/skills/Utilities/Pdf/SKILL.md`
* `.opencode/skills/Utilities/Pdf/Scripts/check_bounding_boxes.py`
* `.opencode/skills/Utilities/Pdf/Scripts/check_bounding_boxes_test.py`

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Comment on lines +89 to +90
private pendingDialog: DialogInfo | null = null
private autoHandleDialogs: boolean = false
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Inspizieren der Datei .opencode/skills/Utilities/Browser/index.ts
wc -l .opencode/skills/Utilities/Browser/index.ts

Repository: Steffen025/pai-opencode

Length of output: 114


🏁 Script executed:

# Zeilen 1-100 inspizieren (Importe und Klassendefinition)
head -100 .opencode/skills/Utilities/Browser/index.ts | cat -n

Repository: Steffen025/pai-opencode

Length of output: 3340


🏁 Script executed:

# Zeilen 80-120 inspizieren (Dialog-Properties und attendPageListeners)
sed -n '80,120p' .opencode/skills/Utilities/Browser/index.ts | cat -n

Repository: Steffen025/pai-opencode

Length of output: 1673


🏁 Script executed:

# Zeilen 180-210 inspizieren (Dialog Event Listener)
sed -n '180,210p' .opencode/skills/Utilities/Browser/index.ts | cat -n

Repository: Steffen025/pai-opencode

Length of output: 1104


🏁 Script executed:

# Zeilen 686-720 inspizieren (handleDialog-Methode)
sed -n '686,720p' .opencode/skills/Utilities/Browser/index.ts | cat -n

Repository: Steffen025/pai-opencode

Length of output: 1327


Dialog-Handle wird nicht gespeichert – handleDialog() kann aktiven Dialog nicht verarbeiten.

Aktuell wird in attachPageListeners() (Zeilen 182–199) nur DialogInfo gespeichert, nicht der Dialog-Handle selbst. In handleDialog() (Zeilen 686–709) wird dann versucht, den nächsten Dialog mit page.once('dialog', ...) zu behandeln, statt den bereits gepufferten Dialog zu verarbeiten. Dies führt dazu, dass der aktive Dialog nicht geschlossen werden kann.

Der Import (Zeile 15) enthält ebenfalls nicht den Dialog-Typ.

🔧 Vorschlag für einen korrekten Fix
-import { chromium, firefox, webkit, type Browser, type Page, type BrowserContext } from 'playwright'
+import { chromium, firefox, webkit, type Browser, type Page, type BrowserContext, type Dialog } from 'playwright'

 export class PlaywrightBrowser {
   private browser: Browser | null = null
   private context: BrowserContext | null = null
   private page: Page | null = null
   private consoleLogs: ConsoleLogEntry[] = []
   private networkLogs: NetworkLogEntry[] = []
   private requestTimings: Map<string, number> = new Map()
   private pendingDialog: DialogInfo | null = null
+  private pendingDialogHandle: Dialog | null = null
   private autoHandleDialogs: boolean = false
   private dialogResponse: string | boolean = true

   // In attachPageListeners():
   page.on('dialog', async dialog => {
+    this.pendingDialogHandle = dialog
     this.pendingDialog = {
       type: dialog.type() as DialogInfo['type'],
       message: dialog.message(),
       defaultValue: dialog.defaultValue()
     }

     if (this.autoHandleDialogs) {
       if (typeof this.dialogResponse === 'string') {
         await dialog.accept(this.dialogResponse)
       } else if (this.dialogResponse) {
         await dialog.accept()
       } else {
         await dialog.dismiss()
       }
       this.pendingDialog = null
+      this.pendingDialogHandle = null
     }
   })

   // In handleDialog():
   async handleDialog(action: 'accept' | 'dismiss', promptText?: string): Promise<void> {
-    const page = this.ensurePage()
+    this.ensurePage()
 
     if (!this.pendingDialog) {
       throw new Error('No pending dialog to handle')
     }
 
-    page.once('dialog', async dialog => {
-      if (action === 'accept') {
-        await dialog.accept(promptText)
-      } else {
-        await dialog.dismiss()
-      }
-    })
+    const dialog = this.pendingDialogHandle
+    if (!dialog) {
+      throw new Error('No pending dialog handle to manage')
+    }
+    if (action === 'accept') {
+      await dialog.accept(promptText)
+    } else {
+      await dialog.dismiss()
+    }
 
     this.pendingDialog = null
+    this.pendingDialogHandle = null
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Browser/index.ts around lines 89 - 90, Die
implementierung puffert derzeit nur ein DialogInfo-Objekt statt des echten
Dialog-Handles, wodurch handleDialog() den aktiven Dialog nicht schließen kann;
passe attachPageListeners() so an, dass es zusätzlich das reale Dialog-Objekt
(Typ Dialog) in einer neuen oder bestehenden Eigenschaft (z.B. pendingDialog:
Dialog | null statt DialogInfo) speichert, erweitere die
Klasse-Property-Definition (remove/replace private pendingDialog: DialogInfo |
null mit private pendingDialog: Dialog | null) und importiere den Dialog-Typ im
Import-Block; ändere handleDialog() so, dass es zuerst prüft ob pendingDialog
vorhanden ist und diesen direkten Handle verwendet (statt page.once('dialog',
...)) und nur wenn kein gepufferter Dialog existiert, soll es
page.once('dialog', ...) verwenden; stelle sicher, dass autoHandleDialogs-Logik
weiterhin funktioniert mit dem neuen pendingDialog-Typ.

Comment on lines +103 to +111
function json(data: any, status = 200): Response {
return new Response(JSON.stringify(data), {
status,
headers: {
'Content-Type': 'application/json',
'Access-Control-Allow-Origin': '*',
'Access-Control-Allow-Methods': 'GET, POST, DELETE, OPTIONS',
'Access-Control-Allow-Headers': 'Content-Type'
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Unauthentifizierte Session-API mit Access-Control-Allow-Origin: * ist kritisch ausnutzbar

Aktuell können fremde Webseiten den lokalen Browser-Server direkt steuern (inkl. /evaluate, /navigate, /stop) und Antworten lesen. Bitte mindestens Host-Bindung auf localhost plus Token-basierte Auth erzwingen und CORS einschränken.

🔒 Härtungsvorschlag
 const CONFIG = {
   port: parseInt(process.env.BROWSER_PORT || '9222'),
+  host: '127.0.0.1',
+  apiToken: process.env.BROWSER_SESSION_TOKEN || '',
   headless: process.env.BROWSER_HEADLESS === 'true',
@@
-function json(data: any, status = 200): Response {
+function json(data: any, status = 200, origin = 'http://localhost'): Response {
   return new Response(JSON.stringify(data), {
@@
-      'Access-Control-Allow-Origin': '*',
+      'Access-Control-Allow-Origin': origin,
@@
 }
+
+function requireAuth(req: Request): Response | null {
+  if (!CONFIG.apiToken) return null
+  const token = req.headers.get('x-browser-session-token')
+  if (token !== CONFIG.apiToken) return error('Unauthorized', 401)
+  return null
+}
@@
 const server = Bun.serve({
+  hostname: CONFIG.host,
   port: CONFIG.port,
@@
   async fetch(req) {
+    const authError = requireAuth(req)
+    if (authError) return authError

Also applies to: 153-160, 169-420

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Browser/Tools/BrowserSession.ts around lines 103
- 111, The json helper currently emits Access-Control-Allow-Origin: * exposing
session endpoints; change it to use a configured allowedOrigin (default to
http://localhost or derived from a BIND_HOST/BIND_ORIGIN env var) and set
Access-Control-Allow-Credentials appropriately, and update CORS response headers
for OPTIONS. Additionally, enforce token-based auth by validating an
Authorization header (e.g., a shared token/env variable) in all session route
handlers such as evaluate, navigate, and stop (or a central middleware used by
those handlers) and reject requests without the valid token or from non-local
hosts; ensure the same allowedOrigin value and auth-check are applied for the
endpoints referenced (json, evaluate, navigate, stop).

Comment on lines +177 to +200
const result: TestResult = {
passed: true,
issues: [],
consoleErrors: [],
networkErrors: [],
visualIssues: [],
};

try {
// Note: These Chrome MCP commands should be executed by the OpenCode agent
// This TypeScript code indicates what should be done
console.log(" → Navigate to site");
console.log(" → Take screenshot");
console.log(" → Check console errors");
console.log(" → Check network requests");

// The agent should use:
// mcp__Chrome__navigate_page({ url })
// mcp__Chrome__take_screenshot()
// mcp__Chrome__list_console_messages()
// mcp__Chrome__list_network_requests()

result.issues.push("AGENT_ACTION_REQUIRED: Use Chrome MCP to test deployment");

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Browser-Test kann aktuell fälschlich als erfolgreich durchlaufen.

testDeploymentWithChrome() lässt passed auf true, obwohl nur AGENT_ACTION_REQUIRED gesetzt wird; dadurch wird ein ungetestetes Deployment als Erfolg behandelt.

✅ Konkreter Fix-Vorschlag
-    result.issues.push("AGENT_ACTION_REQUIRED: Use Chrome MCP to test deployment");
+    result.passed = false;
+    result.issues.push("AGENT_ACTION_REQUIRED: Use Chrome MCP to test deployment");

Also applies to: 306-309

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Cloudflare/Workflows/Troubleshoot.md around lines
177 - 200, testDeploymentWithChrome() sets result.passed = true at
initialization then pushes "AGENT_ACTION_REQUIRED" but never clears passed to
false, so untested deployments are reported as successful; update the logic in
testDeploymentWithChrome()/where TestResult result is built (result) to set
result.passed = false whenever you add an agent-action issue like
"AGENT_ACTION_REQUIRED" (or more generally when any non-empty issues array is
present), or invert the initialization so passed is computed after checks (e.g.,
passed = issues.length === 0 && consoleErrors.length === 0 &&
networkErrors.length === 0 && visualIssues.length === 0) to ensure the presence
of AGENT_ACTION_REQUIRED marks the test as failed.


for (const command of params.commands) {
try {
const result = await $`cd ${workingDir} && ${command}`.quiet().nothrow();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Kritische Command-Injection-Schwachstelle.

Der command-String aus params.commands wird direkt in die Shell interpoliert. Ein böswilliger Command-String wie ruff; rm -rf / würde beide Befehle ausführen.

🔒 Empfohlene Absicherung
+// Whitelist erlaubter Commands
+const ALLOWED_COMMANDS = ['ruff', 'mypy', 'bandit', 'eslint', 'tsc', 'biome'];
+
 for (const command of params.commands) {
+  const baseCommand = command.split(/\s+/)[0];
+  if (!ALLOWED_COMMANDS.includes(baseCommand)) {
+    results.push({
+      command,
+      passed: false,
+      output: `Command not allowed: ${baseCommand}`,
+      warnings: 0,
+      errors: 1,
+    });
+    continue;
+  }

Alternativ: Verwende Bun.spawn mit Array-Argumenten statt Shell-String.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Evals/Graders/CodeBased/StaticAnalysis.ts at line
29, Der Code führt eine Shell-Interpolation von `command` (aus
`params.commands`) in `const result = await $`cd ${workingDir} &&
${command}`.quiet().nothrow();` aus, was zu Command-Injection führt; ändere die
Ausführung so, dass keine Shell-joined String-Interpolation verwendet wird:
nutze stattdessen Bun.spawn (oder ein ähnliches API) mit einem Array von
Argumenten (z.B. ["ruff", "<args>"]) und arbeite mit `workingDir` als
cwd-Option; validiere/whiteliste außerdem die Elemente in `params.commands` vor
dem Spawn und entferne/verbiete Shell-Metazeichen, sodass `command` niemals
direkt in eine Shell-Zeile interpoliert wird.

Comment on lines +67 to +81
executor: async (t, trialNum) => {
// For ALGORITHM integration, we use a simplified executor
// that captures the current agent's work
const transcript = createTranscript(t.id, `trial_${trialNum}`, {
turns: [
{ role: 'system', content: t.description },
{ role: 'assistant', content: 'Task executed via ALGORITHM' },
],
toolCalls: [],
});

return {
output: 'Executed via ALGORITHM bridge',
transcript,
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Der Bridge-Executor bewertet keinen echten Algorithmus-Run.

Hier wird weder THE ALGORITHM aufgerufen noch ein vorhandener Output/Transcript übernommen; jede Trial bekommt nur denselben synthetischen Platzhalter. Damit sind score und passed nicht an die reale Implementierung gekoppelt, sondern nur an das Verhalten der Grader gegenüber diesem Stub. Bitte den echten Executor bzw. das reale Transcript in TrialRunner einspeisen.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Evals/Tools/AlgorithmBridge.ts around lines 67 -
81, The executor in AlgorithmBridge.ts currently returns a synthetic transcript
created via createTranscript and a placeholder output, which prevents real
algorithm runs from affecting score/passed; change executor (the async function
defined with parameters (t, trialNum)) to invoke the real algorithm runner
and/or accept the actual transcript/output produced by TrialRunner (or whatever
real runner method is used), then populate the returned object with that real
output and transcript instead of the hardcoded placeholder; ensure the
implementation forwards errors and preserves toolCalls from the real run so
downstream grading uses the true execution results.

Security fixes applied:
- unpack.py: prevent zip-slip via safe extractall with path validation
- document.py: fix import casing (ooxml.scripts -> Ooxml.Scripts)
- BinaryTests.ts: prevent command injection by splitting command string to array
- StateCheck.ts: prevent path traversal via resolve()+prefix check; mask env secrets in results
- StaticAnalysis.ts: prevent command injection by splitting command string to array
- SuiteManager.ts: validate suite name (alphanumeric/hyphens/underscores only); prevent silent overwrite of existing suites

Config fix:
- .coderabbit.yaml: change language from 'de' to 'en' for English reviews

Intentionally skipped:
- Browser/index.ts Dialog handle: architectural refactor, out of scope for this PR
- BrowserSession.ts CORS: intentional local-dev-only setting
- Cloudflare/Troubleshoot.md TypeScript in markdown: markdown doc, not runnable
- Evals/AlgorithmBridge.ts synthetic executor: intentional design placeholder
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
.opencode/skills/Utilities/Evals/Graders/CodeBased/BinaryTests.ts (1)

24-36: ⚠️ Potential issue | 🔴 Critical

Command injection vulnerability partially addressed but still present.

The code now splits command into commandParts (line 33), which helps with that variable, but workingDir and testFile are still interpolated directly into the shell template. The cd ${workingDir} && pattern uses shell chaining, allowing injection via crafted paths (e.g., workingDir = "; rm -rf /; cd ").

Use Bun.spawn with the cwd option to eliminate shell interpolation entirely:

,

🔒 Proposed fix using Bun.spawn
+import { resolve, normalize } from 'path';
+
+// Add path validation helper
+function validatePath(basePath: string, targetPath: string): string {
+  const resolvedBase = resolve(basePath);
+  const resolvedTarget = resolve(basePath, targetPath);
+  if (!resolvedTarget.startsWith(resolvedBase)) {
+    throw new Error(`Path traversal detected: ${targetPath}`);
+  }
+  return resolvedTarget;
+}

 // In grade() method, replace the shell execution:
-const commandParts = command.split(/\s+/);
-const result = await $`cd ${workingDir} && timeout ${Math.ceil(timeout/1000)} ${commandParts} ${testFile}`
-  .quiet()
-  .nothrow();
+const safeWorkingDir = resolve(workingDir);
+const safeTestFile = validatePath(safeWorkingDir, testFile);
+const commandParts = command.split(/\s+/);
+const timeoutSec = Math.ceil(timeout / 1000);
+
+const proc = Bun.spawn(['timeout', String(timeoutSec), ...commandParts, safeTestFile], {
+  cwd: safeWorkingDir,
+  stdout: 'pipe',
+  stderr: 'pipe',
+});
+
+const [stdout, stderr] = await Promise.all([
+  new Response(proc.stdout).text(),
+  new Response(proc.stderr).text(),
+]);
+const exitCode = await proc.exited;
+
+const passed = exitCode === 0;
+results.push({
+  file: testFile,
+  passed,
+  output: stdout.slice(-500),
+  error: passed ? undefined : stderr.slice(-500),
+});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Evals/Graders/CodeBased/BinaryTests.ts around
lines 24 - 36, Replace the shell interpolation that builds a command string with
Bun.spawn to avoid any shell injection: stop using the template that
interpolates workingDir and testFile in the `$` call in BinaryTests.ts and
instead call Bun.spawn with cwd set to workingDir and args set to an array built
from commandParts plus testFile (where commandParts comes from detectTestCommand
or params.test_command); implement timeout handling via an AbortController (or
Bun.spawn's timeout mechanism) so the child is killed after params.timeout_ms,
capture stdout/stderr and exit code to populate the existing results entry (the
same shape used for result/passed/output/error), and ensure no user-controlled
values are ever concatenated into a shell string.
.opencode/skills/Utilities/Docx/Ooxml/Scripts/unpack.py (1)

21-24: ⚠️ Potential issue | 🔴 Critical

Path containment check is still bypassable with startswith.

Lines 22-23 compare string prefixes; paths like /tmp/out_evil/... still satisfy startswith("/tmp/out"). This weakens the zip-slip mitigation.

Suggested fix
 with zipfile.ZipFile(input_file) as zf:
-    for member in zf.namelist():
-        member_path = (output_path / member).resolve()
-        if not str(member_path).startswith(str(output_path.resolve())):
-            raise ValueError(f"Unsafe zip entry rejected (zip-slip): {member}")
-    zf.extractall(output_path)
+    root = output_path.resolve()
+    for info in zf.infolist():
+        target = (root / info.filename).resolve()
+        try:
+            target.relative_to(root)
+        except ValueError:
+            raise ValueError(f"Unsafe zip entry rejected (zip-slip): {info.filename}") from None
+    zf.extractall(root)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Docx/Ooxml/Scripts/unpack.py around lines 21 -
24, The current zip-slip check uses string startswith and can be bypassed;
replace that logic by resolving both paths and using Path.relative_to (or
catching ValueError from relative_to) to ensure member_path is actually inside
output_path: compute member_path = (output_path / member).resolve(), compute
out_resolved = output_path.resolve(), then try
member_path.relative_to(out_resolved) and raise the ValueError if it raises; do
this for each member before calling zf.extractall so the check cannot be
bypassed by path string prefixes.
🧹 Nitpick comments (4)
.opencode/skills/Utilities/Evals/Graders/CodeBased/BinaryTests.ts (1)

42-43: Output truncation discards context; consider keeping head and tail.

Keeping only the last 500 characters may lose important information from the beginning of test output (e.g., which test failed first). Consider keeping both the first and last portions:

♻️ Optional: preserve head and tail of output
-output: result.stdout.toString().slice(-500),  // Last 500 chars
-error: passed ? undefined : result.stderr.toString().slice(-500),
+output: truncateMiddle(result.stdout.toString(), 1000),
+error: passed ? undefined : truncateMiddle(result.stderr.toString(), 1000),

Where truncateMiddle keeps first 400 + ...truncated... + last 400 chars if over limit.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Evals/Graders/CodeBased/BinaryTests.ts around
lines 42 - 43, The current assignment for output and error uses only the last
500 chars (result.stdout.toString().slice(-500) /
result.stderr.toString().slice(-500)), which can drop important head context;
replace this with a truncation helper (e.g., truncateMiddle) that returns the
full string if under limit or first N + "…truncated…" + last N when over limit,
and call it for both output and error (use result.stdout.toString() and
result.stderr.toString()) so test logs preserve both head and tail; ensure the
helper is used consistently and preserves existing behavior when the string
length is below the limit.
.opencode/skills/Utilities/Docx/Scripts/document.py (3)

130-132: Remove redundant import.

The datetime and timezone imports are already present at module level (line 33). This local import is unnecessary.

♻️ Proposed fix
     def _inject_attributes_to_nodes(self, nodes):
         """Inject RSID, author, and date attributes into DOM nodes where applicable.
         ...
         """
-        from datetime import datetime, timezone
-
         timestamp = datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Docx/Scripts/document.py around lines 130 - 132,
Remove the redundant local import of datetime and timezone inside the block that
sets timestamp; the module already imports these at top-level, so delete the
local "from datetime import datetime, timezone" and keep the timestamp
assignment using datetime.now(timezone.utc). This change touches the timestamp
variable assignment (timestamp = datetime.now(timezone.utc).strftime(...)) so
ensure that code uses the top-level datetime and timezone imports.

650-653: Replace print statement with logging.

Library code should avoid unconditional print() statements as they can interfere with consuming applications. Use the logging module for diagnostic output.

♻️ Proposed fix

Add import at top of file:

import logging

Then replace the print:

         # Generate RSID if not provided
         self.rsid = rsid if rsid else _generate_rsid()
-        print(f"Using RSID: {self.rsid}")
+        logging.debug(f"Using RSID: {self.rsid}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Docx/Scripts/document.py around lines 650 - 653,
Replace the unconditional print by using the logging module: add an import
logging at top and create or use a module logger (e.g., logger =
logging.getLogger(__name__)); then change the print(f"Using RSID: {self.rsid}")
in the constructor/initializer where self.rsid is set (and where
_generate_rsid() is called) to an appropriate logger call such as logger.debug
or logger.info so library consumers control output via logging configuration.

833-836: Consider adding context manager support for reliable cleanup.

The __del__ destructor is not guaranteed to run (e.g., circular references, interpreter shutdown). For more reliable cleanup of the temporary directory, consider implementing the context manager protocol (__enter__/__exit__) or providing an explicit close() method.

♻️ Example context manager addition
def __enter__(self):
    return self

def __exit__(self, exc_type, exc_val, exc_tb):
    self.close()
    return False

def close(self):
    """Clean up temporary directory."""
    if hasattr(self, "temp_dir") and Path(self.temp_dir).exists():
        shutil.rmtree(self.temp_dir)

Usage:

with Document('workspace/unpacked') as doc:
    doc.add_comment(...)
    doc.save()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Docx/Scripts/document.py around lines 833 - 836,
The destructor (__del__) is unreliable for removing the temporary directory; add
context manager support and an explicit cleanup method: implement __enter__
(return self), __exit__ (call a new close() and return False), and a close()
method that contains the existing cleanup logic (check temp_dir via hasattr and
Path.exists then shutil.rmtree). Update any use sites or docs to prefer "with
Document(...) as doc:" or call doc.close() explicitly to ensure deterministic
cleanup; keep __del__ as a fallback that calls close().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.opencode/skills/Utilities/Docx/Ooxml/Scripts/unpack.py:
- Line 35: The generated RSID assignment using random.choices for suggested_rsid
is flagged S311; suppress it by adding an inline noqa comment that explains this
value is only a local, ephemeral hint and is never persisted (e.g., append "#
noqa: S311 — local-only, non-cryptographic RSID for user hinting, not stored or
reused"). Update the line where suggested_rsid is created (the call to
random.choices) to include that explanatory comment so Ruff knows this usage is
intentional.
- Around line 11-12: Replace the use of assert for CLI validation in unpack.py:
do not rely on "assert len(sys.argv) == 3" because asserts are disabled with -O;
instead perform an explicit check of sys.argv, and if the length is incorrect
print a clear usage message and exit with a non-zero status (e.g., via sys.exit
or raising SystemExit). Ensure you still assign input_file = sys.argv[1] and
output_dir = sys.argv[2] after the validation (or use argparse to parse/validate
arguments and provide help text) so the script always enforces correct
invocation even under Python optimization.

In @.opencode/skills/Utilities/Evals/Graders/CodeBased/BinaryTests.ts:
- Around line 55-57: The current score and passed calculations use
params.test_files.length which can be inconsistent or zero; change them to use
results.length instead and guard against zero to avoid division by zero: compute
passCount from results.filter(r => r.passed).length, set score = results.length
? passCount / results.length : 0, and set passed = results.length > 0 &&
passCount === results.length; update the uses of score and passed accordingly
(symbols: passCount, score, passed, params.test_files, results).

---

Duplicate comments:
In @.opencode/skills/Utilities/Docx/Ooxml/Scripts/unpack.py:
- Around line 21-24: The current zip-slip check uses string startswith and can
be bypassed; replace that logic by resolving both paths and using
Path.relative_to (or catching ValueError from relative_to) to ensure member_path
is actually inside output_path: compute member_path = (output_path /
member).resolve(), compute out_resolved = output_path.resolve(), then try
member_path.relative_to(out_resolved) and raise the ValueError if it raises; do
this for each member before calling zf.extractall so the check cannot be
bypassed by path string prefixes.

In @.opencode/skills/Utilities/Evals/Graders/CodeBased/BinaryTests.ts:
- Around line 24-36: Replace the shell interpolation that builds a command
string with Bun.spawn to avoid any shell injection: stop using the template that
interpolates workingDir and testFile in the `$` call in BinaryTests.ts and
instead call Bun.spawn with cwd set to workingDir and args set to an array built
from commandParts plus testFile (where commandParts comes from detectTestCommand
or params.test_command); implement timeout handling via an AbortController (or
Bun.spawn's timeout mechanism) so the child is killed after params.timeout_ms,
capture stdout/stderr and exit code to populate the existing results entry (the
same shape used for result/passed/output/error), and ensure no user-controlled
values are ever concatenated into a shell string.

---

Nitpick comments:
In @.opencode/skills/Utilities/Docx/Scripts/document.py:
- Around line 130-132: Remove the redundant local import of datetime and
timezone inside the block that sets timestamp; the module already imports these
at top-level, so delete the local "from datetime import datetime, timezone" and
keep the timestamp assignment using datetime.now(timezone.utc). This change
touches the timestamp variable assignment (timestamp =
datetime.now(timezone.utc).strftime(...)) so ensure that code uses the top-level
datetime and timezone imports.
- Around line 650-653: Replace the unconditional print by using the logging
module: add an import logging at top and create or use a module logger (e.g.,
logger = logging.getLogger(__name__)); then change the print(f"Using RSID:
{self.rsid}") in the constructor/initializer where self.rsid is set (and where
_generate_rsid() is called) to an appropriate logger call such as logger.debug
or logger.info so library consumers control output via logging configuration.
- Around line 833-836: The destructor (__del__) is unreliable for removing the
temporary directory; add context manager support and an explicit cleanup method:
implement __enter__ (return self), __exit__ (call a new close() and return
False), and a close() method that contains the existing cleanup logic (check
temp_dir via hasattr and Path.exists then shutil.rmtree). Update any use sites
or docs to prefer "with Document(...) as doc:" or call doc.close() explicitly to
ensure deterministic cleanup; keep __del__ as a fallback that calls close().

In @.opencode/skills/Utilities/Evals/Graders/CodeBased/BinaryTests.ts:
- Around line 42-43: The current assignment for output and error uses only the
last 500 chars (result.stdout.toString().slice(-500) /
result.stderr.toString().slice(-500)), which can drop important head context;
replace this with a truncation helper (e.g., truncateMiddle) that returns the
full string if under limit or first N + "…truncated…" + last N when over limit,
and call it for both output and error (use result.stdout.toString() and
result.stderr.toString()) so test logs preserve both head and tail; ensure the
helper is used consistently and preserves existing behavior when the string
length is below the limit.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 564d8cd2-c9d0-4dd5-ba43-6f8e4add3737

📥 Commits

Reviewing files that changed from the base of the PR and between 8a4140d and a150931.

📒 Files selected for processing (7)
  • .coderabbit.yaml
  • .opencode/skills/Utilities/Docx/Ooxml/Scripts/unpack.py
  • .opencode/skills/Utilities/Docx/Scripts/document.py
  • .opencode/skills/Utilities/Evals/Graders/CodeBased/BinaryTests.ts
  • .opencode/skills/Utilities/Evals/Graders/CodeBased/StateCheck.ts
  • .opencode/skills/Utilities/Evals/Graders/CodeBased/StaticAnalysis.ts
  • .opencode/skills/Utilities/Evals/Tools/SuiteManager.ts
✅ Files skipped from review due to trivial changes (1)
  • .coderabbit.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
  • .opencode/skills/Utilities/Evals/Graders/CodeBased/StaticAnalysis.ts
  • .opencode/skills/Utilities/Evals/Graders/CodeBased/StateCheck.ts

unpack.py:
- Replace assert with explicit if + sys.exit(1) for CLI validation (assert disabled under -O)
- Replace startswith() zip-slip check with Path.relative_to() (not bypassable by prefix match)
- Add noqa: S311 comment on random.choices RSID (local-only hint, not cryptographic)

BinaryTests.ts:
- Use results.length instead of params.test_files.length for score/passed calculation
- Guard against zero-division: score = results.length ? passCount/results.length : 0

document.py:
- Remove redundant local 'from datetime import datetime, timezone' inside _inject_attributes_to_nodes (already imported at module level)
- Replace print(f'Using RSID: ...') with logger.debug() — library code must not write to stdout unconditionally

Intentionally skipped:
- BinaryTests.ts Bun.spawn rewrite: Bun $ with array interpolation is already injection-safe; full spawn rewrite is scope creep
- document.py context manager (__enter__/__exit__/close()): architectural API addition, not a bug
- BinaryTests.ts truncateMiddle helper: feature enhancement, not a bug
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
.opencode/skills/Utilities/Docx/Scripts/document.py (1)

63-71: Replace legacy default author value to match OpenCode naming.

Line 63 and Line 621 still default author to "Claude". This keeps legacy branding in generated metadata and examples.

Suggested patch
-        self, xml_path, rsid: str, author: str = "Claude", initials: str = "C"
+        self, xml_path, rsid: str, author: str = "OpenCode", initials: str = "O"
@@
-            author: Author name for tracked changes and comments (default: "Claude")
-            initials: Author initials (default: "C")
+            author: Author name for tracked changes and comments (default: "OpenCode")
+            initials: Author initials (default: "O")
@@
-        author="Claude",
-        initials="C",
+        author="OpenCode",
+        initials="O",
@@
-            author: Default author name for comments (default: "Claude")
-            initials: Default author initials for comments (default: "C")
+            author: Default author name for comments (default: "OpenCode")
+            initials: Default author initials for comments (default: "O")

Also applies to: 621-633

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Docx/Scripts/document.py around lines 63 - 71,
Update the legacy default author string "Claude" to "OpenCode" in the
constructor signatures and any other places setting default metadata: change the
default for the author parameter in the Document class __init__(self, xml_path,
rsid: str, author: str = "Claude", initials: str = "C") to author: str =
"OpenCode" and make the same replacement at the second occurrence around the
later constructor/initializer (the block at lines ~621–633 that currently
defaults author to "Claude"); ensure any docstring examples or default metadata
assignments in the same file that reference the old default are updated to
"OpenCode" as well.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.opencode/skills/Utilities/Docx/Ooxml/Scripts/unpack.py:
- Around line 25-28: The except block that catches ValueError from
member_path.relative_to(out_resolved) should preserve the original exception
context; change the handler to capture the caught exception (e.g., "except
ValueError as err:") and re-raise the new ValueError with explicit chaining
("raise ValueError(f'Unsafe zip entry rejected (zip-slip): {member}') from err")
so the original traceback is preserved; update the exception clause surrounding
member_path.relative_to(out_resolved) accordingly.

In @.opencode/skills/Utilities/Docx/Scripts/document.py:
- Around line 1208-1210: Do not early-return when _has_relationship(editor,
"comments.xml") is true; instead ensure the companion relationships and content
types are present by checking/adding the 'commentsExtended', 'commentsIds', and
'commentsExtensible' pieces and their content-types/relationship entries if
missing. Remove the return that short-circuits on _has_relationship(editor,
"comments.xml") in both occurrences and replace with logic that still
verifies/creates the supplemental relationships/content types (the same checks
used when comments.xml is absent). Keep using the existing helper methods (e.g.,
_has_relationship and whatever add_relationship/add_content_type helpers are
present) so replies' metadata is fully wired.
- Around line 704-712: Ensure xml_path cannot escape self.unpacked_path by
resolving and validating it before creating DocxXMLEditor: build file_path by
joining self.unpacked_path and xml_path, resolve file_path and
self.unpacked_path (or use Path.relative_to) and reject if the resolved
file_path is not inside the resolved self.unpacked_path (raise ValueError for
absolute paths or .. traversal); only after that call DocxXMLEditor(file_path,
rsid=self.rsid, author=self.author, initials=self.initials) and store in
self._editors.

---

Nitpick comments:
In @.opencode/skills/Utilities/Docx/Scripts/document.py:
- Around line 63-71: Update the legacy default author string "Claude" to
"OpenCode" in the constructor signatures and any other places setting default
metadata: change the default for the author parameter in the Document class
__init__(self, xml_path, rsid: str, author: str = "Claude", initials: str = "C")
to author: str = "OpenCode" and make the same replacement at the second
occurrence around the later constructor/initializer (the block at lines ~621–633
that currently defaults author to "Claude"); ensure any docstring examples or
default metadata assignments in the same file that reference the old default are
updated to "OpenCode" as well.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: dedb8b6d-9ab0-4aa2-b9fa-ce0365c4ce90

📥 Commits

Reviewing files that changed from the base of the PR and between a150931 and ae8a56b.

📒 Files selected for processing (3)
  • .opencode/skills/Utilities/Docx/Ooxml/Scripts/unpack.py
  • .opencode/skills/Utilities/Docx/Scripts/document.py
  • .opencode/skills/Utilities/Evals/Graders/CodeBased/BinaryTests.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • .opencode/skills/Utilities/Evals/Graders/CodeBased/BinaryTests.ts

unpack.py:
- Add exception chaining (raise ... from err) to preserve original ValueError context

document.py:
- __getitem__: validate xml_path stays within unpacked_path using resolve()+relative_to()
- _ensure_comment_relationships: replace single early-return with per-item checks
  for each companion relationship; re-query get_next_rid() per item
- _ensure_comment_content_types: replace single early-return with per-item checks
  for each companion content type
- Replace default author 'Claude' with 'OpenCode' and initials 'C' with 'O'
  in both DocxXMLEditor and Document constructors (branding migration)

Intentionally skipped:
- Bun.$ array interpolation rewrite to Bun.spawn (not a genuine injection vector)
- Mass code-fence language tagging of reference files (scope creep)
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.opencode/skills/Utilities/Docx/Scripts/document.py:
- Around line 918-934: The loop that assigns existing[int(comment_id)] must
avoid raising on malformed w:id values; modify the code around the comment_elem
loop to parse comment_id defensively (mirroring _get_next_comment_id) by trying
to convert comment_id to int inside a try/except (or using a safe numeric check)
and skipping the comment when conversion fails or the id is not a non-negative
integer; keep the rest of the logic (finding para_id from w:p and storing into
existing using the validated integer key) unchanged so malformed IDs no longer
abort initialization.
- Around line 637-649: Add a fast-fail guard to validate the required "word"
directory exists under the unpacked content: after creating self.unpacked_path
and setting self.word_path (the "word" folder used by other methods), check that
self.word_path.exists() and self.word_path.is_dir(), and if not raise a
ValueError with a clear message (e.g., "Missing required 'word' directory in
DOCX unpacked content"). Update the constructor in document.py where
self.unpacked_path and self.word_path are assigned to perform this check so
later uses of self.word_path fail early with a helpful error.
- Around line 704-717: The cache uses the raw xml_path as the key which allows
aliased paths to create duplicate editors; in __getitem__ normalize xml_path
before using it (e.g., convert to Path, resolve/relative_to against
self.unpacked_path or call .resolve() then
.relative_to(self.unpacked_path.resolve()) and use that normalized relative
string or path as the key) so lookups and assignments to self._editors use the
normalized path instead of the original xml_path; ensure file_path is still
validated and passed to DocxXMLEditor(rsid=self.rsid, author=self.author,
initials=self.initials) but store and return self._editors[normalized_key].

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: add211be-7e01-485e-bdb6-38a3661a6bd8

📥 Commits

Reviewing files that changed from the base of the PR and between ae8a56b and 9dcd7d5.

📒 Files selected for processing (2)
  • .opencode/skills/Utilities/Docx/Ooxml/Scripts/unpack.py
  • .opencode/skills/Utilities/Docx/Scripts/document.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • .opencode/skills/Utilities/Docx/Ooxml/Scripts/unpack.py

document.py:
- _load_existing_comments: wrap int(comment_id) in try/except ValueError and
  skip non-negative-integer IDs (mirrors _get_next_comment_id pattern)
- Document.__init__: add fast-fail guard after self.word_path assignment —
  raise ValueError if 'word' directory is missing or not a directory
- __getitem__: normalize cache key using resolve()+relative_to() so aliased
  paths (e.g. 'word/../word/document.xml') reuse the same editor instance;
  traversal detection is now a side-effect of the same normalization step
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
.opencode/skills/Utilities/Docx/Scripts/document.py (1)

640-648: Protect temp workspace from constructor-failure leaks.

At Line 641, a temp directory is created before operations at Lines 643 and 647 that can raise. If initialization fails, cleanup is non-deterministic (__del__ timing), so temp dirs can be left behind.

Proposed patch
-        self.temp_dir = tempfile.mkdtemp(prefix="docx_")
-        self.unpacked_path = Path(self.temp_dir) / "unpacked"
-        shutil.copytree(self.original_path, self.unpacked_path)
-
-        # Pack original directory into temporary .docx for validation baseline (outside unpacked dir)
-        self.original_docx = Path(self.temp_dir) / "original.docx"
-        pack_document(self.original_path, self.original_docx, validate=False)
+        self.temp_dir = tempfile.mkdtemp(prefix="docx_")
+        try:
+            self.unpacked_path = Path(self.temp_dir) / "unpacked"
+            shutil.copytree(self.original_path, self.unpacked_path)
+
+            # Pack original directory into temporary .docx for validation baseline (outside unpacked dir)
+            self.original_docx = Path(self.temp_dir) / "original.docx"
+            pack_document(self.original_path, self.original_docx, validate=False)
+        except Exception:
+            shutil.rmtree(self.temp_dir, ignore_errors=True)
+            raise
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Docx/Scripts/document.py around lines 640 - 648,
The constructor creates a temp dir (tempfile.mkdtemp) and assigns self.temp_dir
before running operations that can raise (shutil.copytree and pack_document),
risking leaked temp files; change initialization so you create the temp dir into
a local variable first, perform shutil.copytree(self.original_path,
local_unpacked) and pack_document(self.original_path, local_original_docx,
validate=False), and only after those succeed assign self.temp_dir,
self.unpacked_path and self.original_docx; alternatively wrap the steps in a
try/except/finally that removes the temp dir (shutil.rmtree) on any exception to
guarantee cleanup. Ensure references in the fix match symbols temp_dir,
unpacked_path, original_docx, shutil.copytree, pack_document and
tempfile.mkdtemp.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.opencode/skills/Utilities/Docx/Scripts/document.py:
- Around line 710-715: The exception raised in the traversal guard loses the
original cause; when catching the ValueError from
file_path.resolve().relative_to(self.unpacked_path.resolve()) (the code
computing normalized_key), capture the original exception (e.g., except
ValueError as err) and re-raise the new ValueError(f"Path traversal detected:
{xml_path}") from err so the original traceback is preserved for debugging.

---

Nitpick comments:
In @.opencode/skills/Utilities/Docx/Scripts/document.py:
- Around line 640-648: The constructor creates a temp dir (tempfile.mkdtemp) and
assigns self.temp_dir before running operations that can raise (shutil.copytree
and pack_document), risking leaked temp files; change initialization so you
create the temp dir into a local variable first, perform
shutil.copytree(self.original_path, local_unpacked) and
pack_document(self.original_path, local_original_docx, validate=False), and only
after those succeed assign self.temp_dir, self.unpacked_path and
self.original_docx; alternatively wrap the steps in a try/except/finally that
removes the temp dir (shutil.rmtree) on any exception to guarantee cleanup.
Ensure references in the fix match symbols temp_dir, unpacked_path,
original_docx, shutil.copytree, pack_document and tempfile.mkdtemp.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: cf370a97-45c9-4a45-84f6-0d14b9e58919

📥 Commits

Reviewing files that changed from the base of the PR and between 9dcd7d5 and 0e8dee8.

📒 Files selected for processing (1)
  • .opencode/skills/Utilities/Docx/Scripts/document.py

Comment on lines +710 to +715
try:
normalized_key = str(
file_path.resolve().relative_to(self.unpacked_path.resolve())
)
except ValueError:
raise ValueError(f"Path traversal detected: {xml_path}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify missing exception chaining in __getitem__
rg -n -C2 'except ValueError:\s*\n\s*raise ValueError\(f"Path traversal detected: \{xml_path\}"\)' .opencode/skills/Utilities/Docx/Scripts/document.py

Repository: Steffen025/pai-opencode

Length of output: 258


🏁 Script executed:

sed -n '705,720p' .opencode/skills/Utilities/Docx/Scripts/document.py

Repository: Steffen025/pai-opencode

Length of output: 872


Preserve original exception cause in traversal guard.

At line 715, the raised ValueError drops the original exception context from .relative_to(). Chain it with from err to preserve debugging information.

         except ValueError:
-            raise ValueError(f"Path traversal detected: {xml_path}")
+        except ValueError as err:
+            raise ValueError(f"Path traversal detected: {xml_path}") from err
🧰 Tools
🪛 Ruff (0.15.5)

[warning] 715-715: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/skills/Utilities/Docx/Scripts/document.py around lines 710 - 715,
The exception raised in the traversal guard loses the original cause; when
catching the ValueError from
file_path.resolve().relative_to(self.unpacked_path.resolve()) (the code
computing normalized_key), capture the original exception (e.g., except
ValueError as err) and re-raise the new ValueError(f"Path traversal detected:
{xml_path}") from err so the original traceback is preserved for debugging.

@Steffen025 Steffen025 merged commit 6d565ea into main Mar 16, 2026
4 checks passed
@Steffen025 Steffen025 deleted the release/v3.0-pr07-utilities-reorg-teil1 branch March 16, 2026 10:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant