Skip to content

Release 2.1.0: User Resources & Flow Files, Knowledge Base, ToolCall Logging, Assistant Flow Control, and Frontend Modernization#319

Merged
asdek merged 305 commits into
mainfrom
feature/next-release
May 29, 2026
Merged

Release 2.1.0: User Resources & Flow Files, Knowledge Base, ToolCall Logging, Assistant Flow Control, and Frontend Modernization#319
asdek merged 305 commits into
mainfrom
feature/next-release

Conversation

@asdek
Copy link
Copy Markdown
Contributor

@asdek asdek commented May 28, 2026

Description of the Change

This PR merges the accumulated feature/next-release branch (304 commits) into main, forming the 2.1.0 release. It groups together every feature, fix, and improvement that landed since v2.0.0. The summary below is organized by theme rather than by commit so reviewers can grasp the overall shape of the release without reading the full log.

Problem

After 2.0.0, PentAGI had no way for users to bring their own files into the system, no first-class UI or API for the pgvector knowledge store, no visibility into individual tool calls, and no way for the assistant to observe or steer a running flow. The frontend list pages each reinvented filtering, pagination, and navigation, the dashboard had noticeable input lag, and the toolchain (React, Apollo, Vite, TypeScript, Zod) was a major version behind. Several reliability gaps remained around task cancellation, malformed LLM tool-call arguments, subscription backpressure, and vector-search correctness.

Solution

This release ships the following grouped changes.

File management layer (new)

  • User Resources — a persistent, per-user file library with MD5-deduplicated storage, a virtual path filesystem, full REST + GraphQL CRUD (upload, mkdir, move, copy, delete, download), real-time subscriptions, and multi-source/multi-path batch operations executed atomically.
  • Flow files — per-flow workspace files synced into worker containers (/work/uploads, /work/resources), the ability to pull files back out of a running container, and promotion of flow files into the user library. Attached files are injected into agent prompts as a structured <task_files> block.
  • File Manager UI — a reusable tree component (multi-select, keyboard navigation, drag-and-drop, sortable columns, bulk actions, overwrite workflow) used by both the new /resources page and the flow Files tab.
  • Upload limits enforced on both ends: 300 MB/file, 1000 files/request, 2 GB total, 255-byte names. Upload paths are hardened against traversal and symlink attacks.

Knowledge Base management (new)

  • GraphQL/REST CRUD plus semantic search over the pgvector store, with admin/user scoping, re-embedding on update, real-time subscriptions, and per-user document ownership.
  • A new /knowledges UI with list/detail pages, a TipTap markdown editor, partial updates, inline rename/delete, and a collapsible semantic-search input (?qs=).
  • Vector search was rewritten onto direct sqlc queries — fixing a bug where document IDs were dropped and removing unsafe SQL string interpolation.
  • A text anonymization service (anonymizeText) exposed via GraphQL/REST and surfaced in the knowledge editor.

ToolCall observability (new)

  • Tool calls are now logged through a dedicated provider and exposed via GraphQL queries/subscriptions and a REST Toolcall API for real-time inspection.

Assistant flow management (new)

  • The assistant can now observe and control active flows from chat via get_flow_status, stop_flow, submit_flow_input, patch_flow_subtasks, and a blocking waitFlowCompletion tool. A summarizer LRU cache avoids redundant LLM calls.

Agent language policy

  • A unified engagement-log vs. technical-channel language policy across all 11 agent prompt templates, forcing English for vector-store and search-engine queries while keeping user-facing messages in the engagement language.

LLM providers & models

  • DeepSeek V4 (deepseek-v4-flash / deepseek-v4-pro) is the new default with per-role thinking control and updated pricing/context metadata. Qwen gains per-role thinking control; model configs for qwen/kimi/glm/deepseek/gemini were refreshed. Added vLLM Qwen 3.6 (incl. 35B FP8) and Azure OpenAI reference configs, and a clear, actionable error when an Ollama model lacks tool support.

Configuration & infrastructure

  • New/updated settings: TERMINAL_TOOL_TIMEOUT (default raised to 1200 s, clamped), PostgreSQL connection pooling (DB_MAX_OPEN_CONNS / DB_MAX_IDLE_CONNS / DB_VECTOR_MAX_CONNS), EMBEDDING_MAX_TEXT_BYTES, and exposed version / isDevelopMode in the Settings API.
  • Frontend package manager migrated to pnpm; CI build trigger moved from master to main.
  • New per-model/per-agent token usage analytics query for the flow dashboard.

Frontend platform modernization

  • Unified list tables (DataTable + useTableState) with URL-synced filter/pagination/sorting, multi-column search, contextual empty states, detail-page Prev/Next navigation, and centralized per-route document titles.
  • Mobile-responsive headers and pickers; inline rename/finish/delete across flows, templates, and knowledges.
  • Performance: lazy-loaded PDF renderer, dashboard period-switch INP cut from 434 ms to 134 ms, useOptimistic for instant rename/favorite feedback, debounced filtering.
  • Major dependency upgrades: React 19, Apollo Client v4, Vite 8 (Rolldown), TypeScript 6, Zod v4, graphql-codegen v6/v7, and the shadcn new-york-v4 component style. Vitest suite grew from 475 to 541 tests.
  • Broad accessibility sweep (aria-labels on icon buttons, form field ids, Radix dialog compliance).

Reliability fixes

  • Cancellable context during subtask generation (no more false-success on cancelled tasks), task/subtask interruption handling, malformed tool-call JSON falling back to {} (prevents LiteLLM 400 retry loops), JSON control-char sanitization, csum out-of-range clamp, aslog deadlock fix, assistant nil-channel deadlock fix, subscription backpressure handling, custom prompts actually applied to new sessions, and numerous frontend fixes (pagination URL loops, filter races, production Apollo title crash, friendly not-found toasts).

Documentation & RFCs

  • Extensive user-facing docs (first-use guide, pentesting methodology, memory lifecycle, OAuth callbacks, Docker mirror, OSINT scenarios, flow Files tab, Graphiti beta limitations, DeepSeek V4, Vertex AI clarification) and two design RFCs (flow concurrency + completion webhooks, MCP client integration) added under examples/proposals/ with no runtime code.

Closes #285, #298, #300, #310

Type of Change

  • 🐛 Bug fix (non-breaking change which fixes an issue)
  • 🚀 New feature (non-breaking change which adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📚 Documentation update
  • 🔧 Configuration change
  • 🧪 Test update
  • 🛡️ Security update

Areas Affected

  • Core Services (Frontend UI/Backend API)
  • AI Agents (Researcher/Developer/Executor)
  • Security Tools Integration
  • Memory System (Vector Store/Knowledge Base)
  • Monitoring Stack (Grafana/OpenTelemetry)
  • Analytics Platform (Langfuse)
  • External Integrations (LLM/Search APIs)
  • Documentation
  • Infrastructure/DevOps

Testing and Verification

Test Configuration

PentAGI Version: 2.1.0 (feature/next-release)
Docker Version: 24.0.x+
Host OS: macOS / Linux
LLM Provider: OpenAI, Anthropic, DeepSeek V4, Qwen (vLLM), Ollama
Node Version: 24.12
Enabled Features: Core, Analytics (Langfuse), pgvector Knowledge Base

Test Steps

  1. Build and start the stack (docker compose up -d); log in and confirm the new /resources and /knowledges pages load.
  2. Upload files to the User Resources library, attach them to a new flow, and confirm they appear under /work/uploads and /work/resources in the worker container.
  3. Create/edit a knowledge document, run a semantic search via the collapsible search input, and trigger the anonymize action.
  4. Start a flow, then from the assistant use get_flow_status / stop_flow / submit_flow_input and observe ToolCall logs streaming in real time.
  5. Exercise list pages (filter, multi-column search, pagination, Prev/Next detail navigation, inline rename/delete) and verify URL/state sync.
  6. Run backend (go test ./...) and frontend (pnpm run test) suites.

Test Results

  • Backend: new handler coverage for FlowFileService (0% → 60–90%) and expanded ResourceService / KnowledgeStore tests pass.
  • Frontend: 541/541 Vitest tests pass; pnpm run lint clean (0 errors).
  • Manual smoke across all list/detail pages and the new file/knowledge flows confirmed, including mobile viewports.

Security Considerations

  • Flow file uploads are hardened against path traversal and symlink escapes; upload size/count limits are enforced on both backend and frontend.
  • Knowledge vector search now uses parameterized sqlc queries, removing prior fmt.Sprintf SQL string interpolation.
  • All resource/knowledge/toolcall endpoints enforce user/admin privilege scoping; new privileges (anonymize.call, toolcall.*) are added via migration.
  • Text anonymization is available to scrub sensitive data from stored knowledge.

Performance Impact

  • Shared PostgreSQL connection pools (sqlc + GORM) and a shared pgxpool for pgvector reduce connection overhead.
  • Container file sync is incremental (missing files copied once).
  • Frontend: dashboard period-switch INP reduced from 434 ms to 134 ms, PDF renderer lazy-loaded (~2.0 MB → ~500 KB initial JS on the report route), KnowledgesProvider scoped to its routes (avoids a ~2.1 MB payload on every page), debounced filtering, and useOptimistic for instant UI feedback.

Documentation Updates

  • README.md updates
  • API documentation updates
  • Configuration documentation updates
  • GraphQL schema updates
  • Other: Swagger/OpenAPI specs; two design RFCs under examples/proposals/

Deployment Notes

Breaking / migration notes:

  • DeepSeek: deployments using legacy deepseek-chat / deepseek-reasoner should migrate to deepseek-v4-flash / deepseek-v4-pro before the upstream deprecation on 2026-07-24.
  • Database env vars: connection-pool settings consolidated to DB_MAX_OPEN_CONNS, DB_MAX_IDLE_CONNS, DB_VECTOR_MAX_CONNS — verify against .env.example.
  • TERMINAL_TOOL_TIMEOUT default raised from 600 s to 1200 s; review if a lower value was intentional.
  • Frontend dev now requires pnpm (was npm).
  • Database migrations run automatically via goose at startup (new privileges, user_id backfill, knowledge/resources/toolcall schema).

Deployment steps:

  1. Pull the merged main branch.
  2. Rebuild: docker compose build.
  3. Restart: docker compose up -d (migrations apply on boot).

Checklist

Code Quality

  • My code follows the project's coding standards
  • I have added/updated necessary documentation
  • I have added tests to cover my changes
  • All new and existing tests pass
  • I have run go fmt and go vet (for Go code)
  • I have run npm run lint (for TypeScript/JavaScript code)

Security

  • I have considered security implications
  • Changes maintain or improve the security model
  • Sensitive information has been properly handled

Compatibility

  • Changes are backward compatible
  • Breaking changes are clearly marked and documented
  • Dependencies are properly updated

Documentation

  • Documentation is clear and complete
  • Comments are added for non-obvious code
  • API changes are documented

Additional Notes

This is an aggregate release PR: all individual feature/fix PRs were already reviewed and merged into feature/next-release. Community contributions in this window came from @mason5052 (custom prompts fix, Ollama/DeepSeek provider fixes, flow-file hardening, extensive docs and RFCs) and @Kairos-T (documentation mermaid fix). Core work by @asdek (backend: resources/flow files, knowledge base, toolcalls, assistant flow tools, providers, reliability) and @sirozha (frontend: File Manager, knowledges UI, unified tables, mobile UX, platform upgrades).

asdek and others added 30 commits April 27, 2026 10:03
…lities

- Added new endpoints for managing flow files, including listing, uploading, and deleting files within flow workspaces.
- Introduced FlowFile model to represent file metadata.
- Enhanced GraphQL schema to support flow file operations and subscriptions for real-time updates.
- Updated API documentation to reflect new flow file functionalities.
…tegration

- Added methods for non-recursive directory listing and file stat operations in the Docker client.
- Implemented a new API endpoint to retrieve files from a running container's directory.
- Updated documentation to reflect new file operations and API changes.
- Introduced data structures for container file metadata and integrated them into the flow file service.
- Enhanced flow file management capabilities with improved synchronization between local and container file systems.
- UserResource model with MD5-deduplicated blob storage and virtual path filesystem
- REST API for resource CRUD (upload, mkdir, move, copy, delete, download)
- GraphQL query/mutations with resourceIds support on createFlow, putUserInput, createAssistant, callAssistant
- Resource → flow copy with hierarchy restore; incremental container sync (find missing, copy once)
- FlowWorker.PutResources delegates copy, docker push and flowFileAdded events
- Agent prompts updated with {{.UserFiles}} XML listing of /work/uploads and /work/resources
- Resource subscriptions: resourceAdded/Updated/Deleted
- Restructured single-case tests for List, Mkdir, Download, and CleanupOrphanBlobs into table-driven scenarios.
- Extended Upload, Delete, Copy, and Move scenarios with admin, forbidden, malformed JSON, invalid path, missing source, and trailing-slash hint cases.
- Added dedicated tests for blob deduplication on upload/copy, non-multipart bodies, orphan blob preservation on rollback, and direct deleteOrphanBlob coverage.
- Introduced helpers for parameterized multipart field names, custom UID context, and on-disk blob counting.
- Added table-driven scenarios for all 8 endpoints (Get/Upload/Delete/Download flow files, Pull from container, GetContainerFiles, AddResourcesToFlow, AddResourceFromFlow) covering success paths, all privilege combinations (view/upload/admin/cross-user), error responses (forbidden/not-found/conflict/invalid request), and security checks (path traversal, symlink rejection).
- Introduced reusable test infrastructure: sqlite-backed flows/user_resources schema, fakeDockerClient implementing the full docker.DockerClient interface, flowFileCaptureSubscriptions recording both FlowPublisher and ResourcePublisher events, and helpers for multipart upload bodies and container TAR fixtures.
- Added direct unit tests for shellQuote, parseFlowIDParam, cleanupPendingUploads, and flowScopeForFiles privilege matrix.
- Lifts handler coverage from 0% to 60-90% across the file and total services package coverage from 10.2% to 28.6%.
…m-image-guide

Signed-off-by: Dmitry Ng <19asdek91@gmail.com>
…image-guide

docs: add OpenVAS custom image guide
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Dmitry Ng <19asdek91@gmail.com>
…emplate

docs: add scope-of-work pentest template
* docs: add OSINT integration scenarios

* docs: clarify OSINT scenario payloads

* docs: clarify OSINT provider identifiers

* docs: distinguish OSINT provider ids

---------

Co-authored-by: Mason Kim(ZINUS US_SALES) <mkim@zinus.com>
…-management-docs

docs: explain assistant flow management for active flows
…doc-links

docs: fix processor wizard integration links
…-guidance

docs: clarify web UI account setup
…ss agent prompts

- Replaced ambiguous "user's language" guidance in tools/args.go with explicit engagement-log vs technical-channel markers per field, with strong English-only requirement for vector-store and search-engine queries.
- Added a unified LANGUAGE POLICY block to every agent prompt (primary_agent, assistant, pentester, coder, installer, searcher, memorist, generator, refiner, reporter, enricher), tailored per agent based on its actual tool set.
- Extended template variables and tool access (TerminalToolName, FileToolName) for coder, pentester, installer, memorist, generator, refiner, and enricher to match their runtime tool registrations.
- Fixed inverted UseAgents condition and removed misleading vector-store write references in assistant prompt; corrected MEMORY SYSTEM INTEGRATION for mode-specific tool references.
- Compressed COMPLETION REQUIREMENTS across templates and aligned closing-tool guidance with the channel mapping (engagement-log message vs technical-channel result).

Fixes #285.

Co-Authored-By: Octopus <liyuan851277048@icloud.com>
- introduce src/components/file-manager with tree rendering, search highlighting, multi-select (single / toggle / shift-range), keyboard navigation (arrows, Home/End, Space, Cmd+A, Esc) and bulk delete
- expose unified actions[] API with downloadAction / copyPathAction / deleteAction factory helpers instead of separate built-in props
- split orchestration into useFileManagerExpansion and useFileManagerSelection hooks; derive expanded/selected state during render without useEffect, preserving Set identity for memo-friendly downstream
- add ARIA tree semantics (role=tree on the container, treeitem rows, roving tabindex via activeRowPath, aria-expanded/aria-selected)
- pure tree utils: O(1) folder lookup via Map, normalizeRootGroups, dedupeOverlappingPaths, findNodeByPath, plus 37 vitest unit tests
- ship shadcn checkbox primitive (depends on @radix-ui/react-checkbox) which the file manager uses for row and select-all controls

Made-with: Cursor
- replace the inline file tree implementation in flow-files with the new FileManager component
- compose row actions via factory helpers (downloadAction / copyPathAction / deleteAction) instead of bespoke menus
- delegate search highlighting, expand/collapse, multi-select and bulk delete to the shared component; the page now only owns upload, drag-and-drop, pull-from-container and per-file delete confirmation
- switch to the typed axios helpers (api / unwrapApiResponse / getApiErrorMessage) for upload, pull and delete calls

Made-with: Cursor
- add ApiResponse<T> / ApiSuccessResponse<T> / ApiErrorResponse / ApiHttpError types describing the backend protocol
- expose a typed api wrapper with helper methods (api.get / api.post / api.put / api.delete) that returns ApiResponse<T> directly and accepts per-call AxiosRequestConfig
- add unwrapApiResponse(...) and getApiErrorMessage(...) helpers so callers no longer reimplement success/error branching
- set a sensible default request timeout (30s) on the shared axios instance; long uploads still opt out via { timeout: 0 }
- migrate user-provider and password-change-form from the raw axios instance to the new typed api helpers, dropping their bespoke error-shape interfaces

Made-with: Cursor
- handleConfirm now accepts () => Promise<void> | void; the dialog awaits the promise before closing
- show a Loader2 spinner on the confirm button while the handler is in flight, and disable both confirm and cancel
- block onOpenChange and outside-clicks while processing so the dialog can't be dismissed mid-action
- caller no longer needs to manage its own loading state for confirm-then-async flows

Made-with: Cursor
- collapse multi-line imports and helpers in flow-form back onto single lines per the active prettier config; no behavioral changes

Made-with: Cursor
- vendor the shadcn ai/file-tree primitive (recursive collapsible tree with shared expand/select context) for future AI chat surfaces
- vendor the shadcn crud-file-manager and skeleton-file-manager demo blocks as design references; not wired into any route yet

Made-with: Cursor
- split monolithic FileManager into FileManagerTreeNode, FileManagerBulkActionsBar, and useFileManagerKeyboardNavigation
- group FileManagerProps columns/search into nested configs and rename visibility booleans from show* to is*Visible per project convention
- replace abbreviations (idx, i, Sep, Item) with full names across components, hooks, and tests
- add ARIA tree-pattern attributes (aria-level, aria-posinset, aria-setsize, aria-multiselectable) and fix aria-hidden hiding the Select-all checkbox
- add labels.formatModified to localize the date column
- make walkTree strictly pure by hiding its internal accumulator
- move side effects out of the setState updater in useFileManagerSelection
- drop redundant stopPropagation in favor of the data-fm-skip-row-click marker

Made-with: Cursor
Drops `components/ai/file-tree.tsx` and `components/blocks/{crud,skeleton}/*`
reference snippets that were never imported anywhere; the real
file manager lives under `components/file-manager/` and is the only
implementation in use.

Made-with: Cursor
- Replace resources feature mocks with real GraphQL + REST integration
  (Apollo subscriptions-backed cache, dedicated hooks for search, upload,
  copy, move, mkdir, delete, plus conflict / mkdir / copy / move dialogs
  and a shared FileDropZone UI component).
- Let flows attach user resources on creation and in chat messages:
  FlowForm exposes resourceIds as a form field with a multi-select
  dropdown (file/folder icons), wired through createFlow, createAssistant,
  putUserInput and callAssistant mutations.
- Add attach-resources and save-as-resource (promote) dialogs to the
  flow files page; unify drag-and-drop via a shared
  hooks/use-files-drag-and-drop.
- Harden file manager: align skeleton layout (grid + column config) with
  real rows, extract use-file-manager-dnd, add group selection state and
  tree-node accessibility fixes.
- Normalize numeric id/userId from REST /resources/ responses to strings
  so GraphQL ID-typed consumers (zod-validated resource picker) work.
  Marked with TODO(backend) for removal once the REST endpoint matches
  the GraphQL ID scalar.
- Ignore *.tsbuildinfo artifacts.

Made-with: Cursor
sirozha and others added 26 commits May 22, 2026 18:02
…02c3)

Reverts the O10 part of commit 04702c3. The original change renamed
the provider clone URL parameter from `?id=` to `?clone=`. On second
look the rename solved no real problem:

- No security issue: PentAGI is a single-tenant local admin tool, so
  there is no cross-tenant info leak from a shared URL.
- No bug: the form behaviour was identical before and after.
- No UX impact: the URL is generated by the "Clone" action, not typed
  by users; the bare `?id=` is also explicit enough in context
  (settings-provider.tsx gates it on `isNew`, so the meaning is
  "source provider to clone from").

What the rename did cost: it broke backward compatibility with any
bookmarks/links saved in `?id=` form. Restore `?id=` to remove the
needless churn.

The rest of commit 04702c3 (O7 dialog case, O8 "API Tokens" unify,
O11 button label + JSDoc, O12 italic placeholder) addressed real
inconsistencies and stays in place.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resolves QA report observation O9. Every list page used to render the
bare "No results." label when its DataTable had no rows — both for a
truly empty dataset and when a filter narrowed the result set to zero.
The Resources page already had a better pattern (shadcn <Empty> block
with a filter-aware copy that cited the query), so propagate it to the
shared DataTable.

API
- New optional `empty?: { entityName?: string }` prop on `DataTable`.
- When `entityName` is provided, the body cell renders an `<Empty>`
  block:
    filter active → Search icon + "No matches" + "No <entity> match
    <code>{query}</code>. Try a different query."
    no filter      → Inbox icon + "No <entity> yet"
  and the pagination footer mirrors the entity name ("No <entity>"
  instead of "No results").
- When `entityName` is omitted, the existing bare "No results." cell
  and "No results" footer are kept verbatim — preserves backward-compat
  for call sites that have not migrated and for the existing test that
  asserts the legacy copy.

Per-page wiring
Plural lowercase names match how a screen reader reads them mid-
sentence:
  flows                  → "flows"
  templates              → "templates"
  knowledges             → "knowledge documents"
  /settings/providers    → "providers"
  /settings/api-tokens   → "API tokens"
  /settings/prompts ×2   → "agent prompts" / "tool prompts"

Tests
- Existing `'No results.'` assertion stays valid: it renders a
  DataTable without the `empty` prop, exercising the legacy fallback
  path.
- Added 3 new cases under `DataTable — empty results`:
    1. legacy fallback when `empty.entityName` is omitted
    2. data-empty "No <entity> yet" when no filter is active
    3. filter-empty "No <entity> match <query>. Try a different
       query." with the query cited verbatim

Verified
- 0 ESLint errors, 44 pre-existing warnings.
- 511/511 vitest pass in sequential mode and when the affected files
  run isolated. A pre-existing concurrency flake in
  `detail-navigation-sheet.test.tsx` (`typing narrows the listbox
  immediately when searchDebounceMs=0`) occasionally surfaces under
  vitest's default `--file-parallelism` — unrelated to this change,
  passes deterministically with `--no-file-parallelism` and when the
  file is run in isolation.
- Browser sweep across 7 tables (`/flows`, `/templates`,
  `/knowledges`, `/settings/{providers,api-tokens,prompts(×2)}`) all
  render the new copy with `?q=ZZZZZZ`.

References
- shadcn `Empty` component (already vendored in
  `components/ui/empty.tsx`)
- shadcn.io `tables-empty` block FAQ (best-practice: distinguish
  filter-empty vs data-empty via `hasActiveFilters` boolean, ground
  the empty cell in a Lucide icon, surface a "next-step" hint).
- Existing precedent in `pages/resources/resources.tsx` —
  `No resources match <code>{q}</code>. Try a different query.`

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The JSDoc blocks added in afb9b63 mostly paraphrased the 12 lines of
code below them. Reviewing line-by-line:

- "two axes" / "filter active vs no filter" / per-state copy — code
  reads identically and is right next to the comment.
- "kept inline because tightly coupled to effectiveQuery" — every
  helper in this file is inline, that's the existing convention.
- "colSpan={columns.length} parent" — the one call site is 10 lines
  below in the same file.
- "for callers that have not migrated yet" — `empty?:` being optional
  already says this.
- "so existing tests stay green" — the test itself documents the
  contract; deleting the fallback fails it loudly.

Kept the one piece TypeScript can't express: a single-line hint that
`entityName` wants the plural lowercase form so the generated copy
reads naturally mid-sentence. That shows up in IDE autocomplete and
saves a "Flows" → "No Flows match" first-try mistake.

No behaviour change. 28/28 DataTable tests pass, 0 ESLint errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- README: align supported-models intro with the local convention used
  by every other provider section ("Models marked with `*` are used
  in default configuration"), so the asterisk on each model ID has a
  near-by explanation.
- Installer help (`LLMFormDeepSeekHelp`): swap the legacy
  "DeepSeek-Chat" / "DeepSeek-Reasoner" bullets in "Default PentAGI
  Models" for the current `deepseek-v4-flash` / `deepseek-v4-pro`
  defaults so the wizard guidance matches the bundled config.

No code, schema, or LiteLLM prefix behavior changes.
…text

- Update model descriptions to reflect V4 1M context window (up to 384K output)
  instead of legacy 128K wording in models.yml and README.
- Split Flash and Pro pricing per official DeepSeek API docs:
  - deepseek-v4-flash: input 0.14 / output 0.28 / cache_hit 0.0028 per 1M tokens
  - deepseek-v4-pro:   input 0.435 / output 0.87 / cache_hit 0.003625 per 1M tokens
- Apply per-role price split across all 13 role configs in both the embedded
  config.yml and the user-facing examples/configs/deepseek.provider.yml.
- Replace stale "cache pricing is 10% of input cost" claim in the README,
  which no longer holds for either V4 model.
- No change to LiteLLM prefix behavior, role-to-model mapping, lifecycle,
  queues, GraphQL schema, migrations, frontend, or installer flow.
…h roles

DeepSeek V4 thinking mode defaults to enabled on both deepseek-v4-flash
and deepseek-v4-pro; non-thinking behavior requires an explicit toggle.
Per official docs, in thinking mode temperature/top_p/presence_penalty/
frequency_penalty are ignored, so the existing Flash role sampling knobs
would have been silently no-ops without the toggle.

Add extra_body.thinking.type=disabled to the five non-thinking Flash
roles so deepseek-v4-flash actually runs in non-thinking mode and
honors the role's temperature/top_p settings:

- simple, simple_json, adviser, searcher, enricher

Pro roles (primary_agent, assistant, generator, refiner, reflector,
coder, installer, pentester) intentionally keep thinking enabled (the
V4 default) for reasoning, tool-use, and security analysis.

PentAGI provider config already supports extra_body as a first-class
yaml field on AgentConfig and forwards it through openai.WithExtraBody,
which the vxcontrol langchaingo fork serializes at the top level of the
Chat Completions request - the same pattern Kimi uses for tool_choice.
No code, schema, or LiteLLM prefix changes required.

Touches:
- backend/pkg/providers/deepseek/config.yml (embedded production config)
- examples/configs/deepseek.provider.yml (user-facing example)

No change to role-to-model mapping, model metadata, pricing, README
wording, LiteLLM prefix, unrelated providers, lifecycle, queues, or
installer flow.
Issue #310 asks how to provide a Google Vertex AI API key in .env for
Anthropic Claude. PentAGI currently has no dedicated Vertex AI provider
path in code: backend/pkg/config and backend/cmd/installer do not read
VERTEX_API_KEY, GOOGLE_APPLICATION_CREDENTIALS, or any vertex_ai
variable. The supported routes for Claude today are direct Anthropic
(ANTHROPIC_API_KEY / ANTHROPIC_SERVER_URL) and AWS Bedrock (BEDROCK_*).

Document this explicitly so users do not assume a hidden Vertex AI
configuration path exists:

- README.md: add a NOTE callout inside the Anthropic Provider
  Configuration section listing the supported routes and pointing
  users who need Vertex AI today at the OpenAI-compatible custom LLM
  provider path (LLM_SERVER_URL / LLM_SERVER_KEY / LLM_SERVER_MODEL)
  fronted by a translating gateway, with a caveat that reliability
  depends on the gateway.
- backend/docs/config.md: add a matching Note paragraph under the
  Anthropic section that points at the AWS Bedrock and custom LLM
  provider sections, and states that no VERTEX_API_KEY or
  GOOGLE_APPLICATION_CREDENTIALS variable is wired into provider
  initialization today.

Docs-only change. No runtime Go code, no installer behavior, no
generated files, no new environment variables. All env var names cited
in the new text already exist in the current PentAGI .env.example,
backend/pkg/config, and backend/cmd/installer.
DataTableFilter / InputSearch listed `onQueryChange` (an inline arrow from
the parent — fresh reference per render) in the emit effect's deps. After
the X-clear handler set `lastEmitted=''` and propagated upstream, the
parent re-rendered and handed back a new handler reference, re-running
the effect while `debouncedValue` was still the pre-clear value
(`useState` inside `useDebouncedValue` hadn't yet been updated by its
trailing `setTimeout`). The effect saw `debouncedValue='alpha' !==
lastEmitted=''` and re-emitted `'alpha'`, snapping URL/state back for
~50–80 ms. JSDOM regression captured `emitted = ['', 'alpha', '']`.

Stash the handler in `useLatestRef` and drop it from deps so the effect
fires only when the observed value actually changes. Same fix in
InputSearch's Esc-clear path. New test in data-table.test.tsx pins the
behavior — fails with the previous code (`['', 'alpha', '']`), passes
now (`['', '']`).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously DataTableFilter (and InputSearch) layered four cells holding
the same logical value — `localValue` (sync useState), `debouncedValue`
(async useState inside `useDebouncedValue`), `lastEmittedReference`
(ref), and the parent's `query` prop (async round-trip) — and stitched
them together with two opposing useEffect's. The async `debouncedValue`
cell could be read out of phase from any other effect, which leaked
through as four distinct races: X clear after debounce flush (URL/state
snap back ~50–80 ms), X clear *before* debounce flush (pending timer
fires after clear), external `query` change mid-debounce (pending
typed value clobbers external), and tail-end character loss on fast
typing across the round-trip boundary.

Drop the extra storage cell. The debounce is now an imperative
`pendingTimerReference` mutated only by `handleChange`,
`cancelPendingEmit`, and the external-sync effect. All outbound
emission goes through one `emit(next)` function that synchronously
cancels any pending timer before sending. `handleClear` → `emit('')`.
External `query` change → cancel pending + accept. Unmount → cancel.
`onQueryChange` lives behind `useLatestRef` so effect deps stay
stable across parent re-renders.

Four regression tests in data-table.test.tsx pin all four scenarios.
Verified by rolling back the fix temporarily — the X-clear race test
fails with the exact `emitted = ['', 'alpha', '']` from the original
browser repro. With the fix in place all four pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…llback

The previous iteration moved DataTableFilter / InputSearch off
`useDebouncedValue` and onto an inline `setTimeout` + `pendingTimerRef`
+ `cancelPendingEmit` dance. That fixed all four race scenarios but
left imperative scheduling living inside components — a code smell
flagged in review. After surveying the React 19.2 surface area
(`useDeferredValue` is for *render* priority, not side-effect throttling;
`useTransition` doesn't apply to controlled inputs; `useEffectEvent` is
restricted to handlers called from within Effects) and `use-debounce`'s
API, the right shape is a small, dependency-free hook.

`useDebouncedCallback(fn, delayMs)` returns a stable callable with
`.cancel()` / `.isPending()`. Timer state lives in a closure variable
inside a `useState` lazy initialiser, so:
- the returned identity never changes (safe in effect deps and memoised
  children);
- there are no refs to read during render (no React Compiler complaints);
- `fn` and `delayMs` go through `useLatestRef` so the dispatched call
  always sees the freshest closure — inline arrows in parents are free.

DataTableFilter / InputSearch shrink to one useState + one useRef
(`lastEmittedReference` distinguishes our own router round-trip from a
true external change) + one useEffect (external sync). All four
regression tests added in the previous commit still pass; five new unit
tests pin `useDebouncedCallback` semantics (burst coalescing, cancel,
isPending, latest-closure read, unmount cleanup). Browser repro on
/flows for both X-after-flush and X-before-flush scenarios: a single
transition `v="" u=""`, no resurrection.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After landing the imperative-timer fix and extracting `useDebouncedCallback`
into our own hook, the use-site count climbed past three and the case for
a single, battle-tested contract beat the case for in-tree code. Drops
both our `useDebouncedValue` and `useDebouncedCallback` and routes every
caller through `use-debounce@10.1.1` (1M+ weekly, ~3KB gzip).

API mapping at each use-site:
- value debounce: `useDebouncedValue(v, ms)` → `const [v] = useDebounce(v, ms)`
- callback debounce: `useDebouncedCallback(fn, ms)` → identical signature,
  same `.cancel()`/`.isPending()` semantics we relied on, plus `.flush()`
  and leading/trailing/maxWait options we now get for free.

Files touched (7 use-sites): data-table.tsx, input-search.tsx,
use-table-state.ts, use-table-query-filter.ts, knowledges-provider.tsx,
use-flow-files-search.ts, use-detail-navigation.ts.

pnpm + Vite gotcha: pnpm hoists into `.pnpm/<name>@<ver>/`, so any
transitive `import React from 'react'` from a third-party package can
resolve to its own physical copy. With one path through our code and
another through use-debounce, the runtime crashed at `useRef` with
"Cannot read properties of null (reading 'useRef')". Added
`resolve.dedupe: ['react', 'react-dom']` in vite.config.ts so the
bundle ships exactly one React.

Tests:
- new input-search.test.tsx (10 tests) covers collapsed/expanded
  presentation, expand-on-click + rAF focus, burst-typing coalesce,
  Esc-clear mid-debounce, two-stage Esc semantics, external-source-wins
  race, Ctrl+F / Cmd+F global hotkey expansion.
- data-table.test.tsx +5 tests: deep-link initial filterValue,
  6-keystroke burst coalesce, unmount-mid-debounce cleanup, two
  independent DataTables on one page (settings/prompts shape),
  back-button-style external clear, three-round type/clear stress.
- full suite: 541/541 passing, lint 0 errors.

Manual smoke via Chrome DevTools MCP on every page that mounts a
filter input — /flows, /templates, /settings/api-tokens,
/settings/prompts (both tables independent), /knowledges (filter
+ semantic search hitting the server through ?qs=). All scenarios
collapse to a single `[{v="", u=""}]` transition; no URL re-emit
after X, no leftover timer after Esc, no cross-table leakage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Run `pnpm prettier:fix` brought three files into compliance:
- data-table.test.tsx — new race-scenario suite added in the previous
  commit; prettier preferred a slightly different multi-line shape for a
  couple of `render(<FilterHost />)` calls.
- input-search.test.tsx — new file in the previous commit; same minor
  multi-line reshaping for `render(<SearchHost />)`.
- settings-api-tokens.tsx — pre-existing single-line nested ternary that
  predates this branch, now collapsed to one line per prettier.

No behavior change. Format-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reviewing 17451e7: the dedupe was added in reaction to a runtime
`Cannot read properties of null (reading 'useRef')` that I attributed
to dual React copies via pnpm hoist. But I applied two changes at
once — clearing `node_modules/.vite` and adding dedupe — and never
isolated which one actually fixed it.

Empirical re-check just now: production build, dev server, and
the affected page all work with dedupe removed. The real cause was
the stale Vite pre-bundle (`.vite/deps/`), which doesn't auto-rebuild
after `pnpm add` in a running dev server. Cache clear alone fixes it.

That makes the dedupe defensive hygiene at best, and the 7-line
war-story comment alongside it actively misleads — it pins a problem
on a mechanism that wasn't the cause. Removing both keeps the config
honest. QA report updated to record the misdiagnosis as a process
note instead of an architectural rule.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tion

After the `chore(frontend): upgrade graphql-codegen toolchain` upgrade
(8a3d4e9) the generated `src/graphql/types.ts` carried every base type
twice — once as `export enum Foo` (from `typescript`) and once as
`export type Foo = 'a' | 'b'` (from `typescript-operations`). Vite's
dependency scan failed on the duplicate declarations with 29
`Identifier 'X' has already been declared` parse errors, which made
the dev server occasionally die on cold start.

Root cause: this was a stale v5-style config running on v6 plugins.
The v6 migration guide is explicit — `preResolveTypes` has been
removed, and `typescript-operations@6` now emits base types itself,
so the canonical setup is `plugins: ['typescript-operations']` rather
than `['typescript', 'typescript-operations']`. Keeping both meant the
two plugins independently emitted the full type set in different
shapes.

Bisect confirmed the split: `typescript` alone → 1327 lines, no
duplicates; `typescript-operations` alone → 1311 lines, no
duplicates; the pair → 2639 lines with 29 duplicate type names.
With `typescript-react-apollo` also in the mix the regenerated file
ballooned to 8819 lines, half of which was the shadow set.

Fix:
- Drop the `typescript` plugin from the codegen config.
- Drop `preResolveTypes: true` (removed in v6).
- Add `enumType: 'native'` so the call sites that use enum-style
  access (`KnowledgeDocType.Answer`, `AgentConfigType.Adviser`, …)
  keep compiling — without it, v6 defaults to string-literal unions.

Result:
- `src/graphql/types.ts` shrinks 8819 → 7586 lines, zero duplicate
  declarations.
- `KnowledgeDocType`, `AgentConfigType`, etc. remain real TS `enum`s
  so `KnowledgeDocType.Answer`-style access keeps working.
- `pnpm run build` clean, lint clean, 531/531 tests passing.
- Vite dev start no longer prints `[PARSE_ERROR] Identifier X has
  already been declared`. `/knowledges` (the page that exercises
  enum access) loads with 10 rows and zero console errors.

Migration guide:
https://the-guild.dev/graphql/codegen/docs/migration/operations-and-client-preset-from-5-0

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…th DataTableFilter)

`DataTableFilter` already ships a conditional trailing `<X>` clear
button; `InputSearch` (the collapsible variant used for /knowledges
semantic search and the same shape sitting in flow detail searches)
only offered keyboard Esc to clear. Unify so every search field across
the app has the same shape.

Implementation:
- Extract a shared `handleClear` used by both the new X button and the
  first-Esc branch. The path cancels any pending debounce, emits `''`
  upstream, and explicitly returns focus to the input — this matches
  the Esc-clear path (where focus is already on the input) and
  side-steps the programmatic-clear collapse effect, so the field
  stays open and ready for fresh typing.
- Render the X only while `isExpanded && localValue` so it can never
  appear on a collapsed/empty field. `aria-label` is contextual:
  "Clear search knowledge documents" / "Clear semantic search" / etc.,
  derived from the host page's `ariaLabel`.
- Update the JSDoc to describe both clear entry points.

Tests (input-search.test.tsx, 15 total — 10 baseline + 5 new):
- absent on empty mount
- present on deep-linked non-empty mount
- shows/hides as the user types and clears
- click clears value, emits `''` upstream, keeps focus on input,
  prevents the trigger from re-appearing
- click mid-typing drops any pending debounce — no leaked emit

Browser verified on /knowledges semantic search:
beforeClear { value: 'jwt', url: '?qs=jwt', xPresent: true }
afterClear  { value: '',    url: '',       xPresent: false, focusOnInput: true }

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Followed up on the "no war-story comments" rule (memory file added a
few commits ago, then promptly broken in the X-button addition).
Pass through every comment with the test "remove it — does a senior
reading the code for the first time understand less?".

Removed:
- "Same two-cell shape as DataTableFilter, see its docstring" — cross-
  file rationale belongs in commit history.
- "Hint shown over the collapsed trigger. Kbd self-themes…" — Kbd
  configuration detail, not part of this component's contract.
- "Mirror size='sm' button geometry used by HeaderButton" — micro
  implementation detail.
- "Collapsed trigger sits flush… has-[>button]:ml-[-0.45rem] handles
  the analogous adjustment" — CSS quirk story.
- "The trigger is a real <button> — never a <div> with onClick" —
  generic HTML/a11y factoid, not specific to this file.
- "The component is controlled — parent owns searchQuery" — visible
  from the props shape.
- "Belt-and-braces for programmatic clears" — defensive-style war
  language, replaced with one line stating the effect's purpose.
- "Initialise expanded if the parent already has a non-empty query" —
  duplicates `useState(() => searchQuery.trim().length > 0)`.

Kept (in single-line / sub-3-line form):
- `COMMIT_DEBOUNCE_MS` shares timing with DataTableFilter.
- `External → local sync` effect intent.
- `rAF` before focus so motion's transform has started.
- `handleInputBlur` reads `localValue`, not the prop (subtle invariant).
- Auto-collapse effect purpose.
- `inputRef.current?.focus()` in `handleClear` (otherwise auto-collapse
  fires and the field disappears under the user).
- Top-level JSDoc shrunk from 28 lines to 5 — what it does plus the
  two equivalent clear paths.

Behaviour unchanged. 15/15 tests still pass, lint clean. File: 340 → 260
lines (-24%).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ent.change

Baseline 8/10 fail (80% flake) → 20/20 pass. Root cause was `user.type`
racing the Radix Sheet focus trap, not the `useDebounce` macrotask
theory from the prior investigation: with `defaultOpen=true` Radix sets
`body { pointer-events: none }`, swallowing user-event's internal click,
so keystrokes route to whatever element currently holds focus (the
SheetContent close button) instead of the input. `inputValue` stays
empty and the listbox never narrows.

`fireEvent.change` mirrors the sidestep already used in the
URL-filter AND test in the same file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lint warnings 44 → 25. The remaining 14 `any` warnings sit in
settings-provider.tsx (provider config form + test results dialog,
needs domain knowledge — deferred per scope). The other 11 warnings
are React Compiler "Compilation Skipped" notices, not code issues.

- lib/log.ts: any → unknown (5) — logger accepts arbitrary input
- shared/markdown.tsx: typed the recursive ReactNode walker with a
  `hasChildrenProp` type guard around `isValidElement`; the components
  map now uses react-markdown's `Components` type so `code` / `pre`
  overrides infer their props (5)
- settings-prompt.tsx: `ControllerProps.control: Control<FieldValues>`,
  `field: ControllerRenderProps<…>`, validation result typed against
  `ValidatePromptMutation['validatePrompt']`, dropped a stray
  `as any` on `removeEventListener` options (4)
- settings-prompts.tsx: row sub-component params typed via
  `Row<T>` from @tanstack/react-table (2)
- settings-providers.tsx: provider icon map typed against
  `SVGProps<SVGSVGElement>`, row sub-component via `Row<Provider>`,
  recursive `getFields(obj)` narrowed from `any` to `unknown` with a
  single boundary cast for `Object.entries` (3)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…cy-rfc

docs(rfc): propose persistent flow queue and completion webhooks
…dels

fix(deepseek): update default model names to DeepSeek V4
docs(rfc): propose MCP client integration design
…ig-docs

docs(llm): clarify Vertex AI configuration options
…ters

Updated the Qwen agent configuration to include `extra_body` parameters for thinking control across various models. Added `enable_thinking` and `preserve_thinking` options for reasoning agents, while utility agents have `enable_thinking` set to false. Adjusted the Qwen client initialization to support these new configurations. Updated test report to reflect changes in success rates and latencies.
…sks generation stage

Enhanced the flowWorker's task processing by introducing a cancellable context for task execution. This change ensures that tasks can be properly cancelled without reporting false success states. The `runTask` method has been refactored to utilize a new `execTask` method, which centralizes task execution logic and maintains context integrity. This update improves flow control and error handling during task creation and execution.
Comment thread backend/pkg/flowfiles/files.go Dismissed
Comment thread backend/pkg/flowfiles/files.go Dismissed
Comment thread backend/pkg/flowfiles/files.go Dismissed
Comment thread backend/pkg/flowfiles/files.go Dismissed
Comment thread backend/pkg/server/services/assistants.go Dismissed
Comment thread backend/pkg/server/services/flow_files.go Dismissed
Comment thread backend/pkg/server/services/flow_files.go Dismissed
Comment thread backend/pkg/server/services/flow_files.go Dismissed
Comment thread backend/pkg/server/services/flow_files.go Dismissed
Comment thread backend/pkg/server/services/flow_files.go Dismissed
…rriers

Added new test cases to validate containment barriers in the `ResolvePulledStagedTarget` and `ZipRelativePaths` functions, ensuring that paths escaping the designated directories are properly rejected. Updated the `knowledge_test.go` to include scenarios for handling nil embedder and error propagation during document creation, improving overall test coverage and robustness.
@asdek asdek merged commit a112db2 into main May 29, 2026
5 of 6 checks passed
@asdek asdek deleted the feature/next-release branch May 29, 2026 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Agent outputs Russian in message/result fields despite Flow.language = "English"

5 participants