Skip to content

feat(talent-market): trading agent templates + MCP auto-install + onboarding/picker fixes#491

Open
cinderzhan wants to merge 24 commits intomainfrom
cinder/agent-market-trading-and-mcp
Open

feat(talent-market): trading agent templates + MCP auto-install + onboarding/picker fixes#491
cinderzhan wants to merge 24 commits intomainfrom
cinder/agent-market-trading-and-mcp

Conversation

@cinderzhan
Copy link
Copy Markdown
Collaborator

@cinderzhan cinderzhan commented Apr 27, 2026

Closes #471, Closes #474 — both PRs' commits are ancestors of this branch (verified via git merge-base --is-ancestor). Merging this PR will land their content in main; GitHub will auto-close them via the keywords above.

Summary

Three layers of work on the Talent Market and chat experience.

1. Talent Market expansion (templates + UX)

  • 11 new agent templates across trading, software-development, marketing, and office categories
  • Brings built-in template count from 4 → 25 (4 legacy + 11 phase 1 + 10 phase 2 trading)
  • New 5th tab "交易投资 / Trading"; Popular tab expanded from 8 to 11
  • Search box that matches against name / description / capability bullets / category, both EN and ZH
  • Per-template Chinese translations (frontend i18n map, no schema changes)
  • Cleaner tab style (subtle underline on active, no font-weight jitter, modal height locked across tabs)
  • Folder-based template loader (backend/agent_templates/<slug>/{meta.yaml,soul.md,bootstrap.md}) replaces inline Python literals

2. MCP / Smithery integration

  • Templates can declare default_mcp_servers in meta.yaml — auto-imported at agent creation
  • Fixes a latent bug where _get_smithery_api_key was returning ciphertext (encrypted in DB but read raw) → 401 from Smithery
  • Live tools/list from runtime server overrides Smithery registry's stale schema (registry advertised sql for shibui/finance; live server requires user_prompt + query)
  • Accept: application/json, text/event-stream header added on Smithery Connect calls (was returning 406)
  • Bare-name lookup fallback in _execute_mcp_tool for when LLM drops the mcp_<server>_ prefix

3. Onboarding + model picker

  • default_skills from template now actually copied to <agent_dir>/skills/ (was silently ignored)
  • Locale-aware first-turn greeting: WS connection passes lang=zh|en; backend prepends a directive so qwen-max etc. respond in user's UI language
  • Onboarding ritual locks on first chunk of greeting, not just deliverable — fires exactly once per (agent, user) regardless of completion
  • Tool list suppressed on greeting turn (~50% prompt reduction → faster TTFT)
  • Model picker dropdown rendered via React Portal with smart up/down placement so it never clips off-viewport
  • Picker change persists to agent.primary_model_id via PATCH; tenant default change migrates following agents
  • "默认" badge tracks agent's saved default (was tracking tenant default → confusing after picker sync)

Diff stats

  • ~21 trading agent template folders (meta.yaml + soul.md + bootstrap.md per slug, 63 files)
  • 2 new builtin skills (Market Data, Financial Calendar)
  • New Alembic migration: add_agent_template_default_mcp_servers
  • Backend: agents.py, websocket.py, onboarding.py, resource_discovery.py, agent_tools.py, enterprise.py, template_seeder.py, skill_seeder.py
  • Frontend: TalentMarketModal, ModelSwitcher, PostHireSettingsModal, Chat, AgentDetail + new templateTranslations.ts

Test plan

  • Boot platform fresh, verify Talent Market shows 25 templates across 4 tabs in correct categories
  • Create a Watchlist Monitor in zh locale → name persists as "盯盘助手", greeting in Chinese
  • Verify mcp_shibui_finance_* tools auto-bind to the new agent
  • Ask "AAPL 现价" — expect agent to call unlock_financial_analysis then stock_data_query with user_prompt + query args
  • Switch model in chat picker → verify agent.primary_model_id updates in DB and the "默认" badge follows
  • Change tenant default → verify agents that were following old default migrate to new
  • Open dropdown near top of viewport → expect it to flip downward instead of clipping

🤖 Generated with Claude Code

cinderzhan and others added 24 commits April 24, 2026 14:09
…Talent Market APIs

- New agent_user_onboardings junction table; a row exists per pair the user
  has already been onboarded to, and is inserted as soon as the agent starts
  streaming its first greeting chunk — the lock fires the instant the user
  sees the agent respond.
- onboarding.py service:
    * resolve_onboarding_prompt picks the right system prompt per turn:
        founding (template.bootstrap_content) when this user is the first
        to ever chat with the agent, welcoming (generic built-in) when
        someone else was there first.
    * mark_onboarded writes the junction row with ON CONFLICT DO NOTHING
      semantics to survive concurrent first-turns.
- WebSocket handler prepends the resolved prompt, skips persistence for the
  synthetic "kind=onboarding_trigger" turn the frontend fires, and flips the
  junction row on the first chunk. Old file-based bootstrap.md flow removed
  entirely (no disk write, no post-turn file-existence checks).
- Alembic migrations:
    * add_agent_bootstrap_fields now only adds capability_bullets and
      bootstrap_content to agent_templates (the short-lived
      Agent.bootstrapped flag is no longer part of the design).
    * add_agent_user_onboardings creates the junction table, backfills rows
      for every (agent, user) pair with chat history so established
      relationships never re-onboard, and drops Agent.bootstrapped.
- AgentTemplate gets bootstrap_content + capability_bullets; seeder authors
  founding rituals for the 4 built-in templates.
- Tenant.default_model_id auto-sets on first enabled model added; set-default
  endpoint lets admins reassign; wizard and Talent Market direct-hire both
  inherit the default when no primary_model_id is supplied.
- /tenants/me is readable by any authenticated member so the wizard and chat
  model switcher can tag the current default.
- WebSocket accepts a per-message model_id override for session-scoped model
  switching without persisting the choice.
- AgentOut gains onboarded_for_me (computed per-request from the junction)
  in place of the deprecated bootstrapped flag.
…g kickoff, ModelSwitcher

- TalentMarketModal replaces the sidebar create-agent button with a grid of
  built-in templates plus a dashed custom-agent card that routes to the
  existing wizard.
- PostHireSettingsModal collects visibility (company/only me) + preferred
  model before creating; 仅创建 vs 立即对话 actions; the latter lands on
  /agents/:id#chat so users skip the status tab on first entry.
- Onboarding kickoff (AgentDetail + Chat): when agent.onboarded_for_me is
  false and a new session is opened, fire a tagged {kind: "onboarding_trigger"}
  message. Backend swallows the user turn and streams the assistant greeting
  — the agent opens the conversation itself, no visible user bubble. One-shot
  per (agent, session) via a ref set. Founding vs welcoming content is
  decided server-side.
- ModelSwitcher: compact pill in the chat toolbar, docked right next to the
  send button with a fixed gap via a margin-left:auto right-hugging group.
  Value resets to agent.primary_model_id on every mount for session scope;
  dropdown tags the tenant default with 默认.
- AgentCreate wizard preselects tenant.default_model_id in the model step.
- EnterpriseSettings LLM tab shows a 默认 badge on the current default and
  a 设为默认 button on other enabled models.
- Sidebar rename: 新建数字员工 → 招聘新成员; opens the Talent Market.
- en.json + zh.json: talentMarket.*, nav.hire, postHire.* entries
- Agent interface: onboarded_for_me replaces the deprecated bootstrapped
- tenantApi.me() for the shared /tenants/me endpoint
- enterpriseApi.setDefaultModel() for the admin set-default action
…erable

Post-review feedback: the previous prompt was only injected on the trigger
turn, so the agent followed instructions on turn 1 (greeted + asked one
question) but defaulted back to asking clarifying questions on turn 2. The
deliverable step never landed.

Rework the injection so both turns of the ritual are guarded:

- resolve_onboarding_prompt now returns an OnboardingInjection(prompt,
  lock_on_first_chunk) with the real user-message count interpolated.
  It keeps firing for a given (user, agent) pair until user_turns >= 1
  AND streaming begins; the junction row is only committed after the
  deliverable turn, so if the user disconnects mid-greeting they still
  get the ritual on their next visit.

- All four built-in bootstrap_contents (PM, Designer, Product Intern,
  Market Researcher) rewritten around a {user_turns} branch:
    * user_turns == 0 → warm one-line greeting + ONE tight question;
      stop. No scope/team/context/deadline follow-ups.
    * user_turns >= 1 → whatever they named is the subject. DO NOT ask
      clarifying questions. Produce a concrete first deliverable inline
      tailored to the role: a one-page project snapshot for PM, a
      quick-win audit for Designer, a competitive snapshot for the
      intern, a landscape map for the researcher. Close with a single
      clear next-step offer.
- The generic welcoming prompt (non-founders) follows the same two-turn
  branch so every first-meeting feels consistent.
- Onboarding sessions now get titled "Onboarding" up front in the WS
  trigger path — the subsequent auto-title logic only overrides titles
  that start with "Session ", so this stays sticky across the flow.
- Fixed list_sessions filter that was hiding onboarding sessions from
  the "我的会话" tab: it counted only user messages, but agent-initiated
  sessions have none yet. Count total messages instead.
…amed greeting

Three issues from post-push review:

1. Opening a new session with an agent that had already completed onboarding
   re-triggered the ritual. Root cause: the frontend's agent query was
   cached with onboarded_for_me=false from before the lock fired, so the
   kickoff effect fired again on the fresh session; backend then accepted
   the trigger and the "Please begin the onboarding" placeholder reached
   the LLM, which dutifully started over.
   Fix: WebSocket guards stale triggers — if the pair already has a
   junction row, short-circuit before LLM and emit a {type: "onboarded"}
   event. Frontend invalidates the cached agent query on that event (and
   when the real lock fires on turn 2), so subsequent sessions observe
   onboarded_for_me=true and skip the kickoff.

2. Greeting content was too terse — no name, no capability pitch, no
   emphasis on key info.
   Fix: prompts now greet the user by display_name, bold the agent name,
   list 2–3 bolded capabilities, bold the single question, and bold
   section headers + next-step phrases in the deliverable turn. The
   welcoming prompt (non-founders) follows the same structure.
   Templates now interpolate {user_name} alongside {name} and {user_turns}.

3. Session title now explicitly set to "Onboarding" on the trigger turn so
   it's identifiable in the session list before the user has typed
   anything (previously already shipped; kept intact here).

Backend, frontend, and prompts all land together so the lock round-trip
and the richer content surface in the same release.
…dal/toast

Users were seeing browser-default confirm() and alert() popups (session delete,
model connectivity test, etc.) that broke visual consistency with the rest of
the app. Introduces a unified dialog/toast system so every notification uses
the Clawith UI.

- DialogProvider + useDialog(): centered modal with Promise-based confirm()
  and alert() API, info/success/warning/error types, collapsible details for
  long error payloads (e.g. LLM connectivity test raw errors).
- ToastProvider + useToast(): top-right auto-dismissing notifications with
  the same type/details support for non-critical errors.
- Migrated all 46 native alert/confirm call sites across 9 files following
  the split: destructive or must-acknowledge -> dialog; non-critical -> toast.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a folder-based template loader in template_seeder.py that reads
backend/agent_templates/<slug>/{meta.yaml, soul.md, bootstrap.md} and
merges with the existing Python DEFAULT_TEMPLATES list.

Ship 11 new built-in templates across 3 categories:
  - software-development (5): frontend-developer, backend-architect,
    code-reviewer, devops-automator, rapid-prototyper
  - marketing (5): growth-hacker, content-creator, seo-specialist,
    tiktok-strategist, linkedin-content-creator
  - office (1): chief-of-staff

Also realign the existing 4 legacy templates' categories to the new
3-bucket taxonomy:
  - Project Manager: management -> office
  - Designer: design -> software-development
  - Product Intern: product -> software-development
  - Market Researcher: research -> marketing

Each new template follows the minimal 4-section soul structure
(Identity / Personality / Work Style / Boundaries) extended with
clawith-runtime bullets for workspace usage, memory sinking, and
heartbeat focus. Bootstrap prompts align with the two-turn ritual
(greeting turn vs deliverable turn) introduced by onboarding.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Design review document for the folder-based template expansion:
rationale, taxonomy, soul/bootstrap contract, rollout plan, and the
confirmed answers to the open questions raised during review.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts:
#	frontend/src/pages/Chat.tsx
… tabs

Bootstrap intros: drop the redundant "I'm X, your X" pattern that read
awkwardly when users kept the template's default name. New form is
"I'm **{name}** — <one-line value prop>" which works whether the agent
is named "Alex" or "Growth Hacker".

Bootstrap deliverables: rework sections that asked the LLM to produce
nested lists ("each with A, B, C") to use a single-line compound format
with " | " separators. Fixes broken ordered-list numbering when the
renderer saw `- bullet` lines interleaved with `1.` numbered items.
Affects: growth-hacker, content-creator, seo-specialist,
tiktok-strategist, code-reviewer, devops-automator.

Talent Market UI: add four category tabs (Popular / Software Development
/ Marketing / Office) above the template grid. "Popular" is the default
and shows a curated set of 8 broadly useful roles; the other three tabs
filter by `category`. Tabs are i18n'd with Chinese fallbacks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tabs:
- Drop the per-tab gap and bold-on-active jitter; tabs now sit flush,
  with consistent font-weight (500) for stable layout
- Active indicator uses text-primary color (matches the screenshot
  reference) instead of accent — reads as "selected" without screaming
- Active tab's 2px underline overlaps the rail via -1px marginBottom
  so the indicator sits exactly on the line, not below it
- Hover state on inactive tabs nudges color to text-primary for
  affordance without layout shift

Modal sizing:
- Replace `maxHeight: '88vh'` with `height: 'min(88vh, 720px)'` so the
  dialog stays the same size when switching tabs (Office tab has only
  2 cards and previously caused the modal to shrink awkwardly)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds two new builtin skills under category "trading", supporting the
upcoming trading agent templates (see AGENT_MARKET_TRADING_SPEC.md).

market-data (icon MD)
- Recommends Smithery's shibui/finance MCP server (free, no API key)
- Covers US equities: quotes, OHLCV history (64 years), financial
  statements, 56 pre-computed technical indicators, EPS data
- Documents Step-by-step protocol: check tools → install via MCP_INSTALLER
  → unlock_financial_analysis → stock_data_query
- Notes v1 scope (US stocks only) and v2 roadmap (futures, FX, crypto)

financial-calendar (icon FC)
- v1 is a structured wrapper around web-research with curated query
  templates (no Smithery MCP currently fits the earnings/macro calendar
  use case)
- Covers: earnings releases, FOMC schedule, US data calendar (CPI, NFP,
  GDP), central bank decisions
- Caches results in memory/calendar_<YYYY-MM>.md to avoid repeated
  fetches, with re-verification rules for high-impact events
- v2 roadmap: dedicated MCP backed by finnhub or trading-economics

Both skills are non-default (won't auto-attach to existing agents);
trading templates in Phase 1 will reference them via default_skills in
meta.yaml.

Also marks the 6 trading-spec open questions as resolved with the
agreed answers from review.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the full trading track to the Agent Market: 10 templates spanning
intel, monitoring, analysis, risk, and review — all in analysis-education
positioning, no broker execution.

Phase 1 — core 6:
  - Market Intel Aggregator (MIA): daily signal/noise news brief
  - Macro Watcher (MW): central banks, data prints, geopolitical events
  - Watchlist Monitor (WM): intraday alerts with active-hours discipline
  - Technical Analyst (TA): 4-section chart reads with explicit invalidation
  - Risk Manager (RM): Stage / Guards / Push flow modeled after OpenAlice's
    Trading-as-Git, but never touches a broker — produces a parameter card
    that the user manually enters
  - Trading Journal Coach (TJC): weekly review, behavior pattern detection,
    rule evolution into memory/trading_rules.md (closes the loop with RM)

Phase 2 — supplementary 4:
  - Earnings & Filings Analyst (EFA): operating change / risk change /
    valuation anchor change reads from 10-K/10-Q/8-K
  - COT Report Analyst (COT): weekly CFTC positioning digest with
    extreme detection
  - Pre-Market & Open Briefer (PMB): one-screen 8am ET brief on US
    trading days, silent otherwise
  - Tilt & Bias Coach (TBC): 5-question check-in with GO/PAUSE/STOP
    state output

All trading souls carry three hardcoded compliance lines:
  - "I frame everything as analysis or education, never investment advice"
  - "I never place, modify, or cancel orders, never enter brokerage
    credentials, never touch private keys"
  - "Every directional or numerical claim ships with its source and
    confidence — guesses tagged 'my read', historical data with as-of date"

All trading bootstraps include a one-line disclaimer in the greeting turn:
"_I help with research, analysis, and discipline — I won't place trades
or give investment advice._"

Frontend:
  - TalentMarketModal: add 5th tab "交易投资 / Trading"
  - Popular tab expanded from 8 to 11 (added MIA + WM + TJC)

Total Agent Market state: 25 templates across 4 categories
(marketing 6, office 2, software-development 7, trading 10).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ting

Three improvements to the Talent Market and onboarding flow.

1. Search box in Talent Market
   - Header gets a magnifier-prefixed input with clear button
   - Searching matches against name, description, capability bullets, and
     category — both English and the localized strings — so a Chinese
     keyword like "盯盘" finds Watchlist Monitor
   - When the query is non-empty, tabs go un-highlighted and the grid
     shows all matches across categories. Clicking any tab clears the
     search and returns to category view
   - Empty state messaging differentiates "no match for X" vs "no
     templates in this category"

2. Per-template Chinese translations
   - New frontend/src/i18n/templateTranslations.ts maps all 25 built-in
     template names (English source-of-truth) to their localized name,
     description, and capability bullets
   - TemplateCard renders the localized version when i18n.language
     starts with 'zh', falls through to English otherwise
   - Backend stays single-source-English; no schema changes

3. First-turn greeting follows user's interface language
   - Frontend (Chat.tsx, AgentDetail.tsx) appends &lang=zh|en to the WS
     connect URL based on i18n.language
   - Backend WS endpoint accepts lang as a query param and threads it
     into resolve_onboarding_prompt
   - onboarding.py prepends a one-line locale directive to the bootstrap
     content so Turn 0 lands in Chinese for zh users (Turn 1+ falls back
     to the soul's standard language-detection rule, matching whatever
     language the user actually typed)
   - Zero changes to the 25 template bootstraps — directive is wrapper-level

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sh in greetings

Two follow-up fixes for the zh-locale flow.

1. PostHireSettingsModal now translates the template's name and
   description through translateTemplate() before posting to the
   create-agent endpoint. Without this, the DB persisted the English
   template name (e.g. "Rapid Prototyper") even when the user picked
   the card showing "快速原型工程师" — the agent then displayed in
   English everywhere forever.

2. The locale directive in onboarding.py was too soft: the LLM was
   treating bootstrap's hardcoded English snippets ("Hi {user_name}!",
   "MVP scoping — strip an idea to...", etc.) as verbatim strings
   to copy rather than templates to translate. Strengthened the
   directive with explicit guidance:
     - The bootstrap below is in English but is a STRUCTURE TEMPLATE
     - Translate ALL example greetings, labels, pitches, questions
     - Preserve markdown structure (bold, bullets) but use Chinese words
     - Keep proper nouns + conventional acronyms in English
   English locale gets a lighter directive (no translation needed).

Note: existing agents created before this fix retain their English
names in the DB. Users can rename them manually or delete + re-create
from the Talent Market.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The create_agent handler was only consuming `data.skill_ids` (user-picked)
+ globally-default skills (mcp-installer / skill-creator / complex-task-
executor). It silently ignored the template's `default_skills` list.

Result: trading templates declared `default_skills: ["market-data"]` in
their meta.yaml, but the resulting agent's filesystem at
`~/.clawith/data/agents/<id>/skills/` only had the three globals — no
market-data SKILL.md. The agent therefore never knew it should install
shibui/finance via MCP_INSTALLER, and silently fell back to web search
forever.

Fix: load the AgentTemplate, expand its default_skills folder names into
Skill IDs, and OR them into the merge before the file-copy loop. Now
trading agents come up with market-data SKILL.md in place, which guides
them to auto-import the shibui/finance MCP using the system Smithery key
(already configured at company level).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds end-to-end "MCP pre-install" so trading agents come up with
shibui/finance ready to call, instead of relying on the agent to
discover and install it via the MCP_INSTALLER skill on first use
(which depends on LLM-side compliance — qwen-max in particular
invents a fake `market_data` tool name instead of reading SKILL.md).

Path B from the design discussion. Three pieces:

1. Schema — AgentTemplate gets `default_mcp_servers: list[str]`
   - Alembic migration `add_default_mcp_servers` adds the column
   - Loader in template_seeder.py reads it from meta.yaml
   - Seeder upserts it (both create + update branches)

2. Trading templates declare their MCP needs — 8 of 10 trading
   templates (all except macro-watcher and tilt-bias-coach which
   don't use market-data) now ship with:
     default_mcp_servers:
       - "shibui/finance"

3. create_agent runs the imports — after skill files copy, if the
   template's default_mcp_servers is non-empty:
     a. Commit the in-flight transaction so the agent row exists
        in DB (otherwise import_mcp_from_smithery's parallel session
        gets FK violations on agent_tools.agent_id)
     b. For each server_id, call import_mcp_from_smithery with the
        system Smithery key (no per-agent prompt)
     c. Each MCP's tools land in the Tool table and AgentTool binds
        them to the new agent. Idempotent — Tool by mcp_server_url
        is reused on subsequent agents
   Failures are logged + swallowed; agent creation succeeds even if
   Smithery is down.

Also fixes a latent bug discovered along the way: _get_smithery_api_key
was reading the encrypted ciphertext from tool config (encrypted by
api.tools._encrypt_sensitive_fields) and sending it raw to Smithery —
producing 401 Unauthorized. Now decrypts via app.core.security.decrypt_data
with a fallback to plaintext for legacy keys.

Verified end-to-end: created an agent from Watchlist Monitor with
this code path → DB shows mcp_shibui_finance_unlock_financial_analysis
and mcp_shibui_finance_stock_data_query bound to the new agent, MCP
URL https://finance--shibui.run.tools, Smithery returned 201 Created.
3.45s end-to-end (including the network round-trip to Smithery).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The runtime path that invokes an MCP tool via Smithery Connect
(_execute_via_smithery_connect) was sending POSTs with only:
  Authorization: Bearer <key>
  Content-Type: application/json

Smithery Connect (and shibui/finance behind it) returns SSE format
"event: message\ndata: {...}" for tools/call responses, and rejects
clients that don't advertise both content types in Accept with:
  HTTP 406 Not Acceptable: Client must accept both
  application/json and text/event-stream

Fix: declare both in Accept. The downstream parser already handles
both JSON and SSE response shapes — the only thing missing was the
header that lets the server actually send SSE.

Repro before fix: create a Watchlist Monitor (with shibui MCP
auto-installed by the previous commit), ask "苹果的现价是?". Agent
calls mcp_shibui_finance_unlock_financial_analysis and
mcp_shibui_finance_stock_data_query. Both return:
  ❌ MCP tool error: Not Acceptable: Client must accept both
  application/json and text/event-stream
Agent then falls back to jina_search and returns a stale yfinance URL.

After fix: same query → both MCP tools resolve normally and return
real Shibui data.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…+ bare-name lookup

Two interlocking bugs that together kept shibui/finance unusable end-to-end.

1. Stale registry schema
   import_mcp_from_smithery was reading tool inputSchemas from
   registry.smithery.ai/servers/<qn> (registry detail). Smithery's
   registry can drift behind what the deployed server actually
   accepts. shibui/finance was a clear case:

     registry advertised:
       stock_data_query(sql: string)
     deployed server required:
       stock_data_query(user_prompt: string, query: string)

   Result: agent built tool calls with `sql=...` per the registered
   schema, server replied "3 validation errors: missing user_prompt,
   missing query". Also: registry only listed 2 of 7 tools the live
   server exposes (get_database_schema, load_*_workflow prompts).

   Fix: after auto-creating the Smithery Connect connection, POST
   tools/list to the runtime endpoint and replace the registry tool
   set with whatever the live server returns. Falls back to the
   registry data if the live call fails.

   Verified: re-importing shibui now produces 7 tools with correct
   schemas (user_prompt + query, not sql).

2. Bare-name lookup fallback
   When the LLM (qwen-max in particular) calls an MCP tool, it
   sometimes drops the clawith mcp_<server>_ prefix and uses the
   bare server-side name. The dispatcher in _execute_mcp_tool only
   looked up by Tool.name (the prefixed form), so calls like
   `unlock_financial_analysis` returned "Unknown tool: ..." even
   though `mcp_shibui_finance_unlock_financial_analysis` existed.

   Fix: if the prefixed lookup misses, retry by Tool.mcp_tool_name.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rable

The greeting turn was intentionally not locking the (agent, user)
junction row — only user_turns >= 1 (deliverable turn) committed it.
The intent was defensive: if a user disconnected before replying we'd
re-greet them next time so the conversation didn't feel mid-flow.

Real-world feedback contradicts that: users who close the tab during
the greeting and come back later see the onboarding ritual fire a
second time. The bootstrap bullets, the "Hi {user_name}!", and the
opening question all replay — confusing because the chat already
shows the prior greeting in history.

Fix: always set lock_on_first_chunk=True. The WS handler's stream
callback (websocket.py:531) calls mark_onboarded on the very first
streamed chunk, so even a mid-greeting disconnect leaves the
junction row committed and subsequent opens skip onboarding.

This makes "ritual fires exactly once per (agent, user) pair" the
guarantee, regardless of whether the user replied or not.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tion

Onboarding's greeting turn (user_turns == 0) produces a fully
templated reply per the bootstrap_content: "Hi {user_name}!", 2-3
capability bullets, one targeted question. It never calls tools.
But the LLM was being given the full ~30-tool function-call schema
in the system prompt anyway — costing about 3,600 tokens of context
on top of the 2,600-token static prompt and the bootstrap itself.

For a Watchlist Monitor with shibui MCP installed, the prompt
budget on first hello hits ~6,400 tokens. Qwen-max chews on that
for several seconds before the first chunk shows up, so users see
"Analysing…" / "thinking…" for noticeably long on every new agent's
first chat.

Wire a skip_tools flag end-to-end:

  OnboardingInjection now exposes is_greeting_turn (true on
  user_turns == 0). The WS handler reads it and passes
  skip_tools=True to call_llm_with_failover. call_llm_with_failover
  threads it to call_llm (both primary and fallback). call_llm
  short-circuits get_agent_tools_for_llm to an empty list when set.

Net effect on the greeting turn: ~3,600 tokens removed from the
prompt (about half), TTFT drops correspondingly. Deliverable turn
(user_turns >= 1) still gets the full tool list, since the agent
typically does want to query data, write workspace files, etc. on
its first useful action.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… default

Three connected fixes around how a user's model selection flows
through the system.

1. Dropdown clipping (frontend/src/components/ModelSwitcher.tsx)
   ModelSwitcher's popover used position: absolute and was getting
   sliced by an ancestor's overflow:hidden in the chat input bar
   (visible in screenshot: only the bottom of the menu shows).
   Render the popover via React Portal into document.body with
   position:fixed and viewport-anchored coords (recomputed on
   scroll/resize). Click-outside check now also accounts for the
   portaled popover so clicking inside doesn't dismiss it.

2. Chat picker persists to agent (Chat.tsx + AgentDetail.tsx)
   Previously the picker was a session-only override; closing the
   tab reverted to whatever agent.primary_model_id had been at
   creation. Users expected the picker and the agent's saved
   default to be the same thing — switching in chat should make
   it stick.
   New handleModelChange wraps onChange: optimistic local state
   update, PATCH /agents/{id} to persist primary_model_id, then
   invalidate the agent query so detail-page settings refresh.
   Rolls back local state on PATCH failure.

3. Tenant default migrates following agents (api/enterprise.py)
   set_default_llm_model now records the previous tenant default
   before overwriting, then UPDATE agents SET primary_model_id
   = new where tenant_id matches AND primary_model_id = previous.
   Effect: agents that were "implicitly following" the old default
   follow the new one. Agents with explicitly user-chosen models
   (different from the default) are left alone.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… mount

Previous useEffect only set overrideModelId when it was null:

  if (agent?.primary_model_id && overrideModelId === null) {
    setOverrideModelId(agent.primary_model_id);
  }

So once the picker had ANY value, later changes to agent.primary_model_id
(via the settings form, tenant default migration, or other paths) never
flowed back into the chat picker — it kept showing whatever was loaded
on first mount.

User report: "每个 Agent 设置了 default 模型后,在对话中还是没有关联上"
"After setting an agent's default model, the chat still doesn't link."

Fix: drop the null guard. Sync whenever agent.primary_model_id changes
and differs from the current picker value. The handleModelChange path
naturally avoids feedback (after PATCH, agent.primary_model_id == picker
value, !== check is false, no setState).

Chat.tsx also adds wsSessionId to the dep array so picker resets to the
agent's current default when the user starts a new conversation, per
"每次新建对话时生效".

AgentDetail's chat tab gets the same null-guard fix; session reset is
deferred there because wsSessionId is declared deeper in the component
and the existing tab structure keeps the picker visible across sessions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ning

Two follow-ups on the chat picker.

1. "默认" badge semantics
   The badge was tagging the tenant-level default. After the prior fix
   that made the picker selection follow agent.primary_model_id, the
   tenant tag drifted from what the user thought of as "the default":
   if agent default was qwen-max but tenant default was deepseek, the
   ✓ landed on qwen-max while "默认" sat on deepseek — confusing.

   Pass agent.primary_model_id as the badge prop in both Chat.tsx and
   AgentDetail.tsx. Badge now marks "this agent's saved default", which
   is what users see represented as the chat's default model.
   (ModelSwitcher's prop name stays `tenantDefaultId` for compat — only
   the value passed in changed.)

2. Smart vertical placement
   Previous popover always opened upward via a fixed
   `bottom: calc(100vh - top + 4px)`. When the trigger sits near the
   top of the viewport (e.g. embedded chat in a narrow panel near the
   page header), the upward menu spilled off the top — clipped or
   completely invisible.

   useLayoutEffect now measures spaceAbove vs spaceBelow on open,
   picks `placement: 'above' | 'below'` based on which has more room
   (preferring above to match the chevron's prior visual flow), and
   caps maxHeight to the available room minus an 8px viewport pad.
   Recomputes on scroll/resize so the menu doesn't go off-screen
   while open.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 25019fcd1b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +203 to +206
if not model.tenant_id:
raise HTTPException(status_code=400, detail="Model is not tenant-scoped")
if not model.enabled:
raise HTTPException(status_code=400, detail="Model is disabled")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restrict default-model updates to the caller's tenant

This endpoint accepts any model_id and never checks whether an org_admin belongs to the same tenant as that model, so a tenant admin who knows another tenant's model UUID can change that tenant's default model and trigger agent migrations there. Add an authorization guard (allow platform admins, otherwise require current_user.tenant_id == model.tenant_id) before mutating tenant state.

Useful? React with 👍 / 👎.

Comment on lines 761 to +764
return fetch(`/api${url}`, {
...options,
headers: { 'Content-Type': 'application/json', ...(token ? { Authorization: `Bearer ${token}` } : {}) },
}).then(async (r) => {
if (!r.ok) {
const bodyText = await r.text();
let detail: unknown;
try {
detail = bodyText ? JSON.parse(bodyText)?.detail : undefined;
} catch {
detail = bodyText;
}
const message = typeof detail === 'string'
? detail
: bodyText?.trim() || `HTTP ${r.status}`;
const error: any = new Error(message);
error.status = r.status;
error.detail = detail;
throw error;
}
return r.json();
});
}).then(r => r.json());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reject non-2xx responses in fetchAuth

fetchAuth now always resolves with r.json() regardless of HTTP status, so callers stop getting exceptions on 4xx/5xx and can treat error payloads as successful data. In this file several paths assume failures throw (for example, list data is used immediately), which can lead to silent failures or runtime crashes when backend returns {detail: ...}. Restore an r.ok check and throw on non-success statuses.

Useful? React with 👍 / 👎.

)}
{existingProvider && (
<button className="btn btn-ghost btn-sm" style={{ color: 'var(--error)' }} onClick={() => confirm('Are you sure you want to delete this configuration?') && deleteProvider.mutate(existingProvider.id)}>
<button className="btn btn-ghost btn-sm" style={{ color: 'var(--error)' }} onClick={async () => { const ok = await dialog.confirm('确定要删除此配置吗?', { title: '删除配置', danger: true, confirmLabel: '删除' }); if (ok) deleteProvider.mutate(existingProvider.id); }}>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Define dialog before using it in OrgTab delete action

The delete button handler calls dialog.confirm(...), but OrgTab does not create a dialog instance via useDialog(). Clicking this button will throw a ReferenceError and prevent identity-provider deletion from working. Initialize the hook inside OrgTab (or route through an existing helper) before using it.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant