What this is
A direction-setting issue, not a roadmap checklist. Items here are research trajectories ClawLoop is actively exploring. They inform where the stable APIs are heading so contributors and researchers can engage early. Expect shape to change as we learn; nothing here is a release commitment.
The question
ClawLoop's Evolver protocol tunes a single agent (prompts, playbooks, weights, routing) from its own traces. The natural next step is one level up: an optimizer whose subject is the learner itself — its reflector, its evolver, its reward-composition weights, its router policy, its weight-training recipe.
We call this a hyperagent for learner tuning. Its job is to search over learner configurations so that the resulting learners are better at learning from any given stream of experience.
Concrete research threads
- Search space. What is the meaningful surface to vary — prompt templates, reflector style, evolver cadence, reward-signal weights, router thresholds — and how do we parameterize it without combinatorial blowup?
- Credit assignment. A learner config's quality is only visible after N iterations. How do we score configs from short rollouts without overfitting to early-iteration noise?
- Transfer. Do learner configs that work on one task family transfer to another, or does every domain need its own? Early evidence from the Evolver work suggests partial transfer; want to pin this down.
- Backends. SkyDiscover (Berkeley, Apache 2.0) is a strong candidate for the search layer. GEPA-style prompt evolution is another. We want to compare.
Prior art worth reading
- GEPA — prompt-level evolution.
- SkyDiscover — evolution backend, Apache 2.0.
- DGM / DGM-H (Meta) — hyperagent pattern. License is CC-BY-NC-SA; useful for concepts only.
Related
Engage
Comment with papers, critiques, or pointers. If you want to collaborate, reach out.
What this is
A direction-setting issue, not a roadmap checklist. Items here are research trajectories ClawLoop is actively exploring. They inform where the stable APIs are heading so contributors and researchers can engage early. Expect shape to change as we learn; nothing here is a release commitment.
The question
ClawLoop's
Evolverprotocol tunes a single agent (prompts, playbooks, weights, routing) from its own traces. The natural next step is one level up: an optimizer whose subject is the learner itself — its reflector, its evolver, its reward-composition weights, its router policy, its weight-training recipe.We call this a hyperagent for learner tuning. Its job is to search over learner configurations so that the resulting learners are better at learning from any given stream of experience.
Concrete research threads
Prior art worth reading
Related
Evolverprotocol lives inclawloop/evolvers/.Engage
Comment with papers, critiques, or pointers. If you want to collaborate, reach out.