Skip to content

Add callback support for standalone activities#9786

Open
fretz12 wants to merge 20 commits intotemporalio:mainfrom
fretz12:fredtzeng/saa-callbacks
Open

Add callback support for standalone activities#9786
fretz12 wants to merge 20 commits intotemporalio:mainfrom
fretz12:fredtzeng/saa-callbacks

Conversation

@fretz12
Copy link
Copy Markdown
Contributor

@fretz12 fretz12 commented Apr 2, 2026

What changed?

Added completion callback support to standalone activities:

  • Callback lifecycle: When a standalone activity reaches a terminal state (completed, failed, canceled, terminated, timed out), it now fires any registered Nexus completion callbacks — the same mechanism workflows already use.
  • Frontend validation: Callback URL length, header size, and endpoint allowlist validation is now shared between workflows and standalone activities via callbacks.ValidateCallbacks, refactored out of the workflow handler.
  • Describe response: DescribeActivityExecution now returns callback state (CallbackInfo) including trigger, status, attempt count, and failure details.
  • Start response: StartActivityExecutionResponse now includes a Link_Activity_ identifying the started (or reused) activity.
  • Config: Renamed MaxCHASMCallbacksPerWorkflow to MaxCallbacksPerExecution since it now applies to both workflows and standalone activities.

Why?

The v2 scheduler needs to start standalone activities on a schedule and be notified when they complete, so it can track action results, handle overlap policies, and support pause-on-failure. Workflows already have this via CompletionCallbacks + Nexus callback delivery. This PR gives standalone activities the same capability, reusing the existing callback library rather than reimplementing the
delivery/retry/backoff logic. This is a prerequisite for the scheduler's Invoker to call StartActivityExecution with a callback pointing back to the Scheduler component.

How did you test it?

  • built
  • run locally and tested manually
  • covered by existing tests
  • added new unit test(s)
  • added new functional test(s)

Copy link
Copy Markdown
Member

@bergundy bergundy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic you've added LGTM overall, here are the gaps:

  • Missing validation in the frontend for the callbacks.
  • Missing callback info in the describe response.
  • Missing returning links in the StartActivityExecution response.

Comment thread chasm/lib/activity/activity.go Outdated
Comment thread chasm/lib/activity/activity.go
@fretz12 fretz12 force-pushed the fredtzeng/saa-callbacks branch from fc5ff9e to 7da34b7 Compare April 7, 2026 03:18
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Nexus completion-callback support to standalone activities, reusing the existing callbacks library/validation used by workflows, and renames the CHASM callback limit dynamic config to apply to both workflows (CHASM path) and standalone activities.

Changes:

  • Attach/validate CompletionCallbacks on StartActivityExecution for standalone activities and trigger delivery on terminal activity close.
  • Centralize callback validation in components/callbacks.ValidateCallbacks and reuse it for workflow start validation.
  • Rename dynamic config from MaxCHASMCallbacksPerWorkflow to MaxCallbacksPerExecution, and extend functional tests for callbacks + response link fields.

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/standalone_activity_test.go Adds functional coverage for callback acceptance, describe visibility, and delivery on activity terminal states.
service/history/workflow/mutable_state_impl.go Switches CHASM workflow callback cap to MaxCallbacksPerExecution.
service/history/configs/config.go Wires new MaxCallbacksPerExecution config getter into history config.
service/frontend/workflow_handler.go Replaces local workflow callback validation with shared callbacks.ValidateCallbacks.
components/callbacks/config.go Introduces shared ValidateCallbacks and internalizes URL allowlist validation helpers.
components/callbacks/config_test.go Adds unit tests for ValidateCallbacks; updates allowlist tests to new internal method name.
common/dynamicconfig/constants.go Renames/defines MaxCallbacksPerExecution dynamic config setting and description.
chasm/lib/activity/frontend.go Validates/normalizes callbacks on StartActivityExecution requests for standalone activities.
chasm/lib/activity/config.go Adds callback-related dynamic config accessors to standalone activity config.
chasm/lib/activity/handler.go Persists callbacks on activity start and returns an Activity link in StartActivityExecutionResponse.
chasm/lib/activity/activity.go Stores callbacks on the activity, triggers them on close, and surfaces callback info in DescribeActivityExecution.
go.mod / go.sum Bumps go.temporal.io/api dependency to a newer dev revision.
cmd/tools/getproto/files.go Removes a stray leading line from generated file header.

Comment thread tests/standalone_activity_test.go
Comment thread tests/standalone_activity_test.go
Comment thread tests/standalone_activity_test.go
Comment thread chasm/lib/activity/frontend.go
Comment thread service/history/workflow/mutable_state_impl.go
Comment thread common/dynamicconfig/constants.go Outdated
@fretz12 fretz12 marked this pull request as ready for review April 7, 2026 17:06
@fretz12 fretz12 requested review from a team as code owners April 7, 2026 17:06
Comment thread chasm/lib/activity/frontend.go Outdated
}

if cbs := req.GetCompletionCallbacks(); len(cbs) > 0 {
if err := callbacks.ValidateCallbacks(
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to force all errors in callbacks.ValidateCallbacks to return serviceerrors, but didn't want to break anything existing. Open to advice here

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be fine if you switched to service errors. That's very common in the codebase. It's technically redundant because it adds a header with a serialized proto that most SDKs don't care about (only Go SDK does).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@fretz12 fretz12 requested a review from bergundy April 7, 2026 17:12
Comment thread chasm/lib/activity/activity.go Outdated
cbInfos = append(cbInfos, &workflowpb.CallbackInfo{
Callback: cbSpec,
// WorkflowClosed is the only trigger variant in the proto.
Trigger: &workflowpb.CallbackInfo_Trigger{Variant: &workflowpb.CallbackInfo_Trigger_WorkflowClosed{}},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about adding a separate trigger variant for standalone activities? I can see us not wanting to move CallbackInfo from workflowpb because of backwards compatibility, but this feels like something we could introduce more easily now than later. If this has already been discussed and decided, that's fine too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! That was caught in the API PR. It's now activity specific too

Copy link
Copy Markdown
Contributor

@dandavison dandavison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, I did a pass and made some suggestions.

My main concern is it's not clear to me that we should release this using the workflow CallbackInfo. We'll want to straighten that out in the future, but that would be a backwards incompatible API change. So it seems to me that we need to straighten that out now.

return srv.URL
}

func (s *standaloneActivityTestSuite) TestCallbacks() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I suggested above, would be good to add coverage of the timeout cases since they can cause attempt.CompleteTime not to be set.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, added tests for all terminal cases

return serviceerror.NewInvalidArgumentf("unsupported callback variant: %T", variant)
}

id := fmt.Sprintf("%s-%d", requestID, idx)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment explaining what desirable properties follow from this ID-naming scheme (and add the same comment to chasm/lb/workflow/workflow.go.

Can someone confirm that the ID naming scheme used by HSM callbacks, which was designed to address a replication concern that I haven't fully understood yet, is not required by CHASM workflow/SAA callbacks?

https://github.com/temporalio/temporal/blob/main/service/history/workflow/mutable_state_impl.go#L3174-L3176

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment, please double check accuracy folks

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to mention HSM here IMHO.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

Comment thread chasm/lib/activity/activity.go Outdated
Comment on lines +396 to +399
opts.Error = opErr
}

return opts, nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should either be success or failure, so we don't expect to get here. WDYT about:

Suggested change
opts.Error = opErr
}
return opts, nil
opts.Error = opErr
return opts, nil
}
return nexusrpc.CompleteOperationOptions{}, serviceerror.NewInternalf("activity in status %v has no outcome", a.Status)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored

Comment thread chasm/lib/activity/activity.go Outdated
if details := attempt.GetLastFailureDetails(); details != nil {
failure = details.GetFailure()
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do this check-in-two-places-for-a-failure logic in another place. I worry that something will forget to do it in the future. Adding a documented helper would reduce that risk. WDYT about introducing this helper:

diff
diff --git a/chasm/lib/activity/activity.go b/chasm/lib/activity/activity.go
index 43f583674..8cf5a583a 100644
--- a/chasm/lib/activity/activity.go
+++ b/chasm/lib/activity/activity.go
@@ -362,16 +362,7 @@ func (a *Activity) GetNexusCompletion(ctx chasm.Context, _ string) (nexusrpc.Com
 		return opts, nil
 	}
 
-	var failure *failurepb.Failure
-	if f := outcome.GetFailed(); f != nil {
-		failure = f.GetFailure()
-	}
-	if failure == nil {
-		if details := attempt.GetLastFailureDetails(); details != nil {
-			failure = details.GetFailure()
-		}
-	}
-
+	failure := a.terminalFailure(ctx)
 	if failure != nil {
 		state := nexus.OperationStateFailed
 		message := "operation failed"
@@ -947,15 +938,23 @@ func (a *Activity) outcome(ctx chasm.Context) *apiactivitypb.ActivityExecutionOu
 			Value: &apiactivitypb.ActivityExecutionOutcome_Result{Result: successful.GetOutput()},
 		}
 	}
-	if failure := activityOutcome.GetFailed().GetFailure(); failure != nil {
+	if failure := a.terminalFailure(ctx); failure != nil {
 		return &apiactivitypb.ActivityExecutionOutcome{
 			Value: &apiactivitypb.ActivityExecutionOutcome_Failure{Failure: failure},
 		}
 	}
+	return nil
+}
+
+// terminalFailure returns the failure for a closed activity. The failure may be stored in
+// Outcome.Failed (terminated, canceled, timed out) or in LastAttempt.LastFailureDetails
+// (failed after exhausting retries). Returns nil if no failure is found.
+func (a *Activity) terminalFailure(ctx chasm.Context) *failurepb.Failure {
+	if f := a.Outcome.Get(ctx).GetFailed(); f != nil {
+		return f.GetFailure()
+	}
 	if details := a.LastAttempt.Get(ctx).GetLastFailureDetails(); details != nil {
-		return &apiactivitypb.ActivityExecutionOutcome{
-			Value: &apiactivitypb.ActivityExecutionOutcome_Failure{Failure: details.GetFailure()},
-		}
+		return details.GetFailure()
 	}
 	return nil
 }

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored

Comment thread chasm/lib/activity/activity.go Outdated
}
}
return nil
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an identical copy of the function used for workflow CHASM callbacks: https://github.com/temporalio/temporal/blob/main/chasm/lib/workflow/workflow.go#L56

That suggests that it should be moved into chasm/lib/callback.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored

Comment thread chasm/lib/activity/activity.go Outdated
attempt := a.LastAttempt.Get(ctx)
opts := nexusrpc.CompleteOperationOptions{
StartTime: attempt.GetStartedTime().AsTime(),
CloseTime: attempt.GetCompleteTime().AsTime(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a little bug here. This isn't set on schedule-to-start and schedule-to-close timeouts, nor when terminated in SCHEDULED state. I believe the fix is

Suggested change
CloseTime: attempt.GetCompleteTime().AsTime(),
CloseTime: ctx.ExecutionInfo().CloseTime,

Failing test case
	t.Run("ScheduleToStartTimeoutWithCallbacks", func(t *testing.T) {
		activityID := testcore.RandomizeStr(t.Name())
		taskQueue := testcore.RandomizeStr(t.Name())

		ch := &completionHandler{
			requestCh:         make(chan *nexusrpc.CompletionRequest, 1),
			requestCompleteCh: make(chan error, 1),
		}
		defer func() {
			close(ch.requestCh)
			close(ch.requestCompleteCh)
		}()
		callbackAddress := s.runNexusCompletionHTTPServer(t, ch)

		_, err := s.FrontendClient().StartActivityExecution(ctx, &workflowservice.StartActivityExecutionRequest{
			Namespace:    s.Namespace().String(),
			ActivityId:   activityID,
			ActivityType: s.tv.ActivityType(),
			Identity:     s.tv.WorkerIdentity(),
			Input:        defaultInput,
			TaskQueue: &taskqueuepb.TaskQueue{
				Name: taskQueue,
			},
			StartToCloseTimeout:    durationpb.New(1 * time.Minute),
			ScheduleToStartTimeout: durationpb.New(1 * time.Second),
			RequestId:              s.tv.Any().String(),
			CompletionCallbacks: []*commonpb.Callback{{
				Variant: &commonpb.Callback_Nexus_{Nexus: &commonpb.Callback_Nexus{Url: callbackAddress}},
			}},
		})
		require.NoError(t, err)

		// No worker polls -- activity will time out waiting to be started.

		// Verify the callback is delivered with failure state and non-zero CloseTime.
		select {
		case completion := <-ch.requestCh:
			require.Equal(t, nexus.OperationStateFailed, completion.State)
			var failureErr *nexus.FailureError
			require.ErrorAs(t, completion.Error.Cause, &failureErr)
			require.Contains(t, failureErr.Failure.Message, "ScheduleToStart")
			// StartTime may be zero (activity was never started), but CloseTime must not be.
			require.False(t, completion.CloseTime.IsZero(), "CloseTime must not be zero for a timed-out activity")
			ch.requestCompleteCh <- nil
		case <-ctx.Done():
			require.Fail(t, "timed out waiting for completion callback")
		}

		descResp, err := s.FrontendClient().DescribeActivityExecution(ctx, &workflowservice.DescribeActivityExecutionRequest{
			Namespace:  s.Namespace().String(),
			ActivityId: activityID,
		})
		require.NoError(t, err)
		require.Equal(t, enumspb.ACTIVITY_EXECUTION_STATUS_TIMED_OUT, descResp.GetInfo().GetStatus())
	})

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Changed. Also added extra tests for all terminal states

Copy link
Copy Markdown
Member

@bergundy bergundy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're missing the ability to attach callbacks to a running activity when use existing is specified. You'll want to express that with on_conflict_options on the StartActivityExecutionRequest.

Comment thread chasm/lib/activity/activity.go Outdated
// Implements callback.CompletionSource.
func (a *Activity) GetNexusCompletion(ctx chasm.Context, _ string) (nexusrpc.CompleteOperationOptions, error) {
if !a.LifecycleState(ctx).IsClosed() {
return nexusrpc.CompleteOperationOptions{}, serviceerror.NewFailedPrecondition("activity has not completed yet")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug if it happens.

Suggested change
return nexusrpc.CompleteOperationOptions{}, serviceerror.NewFailedPrecondition("activity has not completed yet")
return nexusrpc.CompleteOperationOptions{}, serviceerror.NewInternal("activity has not completed yet")

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return serviceerror.NewInvalidArgumentf("unsupported callback variant: %T", variant)
}

id := fmt.Sprintf("%s-%d", requestID, idx)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to mention HSM here IMHO.

Comment thread chasm/lib/activity/activity.go Outdated
opts := nexusrpc.CompleteOperationOptions{
CloseTime: ctx.ExecutionInfo().CloseTime,
}
if startedTime := attempt.GetStartedTime(); startedTime != nil {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be attempt started time IMHO, it should be the schedule time for the activity.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread components/callbacks/config.go Outdated

// ValidateCallbacks validates completion callbacks: count, URL length, endpoint allowlist, header size, and normalizes
// header keys to lowercase.
func ValidateCallbacks(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't add more code to the HSM implementation. All new code should go in chasm/lib.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's been refactored

Comment thread chasm/lib/activity/frontend.go Outdated
}

if cbs := req.GetCompletionCallbacks(); len(cbs) > 0 {
if err := callbacks.ValidateCallbacks(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be fine if you switched to service errors. That's very common in the codebase. It's technically redundant because it adds a header with a serialized proto that most SDKs don't care about (only Go SDK does).

Comment thread tests/standalone_activity_test.go Outdated
Comment on lines +5181 to +5183
var failureErr *nexus.FailureError
require.ErrorAs(t, completion.Error.Cause, &failureErr)
require.Equal(t, defaultFailure.Message, failureErr.Failure.Message)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use NexusFailureToTemporalFailure and use the SDK's default failure converter to perform assertions on the errors.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

require.Equal(t, nexus.OperationStateFailed, completion.State)
require.False(t, completion.StartTime.IsZero())
require.False(t, completion.CloseTime.IsZero())
var failureErr *nexus.FailureError
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above please.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

require.Equal(t, nexus.OperationStateCanceled, completion.State)
require.False(t, completion.StartTime.IsZero())
require.False(t, completion.CloseTime.IsZero())
var failureErr *nexus.FailureError
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, this should be a canceled error from the Go SDK.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

case completion := <-ch.requestCh:
require.Equal(t, nexus.OperationStateFailed, completion.State)
var failureErr *nexus.FailureError
require.ErrorAs(t, completion.Error.Cause, &failureErr)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, use Temporal errors for the comparison.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

Comment thread tests/standalone_activity_test.go Outdated
require.Equal(t, enumspb.ACTIVITY_EXECUTION_STATUS_TIMED_OUT, descResp.GetInfo().GetStatus())
})

t.Run("ScheduleToCloseTimeoutWithCallbacks", func(t *testing.T) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to check all of the different timeout types, it's all the same outcome.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, kept schedule-to-start and added a comment

@fretz12 fretz12 requested a review from bergundy April 13, 2026 18:29
Comment thread service/frontend/fx.go
Comment thread chasm/lib/activity/handler.go Outdated
}

if cbs := request.GetCompletionCallbacks(); len(cbs) > 0 {
maxCallbacks := h.config.MaxCHASMCallbacksPerExecution(request.GetNamespace())
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This uses MaxCHASMCallbacksPerExecution (default 2000) while the frontend Validator uses MaxCallbacksPerWorkflow (default 32). Should standalone activities use their own CHASM-specific limit, or share the same limit as workflows?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could remove the count check from the Validator entirely since it's a per-component concern that depends on currentCount.
I can use some advice.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be the same limit across all archtypes. We have to validate in the CHASM transaction because we don't know how many callbacks are already attached to the execution. The 32 limit is for the HSM implementation that is already deprecated.

Copy link
Copy Markdown
Member

@bergundy bergundy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still don't see on-conflict-policy and use_existing implemented.

Comment thread chasm/lib/callback/validator.go Outdated

func validatorProvider(dc *dynamicconfig.Collection) *Validator {
return NewValidator(
dynamicconfig.MaxCHASMCallbacksPerWorkflow.Get(dc),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This DC still says "per workflow"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It now uses callback.maxPerExecution in chasm/lib/callback/config.go

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It now uses callback.maxPerExecution in chasm/lib/callback/config.go

Comment thread common/dynamicconfig/constants.go Outdated
"system.maxCallbacksPerExecution",
// NOTE (seankane): MaxCHASMCallbacksPerWorkflow is temporary, this will be removed and replaced with MaxCallbacksPerWorkflow
// once CHASM is fully enabled
MaxCHASMCallbacksPerWorkflow = NewNamespaceIntSetting(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted above, the name execution was correct. This is not just for workflows.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can move this into chasm/lib/callback/config.go AFAIC and rename to callback.maxPerExecution

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored into chasm/lib/callback/config.go and renamed to callback.maxPerExecution

Comment thread chasm/lib/activity/handler.go Outdated
}

if cbs := request.GetCompletionCallbacks(); len(cbs) > 0 {
maxCallbacks := h.config.MaxCHASMCallbacksPerExecution(request.GetNamespace())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be the same limit across all archtypes. We have to validate in the CHASM transaction because we don't know how many callbacks are already attached to the execution. The 32 limit is for the HSM implementation that is already deprecated.

Comment thread service/frontend/fx.go
// chasm/lib/callback/fx.go and read directly from callback.AllowedAddresses.
func callbackValidatorProvider(dc *dynamicconfig.Collection) *callback.Validator {
return callback.NewValidator(
callback.MaxPerExecution.Get(dc),
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous frontend cb validation basically used the hsm config MaxCallbacksPerWorkflow. want to be extra sure changing this from a default of 32 -> 2000 won't break anything
FYI @bergundy

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine since the history handler should be validating the total number of attached callbacks anyways. This is just an initial sanity validation step.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But please confirm in the workflow implementation please.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed:
worfklows

@fretz12 fretz12 requested a review from bergundy April 14, 2026 18:55
Comment thread chasm/lib/activity/activity.go Outdated
return callback.ScheduleStandbyCallbacks(ctx, a.Callbacks)
}

type attachCallbacksRequest struct {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need this request struct, you can pass in the arguments separately. The single struct is good for if you want to call UpdateComponent with a method directly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored

cbs := frontendReq.GetCompletionCallbacks()
if !result.Created && frontendReq.GetOnConflictOptions().GetAttachCompletionCallbacks() && len(cbs) > 0 {
ref := chasm.NewComponentRef[*Activity](result.ExecutionKey)
_, _, err := chasm.UpdateComponent(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't start a second transaction here. Use chasm.UpdateWithStartExecution instead of StartExecution above.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried switching to UpdateWithStartExecution but it breaks for the FAIL conflict policy case. The engine's UpdateWithStartExecution always calls updateFn when the execution already exists — it doesn't check BusinessIDConflictPolicy. So with FAIL policy, instead of getting ExecutionAlreadyStartedError, the updateFn runs successfully and returns Created=false with no error.
To make this work, the engine would need to either respect BusinessIDConflictPolicyFail in the updateFn. Added a TODO for now, we can follow up in separate PR for that

Comment thread service/frontend/fx.go
// chasm/lib/callback/fx.go and read directly from callback.AllowedAddresses.
func callbackValidatorProvider(dc *dynamicconfig.Collection) *callback.Validator {
return callback.NewValidator(
callback.MaxPerExecution.Get(dc),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine since the history handler should be validating the total number of attached callbacks anyways. This is just an initial sanity validation step.

Comment thread service/frontend/fx.go
// chasm/lib/callback/fx.go and read directly from callback.AllowedAddresses.
func callbackValidatorProvider(dc *dynamicconfig.Collection) *callback.Validator {
return callback.NewValidator(
callback.MaxPerExecution.Get(dc),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But please confirm in the workflow implementation please.

Comment thread tests/standalone_activity_test.go Outdated
{Variant: &commonpb.Callback_Nexus_{Nexus: &commonpb.Callback_Nexus{Url: "http://localhost/use-existing-cb"}}},
},
OnConflictOptions: &commonpb.OnConflictOptions{
AttachCompletionCallbacks: true,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should validate that all 3 attach options are set at once.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's what we've done for workflow AFAIR. It's a bit silly but we should keep consistent with workflows.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread tests/standalone_activity_test.go Outdated
require.False(t, resp.GetStarted())
})

t.Run("AttachesCallbacksToExistingActivity", func(t *testing.T) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing coverage for attaching to a new activity, existing activity and that the second start with on conflict options is idempotent.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@bergundy bergundy changed the title Added callback support for standalone activities Add callback support for standalone activities Apr 15, 2026
@fretz12 fretz12 requested a review from bergundy April 15, 2026 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants