Add fine-grained partition properties to replace Kind() checks#9900
Add fine-grained partition properties to replace Kind() checks#9900
Conversation
Replace Kind() == STICKY checks with IsEphemeral() where the behavior applies to all ephemeral partition types (sticky, and future types like worker-commands), not just sticky queues specifically. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
9f1f466 to
526bf30
Compare
526bf30 to
a163283
Compare
a163283 to
2b99d6a
Compare
…ants These are used in IsEphemeral() code paths that apply to all ephemeral partition types, not just sticky queues. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2b99d6a to
8e2ff4f
Compare
I think I'd like to split this up further:
The goal is we want to use metadata TTLs for any queue that has the property: if the queue is idle (no poller or tasks) for 48 hours, it's okay to drop all tasks in it. Sticky fits that description, as do worker command queues. I think basically any worker-specific task queue would, right?
This one feels a little different.. for more "scheduling"/"placement" approaches to matching where we send a high volume of tasks over a worker-specific channel, we may definitely want fairness there even though it's a worker-specific queue. (The problem with fairness + TTLs is for tasks, it doesn't apply to metadata. Though yeah, maybe it's silly to TTL metadata without TTLs on tasks. We'd eventually want a scavenger.)
I'm not sure here, I guess by definition worker-specific doesn't need versioning. But maybe some other kind of "ephemeral" queues could be versioned?
Also potentially different.. in theory we could have a high volume worker-specific queue accepting/dispatching tasks enough that want to spread them over a few partitions. After dynamic partition scaling maybe we should relax this for all queues, including sticky, and just default to 1 but let them scale?
I think this mostly has to do with versioning? More behavior differences: Sticky queues get userdata (and thus config) from a normal "parent". Should other ephemeral queues do that too? "Return error on Add*Task if no poller in last 10s". That could apply to worker commands, maybe with a different timeout though? 5m? Priority backlog forwarding: this is tied to having a normal "parent" partition(s), should probably match that. Migration to v2 table: this is tied to fairness. Can be relaxed eventually. So we could have (hand-waving draft): |
|
Oh, and the original one that started this: Appears in metric labels |
f2e771c to
3591c16
Compare
a1a9cca to
994395c
Compare
Replace the single IsEphemeral() method on the Partition interface with four specific property methods: HasTTLExpiry(), SupportsFairness(), SupportsVersioning(), and SupportsPartitions(). Each call site now uses the property that governs its behavior, making it clearer why a check exists and easier to set correct values for new partition types. PartitionCount() returns 1 when SupportsPartitions() is false, enforcing the single-partition constraint explicitly. Describe task queue checks use Kind() == NORMAL since only user-facing queues can be described — this is about the queue's visibility, not a specific property. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
994395c to
2a7094c
Compare
…k for load-on-demand Replace HasTTLExpiry() bool with PersistenceTTL() time.Duration so partition types own their TTL values. Use explicit Kind() != STICKY check for the load-on-demand decision since that is specific to sticky queue semantics, not a general property of partitions with TTL. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move partition metric label from type assertions in metrics helpers to the Partition interface, so each partition kind defines its own label. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Make partitionIDBreakdown the primary branch in GetPerTaskQueuePartitionIDScope so future partition kinds that support partitions have a clear place to add their ID emission. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
5b88951 to
c68da5d
Compare
Cap the SyncState write interval at 24h so the scavenger never mistakes a queue for idle. Remove unnecessary //nolint:forbidigo on softassert calls since the lint exclusion already covers them. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
MetricTag(partitionIDBreakdown bool) lets each partition type handle its own ID breakdown, removing type assertions from the metrics helpers. Extract sticky kind check into a local variable for readability. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
What
Add property methods to the
Partitioninterface —PersistenceTTL(),SupportsFairness(),SupportsVersioning(),SupportsPartitions()— and use them in place ofKind() == STICKYchecks where the behavior isn't sticky-specific.Describe task queue checks use
Kind() == NORMALsince only user-facing queues can be described.Why
Kind() == STICKYchecks conflated multiple independent concerns. Fine-grained properties make each check self-documenting and make it easier to set correct values for new partition types.How did you test it?
Unit tests (
go test ./service/matching/... ./common/tqid/...). No behavioral changes — pure refactor.🤖 Generated with Claude Code