Is your feature request related to a problem? Please describe.
There is no way to retrieve a repository version's content unit IDs over the REST API. Clients that want to compute the difference between two repository versions must instead issue content filter queries (e.g. filtering content by repository_version), which can be expensive on large repositories and put significant load on the database.
Describe the solution you'd like
Expose the existing RepositoryVersion.content_ids array on the repository version detail endpoint as an opt-in field. To avoid bloating normal responses, it is only returned when the request includes the content_ids=true query parameter; otherwise the field is null. Clients can then fetch each version's content_ids and diff the sets directly, so the cost scales with the size of the change rather than the size of the repository
Describe alternatives you've considered
- Filtering content by
repository_version and diffing the results — this is the current approach and is the performance problem we're trying to avoid.
- Always returning
content_ids — rejected because the array can be very large and would unnecessarily inflate every repository version response
Additional context
content_ids is already stored and maintained on the model (added/backfilled in prior migrations and kept current by the add/remove content hooks), so this is a purely additive serializer change with no schema migration or backfill required
Is your feature request related to a problem? Please describe.
There is no way to retrieve a repository version's content unit IDs over the REST API. Clients that want to compute the difference between two repository versions must instead issue content filter queries (e.g. filtering content by
repository_version), which can be expensive on large repositories and put significant load on the database.Describe the solution you'd like
Expose the existing
RepositoryVersion.content_idsarray on the repository version detail endpoint as an opt-in field. To avoid bloating normal responses, it is only returned when the request includes thecontent_ids=truequery parameter; otherwise the field is null. Clients can then fetch each version'scontent_idsand diff the sets directly, so the cost scales with the size of the change rather than the size of the repositoryDescribe alternatives you've considered
repository_versionand diffing the results — this is the current approach and is the performance problem we're trying to avoid.content_ids— rejected because the array can be very large and would unnecessarily inflate every repository version responseAdditional context
content_idsis already stored and maintained on the model (added/backfilled in prior migrations and kept current by the add/remove content hooks), so this is a purely additive serializer change with no schema migration or backfill required