Skip to content

Migrate language S3 to brainscore-storage bucket#374

Merged
KartikP merged 5 commits intomainfrom
migrate-s3-to-brainscore-storage
Mar 3, 2026
Merged

Migrate language S3 to brainscore-storage bucket#374
KartikP merged 5 commits intomainfrom
migrate-s3-to-brainscore-storage

Conversation

@KartikP
Copy link
Contributor

@KartikP KartikP commented Feb 18, 2026

Summary

  • Updates brainscore_language/utils/s3.py to use brainscore-storage bucket with brainscore-language/ key prefix
  • Aligns with the same S3 pattern used by brainscore-vision (bucket path splitting via core's load_assembly_from_s3)
  • The old brainscore-language standalone bucket is no longer referenced

Test plan

  • Verify uploads land at s3://brainscore-storage/brainscore-language/{filename}
  • Verify downloads resolve from https://brainscore-storage.s3.amazonaws.com/brainscore-language/{filename}
  • Migrate existing objects from old bucket to new path
  • Run Pereira2018 linear benchmarks to confirm ceiling loading still works

Language was using a standalone "brainscore-language" bucket for uploads
and a hardcoded URL for downloads. This aligns with the vision pattern:
bucket=brainscore-storage with key prefix brainscore-language/.

Existing data on the old brainscore-language bucket will need to be
migrated separately for the linear ceilings to continue loading.
The brainscore-storage bucket does not have versioning enabled, so passing
version_id strings causes 400 errors. Make version_id optional (default None)
in load_from_s3 and remove all version_id values from data registrations,
ceiling kwargs, and packaging scripts. sha1 still provides integrity checks.
The test was calling load_tuckute2024_5subj() which defaults to a
relative CSV path that only exists on the original author's machine.
Load via load_dataset('Tuckute2024.language') instead, validating the
same assembly properties against the S3-hosted data.
@mike-ferguson mike-ferguson added the submission_prepared Attached to a PR is metadata and layer mapping is successful. label Feb 18, 2026
@KartikP KartikP requested review from mike-ferguson and removed request for mike-ferguson March 2, 2026 15:21
Copy link
Member

@mike-ferguson mike-ferguson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

@KartikP KartikP merged commit 48ac7a0 into main Mar 3, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

submission_prepared Attached to a PR is metadata and layer mapping is successful.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants