Skip to content

ray data robotics pipeline - droid dataset to blip annotation#47

Open
shorbaji wants to merge 7 commits intomainfrom
omar/ray-data-robotics-droid-blip
Open

ray data robotics pipeline - droid dataset to blip annotation#47
shorbaji wants to merge 7 commits intomainfrom
omar/ray-data-robotics-droid-blip

Conversation

@shorbaji
Copy link
Contributor

No description provided.

shorbaji and others added 4 commits March 17, 2026 00:10
Replace instance_type specifications with required_resources for portable,
cloud-agnostic compute configuration. Specify T4 GPU via required_labels.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Robert Nishihara <rkn@anyscale.com>
@robertnishihara robertnishihara force-pushed the omar/ray-data-robotics-droid-blip branch from cb4bed2 to 87cafc7 Compare March 20, 2026 06:46
- Remove episodes_droid_v1.0.1_s3.parquet (7.2MB) from repo
- Read manifest directly from S3 using PyArrow's S3FileSystem
- Update README to document automatic manifest download
- Set max_retries to 0 for faster failure feedback
- Remove .gitignore

The manifest is now hosted at:
s3://anyscale-public-droid-dataset/droid/1.0.1/episodes_droid_v1.0.1_s3.parquet

Benefits:
- Reduces repo size by 7.2MB
- Co-locates manifest with dataset
- Simplifies maintenance and updates
- Pipeline remains easy to use

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Robert Nishihara <rkn@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants