Add dependency verification documentation#110
Conversation
Add docs for diagnosing and resolving package version mismatches between local and Ray worker environments, including the compare_ray_environments tool and common issues/solutions. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
The one link introduced here will point to the api docs page that will appear with this: |
dantasse
left a comment
There was a problem hiding this comment.
Thanks for the first cut here! I think for public docs we should be a lot more judicious than when we're generating a Claude internal doc: we need to be more sure that each bit is correct, and we need to be more concise.
|
|
||
| **Symptoms**: `ModuleNotFoundError` for compiled extensions, segfaults. | ||
|
|
||
| **Solution**: Run Geneva from the same OS/architecture as your cluster (Linux x86_64). |
There was a problem hiding this comment.
| **Solution**: Run Geneva from the same OS/architecture as your cluster (Linux x86_64). | |
| **Solution**: Run Geneva from the same OS/architecture as your cluster (Linux x86_64). Or, if that's not possible, install dependencies using `pip()` or `conda()` as described in [Execution Contexts](/geneva/jobs/contexts) |
There was a problem hiding this comment.
I don't think that solution will fix the architecture mismatch. I think there is an explicit arch + pip approach that would be needed.
There was a problem hiding this comment.
Wait, it doesn't? Won't the worker just download from pip the right version of the library for their architecture?
| ## Diagnostic Workflow | ||
|
|
||
| When encountering serialization or "abstract class" errors: | ||
|
|
||
| <Steps> | ||
| <Step> | ||
| **Run the diagnostic tool**: | ||
| ```bash | ||
| python -m geneva.runners.ray.compare_env | ||
| ``` | ||
| </Step> | ||
| <Step> | ||
| **Check PACKAGES: version mismatches** section first. | ||
| </Step> | ||
| <Step> | ||
| **Identify critical packages**: numpy, torch, pyarrow, attrs, pydantic. | ||
| </Step> | ||
| <Step> | ||
| **Fix with manifest** for quick testing: | ||
| ```python | ||
| from geneva.manifest.builder import GenevaManifestBuilder | ||
| manifest = GenevaManifestBuilder.create("fix").pip(["numpy==1.26.4"]).build() | ||
| ``` | ||
| </Step> | ||
| <Step> | ||
| **Build custom image** for production (if using KubeRay). | ||
| </Step> | ||
| </Steps> |
There was a problem hiding this comment.
I would take out this whole section. Users will probably only need some steps of this (maybe they do or don't build custom images, etc), and idk why Claude fixated on PACKAGES, I don't think that's the only interesting bit here
There was a problem hiding this comment.
I think the high level of how the tool can help is important. I'm going to move this one section down before ### programmatic usage and imrpove the prose a bit.
| + kuberay-client | ||
| ``` | ||
|
|
||
| ## Common Issues and Solutions |
There was a problem hiding this comment.
I'd cut this section too; I don't think most of these common issues really pull their weight. We already highlighted them at the top of the page, and the solution to all of them is "use a manifest to make them the same."
|
|
||
| ## Fixing Mismatches | ||
|
|
||
| ### Option 1: Manifest pip Dependencies |
There was a problem hiding this comment.
Instead of 3 options (which are not the 3 options I'd choose; "switch to Conda" is probably not going to solve your woes!), I'd do something like this:
| ### Option 1: Manifest pip Dependencies | |
| To fix any potentially problematic dependency mismatches, specify them in a Geneva Manifest. For prototyping, you can specify them using pip, conda, a requirements.txt, or an environment.yml file. See [Execution contexts](/geneva/jobs/contexts) for more details. For stable production jobs, we recommend baking the dependencies into the image. | |
| To fix any missing env vars, pass them as ray_init_kwargs like so... |
There was a problem hiding this comment.
conda ends up being the solution for some because the default ray images use conda and may have conflicts.
I'm goign to fold these solutions into next to the approapriate output sections to make it flow more easily.
… sections Consolidate duplicate manifest examples by moving solutions directly under relevant output sections: architecture fix under PYTHON/PLATFORM, env var passing under Environment Variables, and package fixes under Packages. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Ok, I reorganized the doc so it should flow and have less repetition now. |
dantasse
left a comment
There was a problem hiding this comment.
Ok, thanks for reorganizing and for all the changes, looks much better IMO. (couple remaining comments non-blocking)
Summary
compare_ray_environmentstool with programmatic and CLI usageTest plan
npx mintlify dev🤖 Generated with Claude Code