Split library into cudf-free core and optional cucascade-cudf target#150
Draft
mbrobbel wants to merge 1 commit into
Draft
Split library into cudf-free core and optional cucascade-cudf target#150mbrobbel wants to merge 1 commit into
mbrobbel wants to merge 1 commit into
Conversation
Decouple the core library from libcudf: cucascade now depends only on RMM + CUDA (plus numa and kvikIO/cuFile for the disk tier), while the cudf-backed representations, built-in converters, and bandwidth profiler move to a separate cucascade-cudf target (headers under cucascade/cudf/). - Generic memory::column_metadata (opaque int32_t type tag) used by the disk tier; cudf converters translate cudf::type_id <-> the tag. - Virtualize record_writer_event/get_writer_event/rebind_stream on idata_representation so data_batch stays cudf-free. - Core converter registry ships empty; register_builtin_converters() now lives in cucascade-cudf. - CUCASCADE_BUILD_CUDF option (default ON) gates find_package(cudf) and the cudf target; cudf is exposed as a package component so core consumers never pull cudf. - Split tests and benchmarks by target. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #142. Supersedes #144.
cuCascade is meant to be a generic tiered GPU memory/data manager, but several pieces were tightly
coupled to libcudf, forcing every consumer to pull the whole cuDF stack. PR #144 addressed this
by deleting the cudf-coupled code. This PR takes a less destructive approach: it keeps the cudf
representations/converters but moves them into a separate library target,
cucascade-cudf, thatlinks libcudf. Downstream consumers that need the cudf representations (e.g. Sirius) link
cucascade-cudf; everyone else links the cudf-free corecucascadeand registers their ownconverters.
The core
cucascadetarget now depends only on RMM + CUDA (plus libnuma and kvikIO/cuFile forthe disk tier).
What changed
Core (
cuCascade::cucascade) — now cudf-freecucascade::memory::column_metadata(opaqueint32_t type_id) intoinclude/cucascade/memory/column_metadata.hpp; the disk tier uses it instead of thecudf::type_id-typed version. The cudf converters translatecudf::type_id↔ the generic tag.record_writer_event()/get_writer_event()/rebind_stream()are now virtuals onidata_representation(no-op /nullptrdefaults), sodata_batchdispatches polymorphicallyinstead of
dynamic_cast-ing to the GPU representation.register_builtin_converters()moved out of the core header.New
cuCascade::cucascade_cudftarget (headers underinclude/cucascade/cudf/, sources undersrc/cudf/)gpu_table_representation,host_data_representation/host_data_packed_representation,host_table/host_table_packed, the bandwidth profiler, and the built-in converters(
register_builtin_converters(), declared incucascade/cudf/builtin_converters.hpp).Build
CUCASCADE_BUILD_CUDFoption (defaultON) gatesfind_package(cudf)and thecucascade_cudflibrary.CUCASCADE_BUILD_CUDF=OFFproduces a fully cudf-free build.full (cudf-ON) install:
cucascade_tests(core) andcucascade_cudf_tests(cudf); benchmarks linkcucascade_cudf. Both cudf executables are gated onCUCASCADE_BUILD_CUDF.Testing
Built and ran the full suite locally (NVIDIA,
cuda-13-stableenv):cmake --preset release -DCUCASCADE_BUILD_TESTS=ON→ core +cucascade_cudf+ both test exes +benchmarks build;
ctest→ 100% passed (cucascade_tests,cucascade_cudf_tests,cucascade_topology_discovery_tests).cmake ... -DCUCASCADE_BUILD_CUDF=OFF→ core-only build succeeds; cudf tests/benchmarks skipped;core tests pass.
lddconfirmslibcucascade.sohas no libcudf, whilelibcucascade_cudf.solinks it.core
find_package(cuCascade)works without cudf (even against a cudf-ON install),COMPONENTS cudfresolves cudf and linkscuCascade::cucascade_cudf, andREQUIRED COMPONENTS cudffails cleanly against a cudf-free install.Reviewer notes
headers move from
cucascade/data/andcucascade/memory/tocucascade/cudf/. No old-pathforwarding shims are provided — consumers update their includes to
<cucascade/cudf/...>. Happyto add forwarding shims if we'd rather not break the include paths at this stage.
disk_file_format.hppstays in the core (include/cucascade/data/): it is cudf-free eventhough only the cudf converters use it today.
🤖 Generated with Claude Code