Skip to content

Decouple core library from libcudf#144

Draft
mbrobbel wants to merge 2 commits into
NVIDIA:mainfrom
mbrobbel:cudf-dep
Draft

Decouple core library from libcudf#144
mbrobbel wants to merge 2 commits into
NVIDIA:mainfrom
mbrobbel:cudf-dep

Conversation

@mbrobbel

@mbrobbel mbrobbel commented Jun 5, 2026

Copy link
Copy Markdown
Member

Summary

Closes #142.

cuCascade is meant to be a generic, tiered GPU memory/data manager, but several pieces were tightly
coupled to libcudf — forcing every consumer to depend on the whole cuDF stack. This PR removes that
coupling: the core library now depends only on RMM + CUDA (plus libnuma, and kvikIO/cuFile for disk),
and the cuDF-specific representations, converters, and bandwidth profiler move out to the domain layer
(downstream consumers that link libcudf).

cuCascade's abstractions — idata_representation, the converter registry, data_batch, repositories,
disk I/O backends, the memory subsystem — were already generic; this PR makes the dependency boundary
match the architecture.

42 files changed, +259 −10,444

What changed

Public API / library

  • idata_representation no longer exposes any cuDF type. The cuDF-backed concrete representations are
    gone from the library — disk_data_representation is now the only in-library concrete representation.
  • record_writer_event() / get_writer_event() moved onto the idata_representation base as virtuals
    (no-op / nullptr defaults). read_only_data_batch::get_writer_event() now dispatches polymorphically
    instead of dynamic_cast-ing to the GPU type.
  • Extracted a generic, domain-agnostic memory::column_metadata (opaque int32_t type tag — the numeric
    value of a consumer's column-type enum, which cuCascade never interprets) into
    include/cucascade/memory/column_metadata.hpp, used by the disk tier.
  • The converter registry now ships emptyregister_builtin_converters() is removed; consumers
    register tier converters via register_converter<Source, Target>().

Build

  • Remove find_package(cudf) and cudf::cudf from the link interface; remove find_dependency(cudf)
    from the installed CMake config — consumers no longer transitively pull cuDF.
  • pixi: depend on librmm directly (was transitive via libcudf); rename cudf-* features → rmm-*.

Removed (now provided by the domain layer)

  • gpu_data_representation, cpu_data_representation (host_data_representation,
    host_data_packed_representation), host_table.hpp, host_table_packed.hpp.
  • The ~1,800-line built-in converters in representation_converter.cpp + register_builtin_converters().
  • The bandwidth profiler (bandwidth_profiler.{hpp,cpp}).
  • The cuDF-coupled tests + test/utils/cudf_test_utils.*, and the converter/disk benchmarks.

Docs / CI

  • Updated README, docs/*, and CLAUDE.md to the decoupled architecture; removed docs/bandwidth-profiler.md.
  • Disabled the CI benchmark job (its converter benchmarks moved out with the cuDF code).

Acceptance criteria (#142)

  • find_package(cudf) absent from CMakeLists.txt
  • cudf::cudf absent from every link target
  • No #include <cudf/...> in any cuCascade source/header (remaining cuDF mentions are explanatory comments)
  • cmake-release + build-release succeed; the build graph has zero cuDF references
  • cuCascade tests pass using only synthetic/mock representations
  • Disk I/O backends remain functional with raw GPU-buffer round-trips (test_disk_io_backend)

Testing

Built and ran the full suite locally on an NVIDIA RTX PRO 6000 (sm_120):

cmake --preset release -DCUCASCADE_BUILD_TESTS=ON
cmake --build build/release -j && (cd build/release && ctest --output-on-failure)
# 100% tests passed, 0 failed  (cucascade_tests, cucascade_topology_discovery_tests)

Verified the build references no cuDF via find_package / link / #include. (A fully cuDF-absent
environment will confirm at the package level once the lockfile is re-solved against the new rmm-*
features.)

Reviewer notes

  • column_metadata kept (not removed): the issue lists host_table.hpp's column_metadata among
    the types to move out, but disk_data_representation stays and needs it — so it's retained as a
    generic, cuDF-free struct rather than deleted.
  • On-disk format clarified: a disk file is raw 4 KB-aligned column buffers with no header;
    per-column metadata lives in-memory in disk_table_allocation (a file is only meaningful alongside its
    allocation). Corrected a stale CLAUDE.md claim of a disk_file_header.
  • Follow-up: add cuDF-free benchmarks (e.g. raw-buffer disk I/O) and re-enable the CI benchmark job.

🤖 This PR was prepared with the assistance of an AI agent (Claude Code).

Remove libcudf as a dependency of the cuCascade core library; RMM (librmm)
becomes the only direct RAPIDS dependency. The cuDF-backed GPU/host
representations and all built-in tier converters are removed and become the
responsibility of an external domain layer that links libcudf and registers
converters via register_converter().

Library / build:
- Drop find_package(cudf) and cudf::cudf from link libs; drop
  find_dependency(cudf) from the installed CMake config.
- pixi: swap libcudf -> librmm, rename cudf-* features to rmm-*.
- Delete gpu/cpu_data_representation, host_table(_packed), the ~1800-line
  built-in converters, register_builtin_converters(), the bandwidth profiler,
  and the cuDF-coupled benchmarks / tests / test utils.
- Extract a generic memory::column_metadata (opaque int32_t type tag) into
  include/cucascade/memory/column_metadata.hpp for the disk tier.
- Move record_writer_event()/get_writer_event() to the idata_representation
  base (no-op / nullptr defaults).
- disk_data_representation is now the only in-library concrete representation.

Docs / CI:
- Update README, docs/*, and CLAUDE.md to the decoupled architecture; remove
  docs/bandwidth-profiler.md.
- Disable the CI benchmark job (its converter benchmarks moved out with the
  cuDF code).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@copy-pr-bot

copy-pr-bot Bot commented Jun 5, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@mbrobbel

mbrobbel commented Jun 5, 2026

Copy link
Copy Markdown
Member Author

/ok to test b24413f

@mbrobbel

mbrobbel commented Jun 8, 2026

Copy link
Copy Markdown
Member Author

/ok to test 79fb183

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove libcudf dependency from cuCascade

2 participants