Skip to content

Conversation

@dayesouza
Copy link
Contributor

New CSVTableProvider for managing CSV-backed tables using the storage abstraction, unifies and renames key methods in the TableProvider interface and its implementations, and updates all usage sites and tests accordingly. Additionally, it improves the in-memory storage API to support regex-based key search.

  • Added CSVTableProvider in graphrag_storage/tables/csv_table_provider.py to support reading, writing, checking existence, and listing of tables stored as CSV files in any storage backend. Includes comprehensive error handling and logging.

API unification and method renaming

  • Renamed has_dataframe to has and find_tables to list_tables in the TableProvider interface and all subclasses, including the Parquet provider.

Enhancements to in-memory storage

  • Added a find method to MemoryStorage to support regex-based key searching, which is now leveraged by the new CSVTableProvider for table listing.

Testing improvements

  • Added a new test suite for CSVTableProvider covering read/write, error handling, empty/mixed-type DataFrames, persistence, existence checks, and table listing.
  • Updated Parquet provider tests to use the unified has method.

@dayesouza dayesouza requested a review from a team as a code owner February 6, 2026 21:44
@dayesouza dayesouza requested a review from natoverse February 6, 2026 22:08
@dayesouza dayesouza merged commit 158a957 into main Feb 9, 2026
18 checks passed
@dayesouza dayesouza deleted the csv branch February 9, 2026 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants