Skip to content

Conversation

@ayushsingh9720
Copy link

This PR addresses issue #5039 by refactoring the internal colnamesInt function at both the C and R levels. By introducing a mandatory source argument, the internal error messages now provide specific context (e.g., pinpointing whether a failure occurred during an on= join, setkey, or frank), making debugging significantly more intuitive for the user.

Technical Changes

  1. C Internal Logic (src/)
    src/utils.c: Modified the colnamesInt signature to accept a 5th argument, SEXP source. Updated all internal error() calls to include a %s placeholder, prepending the error with the provided context string.

src/data.table.h: Updated the function prototype to remain in sync with the new signature.

src/init.c: Updated the R-to-C registration from -1 (variable) to a strict 5 to ensure formal argument counting at the C interface level.

  1. R Wrapper and Call Sites (R/)
    R/wrappers.R: Updated the R-side wrapper for colnamesInt to accept source=NULL and pass it through the .Call interface.

Global Refactor: Performed a comprehensive audit and update of 19 call sites across the following files to ensure the new 5th argument is passed correctly with descriptive hints:

R/data.table.R (Joins, setcolorder, setnames)

R/mergelist.R (Merge operations and internal column copying)

R/setkey.R (Key assignment)

R/frank.R (Ranking)

R/duplicated.R (Unique/Duplicated checks)

R/setops.R (Set operations)

Impact
Users will no longer see generic "inscrutable" errors. For example: Old Error: Error in colnamesInt(...) : argument specifying columns received non-existing column(s): cols[2]='z' New Error: Error in colnamesInt(...) : In x table's columns in on= join, argument specifying columns received non-existing column(s): cols[2]='z'

Checklist
[x] Reproduced the original "inscrutable" error.

[x] Successfully compiled using R CMD INSTALL ..

[x] Verified that all 19 call sites are updated to avoid "unused argument" or "argument mismatch" errors.

[x] Passed manual tests for joins, setkeys, and set operations.

Closes #5039

@ayushsingh9720 ayushsingh9720 force-pushed the fix-inscrutable-colnames-errors branch 3 times, most recently from 81ceafe to 2104dc4 Compare December 30, 2025 12:14
@jangorecki
Copy link
Member

source is not a good name for an argument as it is widely used function name, I would use context instead.

Unit tests are missing.

@ayushsingh9720 ayushsingh9720 force-pushed the fix-inscrutable-colnames-errors branch 2 times, most recently from b45cd57 to 11cc256 Compare December 30, 2025 13:19
@codecov
Copy link

codecov bot commented Dec 30, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.96%. Comparing base (3c06138) to head (2649ed9).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #7554   +/-   ##
=======================================
  Coverage   98.96%   98.96%           
=======================================
  Files          87       87           
  Lines       16737    16743    +6     
=======================================
+ Hits        16563    16569    +6     
  Misses        174      174           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ayushsingh9720
Copy link
Author

Thank you for the feedback, @jangorecki ! I will rename the argument to context across the C and R levels. I will also add a dedicated unit test file to verify that the context hints are being correctly prepended to the error messages. I'll push these updates shortly.

@ayushsingh9720 ayushsingh9720 force-pushed the fix-inscrutable-colnames-errors branch from 11cc256 to 65f8ca6 Compare December 30, 2025 16:57
@jangorecki
Copy link
Member

@ayushsingh9720 did you use AI when preparing this PR?

@ayushsingh9720
Copy link
Author

Hi @jangorecki, yes, I've been using an AI as a , kind of a thought/learning partner to help me with the data.table codebase. as i am new to open source, it has been helping me understand the internl logic and debug build failures, but I am manually implementing the changes, running the local R CMD INSTALL builds, and verifying the error messages in my terminal to ensure everything works as intended.

@jangorecki
Copy link
Member

Ok. Check how we define tests and try to follow that logic, you can read manual of internal function test.

@ayushsingh9720 ayushsingh9720 force-pushed the fix-inscrutable-colnames-errors branch from 65f8ca6 to c9a0d8d Compare December 30, 2025 17:23
@ayushsingh9720 ayushsingh9720 force-pushed the fix-inscrutable-colnames-errors branch from c9a0d8d to 2649ed9 Compare December 30, 2025 17:50
@ayushsingh9720
Copy link
Author

checks are passing now, i have updated the unit test(Test 2357) , to follow the internal test() , as suggested. really appreciate the advice

@ayushsingh9720
Copy link
Author

@tdhock @Anirban166 @DorisAmoakohene @joshhwuu I have completed the contributor tests for issues #5039 and #6512. I've updated the Wiki and sent an introductory email to Toby regarding my interest in a mix of low-level C work and bug fixing for GSoC 2026. I've also joined the R-GSoC Slack for future coordination.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

colnamesInt makes inscrutable errors

2 participants