-
Notifications
You must be signed in to change notification settings - Fork 1.7k
AVRO-4228: [c++] Fix BinaryDecoder::arrayNext() to handle negative block counts #3646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| } | ||
| } | ||
|
|
||
| static void testArrayNegativeBlockCount() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test pass even without the fix above.
The reason is that the negative count is in the first block and this uses arrayStart() which already delegates to doDecodeItemCount().
Please update the test to use a negative count [also] in the second block.
lang/c++/impl/BinaryDecoder.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may lead to overflow if result is size_t's min value.
Possible improvement:
| return static_cast<size_t>(-(result + 1)) + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In practice this would mean an array with 2^63 items? Quite large for a prod system in my view, but technically it's undefined behavior so you're right, needs fixing, good catch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't expect that such big number will be used in normal conditions!
But an attacker can craft it manually and pass it to a system to cause problems.
lang/c++/test/CodecTests.cc
Outdated
| BOOST_CHECK_EQUAL(result[0], 10); | ||
| BOOST_CHECK_EQUAL(result[1], 20); | ||
| BOOST_CHECK_EQUAL(result[2], 30); | ||
| BOOST_CHECK_EQUAL(result[3], 40); | ||
| BOOST_CHECK_EQUAL(result[4], 50); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| BOOST_CHECK_EQUAL(result[0], 10); | |
| BOOST_CHECK_EQUAL(result[1], 20); | |
| BOOST_CHECK_EQUAL(result[2], 30); | |
| BOOST_CHECK_EQUAL(result[3], 40); | |
| BOOST_CHECK_EQUAL(result[4], 50); | |
| const std::vector<int32_t> expected = {10, 20, 30, 40, 50}; | |
| BOOST_CHECK_EQUAL_COLLECTIONS(result.begin(), result.end(), expected.begin(), expected.end()); |
What is the purpose of the change
BinaryDecoder::arrayNext() calls doDecodeLong() directly instead of doDecodeItemCount(), causing it to mishandle negative array block counts. Per the Avro spec, a negative block count means the absolute value is the item count followed by an additional long for the byte-size of the block. When arrayNext() reads a negative count, static_cast<size_t>(-100) produces a huge value and the byte-size long is left unconsumed, corrupting the stream position.
doDecodeItemCount() already handles this correctly and is used by arrayStart(), mapStart(), and mapNext(). Only arrayNext() bypassed it. The fix changes arrayNext() to call doDecodeItemCount() for consistency.
This affects any array large enough to be encoded in multiple blocks with negative counts. ClickHouse independently found the same bug (ClickHouse/ClickHouse#60438, ClickHouse#23).
Verifying this change
Documentation