-
Notifications
You must be signed in to change notification settings - Fork 104
make murmurhash3_x64_128 compatible with existing cuco data structures #501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make murmurhash3_x64_128 compatible with existing cuco data structures #501
Conversation
6af4877 to
2001837
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor cleanups.
Based on the offline discussions, we are not happy with the fact that sanitize_hash has to be invoked twice in CG-based probing. @sleeepyjack Any idea how to improve the situation?
|
/ok to test |
469e39b to
2001837
Compare
|
I have some high-level questions/suggestions regarding this PR:
|
I suggested to do so since probing is the only place using
It works but technically we are not using the hash value in a proper way, is it fair to say so? |
And I agree that it makes sense to put utils with the class where they have one-time use. However, since it's a template function the syntax doesn't get much cleaner doing so:
|
include/cuco/detail/utils.cuh
Outdated
| template <typename SizeType, typename HashType> | ||
| __host__ __device__ constexpr SizeType sanitize_hash(HashType hash, std::uint32_t cg_rank) noexcept | ||
| { | ||
| return sanitize_hash<SizeType>(sanitize_hash<SizeType>(hash) + cg_rank); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There exists a scenario where this approach fails and it's the reason CI in my initial draft PR failed.
Consider the following example:
SizeTypeisint32_taka a signed type- The value of
sanitize_hash<SizeType>(hash)is very close tonumeric_limits<SizeType>::max()
In this scenario, if we compute sanitize_hash<SizeType>(hash) + cg_rank there's chance the result oxceeds numeric_limits<SizeType>::max() which would result in a signed integer overflow which is undefined behavior under the C++ abstract machine. Thus the compiler is free to produce any garbage code around this call.
To solve this we need check if sanitize_hash<SizeType>(hash) > (numeric_limits<SizeType>::max() - group.size()) (be careful with > and >=, I'm infamous for my off-by-one errors) and then compute the wrapped-around value manually in case this expression evaluates to true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation!
I think 3ae1afe covers this case now.
PointKernel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some styling nits otheriwse good to go
|
/ok to test |
PointKernel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work as always! Thanks!
| using cg_type = cooperative_groups::thread_block_tile<cg_size>; | ||
| return detail::probing_iterator<Extent>{ | ||
| cuco::detail::sanitize_hash<size_type>(hash_(probe_key) + g.thread_rank()) % upper_bound, | ||
| cuco::detail::sanitize_hash<cg_type, size_type>(g, hash_(probe_key)) % upper_bound, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move the size_type to the front of the tparam list so you don't have to specify the cg_type as it can be inferred from g.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's about the intention of the API and the syntax consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, the g parameter should be the first one for consistency reasons but we can still use a different ordering for the tparam list, i.e., the one that lets us make use of automatic type inference. The only tparam that cannot be inferred is the result size type so specifying the CG type is redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an internal API so I don't want to bikeshed too much. I'm okay with merging it as is.
sleeepyjack
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀 🚀 🚀
Co-authored-by: Yunsong Wang <[email protected]>
|
/ok to test |
Make murmurhash3_x64_128 compatible with existing cuco data structures
sanitize_hashfunction fromcuco::detail::namespace toprobing_scheme_baseclass as protected memeber.santize_hashfunction to handlecuda::std::array<uint64_t, 2>, which is returned frommurmurhash3_x64_128hash function.hash_test.cuto teststatic_mapAPI with all hash functions.