Namespace field and sorting hooks#11
Conversation
Fernthedev
commented
Dec 26, 2025
- metadata: Add namespaze field and matching function to HookNameMetadata
- Basic sorting, time to rewrite to topological graph
- Implement proper topological sort (untested)
- Add topological sort tests
…checks and ensuring relative order is preserved
…plify insertion process
…ving memory management
…d recompile them for correct orig pointers
…ODO for future improvements
…directory in .gitignore and modifying test script
…uitable_installation
sc2ad
left a comment
There was a problem hiding this comment.
I haven't read through all the tests, but did finally review the sorting logic. I think we should discuss how to improve it on discord
| template <typename Context> | ||
| constexpr auto format(flamingo::HookNameMetadata const& metadata, Context& ctx) const { | ||
| return fmt::format_to(ctx.out(), "name: {}", metadata.name); | ||
| return fmt::format_to(ctx.out(), "name: {} namespaze {}", metadata.name, metadata.namespaze); |
There was a problem hiding this comment.
This string still reads namespaze instead of namespace
| // build name to iterator map | ||
| std::unordered_map<HookNameMetadata, std::list<HookInfo>::iterator, HookNameMetadataHash> name_to_iterator; | ||
| for (auto it = hooks.begin(); it != hooks.end(); ++it) { | ||
| name_to_iterator[it->metadata.name_info] = it; |
There was a problem hiding this comment.
This can be merged into the for loop above
| // If a hook specifies that it should be before/after itself, | ||
| // it's likely a mistake, but we won't consider it a cycle since it's not really a dependency. | ||
| if (name == self) { | ||
| continue; |
There was a problem hiding this comment.
Imo, this SHOULD be an error, it should report a bad priorities installation error eventually. I think the way we can do this is by just checking the incoming hook's before/after set on install instead of checking it during sorting, since we will always only sort well installed hooks.
That'll let you simplify some of the logic here.
| // find the iterator for this name | ||
| auto it = name_to_iterator.find(current_name); | ||
| if (it == name_to_iterator.end()) { | ||
| // should not happen |
There was a problem hiding this comment.
I think this should be fatal, not silent
| // append them in their original order and log a warning. Splicing invalidates | ||
| // iterators, so consume from the front until empty to preserve original order. | ||
| for (auto const& hook : hooks) { | ||
| FLAMINGO_CRITICAL( |
There was a problem hiding this comment.
This should be an error or warning log, not critical (since flamingo is not dying, the user tried to install a cyclic dependency)
| target_entry->second.fixups.target.WriteJump(it->hook_ptr); | ||
|
|
||
| // Single hook: target -> hook, hook.orig -> fixups or no_fixups | ||
| if (std::next(it) == hooks.end()) { |
There was a problem hiding this comment.
The for loop logic below can be changed to make this logic redundant:
while (std::next(it) != End)
Assign orig(next(it))
++It
It->assign orig(fixups or no fixups)
| break; | ||
| } | ||
| } | ||
| if (requires_sort) { |
There was a problem hiding this comment.
It might be cleaner to implement this logic in target metadata where when we sort for a given install, we track what names we haven't yet matched, and then when we install a new hook we just check to see if our incoming is in that set to see if we should sort, but that I'll leave as a decision to you if u want
| return ResultT::Ok(newIt); | ||
| } | ||
|
|
||
| // if no priority constraints affect us, we can install at the first suitable location that fits |
There was a problem hiding this comment.
I think this logic ends up being slower? I think we should actually think a bit more about this stuff since the ultimate goal should be to prevent recompiles, but the sort could unconditionally happen if we can guarantee some measure of reasonable speed.
It's hard to know if you are a dependee of an existing hook without more state so it can be tricky to track how you should be placed in such a way that makes sense. Also, the linear scan for a given node that has priorities is not trivial, since other hooks might need to be moved around to make space.
The only slow thing is the first node, since swapping the orig call for that is a tiny bit more annoying, but every other node getting reordered should be a trivial pointer chase operation.
I think if we just:
- always sort
- simplify some of the sorting logic to avoid some redundant loops and checks
- ensure edge cases all play nice
We can end up with a simpler impl that also works well. We should discuss this more on discord I think
sc2ad
left a comment
There was a problem hiding this comment.
Haven't read through all the sorting tests yet, but just left some high level comments here
| // check if any reference self in after/before | ||
| for (auto const& after : it->metadata.priority.afters) { | ||
| if (after.matches(it->metadata.name_info)) { | ||
| FLAMINGO_CRITICAL("Hook {} references itself in after dependencies. This is likely a mistake.", |
There was a problem hiding this comment.
These should not be CRITICAL logs, since that implies that flamingo itself will crash/fail, this is just user error so should log as warning and return the error
| { | ||
| std::vector<std::string_view> hook_names; | ||
| hook_names.reserve(hooks.size()); | ||
| for (auto const& hook : hooks) { | ||
| hook_names.push_back(hook.metadata.name_info.name); | ||
| } | ||
| FLAMINGO_DEBUG("Final hook order after topological sort: {}", fmt::join(hook_names, " -> ")); | ||
| } |
There was a problem hiding this comment.
Make this a method instead of calling it several times (probably would be useful for generic debugging too). Ideally, we also condition it behind the flamingo debug ifdef, but that doesn't need to be done in this PR
| /// @brief Topologically sorts the provided hooks by their priority constraints. | ||
| /// @param hooks The list of hooks to sort in place. | ||
| /// @return The sorted list of hooks that have cycles. | ||
| Result<std::list<HookInfo>, installation::TargetBadPriorities> topological_sort_hooks_by_priority( |
There was a problem hiding this comment.
I think we still should discuss how we can improve this (and its caller) since it seems quite verbose.
The checking logic for each individual hook also seems verbose (we can instead check when installing a hook whether it has cyclic befores or afters of itself).
The overall set of names --> hooks (ex, the actual topo graph) seems like something we could save per target, since the graph only needs to have things added or removed from it as we install/uninstall hooks.
With that, we could simplify this logic further by basically:
- Check incoming hook for immediate validity
- Add incoming hook and dependencies to graph
- Perform full topo sort with graph
- Update graph iterators to point to new node locations after sorting is complete
- Failures are reported as-is
| // Single hook: target -> hook, hook.orig -> fixups or no_fixups | ||
| if (std::next(it) == hooks.end()) { | ||
| it->assign_orig(target_entry->second.metadata.metadata.need_orig | ||
| ? target_entry->second.fixups.fixup_inst_destination.addr.data() | ||
| : reinterpret_cast<void*>(&no_fixups)); | ||
| return; | ||
| } | ||
|
|
||
| // When multiple hooks, orig is next hook | ||
| it->assign_orig(std::next(it)->hook_ptr); | ||
|
|
||
| // Multiple hooks: head, middles, tail | ||
|
|
||
| // Middles | ||
| for (++it; std::next(it) != hooks.end(); ++it) { | ||
| it->assign_orig(std::next(it)->hook_ptr); | ||
| } | ||
|
|
||
| // Tail | ||
| // 'it' now refers to the last element | ||
| it->assign_orig(target_entry->second.metadata.metadata.need_orig | ||
| ? target_entry->second.fixups.fixup_inst_destination.addr.data() | ||
| : reinterpret_cast<void*>(&no_fixups)); |
There was a problem hiding this comment.
All of this logic can be written instead as (or as a for loop):
while (std::next(it) != hooks.end()) {
it->assign_orig(std::next(it)->hook_ptr);
++it;
}
it->assign_orig(target_entry->second.metadata.metadata.need_orig
? target_entry->second.fixups.fixup_inst_destination.addr.data()
: reinterpret_cast<void*>(&no_fixups));
| // Select the end to install at | ||
| return ResultT::Ok(hooks.end()); | ||
|
|
||
| auto new_it = hooks.emplace(hooks.end(), std::move(hook_to_install)); |
There was a problem hiding this comment.
nit: this could be hooks.emplace_back instead of emplace at end
| // therefore topological | ||
|
|
||
| // If the incoming hook has any priority constraints, we may need a topological pass. | ||
| bool requires_sort = |
There was a problem hiding this comment.
I think if we can simplify the topo sort routine, we can avoid this checking logic entirely and just always do it. We will need to ensure we keep things stable, as you mentioned on discord, as I think that is an important feature that we should keep around, but otherwise doing the sort seems fine.
Recompile is cheap in comparison to a sort, so I wouldn't worry too much about that.
| auto const hook_data_result = location_or_err.value(); | ||
|
|
||
| // TODO: Recompile fixups/origs for all hooks on this target or not? | ||
| // recompile_hooks(hooked_target->second.hooks, target_info); |
There was a problem hiding this comment.
This recompile is currently in the "find" method, I'd prefer if we moved it out here and do it unconditionally after a found location (the recompile hooks method can decide if it wants to nop as an optimization, or not, in the futrure; doesn't need to do that now)