Vanadis: Enable true Out-of-Order execution for RoCC instructions #2603
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #2602
Summary
This PR overhauls the
VanadisRoCCInterfaceintegration within the processor pipeline to support true Out-of-Order (OoO) execution. Previously, RoCC instructions were handled sequentially with immediate operand reading at the Issue stage and direct architectural register write-back, which artificially serialized execution and introduced RAW/WAW/WAR hazards.Technical Details
This PR implements the following changes to align RoCC instruction handling with standard Vanadis functional units:
Register Renaming for RoCC (
assignRegistersToInstruction):rd. This resolves WAW and WAR hazards.Delayed Dispatch Mechanism (
allocateFunctionalUnit):pushto the RoCC interface during the Issue stage.rocc_wait_queues_to hold issued RoCC instructions that are waiting for source operands (acting as a reservation station).Operand Readiness Check (
performExecute):rocc_wait_queues_.pendingIntWrites).Correct Physical Register Write-back:
performExecute.Impact
These changes allow the Vanadis CPU to continue issuing independent instructions while RoCC instructions are waiting for operands or executing long-latency tasks. This significantly improves the accuracy of performance modeling for heterogeneous systems.
Testing