I have a network function, which will duplicate atomic packet in switch. when I duplicate atomic pkt, the latency becomes about 3 times larger(from 4.85us to 15.47us), theres no rto or out of sequence. If I use ib_write_lat and duplicate write packet, the latency stays the same.
I find it very strange, because duplicate atomic packets should have been dropped by the NIC as soon as their PSNs are recognized as duplicates, long before they reach the processing pipeline. Why can they still hurt performance?