Skip to content

Performance issues extend/push/insert on v2 #404

@alejandro-vaz

Description

@alejandro-vaz

I've benched both v1 and v2, and it seems there are some performance regressions in v2.

v1 performance:

test bench_extend                      ... bench:          33.43 ns/iter (+/- 1.79)
test bench_extend_from_slice           ... bench:          34.40 ns/iter (+/- 6.53)
test bench_extend_from_slice_small     ... bench:           5.70 ns/iter (+/- 0.09)

test bench_push                        ... bench:         262.09 ns/iter (+/- 2.49)
test bench_push_small                  ... bench:          16.84 ns/iter (+/- 0.16)

test bench_insert_push                 ... bench:         276.11 ns/iter (+/- 3.33)
test bench_insert_push_small           ... bench:          20.81 ns/iter (+/- 0.21)

but on v2:

test bench_extend                      ... bench:          81.64 ns/iter (+/- 2.10)
test bench_extend_from_slice           ... bench:          83.78 ns/iter (+/- 1.58)
test bench_extend_from_slice_small     ... bench:           9.72 ns/iter (+/- 0.25)

test bench_push                        ... bench:         275.09 ns/iter (+/- 3.12)
test bench_push_small                  ... bench:          42.53 ns/iter (+/- 1.75)

test bench_insert_push                 ... bench:         305.75 ns/iter (+/- 10.22)
test bench_insert_push_small           ... bench:          45.65 ns/iter (+/- 9.09)

The most critical performance regressions are on extend methods, push_small, and insert_push_small.

It may be that the implementation for the push method changed from v1:

    pub fn push(&mut self, value: A::Item) {
        unsafe {
            let (mut ptr, mut len, cap) = self.triple_mut();
            if *len == cap {
                self.reserve_one_unchecked();
                let (heap_ptr, heap_len) = self.data.heap_mut();
                ptr = heap_ptr;
                len = heap_len;
            }
            ptr::write(ptr.as_ptr().add(*len), value);
            *len += 1;
        }
    }

to v2:

    pub fn push(&mut self, value: T) {
        let len = self.len();
        if len == self.capacity() {
            self.reserve(1);
        }
        // SAFETY: both the input and output are within the allocation
        let ptr = unsafe { self.as_mut_ptr().add(len) };
        // SAFETY: we allocated enough space in case it wasn't enough, so the address is valid for
        // writes.
        unsafe { ptr.write(value) };
        unsafe { self.set_len(len + 1) }
    }

and it seems it's cascading to all other methods that depend on it. However performance only seems to degrade meaningfully when we don't push past the SmallVec preallocated N and has to spill to the heap. So it's probably slower just when push is called and the current capacity is less than N.

On the benches that have slowed massively, it's by a factor of around 130% slower, pretty constant across those benches, so it all points towards a shared slowing cause.

It might also be that the push is not what's causing the bottleneck but something else downstream.

Originally posted by @alejandro-vaz in #395

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions