Skip to content

Conversation

@thevilledev
Copy link
Contributor

Motivation

The min, max, mean, and median builtin functions were using reflection for every element in the input array, calling reflect.Value.Index(i).Interface() in a loop. This causes significant allocations (100+ per call for 100 elements) because each .Interface() call triggers heap allocations.

Changes

Avoid reflection and per-element allocations for common typed slices ([]int, []float64, []any) by adding type-switch fast paths that iterate directly without calling reflect.Value.Interface().

Added benchmarks:

  • Benchmark_min
  • Benchmark_max
  • Benchmark_mean
  • Benchmark_median

Benchstat results against master branch:

cpu: Apple M1
          │  master.out  │               fix.out               │
          │    sec/op    │   sec/op     vs base                │
_min-8      2850.5n ± 0%   427.7n ± 1%  -85.00% (p=0.000 n=10)
_max-8      2877.5n ± 1%   443.2n ± 0%  -84.60% (p=0.000 n=10)
_mean-8     2140.0n ± 1%   227.2n ± 0%  -89.38% (p=0.000 n=10)
_median-8    4.801µ ± 1%   1.360µ ± 1%  -71.67% (p=0.000 n=10)
geomean      3.030µ        491.9n       -83.76%

          │  master.out  │               fix.out                │
          │     B/op     │     B/op      vs base                │
_min-8      1648.00 ± 0%     48.00 ± 0%  -97.09% (p=0.000 n=10)
_max-8      1648.00 ± 0%     48.00 ± 0%  -97.09% (p=0.000 n=10)
_mean-8     1656.00 ± 0%     56.00 ± 0%  -96.62% (p=0.000 n=10)
_median-8   4.391Ki ± 0%   2.047Ki ± 0%  -53.38% (p=0.000 n=10)
geomean     2.071Ki          128.2       -93.95%

          │  master.out  │              fix.out               │
          │  allocs/op   │ allocs/op   vs base                │
_min-8      102.000 ± 0%   2.000 ± 0%  -98.04% (p=0.000 n=10)
_max-8      102.000 ± 0%   2.000 ± 0%  -98.04% (p=0.000 n=10)
_mean-8     103.000 ± 0%   3.000 ± 0%  -97.09% (p=0.000 n=10)
_median-8    211.00 ± 0%   11.00 ± 0%  -94.79% (p=0.000 n=10)
geomean       122.6        3.390       -97.24%

Further comments

Other typed slices (e.g., []int32, []uint64, custom numeric types) fall back to the original reflection-based path. Open to feedback whether we should include more typed slices here.

Similar optimizations could potentially be applied to other builtins like sum.

Avoid reflection and per-element allocations for common typed slices
([]int, []float64, []any) by adding type-switch fast paths that iterate
directly without calling reflect.Value.Interface().

Falls back to reflection for other slice types to maintain compatibility.

Signed-off-by: Ville Vesilehto <[email protected]>
@antonmedv
Copy link
Member

Nice! I like such optimizations!

@thevilledev
Copy link
Contributor Author

Thanks! So much fun :)

Needs more tests and fine-tuning, but I'll mark it ready once there 🤞

@thevilledev thevilledev changed the title perf(builtin): min, max, mean, median fast paths WIP perf(builtin): min, max, mean, median fast paths Jan 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants