Skip to content

feat(ruvector-cnn): CNN contrastive learning + SIMD optimization fixes#252

Merged
ruvnet merged 5 commits intomainfrom
feat/cnn-contrastive-research
Mar 11, 2026
Merged

feat(ruvector-cnn): CNN contrastive learning + SIMD optimization fixes#252
ruvnet merged 5 commits intomainfrom
feat/cnn-contrastive-research

Conversation

@ruvnet
Copy link
Owner

@ruvnet ruvnet commented Mar 11, 2026

Summary

This PR adds CNN contrastive learning capabilities to ruvector and fixes critical SIMD/backbone issues discovered during code review.

New Features

CNN Contrastive Learning Crate

  • InfoNCE Loss (SimCLR-style) with temperature scaling and numerical stability
  • Triplet Loss with configurable margin and hard/soft modes
  • Augmentation Pipeline (requires augmentation feature): random crop, flip, color jitter, grayscale, blur
  • MobileNet-V3 Backbone (Small/Large variants)
  • SIMD Optimizations: AVX2, NEON, WASM SIMD128
  • WASM Bindings (ruvector-cnn-wasm)
  • npm Package (ruvector-cnn) with TypeScript definitions

Bug Fixes (Closes #251)

SIMD Functions Were Stubs

Implemented real SIMD for:

  • conv_3x3_avx2, conv_3x3_avx2_fma - now processes 8 output channels simultaneously
  • depthwise_conv_3x3_avx2 - now processes 8 channels at once
  • global_avg_pool_avx2 - vectorized accumulation
  • max_pool_2x2_avx2 - uses _mm256_max_ps
  • conv_3x3_wasm, depthwise_conv_3x3_wasm - 4 channels with SIMD128

MobileNetV3 Forward Pass

  • Fixed block processing (was completely skipped)
  • Deprecated MobileNetV3Small/MobileNetV3Large in favor of unified MobileNetV3
  • Fixed hardcoded channel counts

Dead Code

  • Silenced unused field warnings with proper annotations
  • Fixed undefined feature reference

Test Plan

  • All 36 unit tests pass
  • Doc-tests pass
  • WASM crate compiles
  • npm package ready

Files Changed

crates/ruvector-cnn/           # Core crate (36 tests)
crates/ruvector-cnn-wasm/      # WASM bindings
npm/packages/ruvector-cnn/     # npm package
docs/research/cnn/             # Research documentation
docs/adr/ADR-088-...           # Architecture decision

🤖 Generated with claude-flow

Reuven and others added 4 commits March 11, 2026 13:35
- Add ruvector-cnn crate with SIMD-optimized convolutions and contrastive losses
- Implement InfoNCE (SimCLR) and TripletLoss for contrastive learning
- Add MobileNet-V3 inspired backbone architecture
- Include AVX2, NEON, WASM SIMD support with scalar fallback
- Add WASM bindings (ruvector-cnn-wasm) for browser/Node.js
- Add npm package with TypeScript definitions
- Include comprehensive research docs and ADR-088
- 36 tests passing

Co-Authored-By: claude-flow <ruv@ruv.net>
## SIMD Implementations (was using scalar fallbacks)
- AVX2: conv_3x3_avx2, conv_3x3_avx2_fma, depthwise_conv_3x3_avx2
- AVX2: global_avg_pool_avx2, max_pool_2x2_avx2
- WASM: conv_3x3_wasm, depthwise_conv_3x3_wasm

All now use real SIMD intrinsics processing 8 (AVX2) or 4 (WASM)
channels simultaneously with scalar fallback for remainders.

## Backbone Fixes
- Deprecated MobileNetV3Small/Large (use unified MobileNetV3 instead)
- Implemented actual block processing in forward() methods
- Fixed hardcoded channel counts in global_avg_pool calls

## Dead Code Fixes
- Added #[allow(dead_code)] for momentum field (used in training)
- Added #[allow(dead_code)] for rng field (feature-gated)
- Added #[cfg(feature = "augmentation")] for rand::Rng import
- Commented out undefined "parallel" feature reference

Co-Authored-By: claude-flow <ruv@ruv.net>
…ation

- Add Winograd F(2,3) transforms for 2.25x faster 3x3 convolutions
- Implement π-calibrated INT8 quantization with anti-resonance offsets
- Apply 4x loop unrolling with 4 accumulators to AVX2 convolutions
- Update README with practical intro, capabilities table, benchmarks
- Update npm README with simpler language and examples
- Add CNN image embeddings to root README capabilities

Co-Authored-By: claude-flow <ruv@ruv.net>
@ruvnet
Copy link
Owner Author

ruvnet commented Mar 11, 2026

Additional Changes (Latest Commit)

Added more optimizations and documentation improvements:

New Features

  • Winograd F(2,3) transforms — 2.25x faster 3x3 convolutions (36→16 multiplications per 2x2 output)
  • π-calibrated INT8 quantization — Uses π-derived anti-resonance offsets to prevent bucket collapse
  • 4x loop unrolling — AVX2 convolutions with 4 independent accumulators for better ILP

Documentation Updates

File Changes
crates/ruvector-cnn/README.md Practical intro, Quick Start examples, capabilities/performance tables, use cases
npm/packages/ruvector-cnn/README.md Simpler language, step-by-step examples, troubleshooting
README.md Added CNN image embeddings to root capabilities list

New Files

  • simd/winograd.rs — Winograd transform matrices, filter caching, conv function
  • simd/quantize.rs — π-calibrated INT8 quantization with AVX2 SIMD

All 36 unit tests + 4 doc tests pass.

- Add unsafe blocks for WASM SIMD intrinsics (v128_load/v128_store)
- Disable wasm-opt to avoid SIMD validation issues
- Build and include WASM bindings in npm package
- Update npm package.json with all WASM files
- Published to npm as @ruvector/cnn@0.1.0

Co-Authored-By: claude-flow <ruv@ruv.net>
@ruvnet ruvnet merged commit d172324 into main Mar 11, 2026
7 checks passed
@ruvnet ruvnet deleted the feat/cnn-contrastive-research branch March 11, 2026 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ruvector-cnn: SIMD optimizations were stubbed, backbone forward() was broken

1 participant