Skip to content

Add security improvements#23

Open
rcmstark wants to merge 8 commits intomasterfrom
refactor/security
Open

Add security improvements#23
rcmstark wants to merge 8 commits intomasterfrom
refactor/security

Conversation

@rcmstark
Copy link
Copy Markdown
Member

@rcmstark rcmstark commented Apr 10, 2026

Summary

  • Port all security fixes from Python reference implementation
  • Security: hedged RFC 6979 nonces (§3.6 extra entropy: same key+message produces different signatures while preserving protection against RNG failure), Low-S normalization, public key on-curve validation, hash truncation, branch-balanced Montgomery ladder, curve-specific doubling shortcuts (A=0 for secp256k1, A=-3 for prime256v1), Tonelli-Shanks modular square root, extended Euclidean modular inverse, fromJacobian infinity guard
  • Performance: mixed affine+Jacobian addition fast path, bit-by-bit NAF generator multiplication backed by a precomputed affine [G, 2G, 4G, ..., 2^n*G] table (zero doublings during signing), Shamir's trick with Joint Sparse Form, GLV endomorphism for secp256k1 (splits each 256-bit scalar into two ~128-bit halves for a 4-scalar simultaneous multi-exponentiation during verification)
  • 74 tests across 10 separate test files matching Python structure
  • Benchmark class added
  • README updated with security section, benchmark numbers, and performance-prose

Test plan

  • All 74 tests passing (./gradlew test)
  • Benchmark: sign 0.8ms, verify 1.3ms
  • Security audit: all 9 checks pass

- Replace Fermat's little theorem with BigInteger.modInverse (extended
  Euclidean), 2-3x faster for 256-bit operands
- Fixed-base windowed scalar multiplication (2^4-ary method) with
  precomputed generator table, cuts sign time substantially
- Skip A*pz^4 term in jacobianDouble for secp256k1 (A=0); use
  3*(px-pz^2)*(px+pz^2) shortcut for prime256v1 (A=-3)
- Cache curve.nBitLength to avoid recomputing per call

Benchmark (100 rounds):
  sign:   2.9ms -> 1.5ms
  verify: 1.8ms -> 1.7ms
When qz=1 (affine), qz^2=qz^3=1 so U1=px, S1=py, nz=H*pz.
Saves four field multiplications per add.
Replace window table with affine [G, 2G, 4G, ..., 2^nBitLength*G]
and width-2 NAF expansion: each non-zero digit triggers one
mixed add and zero doublings. ~86 adds on average vs ~256
doublings for a 256-bit scalar, and every add hits the mixed
affine+Jacobian fast path because the table is in affine form.
Replace raw-binary Shamir scan with JSF (Solinas 2001): signed
digits in {-1, 0, 1} with at most one of any two consecutive
pairs non-zero. Density drops from ~3/4 to ~1/2, cutting add
count in verification's n1*p1 + n2*p2 by a third.
Split each 256-bit scalar k into two ~128-bit scalars (k1, k2)
with k = k1 + k2*lambda (mod N) via Babai rounding against the
Gauss-reduced basis. Verification's n1*p1 + n2*p2 becomes a
4-scalar simultaneous multi-exponentiation over (p1, phi(p1),
p2, phi(p2)) with a 16-entry subset-sum table -- half the loop
length of the Shamir+JSF path.

glvParams is nullable on Curve; only secp256k1 sets it. Other
curves fall back to Shamir+JSF via multiplyAndAdd(..., curve).
@rcmstark rcmstark force-pushed the refactor/security branch from d3d456f to 2b532bd Compare April 19, 2026 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant