ML-DSA Concrete Rounding
This is AI generated note to help my contribution to VCV-io project.
LatticeCrypto/MLDSA/Concrete/Rounding.lean provides executable, formally verified
implementations of FIPS 204 [National Institute of Standards and Technology, 2024] Algorithms 35–40 — the six coefficient-wise rounding
primitives that ML-DSA uses to compress keys, create signer hints, and verify signatures.
Why rounding exists in ML-DSA
ML-DSA is a Fiat–Shamir-with-aborts signature scheme. During signing, the signer commits to and sends the high-order part to derive the challenge hash. During verification, the verifier reconstructs an approximation
and uses the hint from the signature to recover from that approximation, then checks that the commitment hash matches. The rounding primitives make this work:
| Role | Primitive | Where used |
|---|---|---|
| Key compression | Power2Round |
Drop the least-significant bits of to produce the public-key component |
| Signing commitment | HighBits / LowBits |
Compute from ; keep for the validity check |
| Hint creation | MakeHint |
Record whether adding to changes the high bits |
| Hint use | UseHint |
Recover from during verification |
FIPS 204 algorithms covered
| Algorithm | FIPS 204 ref | Lean name (coefficient) | Lean name (polynomial) |
|---|---|---|---|
| Power2Round | Algorithm 35 | power2RoundCoeff |
power2RoundHigh / power2RoundLow |
| Decompose | Algorithm 36 | decomposeCoeff |
— (used internally) |
| HighBits | Algorithm 37 | highBitsCoeff |
highBits |
| LowBits | Algorithm 38 | lowBitsCoeff |
lowBits |
| MakeHint | Algorithm 39 | makeHintCoeff |
makeHint |
| UseHint | Algorithm 40 | useHintCoeff |
useHint |
Power2Round (Alg 35)
Splits into a high part and a balanced low part satisfying
where = droppedBits = 13 in ML-DSA. The Lean implementation uses standard
integer division with a sign-balancing step (the ≤ power2Scale / 2 branch).
Decompose (Alg 36)
Like Power2Round, but splits with respect to instead of :
There is one special case: when the straightforward quotient would give
, the standard decomposition would place very close to
and make the high bits unstable under small perturbations.
Decompose redirects that case to , ensuring the high
bits change predictably when hints are applied. This is the if alpha * r1 = modulus - 1
branch in decomposeCoeff.
HighBits / LowBits (Algs 37–38)
Thin projections of Decompose: and .
MakeHint (Alg 39)
A single Boolean indicating whether adding to crosses a rounding boundary.
UseHint (Alg 40)
Given hint and , returns the high bits of without knowing directly, by adjusting up or down by 1 modulo according to the sign of .
The wrap-around: why matters
For all approved ML-DSA parameter sets, the prime modulus and stride satisfy exactly, where :
| Parameter set | ||||
|---|---|---|---|---|
| ML-DSA-44 | 95 232 | 190 464 | 44 | 8 380 416 = |
| ML-DSA-65/87 | 261 888 | 523 776 | 16 | 8 380 416 = |
This means , so tiles into exactly buckets of width , with left over as a single isolated point.
Bucket layout
graph LR
subgraph "Zq — integers mod q = 8 380 417"
direction LR
B0["Bucket 0\n[0, alpha)"]
B1["Bucket 1\n[alpha, 2*alpha)"]
Bd["…"]
Bm1["Bucket m-1\n[alpha*(m-1), q-2]"]
seam["⚠ q-1 ≡ -1\nwould be bucket m"]
end
B0 --- B1 --- Bd --- Bm1 --- seam
seam -->|"Decompose redirects\nr1 := 0, r0 := r0 - 1"| B0
The value in is adjacent to with distance 1,
not . If bucket were allowed, small perturbations near could
cause to jump between and — a change of — which
UseHint’s correction cannot recover. By redirecting to bucket 0,
Decompose ensures the seam sits at a single, well-known point and high bits change
by at most 1 across every rounding boundary.
Decompose decision flowchart
flowchart TD
r["r in Zq"] --> rp["r+ = r mod q"]
rp --> r0["r0 = r+ mod± alpha\n(balanced into (-alpha/2, alpha/2])"]
r0 --> r1["r1 = (r+ - r0) / alpha"]
r1 --> check{"alpha * r1 = q - 1?\ni.e. r1 = m"}
check -- "No\n(normal case)" --> out_normal["output (r1, r0)"]
check -- "Yes\n(seam: r near -1)" --> out_wrap["output (0, r0 - 1)\nmerge into bucket 0"]
Consequences in the Lean proofs
In Rq (coefficients live in ), the identity
becomes , captured by the local lemma hwrap
inside decomposeCoeff_eq. The related alpha_mul_pred_m_eq lemma then gives
which is needed in every overflow branch of the UseHint correctness proof.
The BalancedDecomp structure codifies both the divisibility
(hqm1 : alpha * m = modulus - 1) and the ring-unit consequence
(hunit : IsUnit (alpha : Coeff)), so no proof that works from
ctx : BalancedDecomp ever needs to re-derive these facts.
Representation choices
High = Power2High = Rq. The FIPS spec treats high-order representatives
as plain integers. This file keeps them in Rq — the negacyclic polynomial ring
— so that the verifier equation
stays in a single ring
without any cast overhead. The Coeff cast to/from ℕ and ℤ appears frequently
in lemma statements.
Hint = Vector Bool ringDegree. One Bool per coefficient, matching Algorithm 39’s
definition exactly.
The BalancedDecomp abstraction
Most non-trivial lemmas in the file are proved in terms of a private structure
BalancedDecomp (alpha m : ℕ) that bundles the arithmetic properties needed about
the modulus, stride, and number of buckets:
leanprivate structure BalancedDecomp (alpha m : ℕ) : Prop where
hα : 0 < alpha
h2α : 2 * (alpha / 2) = alpha -- alpha is even
hqm1 : alpha * m = modulus - 1 -- alpha divides q − 1
hunit : IsUnit (alpha : Coeff) -- alpha is invertible mod q
...
The constructor BalancedDecomp.ofApproved discharges all obligations for the
three approved ML-DSA parameter sets (ml-dsa-44, ml-dsa-65, ml-dsa-87) with
decide, keeping proof obligations entirely out of the caller lemmas.
This pattern means every theorem stated over ctx : BalancedDecomp alpha m
works for any future parameter set satisfying the conditions, not just the three
currently approved ones.
Theorems proved
Decomposition identities
leantheorem concretePower2Round_high_low_decomp :
power2RoundShift (power2RoundHigh r) + power2RoundLow r = r
theorem concreteRounding_high_low_decomp (p : Params) :
highBitsShift p (highBits p r) + lowBits p r = r
The split is lossless: exactly
reconstructs the original polynomial. Proofs reduce to natCast_div_add_mod
on each coefficient.
Norm bounds on the low part
leantheorem concretePower2Round_bound :
cInfNorm (r − power2RoundShift (power2RoundHigh r)) ≤ 2^(droppedBits − 1)
theorem concreteRounding_lowBits_bound (p) (hp : p.isApproved) :
cInfNorm (lowBits p r) ≤ p.gamma2
In mathematical terms: for the Decompose low part, and for Power2Round. These establish the smallness property used in the rejection-sampling validity check.
Stability of HighBits under small perturbations
leantheorem concreteRounding_hide_low_of_isApproved (p) (hp) (r s : Rq) (b : ℕ) :
cInfNorm s ≤ b →
cInfNorm (lowBits p r) + b < p.gamma2 →
highBits p (r + s) = highBits p r
If and , then . Adding does not push past any rounding boundary, so the high bits are unchanged. This is the property that lets the verifier reconstruct faithfully.
Hint correctness
leantheorem concreteRounding_useHint_correct_of_isApproved (p) (hp) (z r : Rq) :
cInfNorm z ≤ p.gamma2 →
useHint p (makeHint p z r) r = highBits p (r + z)
When , applying the hint to recovers
exactly . The proof case-splits on whether
overflows the rounding interval positively, negatively, or stays
in range, then applies highBitsCoeff_eq_of_pos_overflow,
highBitsCoeff_eq_of_neg_overflow, and highBitsCoeff_eq_of_repr respectively.
UseHint error bound
leantheorem concreteRounding_useHint_bound_of_isApproved (p) (hp) (r : Rq) (h : Hint) :
cInfNorm (r − highBitsShift p (useHint p h r)) ≤ 2 * p.gamma2 + 1
Even with an arbitrary (possibly adversarial) hint,
Injectivity of the shift map
leantheorem highBitsShift_injective_of_isApproved (p) (hp) :
Function.Injective (highBitsShift p)
The map is injective on Rq because is a unit
in for all approved parameters (from BalancedDecomp.hunit).
Connecting to abstract interfaces
The file closes by packaging the concrete implementations and their proofs into the abstract interfaces used by the rest of the ML-DSA proof stack:
leandef concretePower2RoundOps : MLDSA.Power2RoundOps
def concreteRoundingOps (p) : MLDSA.RoundingOps (2 * p.gamma2)
theorem concretePower2RoundLaws : Power2RoundOps.Laws concretePower2RoundOps cInfNorm
theorem concreteRoundingLaws_of_isApproved : RoundingOps.Laws (concreteRoundingOps p) cInfNorm
Higher-level ML-DSA proofs (signing correctness, verification correctness) import
the abstract Laws typeclass and remain parameter-agnostic.