← notebook

Why Lean?

My experience

UX Issues in Proving Assistants

Lean 4 / Mathlib

In the following code snippet, we show a non-constructive proof. We show that a irrational power of a irrational could be a rational number. However, we never tell you whether √2 ^ √2 is rational or not in this proof. “by_cases” here uses the p ∨ ¬p Law of exclusion of middle.

You can run the code here

leanimport Mathlib

theorem irrational_pow_irrational_rational :
     (a b : ℝ), Irrational a ∧ Irrational b ∧ ¬ Irrational (a ^ b) := by
  by_cases h : Irrational (√2 ^ √2)
  · -- Case: √2 ^ √2 is irrational → use it as base
    refine ⟨√2 ^ √2, √2, h, irrational_sqrt_two, ?_⟩
    show ¬Irrational ((√2 ^ √2) ^ √2)
    rw[
      ← Real.rpow_mul (Real.sqrt_nonneg 2),
      Real.mul_self_sqrt (Nat.ofNat_nonneg 2),
      Real.rpow_ofNat,
      Real.sq_sqrt (Nat.ofNat_nonneg 2)
    ]
    exact not_irrational_ofNat 2
  · -- Case: √2 ^ √2 is rational → we're already done
    exact ⟨√2, √2, irrational_sqrt_two, irrational_sqrt_two, h⟩

Rocq/Coq

The same proof is much longer in Rocq.

High level things aside, Lean 4’s devEx is much better than Rocq. Lean 4 is very easy to install with VSCode extensions. And here are some issues I encountered when setting up Rocq.

Cryptography

We’re seeing a trend that people move on beyond pens, papers, Latex, and human review, then craft machine verifiable math proofs.

I’m not sure if this is already helpful in software implementation side. AI says yes, but I’m now incompetent to verify the claim thus I’m skeptical.

VCV-io: Library for Cryptography Security Proofs

Where Software Engineering is Going?

These are reflections after reading the case studies in the next section.

Traditional programming languages are optimized for humans — English-speaking humans especially. But now that agents provide cheap and capable coding skills, the organization is changing. From the point of view of Rich Sutton’s “The Bitter Lesson,” we should do things that favor agent productivity.

The way we channel our intents to implementation is changing. Originally we tried to speak English in a way logical enough that raw machines could understand and turn it into low-level data/compute operations.

Now, since implementations will grow beyond human reasoning, we speak the language of math to solidify our intents. Smart machines write opcodes directly to skip “compilation,” and that favors math.

It also seems the software input/output bug is solved — or must be solved — due to external threats. People now move a level up. We model the attackers and their mental state. This level of modeling used to belong to the academic world and paper writing. But the trend looks like it can cover implementation.

Where Will Vulnerabilities Go?

The arms race dynamic of formal verification doesn’t eliminate vulnerabilities — it relocates them. Each layer you formally verify pushes the attack surface to the boundary above or below it. The clean verification table is the canonical illustration: the constraint system is proven correct, but the serialization to the prover is unverified. The adversary simply targets the seam. This is close to a conservation law: verification hardens one layer, and the boundary becomes the most attractive target.

Concretely, the seams that remain:

Case studies

Lean-zip: Formal Verification of a Compression Library in Lean 4

Background

Highlights

lean/-- Unified DEFLATE roundtrip: inflate ∘ deflateRaw = identity.
    This is the Phase B4 capstone theorem from PLAN.md. Generalized to any
    `maxOutputSize` large enough to hold the input. -/
theorem inflate_deflateRaw (data : ByteArray) (level : UInt8)
    (maxOutputSize : Nat) (hsize : data.size < maxOutputSize) :
    Zip.Native.Inflate.inflate (deflateRaw data level) maxOutputSize = .ok data := by
  unfold deflateRaw
  split
  · exact inflate_deflateStoredPure data _ (by omega)
  · split
    · exact inflate_deflateFixedIter data _ (by omega)
    · split
      · exact inflate_deflateLazyIter data _ hsize
      · exact inflate_deflateDynamic data _ (by omega)

Claude Code investigation shows:

Clean: Formal Verification of ZK Circuits

Background

Highlights

lean-- Circom original (embedded as a comment):
-- inv <-- in!=0 ? 1/in : 0;
-- out <== -in*inv +1;
-- in*out === 0;

def main (input : Expression (F p)) := do
  let inv ← witness fun env =>
    let x := input.eval env
    if x ≠ 0 then x⁻¹ else 0
  let out <== -input * inv + 1
  input * out === 0
  return out

def circuit : FormalCircuit (F p) field field where
  main
  localLength _ := 2

  Spec input output :=
    output = (if input = 0 then 1 else 0)

  soundness := by
    circuit_proof_start
    simp only [id_eq, h_holds]
    split_ifs with h_ifs
    . simp only [h_ifs, zero_mul, neg_zero, zero_add]
    . rw [neg_add_eq_zero]
      have h1 := h_holds.left
      have h2 := h_holds.right
      rw [h1] at h2
      simp only [id_eq, mul_eq_zero] at h2
      cases h2
      case neg.inl hl => contradiction
      case neg.inr hr =>
        rw [neg_add_eq_zero] at hr
        exact hr

  completeness := by
    circuit_proof_start
    cases h_env with
    | intro left right =>
      simp only [left, id_eq, ite_not, mul_ite, mul_zero] at right
      simp only [id_eq, right, left, ite_not, mul_ite, mul_zero, mul_eq_zero, true_and]
      split_ifs <;> aesop

Comparators.lean#L28–L67

leanstructure FormalCircuit (F : Type) [Field F] (Input Output : TypeMap) where
  main        : Var Input F → Circuit F (Var Output F)
  Assumptions : Input F → Prop
  Spec        : Input F → Output F → Prop
  soundness   : Soundness F ...
  completeness : Completeness F ...

Basic.lean#L298–L303

leandef Soundness ... :=
   offset : ℕ,  env : Environment F,
   input_var : Var Input F,  input : Input F, eval env input_var = input →
  Assumptions input →
  ConstraintsHold.Soundness env (circuit.main input_var |>.operations offset) →
  let output := eval env (circuit.output input_var offset)
  Spec input output ∧ ...

Basic.lean#L259–L271

leandef assertBool : FormalAssertion (F p) field where
  main (x : Expression (F p)) := assertZero (x * (x - 1))
  Spec (x : F p) := IsBool x
  soundness   := by circuit_proof_all [IsBool.iff_mul_sub_one, sub_eq_add_neg]
  completeness := by circuit_proof_all [IsBool.iff_mul_sub_one, sub_eq_add_neg]

Boolean.lean#L201–L208

leanhave : (2^n * x).val = 2^n * x.val := by
  rw [ZMod.val_mul_of_lt (by linarith), h_mul_x]

ByteDecomposition.lean#L77

leandef main (input : Expression (F p) × Expression (F p)) := do
  let diff := input.1 - input.2
  let out ← IsZero.circuit diff    -- treated as a verified black box
  return out

soundness := by
  circuit_proof_start [IsZero.circuit]
  rw [← h_input]
  ...

Comparators.lean#L83–L109

Layer Verified?
Lean kernel (type checker) Yes
FormalCircuit.soundness / completeness Yes
toJson serialization No
Rust backend AST interpretation No
Plonky3 proof system soundness No

evm-asm: Formally Verified EVM as RISC-V Assembly

Background

Highlights

leantheorem evm_add_stack_spec_within (sp base : Word) (a b : EvmWord) ... :
    cpsTripleWithin 30 base (base + 120) (evm_add_code base)
      ((.x12 ↦ᵣ sp) ** ... ** evmWordIs sp a ** evmWordIs (sp + 32) b)
      ((.x12 ↦ᵣ (sp + 32)) ** ... ** evmWordIs sp a ** evmWordIs (sp + 32) (a + b))

Add/Spec.lean#L74–L128

leandef cpsTripleWithin (nSteps : Nat) (entry exit_ : Word) (cr : CodeReq)
    (P Q : Assertion) : Prop :=
   (R : Assertion), R.pcFree →  s, cr.SatisfiedBy s →
    (P ** R).holdsFor s → s.pc = entry →
     k, k ≤ nSteps ∧  s', stepN k s = some s' ∧
      s'.pc = exit_ ∧ (Q ** R).holdsFor s'

Rv64/CPSSpec.lean#L45–L48

Dimension geth / reth / besu evm-asm
Correctness evidence Passes ethereum/tests vectors Machine-checked proof for all inputs
Spec EIPs + Yellow Paper (informal) Lean types + cpsTripleWithin + pure EL spec
Coverage Test-driven (finite vectors) Universal (∀ register values, ∀ memory layouts)
Trust base Rust compiler + std Lean kernel + Sail RISC-V model

evm-smith: What if you can bring Ethereum contracts to final form?

Write EVM bytecode directly and skip HLL(High level language) like Solidity/Vyper directly.

Solvency property: Sum of all users’ balance is less or equal to contract’s ETH balance.

leandef WethInv (σ : AccountMap .EVM) (C : AccountAddress) : Prop :=
  storageSum σ C ≤ balanceOf σ C

See Leo’s intro post for more.

Background

Highlights