数组与停机性

为了编写高效的代码，选择合适的数据结构非常重要。链表有它的用途：在某些应用程序中，共享列表的尾部非常重要。但是，大多数可变长有序数据集合的用例都能由数组更好地提供服务，数组既有较少的内存开销，又有更好的局部性。

然而，数组相对于列表来说有两个缺点：

数组是通过索引访问的，而非通过模式匹配，这为维护安全性增加了证明义务。
从左到右处理整个数组的循环是一个尾递归函数，但它没有会在每次调用时递减的参数。

高效地使用数组需要知道如何向 Lean 证明数组索引在范围内，以及如何证明接近数组大小的数组索引也会使程序停机。这两个都使用不等式命题，而非命题等式表示。

不等式

由于不同的类型有不同的序概念，不等式需要由两个类来控制，分别称为 LE 和 LT。标准类型类一节中的表格描述了这些类与语法的关系：

表达式	脱糖结果	类名
`x < y`	`LT.lt x y`	`LT`
`x ≤ y`	`LE.le x y`	`LE`
`x > y`	`LT.lt y x`	`LT`
`x ≥ y`	`LE.le y x`	`LE`

换句话说，一个类型可以定制 < 和 ≤ 运算符的含义，而 > 和 ≥ 可以从 < 和 ≤ 中派生它们的含义。LT 和 LE 类具有返回命题而非 Bool 的方法：

class LE (α : Type u) where
  le : α → α → Prop

class LT (α : Type u) where
  lt : α → α → Prop

Nat 的 LE 实例会委托给 Nat.le：

instance : LE Nat where
  le := Nat.le

定义 Nat.le 需要 Lean 中尚未介绍的一个特性：它是一个归纳定义的关系。

归纳定义的命题、谓词和关系

Nat.le 是一个 归纳定义的关系（Inductively-Defined Relation）。就像 inductive 可以用来创建新的数据类型一样，它也可以用来创建新的命题。当一个命题接受一个参数时，它被称为 谓词（Predicate），它可能对某些潜在参数为真，但并非对所有参数都为真。接受多个参数的命题称为 关系（Relation）。"

每个归纳定义命题的构造子都是证明它的方法。换句话说，命题的声明描述了它为真的不同形式的证据。一个没有参数且只有一个构造子的命题很容易证明：

inductive EasyToProve : Prop where
  | heresTheProof : EasyToProve

证明包括使用其构造子：

theorem fairlyEasy : EasyToProve := by
  constructor

实际上，命题 True 应该总是很容易证明，它的定义就像 EasyToProve：

inductive True : Prop where
  | intro : True

不带参数的归纳定义命题远不如归纳定义的数据类型有趣。这是因为数据本身很有趣——自然数 3 不同于数字 35，而订购了 3 个披萨的人如果 30 分钟后收到 35 个披萨会很沮丧。命题的构造子描述了命题可以为真的方式，但一旦命题被证明，就不需要知道它使用了哪些底层构造子。这就是为什么 Prop 宇宙中最有趣的归纳定义类型带参数的原因。

归纳定义谓词 IsThree 陈述它有三个参数：

inductive IsThree : Nat → Prop where
  | isThree : IsThree 3

这里使用的机制就像索引族，如 HasCol，只不过结果类型是一个可以被证明的命题，而非可以被使用的数据。

使用此谓词，可以证明三确实等于三：

theorem three_is_three : IsThree 3 := by
  constructor

类似地，IsFive 是一个谓词，它陈述了其参数为 5：

inductive IsFive : Nat → Prop where
  | isFive : IsFive 5

如果一个数字是三，那么将它加二的结果应该是五。这可以表示为定理陈述：

theorem three_plus_two_five : IsThree n → IsFive (n + 2) := by
  skip

由此产生的目标具有函数类型：

unsolved goals
n : Nat
⊢ IsThree n → IsFive (n + 2)

因此，intro 策略可用于将参数转换为假设：

theorem three_plus_two_five : IsThree n → IsFive (n + 2) := by
  intro three

unsolved goals
n : Nat
three : IsThree n
⊢ IsFive (n + 2)

假设 n 为三，则应该可以使用 IsFive 的构造子来完成证明：

theorem three_plus_two_five : IsThree n → IsFive (n + 2) := by
  intro three
  constructor

然而，这会产生一个错误：

tactic 'constructor' failed, no applicable constructor found
n : Nat
three : IsThree n
⊢ IsFive (n + 2)

出现此错误是因为 n + 2 与 5 在定义上不相等。在普通的函数定义中，可以对假设 three 使用依值模式匹配来将 n 细化为 3。依值模式匹配的策略等价为 cases，其语法类似于 induction：

theorem three_plus_two_five : IsThree n → IsFive (n + 2) := by
  intro three
  cases three with
  | isThree => skip

在剩余情况下，n 已细化为 3：

unsolved goals
case isThree
⊢ IsFive (3 + 2)

由于 3 + 2 在定义上等于 5，因此构造子现在适用了：

theorem three_plus_two_five : IsThree n → IsFive (n + 2) := by
  intro three
  cases three with
  | isThree => constructor

标准假命题 False 没有构造子，因此无法提供直接证据。为 False 提供证据的唯一方法是假设本身不可能，类似于用 nomatch 来标记类型系统认为无法访问的代码。如插曲中的证明一节所述，否定 Not A 是 A → False 的缩写。Not A 也可以写成 ¬A。

四不是三：

theorem four_is_not_three : ¬ IsThree 4 := by
  skip

初始证明目标包含 Not：

unsolved goals
⊢ ¬IsThree 4

可以使用 simp 显示出它实际上是一个函数类型：

theorem four_is_not_three : ¬ IsThree 4 := by
  simp [Not]

unsolved goals
⊢ IsThree 4 → False

因为目标是一个函数类型，所以 intro 可用于将参数转换为假设。无需保留 simp，因为 intro 可以展开 Not 本身的定义：

theorem four_is_not_three : ¬ IsThree 4 := by
  intro h

unsolved goals
h : IsThree 4
⊢ False

在此证明中，cases 策略直接解决了目标：

theorem four_is_not_three : ¬ IsThree 4 := by
  intro h
  cases h

就像对 Vect String 2 的模式匹配不需要包含 Vect.nil 的情况一样，对 IsThree 4 的情况证明不需要包含 isThree 的情况。

自然数不等式

Nat.le 的定义有一个参数和一个索引：

inductive Nat.le (n : Nat) : Nat → Prop
  | refl : Nat.le n n
  | step : Nat.le n m → Nat.le n (m + 1)

参数 n 应该是较小的数字，而索引应该是大于或等于 n 的数字。当两个数字相等时使用 refl 构造子，而当索引大于 n 时使用 step 构造子。

从证据的视角来看，证明 \( n \leq k \) 需要找到一些数字 \( d \) 使得 \( n + d = m \)。在 Lean 中，证明由 \( d \) 个 Nat.le.step 实例包裹的 Nat.le.refl 构造子组成。每个 step 构造子将其索引参数加一，因此 \( d \) 个 step 构造子将 \( d \) 加到较大的数字上。例如，证明四小于或等于七由 refl 周围的三个 step 组成：

theorem four_le_seven : 4 ≤ 7 :=
  open Nat.le in
  step (step (step refl))

严格小于关系通过在左侧数字上加一来定义：

def Nat.lt (n m : Nat) : Prop :=
  Nat.le (n + 1) m

instance : LT Nat where
  lt := Nat.lt

证明四严格小于七由 refl 周围的两个 step 组成：

theorem four_lt_seven : 4 < 7 :=
  open Nat.le in
  step (step refl)

这是因为 4 < 7 等价于 5 ≤ 7。

停机性证明

函数 Array.map 接受一个函数和一个数组，它将接受的函数应用于输入数组的每个元素后，返回产生的新数组。将其写成尾递归函数遵循通常的累加器模式，即将输入委托给一个函数，该函数将输出的数组传递给累加器。累加器用空数组初始化。传递累加器的辅助函数还接受一个参数来跟踪数组中的当前索引，该索引从 0 开始：

def Array.map (f : α → β) (arr : Array α) : Array β :=
  arrayMapHelper f arr Array.empty 0

辅助函数应在每次迭代时检查索引是否仍在范围内。如果是，则应再次循环，将转换后的元素添加到累加器的末尾，并将索引加 1。如果不是，则应终止并返回累加器。此代码的最初实现会失败，因为 Lean 无法证明数组索引有效：

def arrayMapHelper (f : α → β) (arr : Array α) (soFar : Array β) (i : Nat) : Array β :=
  if i < arr.size then
    arrayMapHelper f arr (soFar.push (f arr[i])) (i + 1)
  else soFar

failed to prove index is valid, possible solutions:
  - Use `have`-expressions to prove the index is valid
  - Use `a[i]!` notation instead, runtime check is perfomed, and 'Panic' error message is produced if index is not valid
  - Use `a[i]?` notation instead, result is an `Option` type
  - Use `a[i]'h` notation instead, where `h` is a proof that index is valid
α : Type ?u.1704
β : Type ?u.1707
f : α → β
arr : Array α
soFar : Array β
i : Nat
⊢ i < Array.size arr

然而，条件表达式已经检查了有效数组索引所要求的精确条件（即 i < arr.size）。为 if 添加一个名称可以解决此问题，因为它添加了一个前提供数组索引策略使用：

def arrayMapHelper (f : α → β) (arr : Array α) (soFar : Array β) (i : Nat) : Array β :=
  if inBounds : i < arr.size then
    arrayMapHelper f arr (soFar.push (f arr[i])) (i + 1)
  else soFar

然而，Lean 不接受修改后的程序，因为递归调用不是针对输入构造子之一的参数进行的。实际上，累加器和索引都在增长，而非缩小：

fail to show termination for
  arrayMapHelper
with errors
argument #6 was not used for structural recursion
  failed to eliminate recursive application
    arrayMapHelper f✝ arr (Array.push soFar (f✝ arr[i])) (i + 1)

structural recursion cannot be used

failed to prove termination, use `termination_by` to specify a well-founded relation

尽管如此，此函数仍然会停机，因此简单地将其标记为 partial 非常不妥。

为什么 arrayMapHelper 会停机？每次迭代都会检查索引 i 是否仍在数组 arr 的范围内。如果是，则 i 将增加并且循环将重复。如果不是，则程序将停机。因为 arr.size 是一个有限数，所以 i 只可以增加有限次。即使函数的每个参数在每次调用时都不会减少，arr.size - i 也会减小到零。

可以通过在定义的末尾提供 termination_by 子句来指示 Lean 使用另一个表达式判定停机。 termination_by 子句有两个组成部分：函数参数的名称和使用这些名称的表达式，该表达式应在每次调用时减少。对于 arrayMapHelper，最终定义如下所示：

def arrayMapHelper (f : α → β) (arr : Array α) (soFar : Array β) (i : Nat) : Array β :=
  if inBounds : i < arr.size then
    arrayMapHelper f arr (soFar.push (f arr[i])) (i + 1)
  else soFar
termination_by arrayMapHelper _ arr _ i _ => arr.size - i

类似的停机证明可用于编写 Array.find，这是一个在数组中查找满足布尔函数的第一个元素，并返回该元素及其索引的函数：

def Array.find (arr : Array α) (p : α → Bool) : Option (Nat × α) :=
  findHelper arr p 0

同样，辅助函数会停机，因为随着 i 的增加，arr.size - i 会减少：

def findHelper (arr : Array α) (p : α → Bool) (i : Nat) : Option (Nat × α) :=
  if h : i < arr.size then
    let x := arr[i]
    if p x then
      some (i, x)
    else findHelper arr p (i + 1)
  else none
termination_by findHelper arr p i => arr.size - i

并非所有停机参数都像这个参数一样简单。但是，在所有停机证明中，都会出现「基于函数的参数找出在每次调用时都会减少的某个表达式」这种基本结构。有时，为了弄清楚函数为何停机，可能需要一点创造力，有时 Lean 需要额外的证明才能接受停机参数。

练习

使用尾递归累加器传递函数和 termination_by 子句在数组上实现 ForM (Array α) 实例。
使用 不需要 termination_by 子句的尾递归累加器传递函数实现一个用于反转数组的函数。
使用恒等单子中的 for ... in ... 循环重新实现 Array.map、Array.find 和 ForM 实例，并比较结果代码。
使用恒等单子中的 for ... in ... 循环重新实现数组反转。将其与尾递归函数的版本进行比较。

Lean 函数式编程

数组与停机性

不等式

归纳定义的命题、谓词和关系

自然数不等式

停机性证明

练习