6. 结构体(Structures)

现代数学广泛使用了代数结构,这些代数结构封装了可在不同环境中实例化的模式,而且往往有多种方法来对它们进行定义和实例化。

因此,Lean 提供了相应的方法来形式化定义这些结构并对其进行操作。 此前你已经接触过一些代数结构的示例,比如在 Chapter 2 中的环(rings)和格(lattices)。 本章将解释之前出现过的方括号语法,比如 [Ring α][Lattice α],并介绍如何创建和使用自定义的代数结构。

如需了解更多技术细节,可以参考 Theorem Proving in Lean, 以及 Anne Baanen 的这篇论文 Use and abuse of instance parameters in the Lean mathematical library

6.1. 定义结构体

广义上来说,结构体是对特定形式数据集合的约定,包括包含数据的形式以及这些数据要满足的一些约束条件。 而结构体的实例则是某一组满足约束的具体数据。例如,我们可以规定一个点是由三个实数组成的三元组:

@[ext]
structure Point where
  x : 
  y : 
  z : 

上面用到的 @[ext] 注解让 Lean 自动生成一个定理,内容是当该结构体的两个实例各组成部分对应相同时,这两个实例相等。该属性也称为外延性(extensionality)。

#check Point.ext

example (a b : Point) (hx : a.x = b.x) (hy : a.y = b.y) (hz : a.z = b.z) : a = b := by
  ext
  repeat' assumption

接着我们定义一个 Point 结构体的实例。 Lean 提供多种实例化方式来达成目的。

def myPoint1 : Point where
  x := 2
  y := -1
  z := 4

def myPoint2 : Point :=
  2, -1, 4

def myPoint3 :=
  Point.mk 2 (-1) 4

第一个例子中,我们明确地指明了结构体的各个字段。 在定义 myPoint3 时用到的函数 Point.mk 叫做 Point 结构体的构造函数(constructor),用于构造结构体成员。 你也可以为构造函数指定一个不同的名字,比如 build

structure Point' where build ::
  x : 
  y : 
  z : 

#check Point'.build 2 (-1) 4

接下来的两个例子展示了如何定义结构体的函数。 第二个例子中明确指出了构造函数 Point.mk 中的字段名,而第一个例子则使用更为简洁的匿名构造函数。 Lean 能根据 add 的目标类型推断出所需的构造函数。 通常,我们会将与结构体(例如这里的 Point)相关的定义和定理放在一个同名的命名空间(namespace)中。 在下面的示例中,由于我们启用了 Point 命名空间,所以 add 的完整名称实际是 Point.add。 当命名空间没被启用时,就得使用完整名称。 不过这时也可以使用匿名投影记号(anonymous projection notation),它允许我们用 a.add b 代替 Point.add a b。 因为 a 的类型是 Point,Lean 能在没有开启对应命名空间的时候将 a.add b 推断为 Point.add a b

namespace Point

def add (a b : Point) : Point :=
  a.x + b.x, a.y + b.y, a.z + b.z

def add' (a b : Point) : Point where
  x := a.x + b.x
  y := a.y + b.y
  z := a.z + b.z

#check add myPoint1 myPoint2
#check myPoint1.add myPoint2

end Point

#check Point.add myPoint1 myPoint2
#check myPoint1.add myPoint2

接下来我们继续在相关命名空间中添加定义,但在引用的代码片段中会省略开关命名空间相关的指令。 在证明加法函数性质时,可使用 rw 来展开定义,并用 ext 将结构体两个实例之间的等号转化为它们各个组成部分之间的等号。 下面的代码使用 protected 关键字,使得在命名空间打开的情况下定理的名字依然是 Point.add_comm。 当我们希望避免与泛型定理如 add_comm 产生歧义时,这样做是有帮助的。

protected theorem add_comm (a b : Point) : add a b = add b a := by
  rw [add, add]
  ext <;> dsimp
  repeat' apply add_comm

example (a b : Point) : add a b = add b a := by simp [add, add_comm]

因为 Lean 能在内部自动展开定义并简化投影,所以有时我们需要的等式在定义上就自动成立。

theorem add_x (a b : Point) : (a.add b).x = a.x + b.x :=
  rfl

与在 Section 5.2 中定义递归函数类似,我们定义函数时也可以使用模式匹配。 下面定义的 addAltaddAlt' 本质上是相同的,形式上的的区别在于我们在后者的定义中使用了匿名构造函数。 虽然以此方式定义函数有时会显得更加简洁,并且结构化 η-规约(structural eta-reduction)也确保了这种简写在定义上的等价性,但这么做可能会为后续的证明带来不便。

def addAlt : Point  Point  Point
  | Point.mk x₁ y₁ z₁, Point.mk x₂ y₂ z₂ => x₁ + x₂, y₁ + y₂, z₁ + z₂

def addAlt' : Point  Point  Point
  | x₁, y₁, z₁⟩, x₂, y₂, z₂ => x₁ + x₂, y₁ + y₂, z₁ + z₂

theorem addAlt_x (a b : Point) : (a.addAlt b).x = a.x + b.x := by
  rfl

theorem addAlt_comm (a b : Point) : addAlt a b = addAlt b a := by
  rw [addAlt, addAlt]
  -- the same proof still works, but the goal view here is harder to read
  ext <;> dsimp
  repeat' apply add_comm

数学构造通常涉及将捆绑的信息拆分开来,并以不同的方式重新组合。 这也是 Lean 和 Mathlib 提供如此多方法来高效处理这些操作的原因。 作为练习,请尝试证明 Point.add 是满足结合律的。 然后请定义点的标量乘法并证明其满足对加法的分配律。

protected theorem add_assoc (a b c : Point) : (a.add b).add c = a.add (b.add c) := by
  sorry

def smul (r : ) (a : Point) : Point :=
  sorry

theorem smul_distrib (r : ) (a b : Point) :
    (smul r a).add (smul r b) = smul r (a.add b) := by
  sorry

使用结构体是走向代数抽象的第一步。 我们目前还没有方法将 Point.add 与泛型的 + 符号联系起来, 抑或是将 Point.add_commPoint.add_assoc 与泛型的 add_commadd_assoc 定理联系起来。 这些任务属于结构体在代数层面的应用,我们将在下一节介绍如何具体实现。 对现在来说,我们只需要把结构体视作一种捆绑对象和信息的方法。

在结构体中我们不仅可以指定数据类型,还可以指定数据需要满足的约束。 在 Lean 中,后者被表示为类型 Prop 的字段。 例如,标准2-单纯形(standard 2-simplex)定义为满足 \(x ≥ 0\)\(y ≥ 0\)\(z ≥ 0\)\(x + y + z = 1\) 的点 \((x, y, z)\) 的集合。 如果你不熟悉这个概念,可以画个图,其实这个集合就是三维空间中以 \((1, 0, 0)\)\((0, 1, 0)\),和 \((0, 0, 1)\) 为顶点的等边三角形以及其内部。 在 Lean 中可以这样形式化表示:

structure StandardTwoSimplex where
  x : 
  y : 
  z : 
  x_nonneg : 0  x
  y_nonneg : 0  y
  z_nonneg : 0  z
  sum_eq : x + y + z = 1

请注意,最后四个字段用到的 xyz 指的就是前三个字段。 我们可以定义一个从2-单纯形到其自身的映射,该映射交换 xy

def swapXy (a : StandardTwoSimplex) : StandardTwoSimplex
    where
  x := a.y
  y := a.x
  z := a.z
  x_nonneg := a.y_nonneg
  y_nonneg := a.x_nonneg
  z_nonneg := a.z_nonneg
  sum_eq := by rw [add_comm a.y a.x, a.sum_eq]

有趣的来了,我们可以计算单纯形内两个点的中点。 为此需要先在文件开头加上 noncomputable section 语句以便使用实数上的除法。

noncomputable section

def midpoint (a b : StandardTwoSimplex) : StandardTwoSimplex
    where
  x := (a.x + b.x) / 2
  y := (a.y + b.y) / 2
  z := (a.z + b.z) / 2
  x_nonneg := div_nonneg (add_nonneg a.x_nonneg b.x_nonneg) (by norm_num)
  y_nonneg := div_nonneg (add_nonneg a.y_nonneg b.y_nonneg) (by norm_num)
  z_nonneg := div_nonneg (add_nonneg a.z_nonneg b.z_nonneg) (by norm_num)
  sum_eq := by field_simp; linarith [a.sum_eq, b.sum_eq]

上面的代码中,对于 x_nonnegy_nonnegz_nonneg,我们用简洁的证明项即可建立。 而对于 sum_eq ,我们选择使用 by 在策略模式下进行证明。

给定参数 \(\lambda\) 满足 \(0 \le \lambda \le 1\), 可定义标准2-单纯形中 \(a\)\(b\) 两点的加权平均为 \(\lambda a + (1 - \lambda) b\)。 请尝试参考前面的 midpoint 形式化定义该加权平均。

def weightedAverage (lambda : Real) (lambda_nonneg : 0  lambda) (lambda_le : lambda  1)
    (a b : StandardTwoSimplex) : StandardTwoSimplex :=
  sorry

结构体还可以依赖于参数。 例如,可以将标准2-单纯形推广到任意维数 \(n\) 下的 \(n\) -单纯形。 目前阶段,你不需要对 Fin n 了解太多,只需知道它有 \(n\) 个元素,并且 Lean 直到如何在其上进行就和操作即可。

open BigOperators

structure StandardSimplex (n : ) where
  V : Fin n  
  NonNeg :  i : Fin n, 0  V i
  sum_eq_one : ( i, V i) = 1

namespace StandardSimplex

def midpoint (n : ) (a b : StandardSimplex n) : StandardSimplex n
    where
  V i := (a.V i + b.V i) / 2
  NonNeg := by
    intro i
    apply div_nonneg
    · linarith [a.NonNeg i, b.NonNeg i]
    norm_num
  sum_eq_one := by
    simp [div_eq_mul_inv,  Finset.sum_mul, Finset.sum_add_distrib,
      a.sum_eq_one, b.sum_eq_one]
    field_simp

end StandardSimplex

作为额外练习,请尝试定义一下标准 :math`n` -单纯形中两点的加权平均。 可以使用 Finset.sum_add_distribFinset.mul_sum 来实现相关的求和操作。

前面我们已经了解到结构体可以用来将数据和属性打包在一起。 有趣的是,结构体也可以用来打包脱离具体数据的抽象属性。 例如,下面的结构体 IsLinear,将线性性(linearity)的两个主要组成部分打包在了一起。

structure IsLinear (f :   ) where
  is_additive :  x y, f (x + y) = f x + f y
  preserves_mul :  x c, f (c * x) = c * f x

section
variable (f :   ) (linf : IsLinear f)

#check linf.is_additive
#check linf.preserves_mul

end

值得一提的是,结构体并不是打包数据的唯一方法。 前面提到的 Point 数据结构也可以用泛型乘积类型来定义,而 IsLinear 可以使用简单的 and 来定义。

def Point'' :=
   ×  × 

def IsLinear' (f :   ) :=
  ( x y, f (x + y) = f x + f y)   x c, f (c * x) = c * f x

泛型类型构造甚至可以替代包含依赖关系的结构体。 例如,子类型(subtype)构造结合了一段数据以及一条属性。 你可以将下面例子中的 PReal 看作正实数类型。 每个 x : PReal 都有两个分量:它的值,以及“它是正的”这一属性。 你可以分别用 x.valx.property 来访问这两个分量,其中前者的类型为 ,后者是一个属性 0 < x.val

def PReal :=
  { y :  // 0 < y }

section
variable (x : PReal)

#check x.val
#check x.property
#check x.1
#check x.2

end

我们也可以用子类型来定义前面的标准2-单纯形,以及任意维数 \(n\) 下的标准 \(n\) -单纯形。

def StandardTwoSimplex' :=
  { p :  ×  ×  // 0  p.1  0  p.2.1  0  p.2.2  p.1 + p.2.1 + p.2.2 = 1 }

def StandardSimplex' (n : ) :=
  { v : Fin n   // ( i : Fin n, 0  v i)  ( i, v i) = 1 }

类似地,Sigma 类型是有序对的推广,其中第二个分量的类型依赖于第一个分量。

def StdSimplex := Σ n : , StandardSimplex n

section
variable (s : StdSimplex)

#check s.fst
#check s.snd

#check s.1
#check s.2

end

给定 s : StdSimplex,它的第一个分量 s.fst 是个自然数,第二个分量是对应维数下标准单纯形中的一个元素 StandardSimplex s.fst。 Sigma 类型与子类型的区别在于 Sigma 类型的第二个分量是数据而非属性。

尽管可以使用乘积类型、子类型、 Sigma 类型等替代结构体,但使用结构体其实有许多好处。 定义结构体可以抽象出底层的表达,且能为成员提供自定义的名称以供访问。 这使得证明更加健壮: 对于那些仅依赖结构体接口的证明,通常只需用新接口替换旧接口,就能让证明代码在结构体定义变更后依然有效。 此外,正如我们即将看到的,Lean 支持将各种结构体编织成一个丰富的、相互关联的层次体系,以管理它们之间的交互关系。

6.2. 代数结构

To clarify what we mean by the phrase algebraic structure, it will help to consider some examples.

  1. A partially ordered set consists of a set \(P\) and a binary relation \(\le\) on \(P\) that is transitive and reflexive.

  2. A group consists of a set \(G\) with an associative binary operation, an identity element \(1\), and a function \(g \mapsto g^{-1}\) that returns an inverse for each \(g\) in \(G\). A group is abelian or commutative if the operation is commutative.

  3. A lattice is a partially ordered set with meets and joins.

  4. A ring consists of an (additively written) abelian group \((R, +, 0, x \mapsto -x)\) together with an associative multiplication operation \(\cdot\) and an identity \(1\), such that multiplication distributes over addition. A ring is commutative if the multiplication is commutative.

  5. An ordered ring \((R, +, 0, -, \cdot, 1, \le)\) consists of a ring together with a partial order on its elements, such that \(a \le b\) implies \(a + c \le b + c\) for every \(a\), \(b\), and \(c\) in \(R\), and \(0 \le a\) and \(0 \le b\) implies \(0 \le a b\) for every \(a\) and \(b\) in \(R\).

  6. A metric space consists of a set \(X\) and a function \(d : X \times X \to \mathbb{R}\) such that the following hold:

    • \(d(x, y) \ge 0\) for every \(x\) and \(y\) in \(X\).

    • \(d(x, y) = 0\) if and only if \(x = y\).

    • \(d(x, y) = d(y, x)\) for every \(x\) and \(y\) in \(X\).

    • \(d(x, z) \le d(x, y) + d(y, z)\) for every \(x\), \(y\), and \(z\) in \(X\).

  7. A topological space consists of a set \(X\) and a collection \(\mathcal T\) of subsets of \(X\), called the open subsets of \(X\), such that the following hold:

    • The empty set and \(X\) are open.

    • The intersection of two open sets is open.

    • An arbitrary union of open sets is open.

In each of these examples, the elements of the structure belong to a set, the carrier set, that sometimes stands proxy for the entire structure. For example, when we say “let \(G\) be a group” and then “let \(g \in G\),” we are using \(G\) to stand for both the structure and its carrier. Not every algebraic structure is associated with a single carrier set in this way. For example, a bipartite graph involves a relation between two sets, as does a Galois connection, A category also involves two sets of interest, commonly called the objects and the morphisms.

The examples indicate some of the things that a proof assistant has to do in order to support algebraic reasoning. First, it needs to recognize concrete instances of structures. The number systems \(\mathbb{Z}\), \(\mathbb{Q}\), and \(\mathbb{R}\) are all ordered rings, and we should be able to apply a generic theorem about ordered rings in any of these instances. Sometimes a concrete set may be an instance of a structure in more than one way. For example, in addition to the usual topology on \(\mathbb{R}\), which forms the basis for real analysis, we can also consider the discrete topology on \(\mathbb{R}\), in which every set is open.

Second, a proof assistant needs to support generic notation on structures. In Lean, the notation * is used for multiplication in all the usual number systems, as well as for multiplication in generic groups and rings. When we use an expression like f x * y, Lean has to use information about the types of f, x, and y to determine which multiplication we have in mind.

Third, it needs to deal with the fact that structures can inherit definitions, theorems, and notation from other structures in various ways. Some structures extend others by adding more axioms. A commutative ring is still a ring, so any definition that makes sense in a ring also makes sense in a commutative ring, and any theorem that holds in a ring also holds in a commutative ring. Some structures extend others by adding more data. For example, the additive part of any ring is an additive group. The ring structure adds a multiplication and an identity, as well as axioms that govern them and relate them to the additive part. Sometimes we can define one structure in terms of another. Any metric space has a canonical topology associated with it, the metric space topology, and there are various topologies that can be associated with any linear ordering.

Finally, it is important to keep in mind that mathematics allows us to use functions and operations to define structures in the same way we use functions and operations to define numbers. Products and powers of groups are again groups. For every \(n\), the integers modulo \(n\) form a ring, and for every \(k > 0\), the \(k \times k\) matrices of polynomials with coefficients in that ring again form a ring. Thus we can calculate with structures just as easily as we can calculate with their elements. This means that algebraic structures lead dual lives in mathematics, as containers for collections of objects and as objects in their own right. A proof assistant has to accommodate this dual role.

When dealing with elements of a type that has an algebraic structure associated with it, a proof assistant needs to recognize the structure and find the relevant definitions, theorems, and notation. All this should sound like a lot of work, and it is. But Lean uses a small collection of fundamental mechanisms to carry out these tasks. The goal of this section is to explain these mechanisms and show you how to use them.

The first ingredient is almost too obvious to mention: formally speaking, algebraic structures are structures in the sense of Section 6.1. An algebraic structure is a specification of a bundle of data satisfying some axiomatic hypotheses, and we saw in Section 6.1 that this is exactly what the structure command is designed to accommodate. It’s a marriage made in heaven!

Given a data type α, we can define the group structure on α as follows.

structure Group₁ (α : Type*) where
  mul : α  α  α
  one : α
  inv : α  α
  mul_assoc :  x y z : α, mul (mul x y) z = mul x (mul y z)
  mul_one :  x : α, mul x one = x
  one_mul :  x : α, mul one x = x
  inv_mul_cancel :  x : α, mul (inv x) x = one

Notice that the type α is a parameter in the definition of group₁. So you should think of an object struc : Group₁ α as being a group structure on α. We saw in Section 2.2 that the counterpart mul_inv_cancel to inv_mul_cancel follows from the other group axioms, so there is no need to add it to the definition.

This definition of a group is similar to the definition of Group in Mathlib, and we have chosen the name Group₁ to distinguish our version. If you write #check Group and ctrl-click on the definition, you will see that the Mathlib version of Group is defined to extend another structure; we will explain how to do that later. If you type #print Group you will also see that the Mathlib version of Group has a number of extra fields. For reasons we will explain later, sometimes it is useful to add redundant information to a structure, so that there are additional fields for objects and functions that can be defined from the core data. Don’t worry about that for now. Rest assured that our simplified version Group₁ is morally the same as the definition of a group that Mathlib uses.

It is sometimes useful to bundle the type together with the structure, and Mathlib also contains a definition of a GroupCat structure that is equivalent to the following:

structure Group₁Cat where
  α : Type*
  str : Group₁ α

The Mathlib version is found in Mathlib.Algebra.Category.GroupCat.Basic, and you can #check it if you add this to the imports at the beginning of the examples file.

For reasons that will become clearer below, it is more often useful to keep the type α separate from the structure Group α. We refer to the two objects together as a partially bundled structure, since the representation combines most, but not all, of the components into one structure. It is common in Mathlib to use capital roman letters like G for a type when it is used as the carrier type for a group.

Let’s construct a group, which is to say, an element of the Group₁ type. For any pair of types α and β, Mathlib defines the type Equiv α β of equivalences between α and β. Mathlib also defines the suggestive notation α β for this type. An element f : α β is a bijection between α and β represented by four components: a function f.toFun from α to β, the inverse function f.invFun from β to α, and two properties that specify these functions are indeed inverse to one another.

variable (α β γ : Type*)
variable (f : α  β) (g : β  γ)

#check Equiv α β
#check (f.toFun : α  β)
#check (f.invFun : β  α)
#check (f.right_inv :  x : β, f (f.invFun x) = x)
#check (f.left_inv :  x : α, f.invFun (f x) = x)
#check (Equiv.refl α : α  α)
#check (f.symm : β  α)
#check (f.trans g : α  γ)

Notice the creative naming of the last three constructions. We think of the identity function Equiv.refl, the inverse operation Equiv.symm, and the composition operation Equiv.trans as explicit evidence that the property of being in bijective correspondence is an equivalence relation.

Notice also that f.trans g requires composing the forward functions in reverse order. Mathlib has declared a coercion from Equiv α β to the function type α β, so we can omit writing .toFun and have Lean insert it for us.

example (x : α) : (f.trans g).toFun x = g.toFun (f.toFun x) :=
  rfl

example (x : α) : (f.trans g) x = g (f x) :=
  rfl

example : (f.trans g : α  γ) = g  f :=
  rfl

Mathlib also defines the type perm α of equivalences between α and itself.

example (α : Type*) : Equiv.Perm α = (α  α) :=
  rfl

It should be clear that Equiv.Perm α forms a group under composition of equivalences. We orient things so that mul f g is equal to g.trans f, whose forward function is f g. In other words, multiplication is what we ordinarily think of as composition of the bijections. Here we define this group:

def permGroup {α : Type*} : Group₁ (Equiv.Perm α)
    where
  mul f g := Equiv.trans g f
  one := Equiv.refl α
  inv := Equiv.symm
  mul_assoc f g h := (Equiv.trans_assoc _ _ _).symm
  one_mul := Equiv.trans_refl
  mul_one := Equiv.refl_trans
  inv_mul_cancel := Equiv.self_trans_symm

In fact, Mathlib defines exactly this Group structure on Equiv.Perm α in the file GroupTheory.Perm.Basic. As always, you can hover over the theorems used in the definition of permGroup to see their statements, and you can jump to their definitions in the original file to learn more about how they are implemented.

In ordinary mathematics, we generally think of notation as independent of structure. For example, we can consider groups \((G_1, \cdot, 1, \cdot^{-1})\), \((G_2, \circ, e, i(\cdot))\), and \((G_3, +, 0, -)\). In the first case, we write the binary operation as \(\cdot\), the identity at \(1\), and the inverse function as \(x \mapsto x^{-1}\). In the second and third cases, we use the notational alternatives shown. When we formalize the notion of a group in Lean, however, the notation is more tightly linked to the structure. In Lean, the components of any Group are named mul, one, and inv, and in a moment we will see how multiplicative notation is set up to refer to them. If we want to use additive notation, we instead use an isomorphic structure AddGroup (the structure underlying additive groups). Its components are named add, zero, and neg, and the associated notation is what you would expect it to be.

Recall the type Point that we defined in Section 6.1, and the addition function that we defined there. These definitions are reproduced in the examples file that accompanies this section. As an exercise, define an AddGroup₁ structure that is similar to the Group₁ structure we defined above, except that it uses the additive naming scheme just described. Define negation and a zero on the Point data type, and define the AddGroup₁ structure on Point.

structure AddGroup₁ (α : Type*) where
  (add : α  α  α)
  -- fill in the rest
@[ext]
structure Point where
  x : 
  y : 
  z : 

namespace Point

def add (a b : Point) : Point :=
  a.x + b.x, a.y + b.y, a.z + b.z

def neg (a : Point) : Point := sorry

def zero : Point := sorry

def addGroupPoint : AddGroup₁ Point := sorry

end Point

We are making progress. Now we know how to define algebraic structures in Lean, and we know how to define instances of those structures. But we also want to associate notation with structures so that we can use it with each instance. Moreover, we want to arrange it so that we can define an operation on a structure and use it with any particular instance, and we want to arrange it so that we can prove a theorem about a structure and use it with any instance.

In fact, Mathlib is already set up to use generic group notation, definitions, and theorems for Equiv.Perm α.

variable {α : Type*} (f g : Equiv.Perm α) (n : )

#check f * g
#check mul_assoc f g g⁻¹

-- group power, defined for any group
#check g ^ n

example : f * g * g⁻¹ = f := by rw [mul_assoc, mul_inv_cancel, mul_one]

example : f * g * g⁻¹ = f :=
  mul_inv_cancel_right f g

example {α : Type*} (f g : Equiv.Perm α) : g.symm.trans (g.trans f) = f :=
  mul_inv_cancel_right f g

You can check that this is not the case for the additive group structure on Point that we asked you to define above. Our task now is to understand that magic that goes on under the hood in order to make the examples for Equiv.Perm α work the way they do.

The issue is that Lean needs to be able to find the relevant notation and the implicit group structure, using the information that is found in the expressions that we type. Similarly, when we write x + y with expressions x and y that have type , Lean needs to interpret the + symbol as the relevant addition function on the reals. It also has to recognize the type as an instance of a commutative ring, so that all the definitions and theorems for a commutative ring are available. For another example, continuity is defined in Lean relative to any two topological spaces. When we have f : and we write Continuous f, Lean has to find the relevant topologies on and .

The magic is achieved with a combination of three things.

  1. Logic. A definition that should be interpreted in any group takes, as arguments, the type of the group and the group structure as arguments. Similarly, a theorem about the elements of an arbitrary group begins with universal quantifiers over the type of the group and the group structure.

  2. Implicit arguments. The arguments for the type and the structure are generally left implicit, so that we do not have to write them or see them in the Lean information window. Lean fills the information in for us silently.

  3. Type class inference. Also known as class inference, this is a simple but powerful mechanism that enables us to register information for Lean to use later on. When Lean is called on to fill in implicit arguments to a definition, theorem, or piece of notation, it can make use of information that has been registered.

Whereas an annotation (grp : Group G) tells Lean that it should expect to be given that argument explicitly and the annotation {grp : Group G} tells Lean that it should try to figure it out from contextual cues in the expression, the annotation [grp : Group G] tells Lean that the corresponding argument should be synthesized using type class inference. Since the whole point to the use of such arguments is that we generally do not need to refer to them explicitly, Lean allows us to write [Group G] and leave the name anonymous. You have probably already noticed that Lean chooses names like _inst_1 automatically. When we use the anonymous square-bracket annotation with the variables command, then as long as the variables are still in scope, Lean automatically adds the argument [Group G] to any definition or theorem that mentions G.

How do we register the information that Lean needs to use to carry out the search? Returning to our group example, we need only make two changes. First, instead of using the structure command to define the group structure, we use the keyword class to indicate that it is a candidate for class inference. Second, instead of defining particular instances with def, we use the keyword instance to register the particular instance with Lean. As with the names of class variables, we are allowed to leave the name of an instance definition anonymous, since in general we intend Lean to find it and put it to use without troubling us with the details.

class Group₂ (α : Type*) where
  mul : α  α  α
  one : α
  inv : α  α
  mul_assoc :  x y z : α, mul (mul x y) z = mul x (mul y z)
  mul_one :  x : α, mul x one = x
  one_mul :  x : α, mul one x = x
  inv_mul_cancel :  x : α, mul (inv x) x = one

instance {α : Type*} : Group₂ (Equiv.Perm α) where
  mul f g := Equiv.trans g f
  one := Equiv.refl α
  inv := Equiv.symm
  mul_assoc f g h := (Equiv.trans_assoc _ _ _).symm
  one_mul := Equiv.trans_refl
  mul_one := Equiv.refl_trans
  inv_mul_cancel := Equiv.self_trans_symm

The following illustrates their use.

#check Group₂.mul

def mySquare {α : Type*} [Group₂ α] (x : α) :=
  Group₂.mul x x

#check mySquare

section
variable {β : Type*} (f g : Equiv.Perm β)

example : Group₂.mul f g = g.trans f :=
  rfl

example : mySquare f = f.trans f :=
  rfl

end

The #check command shows that Group₂.mul has an implicit argument [Group₂ α] that we expect to be found by class inference, where α is the type of the arguments to Group₂.mul. In other words, : Type*} is the implicit argument for the type of the group elements and [Group₂ α] is the implicit argument for the group structure on α. Similarly, when we define a generic squaring function my_square for Group₂, we use an implicit argument : Type*} for the type of the elements and an implicit argument [Group₂ α] for the Group₂ structure.

In the first example, when we write Group₂.mul f g, the type of f and g tells Lean that in the argument α to Group₂.mul has to be instantiated to Equiv.Perm β. That means that Lean has to find an element of Group₂ (Equiv.Perm β). The previous instance declaration tells Lean exactly how to do that. Problem solved!

This simple mechanism for registering information so that Lean can find it when it needs it is remarkably useful. Here is one way it comes up. In Lean’s foundation, a data type α may be empty. In a number of applications, however, it is useful to know that a type has at least one element. For example, the function List.headI, which returns the first element of a list, can return the default value when the list is empty. To make that work, the Lean library defines a class Inhabited α, which does nothing more than store a default value. We can show that the Point type is an instance:

instance : Inhabited Point where default := 0, 0, 0

#check (default : Point)

example : ([] : List Point).headI = default :=
  rfl

The class inference mechanism is also used for generic notation. The expression x + y is an abbreviation for Add.add x y where—you guessed it—Add α is a class that stores a binary function on α. Writing x + y tells Lean to find a registered instance of [Add.add α] and use the corresponding function. Below, we register the addition function for Point.

instance : Add Point where add := Point.add

section
variable (x y : Point)

#check x + y

example : x + y = Point.add x y :=
  rfl

end

In this way, we can assign the notation + to binary operations on other types as well.

But we can do even better. We have seen that * can be used in any group, + can be used in any additive group, and both can be used in any ring. When we define a new instance of a ring in Lean, we don’t have to define + and * for that instance, because Lean knows that these are defined for every ring. We can use this method to specify notation for our Group₂ class:

instance hasMulGroup₂ {α : Type*} [Group₂ α] : Mul α :=
  Group₂.mul

instance hasOneGroup₂ {α : Type*} [Group₂ α] : One α :=
  Group₂.one

instance hasInvGroup₂ {α : Type*} [Group₂ α] : Inv α :=
  Group₂.inv

section
variable {α : Type*} (f g : Equiv.Perm α)

#check f * 1 * g⁻¹

def foo : f * 1 * g⁻¹ = g.symm.trans ((Equiv.refl α).trans f) :=
  rfl

end

In this case, we have to supply names for the instances, because Lean has a hard time coming up with good defaults. What makes this approach work is that Lean carries out a recursive search. According to the instances we have declared, Lean can find an instance of Mul (Equiv.Perm α) by finding an instance of Group₂ (Equiv.Perm α), and it can find an instance of Group₂ (Equiv.Perm α) because we have provided one. Lean is capable of finding these two facts and chaining them together.

The example we have just given is dangerous, because Lean’s library also has an instance of Group (Equiv.Perm α), and multiplication is defined on any group. So it is ambiguous as to which instance is found. In fact, Lean favors more recent declarations unless you explicitly specify a different priority. Also, there is another way to tell Lean that one structure is an instance of another, using the extends keyword. This is how Mathlib specifies that, for example, every commutative ring is a ring. You can find more information in Section 7 and in a section on class inference in Theorem Proving in Lean.

In general, it is a bad idea to specify a value of * for an instance of an algebraic structure that already has the notation defined. Redefining the notion of Group in Lean is an artificial example. In this case, however, both interpretations of the group notation unfold to Equiv.trans, Equiv.refl, and Equiv.symm, in the same way.

As a similarly artificial exercise, define a class AddGroup₂ in analogy to Group₂. Define the usual notation for addition, negation, and zero on any AddGroup₂ using the classes Add, Neg, and Zero. Then show Point is an instance of AddGroup₂. Try it out and make sure that the additive group notation works for elements of Point.

class AddGroup₂ (α : Type*) where
  add : α  α  α
  -- fill in the rest

It is not a big problem that we have already declared instances Add, Neg, and Zero for Point above. Once again, the two ways of synthesizing the notation should come up with the same answer.

Class inference is subtle, and you have to be careful when using it, because it configures automation that invisibly governs the interpretation of the expressions we type. When used wisely, however, class inference is a powerful tool. It is what makes algebraic reasoning possible in Lean.

6.3. Building the Gaussian Integers

We will now illustrate the use of the algebraic hierarchy in Lean by building an important mathematical object, the Gaussian integers, and showing that it is a Euclidean domain. In other words, according to the terminology we have been using, we will define the Gaussian integers and show that they are an instance of the Euclidean domain structure.

In ordinary mathematical terms, the set of Gaussian integers \(\Bbb{Z}[i]\) is the set of complex numbers \(\{ a + b i \mid a, b \in \Bbb{Z}\}\). But rather than define them as a subset of the complex numbers, our goal here is to define them as a data type in their own right. We do this by representing a Gaussian integer as a pair of integers, which we think of as the real and imaginary parts.

@[ext]
structure GaussInt where
  re : 
  im : 

We first show that the Gaussian integers have the structure of a ring, with 0 defined to be ⟨0, 0⟩, 1 defined to be ⟨1, 0⟩, and addition defined pointwise. To work out the definition of multiplication, remember that we want the element \(i\), represented by ⟨0, 1⟩, to be a square root of \(-1\). Thus we want

\[\begin{split}(a + bi) (c + di) & = ac + bci + adi + bd i^2 \\ & = (ac - bd) + (bc + ad)i.\end{split}\]

This explains the definition of Mul below.

instance : Zero GaussInt :=
  ⟨⟨0, 0⟩⟩

instance : One GaussInt :=
  ⟨⟨1, 0⟩⟩

instance : Add GaussInt :=
  fun x y  x.re + y.re, x.im + y.im⟩⟩

instance : Neg GaussInt :=
  fun x  -x.re, -x.im⟩⟩

instance : Mul GaussInt :=
  fun x y  x.re * y.re - x.im * y.im, x.re * y.im + x.im * y.re⟩⟩

As noted in Section 6.1, it is a good idea to put all the definitions related to a data type in a namespace with the same name. Thus in the Lean files associated with this chapter, these definitions are made in the GaussInt namespace.

Notice that here we are defining the interpretations of the notation 0, 1, +, -, and * directly, rather than naming them GaussInt.zero and the like and assigning the notation to those. It is often useful to have an explicit name for the definitions, for example, to use with simp and rewrite.

theorem zero_def : (0 : GaussInt) = 0, 0 :=
  rfl

theorem one_def : (1 : GaussInt) = 1, 0 :=
  rfl

theorem add_def (x y : GaussInt) : x + y = x.re + y.re, x.im + y.im :=
  rfl

theorem neg_def (x : GaussInt) : -x = -x.re, -x.im :=
  rfl

theorem mul_def (x y : GaussInt) :
    x * y = x.re * y.re - x.im * y.im, x.re * y.im + x.im * y.re :=
  rfl

It is also useful to name the rules that compute the real and imaginary parts, and to declare them to the simplifier.

@[simp]
theorem zero_re : (0 : GaussInt).re = 0 :=
  rfl

@[simp]
theorem zero_im : (0 : GaussInt).im = 0 :=
  rfl

@[simp]
theorem one_re : (1 : GaussInt).re = 1 :=
  rfl

@[simp]
theorem one_im : (1 : GaussInt).im = 0 :=
  rfl

@[simp]
theorem add_re (x y : GaussInt) : (x + y).re = x.re + y.re :=
  rfl

@[simp]
theorem add_im (x y : GaussInt) : (x + y).im = x.im + y.im :=
  rfl

@[simp]
theorem neg_re (x : GaussInt) : (-x).re = -x.re :=
  rfl

@[simp]
theorem neg_im (x : GaussInt) : (-x).im = -x.im :=
  rfl

@[simp]
theorem mul_re (x y : GaussInt) : (x * y).re = x.re * y.re - x.im * y.im :=
  rfl

@[simp]
theorem mul_im (x y : GaussInt) : (x * y).im = x.re * y.im + x.im * y.re :=
  rfl

It is now surprisingly easy to show that the Gaussian integers are an instance of a commutative ring. We are putting the structure concept to good use. Each particular Gaussian integer is an instance of the GaussInt structure, whereas the type GaussInt itself, together with the relevant operations, is an instance of the CommRing structure. The CommRing structure, in turn, extends the notational structures Zero, One, Add, Neg, and Mul.

If you type instance : CommRing GaussInt := _, click on the light bulb that appears in VS Code, and then ask Lean to fill in a skeleton for the structure definition, you will see a scary number of entries. Jumping to the definition of the structure, however, shows that many of the fields have default definitions that Lean will fill in for you automatically. The essential ones appear in the definition below. A special case are nsmul and zsmul which should be ignored for now and will be explained in the next chapter. In each case, the relevant identity is proved by unfolding definitions, using the ext tactic to reduce the identities to their real and imaginary components, simplifying, and, if necessary, carrying out the relevant ring calculation in the integers. Note that we could easily avoid repeating all this code, but this is not the topic of the current discussion.

instance instCommRing : CommRing GaussInt where
  zero := 0
  one := 1
  add := (· + ·)
  neg x := -x
  mul := (· * ·)
  nsmul := nsmulRec
  zsmul := zsmulRec
  add_assoc := by
    intros
    ext <;> simp <;> ring
  zero_add := by
    intro
    ext <;> simp
  add_zero := by
    intro
    ext <;> simp
  neg_add_cancel := by
    intro
    ext <;> simp
  add_comm := by
    intros
    ext <;> simp <;> ring
  mul_assoc := by
    intros
    ext <;> simp <;> ring
  one_mul := by
    intro
    ext <;> simp
  mul_one := by
    intro
    ext <;> simp
  left_distrib := by
    intros
    ext <;> simp <;> ring
  right_distrib := by
    intros
    ext <;> simp <;> ring
  mul_comm := by
    intros
    ext <;> simp <;> ring
  zero_mul := by
    intros
    ext <;> simp
  mul_zero := by
    intros
    ext <;> simp

Lean’s library defines the class of nontrivial types to be types with at least two distinct elements. In the context of a ring, this is equivalent to saying that the zero is not equal to the one. Since some common theorems depend on that fact, we may as well establish it now.

instance : Nontrivial GaussInt := by
  use 0, 1
  rw [Ne, GaussInt.ext_iff]
  simp

We will now show that the Gaussian integers have an important additional property. A Euclidean domain is a ring \(R\) equipped with a norm function \(N : R \to \mathbb{N}\) with the following two properties:

  • For every \(a\) and \(b \ne 0\) in \(R\), there are \(q\) and \(r\) in \(R\) such that \(a = bq + r\) and either \(r = 0\) or N(r) < N(b).

  • For every \(a\) and \(b \ne 0\), \(N(a) \le N(ab)\).

The ring of integers \(\Bbb{Z}\) with \(N(a) = |a|\) is an archetypal example of a Euclidean domain. In that case, we can take \(q\) to be the result of integer division of \(a\) by \(b\) and \(r\) to be the remainder. These functions are defined in Lean so that the satisfy the following:

example (a b : ) : a = b * (a / b) + a % b :=
  Eq.symm (Int.ediv_add_emod a b)

example (a b : ) : b  0  0  a % b :=
  Int.emod_nonneg a

example (a b : ) : b  0  a % b < |b| :=
  Int.emod_lt a

In an arbitrary ring, an element \(a\) is said to be a unit if it divides \(1\). A nonzero element \(a\) is said to be irreducible if it cannot be written in the form \(a = bc\) where neither \(b\) nor \(c\) is a unit. In the integers, every irreducible element \(a\) is prime, which is to say, whenever \(a\) divides a product \(bc\), it divides either \(b\) or \(c\). But in other rings this property can fail. In the ring \(\Bbb{Z}[\sqrt{-5}]\), we have

\[6 = 2 \cdot 3 = (1 + \sqrt{-5})(1 - \sqrt{-5}),\]

and the elements \(2\), \(3\), \(1 + \sqrt{-5}\), and \(1 - \sqrt{-5}\) are all irreducible, but they are not prime. For example, \(2\) divides the product \((1 + \sqrt{-5})(1 - \sqrt{-5})\), but it does not divide either factor. In particular, we no longer have unique factorization: the number \(6\) can be factored into irreducible elements in more than one way.

In contrast, every Euclidean domain is a unique factorization domain, which implies that every irreducible element is prime. The axioms for a Euclidean domain imply that one can write any nonzero element as a finite product of irreducible elements. They also imply that one can use the Euclidean algorithm to find a greatest common divisor of any two nonzero elements a and b, i.e.~an element that is divisible by any other common divisor. This, in turn, implies that factorization into irreducible elements is unique up to multiplication by units.

We now show that the Gaussian integers are a Euclidean domain with the norm defined by \(N(a + bi) = (a + bi)(a - bi) = a^2 + b^2\). The Gaussian integer \(a - bi\) is called the conjugate of \(a + bi\). It is not hard to check that for any complex numbers \(x\) and \(y\), we have \(N(xy) = N(x)N(y)\).

To see that this definition of the norm makes the Gaussian integers a Euclidean domain, only the first property is challenging. Suppose we want to write \(a + bi = (c + di) q + r\) for suitable \(q\) and \(r\). Treating \(a + bi\) and \(c + di\) are complex numbers, carry out the division

\[\frac{a + bi}{c + di} = \frac{(a + bi)(c - di)}{(c + di)(c-di)} = \frac{ac + bd}{c^2 + d^2} + \frac{bc -ad}{c^2+d^2} i.\]

The real and imaginary parts might not be integers, but we can round them to the nearest integers \(u\) and \(v\). We can then express the right-hand side as \((u + vi) + (u' + v'i)\), where \(u' + v'i\) is the part left over. Note that we have \(|u'| \le 1/2\) and \(|v'| \le 1/2\), and hence

\[N(u' + v' i) = (u')^2 + (v')^2 \le 1/4 + 1/4 \le 1/2.\]

Multiplying through by \(c + di\), we have

\[a + bi = (c + di) (u + vi) + (c + di) (u' + v'i).\]

Setting \(q = u + vi\) and \(r = (c + di) (u' + v'i)\), we have \(a + bi = (c + di) q + r\), and we only need to bound \(N(r)\):

\[N(r) = N(c + di)N(u' + v'i) \le N(c + di) \cdot 1/2 < N(c + di).\]

The argument we just carried out requires viewing the Gaussian integers as a subset of the complex numbers. One option for formalizing it in Lean is therefore to embed the Gaussian integers in the complex numbers, embed the integers in the Gaussian integers, define the rounding function from the real numbers to the integers, and take great care to pass back and forth between these number systems appropriately. In fact, this is exactly the approach that is followed in Mathlib, where the Gaussian integers themselves are constructed as a special case of a ring of quadratic integers. See the file GaussianInt.lean.

Here we will instead carry out an argument that stays in the integers. This illustrates an choice one commonly faces when formalizing mathematics. Given an argument that requires concepts or machinery that is not already in the library, one has two choices: either formalizes the concepts or machinery needed, or adapt the argument to make use of concepts and machinery you already have. The first choice is generally a good investment of time when the results can be used in other contexts. Pragmatically speaking, however, sometimes seeking a more elementary proof is more efficient.

The usual quotient-remainder theorem for the integers says that for every \(a\) and nonzero \(b\), there are \(q\) and \(r\) such that \(a = b q + r\) and \(0 \le r < b\). Here we will make use of the following variation, which says that there are \(q'\) and \(r'\) such that \(a = b q' + r'\) and \(|r'| \le b/2\). You can check that if the value of \(r\) in the first statement satisfies \(r \le b/2\), we can take \(q' = q\) and \(r' = r\), and otherwise we can take \(q' = q + 1\) and \(r' = r - b\). We are grateful to Heather Macbeth for suggesting the following more elegant approach, which avoids definition by cases. We simply add b / 2 to a before dividing and then subtract it from the remainder.

def div' (a b : ) :=
  (a + b / 2) / b

def mod' (a b : ) :=
  (a + b / 2) % b - b / 2

theorem div'_add_mod' (a b : ) : b * div' a b + mod' a b = a := by
  rw [div', mod']
  linarith [Int.ediv_add_emod (a + b / 2) b]

theorem abs_mod'_le (a b : ) (h : 0 < b) : |mod' a b|  b / 2 := by
  rw [mod', abs_le]
  constructor
  · linarith [Int.emod_nonneg (a + b / 2) h.ne']
  have := Int.emod_lt_of_pos (a + b / 2) h
  have := Int.ediv_add_emod b 2
  have := Int.emod_lt_of_pos b zero_lt_two
  revert this; intro this -- FIXME, this should not be needed
  linarith

Note the use of our old friend, linarith. We will also need to express mod' in terms of div'.

theorem mod'_eq (a b : ) : mod' a b = a - b * div' a b := by linarith [div'_add_mod' a b]

We will use the fact that \(x^2 + y^2\) is equal to zero if and only if \(x\) and \(y\) are both zero. As an exercise, we ask you to prove that this holds in any ordered ring.

theorem sq_add_sq_eq_zero {α : Type*} [LinearOrderedRing α] (x y : α) :
    x ^ 2 + y ^ 2 = 0  x = 0  y = 0 := by
  sorry

We will put all the remaining definitions and theorems in this section in the GaussInt namespace. First, we define the norm function and ask you to establish some of its properties. The proofs are all short.

def norm (x : GaussInt) :=
  x.re ^ 2 + x.im ^ 2

@[simp]
theorem norm_nonneg (x : GaussInt) : 0  norm x := by
  sorry
theorem norm_eq_zero (x : GaussInt) : norm x = 0  x = 0 := by
  sorry
theorem norm_pos (x : GaussInt) : 0 < norm x  x  0 := by
  sorry
theorem norm_mul (x y : GaussInt) : norm (x * y) = norm x * norm y := by
  sorry

Next we define the conjugate function:

def conj (x : GaussInt) : GaussInt :=
  x.re, -x.im

@[simp]
theorem conj_re (x : GaussInt) : (conj x).re = x.re :=
  rfl

@[simp]
theorem conj_im (x : GaussInt) : (conj x).im = -x.im :=
  rfl

theorem norm_conj (x : GaussInt) : norm (conj x) = norm x := by simp [norm]

Finally, we define division for the Gaussian integers with the notation x / y, that rounds the complex quotient to the nearest Gaussian integer. We use our bespoke Int.div' for that purpose. As we calculated above, if x is \(a + bi\) and y is \(c + di\), then the real and imaginary parts of x / y are the nearest integers to

\[\frac{ac + bd}{c^2 + d^2} \quad \text{and} \quad \frac{bc -ad}{c^2+d^2},\]

respectively. Here the numerators are the real and imaginary parts of \((a + bi) (c - di)\), and the denominators are both equal to the norm of \(c + di\).

instance : Div GaussInt :=
  fun x y  Int.div' (x * conj y).re (norm y), Int.div' (x * conj y).im (norm y)⟩⟩

Having defined x / y, We define x % y to be the remainder, x - (x / y) * y. As above, we record the definitions in the theorems div_def and mod_def so that we can use them with simp and rewrite.

instance : Mod GaussInt :=
  fun x y  x - y * (x / y)⟩

theorem div_def (x y : GaussInt) :
    x / y = Int.div' (x * conj y).re (norm y), Int.div' (x * conj y).im (norm y)⟩ :=
  rfl

theorem mod_def (x y : GaussInt) : x % y = x - y * (x / y) :=
  rfl

These definitions immediately yield x = y * (x / y) + x % y for every x and y, so all we need to do is show that the norm of x % y is less than the norm of y when y is not zero.

We just defined the real and imaginary parts of x / y to be div' (x * conj y).re (norm y) and div' (x * conj y).im (norm y), respectively. Calculating, we have

(x % y) * conj y = (x - x / y * y) * conj y = x * conj y - x / y * (y * conj y)

The real and imaginary parts of the right-hand side are exactly mod' (x * conj y).re (norm y) and mod' (x * conj y).im (norm y). By the properties of div' and mod', these are guaranteed to be less than or equal to norm y / 2. So we have

norm ((x % y) * conj y) (norm y / 2)^2 + (norm y / 2)^2 (norm y / 2) * norm y.

On the other hand, we have

norm ((x % y) * conj y) = norm (x % y) * norm (conj y) = norm (x % y) * norm y.

Dividing through by norm y we have norm (x % y) (norm y) / 2 < norm y, as required.

This messy calculation is carried out in the next proof. We encourage you to step through the details and see if you can find a nicer argument.

theorem norm_mod_lt (x : GaussInt) {y : GaussInt} (hy : y  0) :
    (x % y).norm < y.norm := by
  have norm_y_pos : 0 < norm y := by rwa [norm_pos]
  have H1 : x % y * conj y = Int.mod' (x * conj y).re (norm y), Int.mod' (x * conj y).im (norm y)⟩
  · ext <;> simp [Int.mod'_eq, mod_def, div_def, norm] <;> ring
  have H2 : norm (x % y) * norm y  norm y / 2 * norm y
  · calc
      norm (x % y) * norm y = norm (x % y * conj y) := by simp only [norm_mul, norm_conj]
      _ = |Int.mod' (x.re * y.re + x.im * y.im) (norm y)| ^ 2
          + |Int.mod' (-(x.re * y.im) + x.im * y.re) (norm y)| ^ 2 := by simp [H1, norm, sq_abs]
      _  (y.norm / 2) ^ 2 + (y.norm / 2) ^ 2 := by gcongr <;> apply Int.abs_mod'_le _ _ norm_y_pos
      _ = norm y / 2 * (norm y / 2 * 2) := by ring
      _  norm y / 2 * norm y := by gcongr; apply Int.ediv_mul_le; norm_num
  calc norm (x % y)  norm y / 2 := le_of_mul_le_mul_right H2 norm_y_pos
    _ < norm y := by
        apply Int.ediv_lt_of_lt_mul
        · norm_num
        · linarith

We are in the home stretch. Our norm function maps Gaussian integers to nonnegative integers. We need a function that maps Gaussian integers to natural numbers, and we obtain that by composing norm with the function Int.natAbs, which maps integers to the natural numbers. The first of the next two lemmas establishes that mapping the norm to the natural numbers and back to the integers does not change the value. The second one re-expresses the fact that the norm is decreasing.

theorem coe_natAbs_norm (x : GaussInt) : (x.norm.natAbs : ) = x.norm :=
  Int.natAbs_of_nonneg (norm_nonneg _)

theorem natAbs_norm_mod_lt (x y : GaussInt) (hy : y  0) :
    (x % y).norm.natAbs < y.norm.natAbs := by
  apply Int.ofNat_lt.1
  simp only [Int.natCast_natAbs, abs_of_nonneg, norm_nonneg]
  apply norm_mod_lt x hy

We also need to establish the second key property of the norm function on a Euclidean domain.

theorem not_norm_mul_left_lt_norm (x : GaussInt) {y : GaussInt} (hy : y  0) :
    ¬(norm (x * y)).natAbs < (norm x).natAbs := by
  apply not_lt_of_ge
  rw [norm_mul, Int.natAbs_mul]
  apply le_mul_of_one_le_right (Nat.zero_le _)
  apply Int.ofNat_le.1
  rw [coe_natAbs_norm]
  exact Int.add_one_le_of_lt ((norm_pos _).mpr hy)

We can now put it together to show that the Gaussian integers are an instance of a Euclidean domain. We use the quotient and remainder function we have defined. The Mathlib definition of a Euclidean domain is more general than the one above in that it allows us to show that remainder decreases with respect to any well-founded measure. Comparing the values of a norm function that returns natural numbers is just one instance of such a measure, and in that case, the required properties are the theorems natAbs_norm_mod_lt and not_norm_mul_left_lt_norm.

instance : EuclideanDomain GaussInt :=
  { GaussInt.instCommRing with
    quotient := (· / ·)
    remainder := (· % ·)
    quotient_mul_add_remainder_eq :=
      fun x y  by simp only; rw [mod_def, add_comm] ; ring
    quotient_zero := fun x  by
      simp [div_def, norm, Int.div']
      rfl
    r := (measure (Int.natAbs  norm)).1
    r_wellFounded := (measure (Int.natAbs  norm)).2
    remainder_lt := natAbs_norm_mod_lt
    mul_left_not_lt := not_norm_mul_left_lt_norm }

An immediate payoff is that we now know that, in the Gaussian integers, the notions of being prime and being irreducible coincide.

example (x : GaussInt) : Irreducible x  Prime x :=
  irreducible_iff_prime