annotate doc/tlc.ur @ 1506:44fda91f5fa0

Tutorial: folders
author Adam Chlipala <adam@chlipala.net>
date Sun, 17 Jul 2011 11:51:05 -0400
parents 8c851e5508a7
children ca8c8b8cc477
rev   line source
adam@1504 1 (* Chapter 2: Type-Level Computation *)
adam@1504 2
adam@1505 3 (* begin hide *)
adam@1505 4 val show_string = mkShow (fn s => "\"" ^ s ^ "\"")
adam@1505 5 (* end *)
adam@1505 6
adam@1504 7 (* This tutorial by <a href="http://adam.chlipala.net/">Adam Chlipala</a> is licensed under a <a href="http://creativecommons.org/licenses/by-nc-nd/3.0/">Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 Unported License</a>. *)
adam@1504 8
adam@1504 9 (* The last chapter reviewed some Ur features imported from ML and Haskell. This chapter explores uncharted territory, introducing the features that make Ur unique. *)
adam@1504 10
adam@1504 11 (* * Names and Records *)
adam@1504 12
adam@1504 13 (* Last chapter, we met Ur's basic record features, including record construction and field projection. *)
adam@1504 14
adam@1504 15 val r = { A = 0, B = 1.2, C = "hi"}
adam@1504 16
adam@1504 17 (* begin eval *)
adam@1504 18 r.B
adam@1504 19 (* end *)
adam@1504 20
adam@1504 21 (* Our first taste of Ur's novel expressive power is with the following function, which implements record field projection in a completely generic way. *)
adam@1504 22
adam@1504 23 fun project [nm :: Name] [t ::: Type] [ts ::: {Type}] [[nm] ~ ts] (r : $([nm = t] ++ ts)) : t = r.nm
adam@1504 24
adam@1504 25 (* begin eval *)
adam@1504 26 project [#B] r
adam@1504 27 (* end *)
adam@1504 28
adam@1504 29 (* This function introduces a slew of essential features. First, we see type parameters with explicit kind annotations. Formal parameter syntax like <tt>[a :: K]</tt> declares an <b>explicit</b> parameter <tt>a</tt> of kind <tt>K</tt>. Explicit parameters must be passed explicitly at call sites. In contrast, implicit parameters, declared like <tt>[a ::: K]</tt>, are inferred in the usual way.<br>
adam@1504 30 <br>
adam@1504 31 Two new kinds appear in this example. We met the basic kind <tt>Type</tt> in a previous example. Here we meet <tt>Name</tt>, the kind of record field names; and <tt>{Type}</tt> the type of finite maps from field names to types, where we'll generally refer to this notion of "finite map" by the name <b>record</b>, as it will be clear from context whether we're discussing type-level or value-level records. That is, in this case, we are referring to names and records <b>at the level of types</b> that <b>exist only at compile time</b>! By the way, the kind <tt>{Type}</tt> is one example of the general <tt>{K}</tt> kind form, which refers to records with fields of kind <tt>K</tt>.<br>
adam@1504 32 <br>
adam@1504 33 The English description of <tt>project</tt> is that it projects a field with name <tt>nm</tt> and type <tt>t</tt> out of a record <tt>r</tt> whose other fields are described by type-level record <tt>ts</tt>. We make all this formal by assigning <tt>r</tt> a type that first builds the singleton record <tt>[nm = t]</tt> that maps <tt>nm</tt> to <tt>t</tt>, and then concatenating this record with the remaining field information in <tt>ts</tt>. The <tt>$</tt> operator translates a type-level record (of kind <tt>{Type}</tt>) into a record type (of kind <tt>Type</tt>).<br>
adam@1504 34 <br>
adam@1504 35 The type annotation on <tt>r</tt> uses the record concatenation operator <tt>++</tt>. Ur enforces that any concatenation happens between records that share no field names. Otherwise, we'd need to resolve field name ambiguity in some predictable way, which would force us to treat <tt>++</tt> as non-commutative, if we are to maintain the nice modularity properties of polymorphism. However, treating <tt>++</tt> as commutative, and treating records as equal up to field permutation in general, are very convenient for type inference and general programmer experience. Thus, we enforce disjointness to keep things simple.<br>
adam@1504 36 <br>
adam@1504 37 For a polymorphic function like <tt>project</tt>, the compiler doesn't know which fields a type-level record variable like <tt>ts</tt> contains. To enable self-contained type-checking, we need to declare some constraints about field disjointness. That's exactly the meaning of syntax like <tt>[r1 ~ r2]</tt>, which asserts disjointness of two type-level records. The disjointness clause for <tt>project</tt> asserts that the name <tt>nm</tt> is not used by <tt>ts</tt>. The syntax <tt>[nm]</tt> is shorthand for <tt>[nm = ()]</tt>, which defines a singleton record of kind <tt>{Unit}</tt>, where <tt>Unit</tt> is the degenerate kind inhabited only by the constructor <tt>()</tt>.<br>
adam@1504 38 <br>
adam@1504 39 The last piece of this puzzle is the easiest. In the example call to <tt>project</tt>, we see that the only parameters passed are the one explicit constructor parameter <tt>nm</tt> and the value-level parameter <tt>r</tt>. The rest are inferred, and the disjointness proof obligation is discharged automatically. The syntax <tt>#A</tt> denotes the constructor standing for first-class field name <tt>A</tt>, and we pass all constructor parameters to value-level functions within square brackets (which bear no formal relation to the syntax for type-level record literals <tt>[A = c, ..., A = c]</tt>). *)
adam@1505 40
adam@1505 41
adam@1505 42 (* * Basic Type-Level Programming *)
adam@1505 43
adam@1505 44 (* To help us express more interesting operations over records, we will need to do some type-level programming. Ur makes that fairly congenial, since Ur's constructor level includes an embedded copy of the simply-typed lambda calculus. Here are a few examples. *)
adam@1505 45
adam@1505 46 con id = fn t :: Type => t
adam@1505 47
adam@1505 48 val x : id int = 0
adam@1505 49 val x : id float = 1.2
adam@1505 50
adam@1505 51 con pair = fn t :: Type => t * t
adam@1505 52
adam@1505 53 val x : pair int = (0, 1)
adam@1505 54 val x : pair float = (1.2, 2.3)
adam@1505 55
adam@1505 56 con compose = fn (f :: Type -> Type) (g :: Type -> Type) (t :: Type) => f (g t)
adam@1505 57
adam@1505 58 val x : compose pair pair int = ((0, 1), (2, 3))
adam@1505 59
adam@1505 60 con fst = fn t :: (Type * Type) => t.1
adam@1505 61 con snd = fn t :: (Type * Type) => t.2
adam@1505 62
adam@1505 63 con p = (int, float)
adam@1505 64 val x : fst p = 0
adam@1505 65 val x : snd p = 1.2
adam@1505 66
adam@1505 67 con mp = fn (f :: Type -> Type) (t1 :: Type, t2 :: Type) => (f t1, f t2)
adam@1505 68
adam@1505 69 val x : fst (mp pair p) = (1, 2)
adam@1505 70
adam@1505 71 (* Actually, Ur's constructor level goes further than merely including a copy of the simply-typed lambda calculus with tuples. We also effectively import classic <b>let-polymorphism</b>, via <b>kind polymorphism</b>, which we can use to make some of the definitions above more generic. *)
adam@1505 72
adam@1505 73 con fst = K1 ==> K2 ==> fn t :: (K1 * K2) => t.1
adam@1505 74 con snd = K1 ==> K2 ==> fn t :: (K1 * K2) => t.2
adam@1505 75
adam@1505 76 con twoFuncs :: ((Type -> Type) * (Type -> Type)) = (id, compose pair pair)
adam@1505 77
adam@1505 78 val x : fst twoFuncs int = 0
adam@1505 79 val x : snd twoFuncs int = ((1, 2), (3, 4))
adam@1505 80
adam@1505 81
adam@1505 82 (* * Type-Level Map *)
adam@1505 83
adam@1505 84 (* The examples from the same section may seem cute but not especially useful. In this section, we meet <tt>map</tt>, the real workhorse of Ur's type-level computation. We'll use it to type some useful operations over value-level records. A few more pieces will be necessary before getting there, so we'll start just by showing how interesting type-level operations on records may be built from <tt>map</tt>. *)
adam@1505 85
adam@1505 86 con r = [A = int, B = float, C = string]
adam@1505 87
adam@1505 88 con optionify = map option
adam@1505 89 val x : $(optionify r) = {A = Some 1, B = None, C = Some "hi"}
adam@1505 90
adam@1505 91 con pairify = map pair
adam@1505 92 val x : $(pairify r) = {A = (1, 2), B = (3.0, 4.0), C = ("5", "6")}
adam@1505 93
adam@1505 94 con stringify = map (fn _ => string)
adam@1505 95 val x : $(stringify r) = {A = "1", B = "2", C = "3"}
adam@1505 96
adam@1505 97 (* We'll also give our first hint at the cleverness within Ur's type inference engine. The following definition type-checks, despite the fact that doing so requires applying several algebraic identities about <tt>map</tt> and <tt>++</tt>. This is the first point where we see a clear advantage of Ur over the type-level computation facilities that have become popular in GHC Haskell. *)
adam@1505 98
adam@1505 99 fun concat [f :: Type -> Type] [r1 :: {Type}] [r2 :: {Type}] [r1 ~ r2]
adam@1505 100 (r1 : $(map f r1)) (r2 : $(map f r2)) : $(map f (r1 ++ r2)) = r1 ++ r2
adam@1505 101
adam@1505 102
adam@1505 103 (* * First-Class Polymorphism *)
adam@1505 104
adam@1505 105 (* The idea of <b>first-class polymorphism</b> or <b>impredicative polymorphism</b> has also become popular in GHC Haskell. This feature, which has a long history in type theory, is also central to Ur's metaprogramming facilities. First-class polymorphism goes beyond Hindley-Milner's let-polymorphism to allow arguments to functions to themselves be polymorphic. Among other things, this enables the classic example of Church encodings, as for the natural numbers in this example. *)
adam@1505 106
adam@1505 107 type nat = t :: Type -> t -> (t -> t) -> t
adam@1505 108 val zero : nat = fn [t :: Type] (z : t) (s : t -> t) => z
adam@1505 109 fun succ (n : nat) : nat = fn [t :: Type] (z : t) (s : t -> t) => s (n [t] z s)
adam@1505 110
adam@1505 111 val one = succ zero
adam@1505 112 val two = succ one
adam@1505 113 val three = succ two
adam@1505 114
adam@1505 115 (* begin eval *)
adam@1505 116 three [int] 0 (plus 1)
adam@1505 117 (* end *)
adam@1505 118
adam@1505 119 (* begin eval *)
adam@1505 120 three [string] "" (strcat "!")
adam@1505 121 (* end *)
adam@1506 122
adam@1506 123
adam@1506 124 (* * Folders *)
adam@1506 125
adam@1506 126 (* We're almost ready to implement some more polymorphic operations on records. The key missing piece is <b>folders</b>; more specifically, the type family <tt>folder</tt> that allows iteration over the fields of type-level records. The Ur standard library exposes <tt>folder</tt> abstractly, along with the following key operation over it. Don't mind the clutter at the end of this definition as we rebind the function <tt>fold</tt> from the default-open module <tt>Top</tt>, as we must include an explicit kind-polymorphic binder to appease the associated let-polymorphism. *)
adam@1506 127
adam@1506 128 val fold : K --> tf :: ({K} -> Type)
adam@1506 129 -> (nm :: Name -> v :: K -> r :: {K} -> [[nm] ~ r] =>
adam@1506 130 tf r -> tf ([nm = v] ++ r))
adam@1506 131 -> tf []
adam@1506 132 -> r ::: {K} -> folder r -> tf r
adam@1506 133 = K ==> Top.fold
adam@1506 134
adam@1506 135 (* The type is a bit of a mouthful. We can describe the function arguments in order. First, <tt>K</tt> is the kind of data associated with fields in the record we will be iterating over. Next, <tt>tf</tt> describes the type of an accumulator, much as for standard "fold" operations over lists. The difference here is that the accumulator description is not a mere type, but rather a <b>type-level function</b> that returns a type given a properly kinded record. When we begin iterating over a record <tt>r</tt>, the accumulator has type <tt>tf []</tt> (where <tt>[]</tt> is the empty record), and when we finish iterating, the accumulator has type <tt>tf r</tt>. As we step through the fields of the record, we add each one to the argument we keep passing to <tt>tf</tt> to determine accumulator types.<br>
adam@1506 136 <br>
adam@1506 137 The next arguments of <tt>fold</tt> are much like for normal list fold functions: a step function and an initial value. The latter has type <tt>tf []</tt>, just as we expect from the explanation in the last paragraph. The final arguments are <tt>r</tt>, the record we fold over; and a <tt>folder</tt> for it. The function return type follows last paragraph's explanation of accmulator typing.<br>
adam@1506 138 <br>
adam@1506 139 We've left a big unexplained piece: the type of the step function. In order, its arguments are <tt>nm</tt>, the current field being processed; <tt>v</tt>, the data associated with that field; <tt>r</tt>, the portion of the input record that we had already stepped through before this point; a proof that the name <tt>nm</tt> didn't already occur in <tt>r</tt>; and the accumulator, typed to show that the set of fields we've already visited is exactly <tt>r</tt>. The return type of the step function is another accumulator type, extended to show that now we've visited <tt>nm</tt>, too.<br>
adam@1506 140 <br>
adam@1506 141 Here's a simple example usage, where we write a function to count the number of fields in a type-level record of types. *)
adam@1506 142
adam@1506 143 fun countFields [ts :: {Type}] (fl : folder ts) : int =
adam@1506 144 @fold [fn _ => int] (fn [nm ::_] [v ::_] [r ::_] [[nm] ~ r] n => n + 1) 0 fl
adam@1506 145
adam@1506 146 (* We preface <tt>fold</tt> with <tt>@</tt>, to disable inference of folders, since we have one we'd like to pass explicitly. The accumulator type family we use is a simple one that ignores its argument and always returns <tt>int</tt>; at every stage of our traversal of input record <tt>ts</tt>, we keep an integer as our sole state, and the type of this state doesn't depend on which record fields we've visited. The step function binds each type parameter with the notation <tt>::_</tt>, for an explicit parameter whose kind should be inferred.<br>
adam@1506 147 <br>
adam@1506 148 The function <tt>countFields</tt> is a lot easier to use than it is to define! Here's an example invocation, where we see that the appropriate folder is inferred. *)
adam@1506 149
adam@1506 150 (* begin eval *)
adam@1506 151 countFields [[A = int, B = float, C = string]]
adam@1506 152 (* end *)
adam@1506 153
adam@1506 154 (* If folders are generally inferred, why bother requiring that they be passed around? The answer has to do with Ur's general rule that type-level records are considered equivalent modulo permutation. As a result, there is no unique traversal order for a record, in general. The programmer has freedom in constructing folders that embody different permutations, using the functions exposed from the module <tt>Folder</tt> (see the top of <tt>lib/ur/top.urs</tt> in the Ur/Web distribution). Still, in most cases, the order in which fields are written in the source code provides an unambiguous clue about desired ordering. Thus, by default, folder parameters are implicit, and they are inferred to follow the order of fields in program text.<br>
adam@1506 155 <br>
adam@1506 156 Let's implement a more ambitious traversal. We will take in a record whose fields all contain <tt>option</tt> types, and we will determine if every field contains a <tt>Some</tt>. If so, we return <tt>Some</tt> of a "de-optioned" version; if not, we return <tt>None</tt>. *)
adam@1506 157
adam@1506 158 fun join [ts ::: {Type}] (fl : folder ts) (r : $(map option ts)) : option $ts =
adam@1506 159 @fold [fn ts => $(map option ts) -> option $ts]
adam@1506 160 (fn [nm ::_] [v ::_] [r ::_] [[nm] ~ r] (f : $(map option r) -> option $r) =>
adam@1506 161 fn r : $(map option ([nm = v] ++ r)) =>
adam@1506 162 case r.nm of
adam@1506 163 None => None
adam@1506 164 | Some v =>
adam@1506 165 case f (r -- nm) of
adam@1506 166 None => None
adam@1506 167 | Some vs => Some ({nm = v} ++ vs))
adam@1506 168 (fn _ : $(map option []) => Some {}) fl r
adam@1506 169
adam@1506 170 (* Rather than take in an arbitrary record type and add some sort of constraint requiring that it contain only <tt>option</tt> types, the Ur way is to <b>construct</b> a record type with computation over some more primitive inputs, such that the process (A) is guaranteed to construct only records satisfying the constraint and (B) is capable, given the proper inputs, of constructing any record satisfying the original constraint.<br>
adam@1506 171 <br>
adam@1506 172 This use of folding involves an accumulator type that <i>is</i> record-dependent. In particular, as we traverse the record, we are building up a "de-optioning" function. To implement the step function, we rely on the record projection form <tt>r.nm</tt> and the record field removal form <tt>r -- nm</tt>, both of which work fine with variables standing for unknown field names. To extend the output record with a new mapping for field <tt>nm</tt>, we use concatenation <tt>++</tt> with a singleton record literal.<br>
adam@1506 173 <br>
adam@1506 174 Like for the last example, <tt>join</tt> is much easier to use than to implement! The simple invocations below use Ur's <b>reverse-engineering unification</b> to deduce the value of parameter <tt>ts</tt> from the type of parameter <tt>r</tt>. Also, as before, the folder argument is inferred. *)
adam@1506 175
adam@1506 176 (* begin hide *)
adam@1506 177 fun show_option [t] (_ : show t) : show (option t) =
adam@1506 178 mkShow (fn x => case x of
adam@1506 179 None => "None"
adam@1506 180 | Some x => "Some(" ^ show x ^ ")")
adam@1506 181
adam@1506 182 val show_r = mkShow (fn r : {A : int, B : string} =>
adam@1506 183 "{A = " ^ show r.A ^ ", B = " ^ show r.B ^ "}")
adam@1506 184 (* end *)
adam@1506 185
adam@1506 186 (* begin eval *)
adam@1506 187 join {A = Some 1, B = Some "X"}
adam@1506 188 (* end *)
adam@1506 189
adam@1506 190 (* begin eval *)
adam@1506 191 join {A = Some 1, B = None : option string}
adam@1506 192 (* end *)
adam@1506 193
adam@1506 194 (* The Ur/Web standard library includes many variations on <tt>fold</tt> that encapsulate common traversal patterns. For instance, <tt>foldR</tt> captures the idea of folding over a value-level record, and we can use it to simplify the definition of <tt>join</tt>: *)
adam@1506 195
adam@1506 196 fun join [ts ::: {Type}] (fl : folder ts) (r : $(map option ts)) : option $ts =
adam@1506 197 @foldR [option] [fn ts => option $ts]
adam@1506 198 (fn [nm ::_] [v ::_] [r ::_] [[nm] ~ r] v vs =>
adam@1506 199 case (v, vs) of
adam@1506 200 (Some v, Some vs) => Some ({nm = v} ++ vs)
adam@1506 201 | _ => None)
adam@1506 202 (Some {}) fl r
adam@1506 203
adam@1506 204 (* See <tt>lib/ur/top.urs</tt> for the types of <tt>foldR</tt> and some other handy folding functions. *)