annotate doc/manual.tex @ 554:193fe8836419

Intro
author Adam Chlipala <adamc@hcoop.net>
date Sun, 07 Dec 2008 15:10:59 -0500
parents effd620d90da
children 0b2cf25a5eba
rev   line source
adamc@524 1 \documentclass{article}
adamc@554 2 \usepackage{fullpage,amsmath,amssymb,proof,url}
adamc@524 3
adamc@524 4 \newcommand{\cd}[1]{\texttt{#1}}
adamc@524 5 \newcommand{\mt}[1]{\mathsf{#1}}
adamc@524 6
adamc@524 7 \newcommand{\rc}{+ \hspace{-.075in} + \;}
adamc@527 8 \newcommand{\rcut}{\; \texttt{--} \;}
adamc@527 9 \newcommand{\rcutM}{\; \texttt{---} \;}
adamc@524 10
adamc@524 11 \begin{document}
adamc@524 12
adamc@524 13 \title{The Ur/Web Manual}
adamc@524 14 \author{Adam Chlipala}
adamc@524 15
adamc@524 16 \maketitle
adamc@524 17
adamc@540 18 \tableofcontents
adamc@540 19
adamc@554 20
adamc@554 21 \section{Introduction}
adamc@554 22
adamc@554 23 \emph{Ur} is a programming language designed to introduce richer type system features into functional programming in the tradition of ML and Haskell. Ur is functional, pure, statically-typed, and strict. Ur supports a powerful kind of \emph{metaprogramming} based on \emph{row types}.
adamc@554 24
adamc@554 25 \emph{Ur/Web} is Ur plus a special standard library and associated rules for parsing and optimization. Ur/Web supports construction of dynamic web applications backed by SQL databases. The signature of the standard library is such that well-typed Ur/Web programs ``don't go wrong'' in a very broad sense. Not only do they not crash during particular page generations, but they also may not:
adamc@554 26
adamc@554 27 \begin{itemize}
adamc@554 28 \item Suffer from any kinds of code-injection attacks
adamc@554 29 \item Return invalid HTML
adamc@554 30 \item Contain dead intra-application links
adamc@554 31 \item Have mismatches between HTML forms and the fields expected by their handlers
adamc@554 32 \item Attempt invalid SQL queries
adamc@554 33 \item Use improper marshaling or unmarshaling in communication with SQL databases
adamc@554 34 \end{itemize}
adamc@554 35
adamc@554 36 This type safety is just the foundation of the Ur/Web methodology. It is also possible to use metaprogramming to build significant application pieces by analysis of type structure. For instance, the demo includes an ML-style functor for building an admin interface for an arbitrary SQL table. The type system guarantees that the admin interface sub-application that comes out will always be free of the above-listed bugs, no matter which well-typed table description is given as input.
adamc@554 37
adamc@554 38 The Ur/Web compiler also produces very efficient object code that does not use garbage collection. These compiled programs will often be even more efficient than what most programmers would bother to write in C.
adamc@554 39
adamc@554 40 \medskip
adamc@554 41
adamc@554 42 The official web site for Ur is:
adamc@554 43 \begin{center}
adamc@554 44 \url{http://www.impredicative.com/ur/}
adamc@554 45 \end{center}
adamc@554 46
adamc@529 47 \section{Ur Syntax}
adamc@529 48
adamc@529 49 In this section, we describe the syntax of Ur, deferring to a later section discussion of most of the syntax specific to SQL and XML. The sole exceptions are the declaration forms for tables, sequences, and cookies.
adamc@524 50
adamc@524 51 \subsection{Lexical Conventions}
adamc@524 52
adamc@524 53 We give the Ur language definition in \LaTeX $\;$ math mode, since that is prettier than monospaced ASCII. The corresponding ASCII syntax can be read off directly. Here is the key for mapping math symbols to ASCII character sequences.
adamc@524 54
adamc@524 55 \begin{center}
adamc@524 56 \begin{tabular}{rl}
adamc@524 57 \textbf{\LaTeX} & \textbf{ASCII} \\
adamc@524 58 $\to$ & \cd{->} \\
adamc@524 59 $\times$ & \cd{*} \\
adamc@524 60 $\lambda$ & \cd{fn} \\
adamc@524 61 $\Rightarrow$ & \cd{=>} \\
adamc@529 62 $\neq$ & \cd{<>} \\
adamc@529 63 $\leq$ & \cd{<=} \\
adamc@529 64 $\geq$ & \cd{>=} \\
adamc@524 65 \\
adamc@524 66 $x$ & Normal textual identifier, not beginning with an uppercase letter \\
adamc@525 67 $X$ & Normal textual identifier, beginning with an uppercase letter \\
adamc@524 68 \end{tabular}
adamc@524 69 \end{center}
adamc@524 70
adamc@525 71 We often write syntax like $e^*$ to indicate zero or more copies of $e$, $e^+$ to indicate one or more copies, and $e,^*$ and $e,^+$ to indicate multiple copies separated by commas. Another separator may be used in place of a comma. The $e$ term may be surrounded by parentheses to indicate grouping; those parentheses should not be included in the actual ASCII.
adamc@524 72
adamc@526 73 We write $\ell$ for literals of the primitive types, for the most part following C conventions. There are $\mt{int}$, $\mt{float}$, and $\mt{string}$ literals.
adamc@526 74
adamc@527 75 This version of the manual doesn't include operator precedences; see \texttt{src/urweb.grm} for that.
adamc@527 76
adamc@552 77 \subsection{\label{core}Core Syntax}
adamc@524 78
adamc@524 79 \emph{Kinds} classify types and other compile-time-only entities. Each kind in the grammar is listed with a description of the sort of data it classifies.
adamc@524 80 $$\begin{array}{rrcll}
adamc@524 81 \textrm{Kinds} & \kappa &::=& \mt{Type} & \textrm{proper types} \\
adamc@525 82 &&& \mt{Unit} & \textrm{the trivial constructor} \\
adamc@525 83 &&& \mt{Name} & \textrm{field names} \\
adamc@525 84 &&& \kappa \to \kappa & \textrm{type-level functions} \\
adamc@525 85 &&& \{\kappa\} & \textrm{type-level records} \\
adamc@525 86 &&& (\kappa\times^+) & \textrm{type-level tuples} \\
adamc@529 87 &&& \_\_ & \textrm{wildcard} \\
adamc@525 88 &&& (\kappa) & \textrm{explicit precedence} \\
adamc@524 89 \end{array}$$
adamc@524 90
adamc@524 91 Ur supports several different notions of functions that take types as arguments. These arguments can be either implicit, causing them to be inferred at use sites; or explicit, forcing them to be specified manually at use sites. There is a common explicitness annotation convention applied at the definitions of and in the types of such functions.
adamc@524 92 $$\begin{array}{rrcll}
adamc@524 93 \textrm{Explicitness} & ? &::=& :: & \textrm{explicit} \\
adamc@525 94 &&& \; ::: & \textrm{implicit}
adamc@524 95 \end{array}$$
adamc@524 96
adamc@524 97 \emph{Constructors} are the main class of compile-time-only data. They include proper types and are classified by kinds.
adamc@524 98 $$\begin{array}{rrcll}
adamc@524 99 \textrm{Constructors} & c, \tau &::=& (c) :: \kappa & \textrm{kind annotation} \\
adamc@530 100 &&& \hat{x} & \textrm{constructor variable} \\
adamc@524 101 \\
adamc@525 102 &&& \tau \to \tau & \textrm{function type} \\
adamc@525 103 &&& x \; ? \; \kappa \to \tau & \textrm{polymorphic function type} \\
adamc@525 104 &&& \$ c & \textrm{record type} \\
adamc@524 105 \\
adamc@525 106 &&& c \; c & \textrm{type-level function application} \\
adamc@530 107 &&& \lambda x \; :: \; \kappa \Rightarrow c & \textrm{type-level function abstraction} \\
adamc@524 108 \\
adamc@525 109 &&& () & \textrm{type-level unit} \\
adamc@525 110 &&& \#X & \textrm{field name} \\
adamc@524 111 \\
adamc@525 112 &&& [(c = c)^*] & \textrm{known-length type-level record} \\
adamc@525 113 &&& c \rc c & \textrm{type-level record concatenation} \\
adamc@525 114 &&& \mt{fold} & \textrm{type-level record fold} \\
adamc@524 115 \\
adamc@525 116 &&& (c^+) & \textrm{type-level tuple} \\
adamc@525 117 &&& c.n & \textrm{type-level tuple projection ($n \in \mathbb N^+$)} \\
adamc@524 118 \\
adamc@525 119 &&& \lambda [c \sim c] \Rightarrow c & \textrm{guarded constructor} \\
adamc@524 120 \\
adamc@529 121 &&& \_ :: \kappa & \textrm{wildcard} \\
adamc@525 122 &&& (c) & \textrm{explicit precedence} \\
adamc@530 123 \\
adamc@530 124 \textrm{Qualified uncapitalized variables} & \hat{x} &::=& x & \textrm{not from a module} \\
adamc@530 125 &&& M.x & \textrm{projection from a module} \\
adamc@525 126 \end{array}$$
adamc@525 127
adamc@525 128 Modules of the module system are described by \emph{signatures}.
adamc@525 129 $$\begin{array}{rrcll}
adamc@525 130 \textrm{Signatures} & S &::=& \mt{sig} \; s^* \; \mt{end} & \textrm{constant} \\
adamc@525 131 &&& X & \textrm{variable} \\
adamc@525 132 &&& \mt{functor}(X : S) : S & \textrm{functor} \\
adamc@529 133 &&& S \; \mt{where} \; \mt{con} \; x = c & \textrm{concretizing an abstract constructor} \\
adamc@525 134 &&& M.X & \textrm{projection from a module} \\
adamc@525 135 \\
adamc@525 136 \textrm{Signature items} & s &::=& \mt{con} \; x :: \kappa & \textrm{abstract constructor} \\
adamc@525 137 &&& \mt{con} \; x :: \kappa = c & \textrm{concrete constructor} \\
adamc@528 138 &&& \mt{datatype} \; x \; x^* = dc\mid^+ & \textrm{algebraic datatype definition} \\
adamc@529 139 &&& \mt{datatype} \; x = \mt{datatype} \; M.x & \textrm{algebraic datatype import} \\
adamc@525 140 &&& \mt{val} \; x : \tau & \textrm{value} \\
adamc@525 141 &&& \mt{structure} \; X : S & \textrm{sub-module} \\
adamc@525 142 &&& \mt{signature} \; X = S & \textrm{sub-signature} \\
adamc@525 143 &&& \mt{include} \; S & \textrm{signature inclusion} \\
adamc@525 144 &&& \mt{constraint} \; c \sim c & \textrm{record disjointness constraint} \\
adamc@525 145 &&& \mt{class} \; x & \textrm{abstract type class} \\
adamc@525 146 &&& \mt{class} \; x = c & \textrm{concrete type class} \\
adamc@525 147 \\
adamc@525 148 \textrm{Datatype constructors} & dc &::=& X & \textrm{nullary constructor} \\
adamc@525 149 &&& X \; \mt{of} \; \tau & \textrm{unary constructor} \\
adamc@524 150 \end{array}$$
adamc@524 151
adamc@526 152 \emph{Patterns} are used to describe structural conditions on expressions, such that expressions may be tested against patterns, generating assignments to pattern variables if successful.
adamc@526 153 $$\begin{array}{rrcll}
adamc@526 154 \textrm{Patterns} & p &::=& \_ & \textrm{wildcard} \\
adamc@526 155 &&& x & \textrm{variable} \\
adamc@526 156 &&& \ell & \textrm{constant} \\
adamc@526 157 &&& \hat{X} & \textrm{nullary constructor} \\
adamc@526 158 &&& \hat{X} \; p & \textrm{unary constructor} \\
adamc@526 159 &&& \{(x = p,)^*\} & \textrm{rigid record pattern} \\
adamc@526 160 &&& \{(x = p,)^+, \ldots\} & \textrm{flexible record pattern} \\
adamc@527 161 &&& (p) & \textrm{explicit precedence} \\
adamc@526 162 \\
adamc@529 163 \textrm{Qualified capitalized variables} & \hat{X} &::=& X & \textrm{not from a module} \\
adamc@526 164 &&& M.X & \textrm{projection from a module} \\
adamc@526 165 \end{array}$$
adamc@526 166
adamc@527 167 \emph{Expressions} are the main run-time entities, corresponding to both ``expressions'' and ``statements'' in mainstream imperative languages.
adamc@527 168 $$\begin{array}{rrcll}
adamc@527 169 \textrm{Expressions} & e &::=& e : \tau & \textrm{type annotation} \\
adamc@529 170 &&& \hat{x} & \textrm{variable} \\
adamc@529 171 &&& \hat{X} & \textrm{datatype constructor} \\
adamc@527 172 &&& \ell & \textrm{constant} \\
adamc@527 173 \\
adamc@527 174 &&& e \; e & \textrm{function application} \\
adamc@527 175 &&& \lambda x : \tau \Rightarrow e & \textrm{function abstraction} \\
adamc@527 176 &&& e [c] & \textrm{polymorphic function application} \\
adamc@527 177 &&& \lambda x \; ? \; \kappa \Rightarrow e & \textrm{polymorphic function abstraction} \\
adamc@527 178 \\
adamc@527 179 &&& \{(c = e,)^*\} & \textrm{known-length record} \\
adamc@527 180 &&& e.c & \textrm{record field projection} \\
adamc@527 181 &&& e \rc e & \textrm{record concatenation} \\
adamc@527 182 &&& e \rcut c & \textrm{removal of a single record field} \\
adamc@527 183 &&& e \rcutM c & \textrm{removal of multiple record fields} \\
adamc@527 184 &&& \mt{fold} & \textrm{fold over fields of a type-level record} \\
adamc@527 185 \\
adamc@527 186 &&& \mt{let} \; ed^* \; \mt{in} \; e \; \mt{end} & \textrm{local definitions} \\
adamc@527 187 \\
adamc@527 188 &&& \mt{case} \; e \; \mt{of} \; (p \Rightarrow e|)^+ & \textrm{pattern matching} \\
adamc@527 189 \\
adamc@527 190 &&& \lambda [c \sim c] \Rightarrow e & \textrm{guarded expression} \\
adamc@527 191 \\
adamc@527 192 &&& \_ & \textrm{wildcard} \\
adamc@527 193 &&& (e) & \textrm{explicit precedence} \\
adamc@527 194 \\
adamc@527 195 \textrm{Local declarations} & ed &::=& \cd{val} \; x : \tau = e & \textrm{non-recursive value} \\
adamc@527 196 &&& \cd{val} \; \cd{rec} \; (x : \tau = e \; \cd{and})^+ & \textrm{mutually-recursive values} \\
adamc@527 197 \end{array}$$
adamc@527 198
adamc@528 199 \emph{Declarations} primarily bring new symbols into context.
adamc@528 200 $$\begin{array}{rrcll}
adamc@528 201 \textrm{Declarations} & d &::=& \mt{con} \; x :: \kappa = c & \textrm{constructor synonym} \\
adamc@528 202 &&& \mt{datatype} \; x \; x^* = dc\mid^+ & \textrm{algebraic datatype definition} \\
adamc@529 203 &&& \mt{datatype} \; x = \mt{datatype} \; M.x & \textrm{algebraic datatype import} \\
adamc@528 204 &&& \mt{val} \; x : \tau = e & \textrm{value} \\
adamc@528 205 &&& \mt{val} \; \cd{rec} \; (x : \tau = e \; \mt{and})^+ & \textrm{mutually-recursive values} \\
adamc@528 206 &&& \mt{structure} \; X : S = M & \textrm{module definition} \\
adamc@528 207 &&& \mt{signature} \; X = S & \textrm{signature definition} \\
adamc@528 208 &&& \mt{open} \; M & \textrm{module inclusion} \\
adamc@528 209 &&& \mt{constraint} \; c \sim c & \textrm{record disjointness constraint} \\
adamc@528 210 &&& \mt{open} \; \mt{constraints} \; M & \textrm{inclusion of just the constraints from a module} \\
adamc@528 211 &&& \mt{table} \; x : c & \textrm{SQL table} \\
adamc@528 212 &&& \mt{sequence} \; x & \textrm{SQL sequence} \\
adamc@535 213 &&& \mt{cookie} \; x : \tau & \textrm{HTTP cookie} \\
adamc@528 214 &&& \mt{class} \; x = c & \textrm{concrete type class} \\
adamc@528 215 \\
adamc@529 216 \textrm{Modules} & M &::=& \mt{struct} \; d^* \; \mt{end} & \textrm{constant} \\
adamc@529 217 &&& X & \textrm{variable} \\
adamc@529 218 &&& M.X & \textrm{projection} \\
adamc@529 219 &&& M(M) & \textrm{functor application} \\
adamc@529 220 &&& \mt{functor}(X : S) : S = M & \textrm{functor abstraction} \\
adamc@528 221 \end{array}$$
adamc@528 222
adamc@528 223 There are two kinds of Ur files. A file named $M\texttt{.ur}$ is an \emph{implementation file}, and it should contain a sequence of declarations $d^*$. A file named $M\texttt{.urs}$ is an \emph{interface file}; it must always have a matching $M\texttt{.ur}$ and should contain a sequence of signature items $s^*$. When both files are present, the overall effect is the same as a monolithic declaration $\mt{structure} \; M : \mt{sig} \; s^* \; \mt{end} = \mt{struct} \; d^* \; \mt{end}$. When no interface file is included, the overall effect is similar, with a signature for module $M$ being inferred rather than just checked against an interface.
adamc@527 224
adamc@529 225 \subsection{Shorthands}
adamc@529 226
adamc@529 227 There are a variety of derived syntactic forms that elaborate into the core syntax from the last subsection. We will present the additional forms roughly following the order in which we presented the constructs that they elaborate into.
adamc@529 228
adamc@529 229 In many contexts where record fields are expected, like in a projection $e.c$, a constant field may be written as simply $X$, rather than $\#X$.
adamc@529 230
adamc@529 231 A record type may be written $\{(c = c,)^*\}$, which elaborates to $\$[(c = c,)^*]$.
adamc@529 232
adamc@533 233 The notation $[c_1, \ldots, c_n]$ is shorthand for $[c_1 = (), \ldots, c_n = ()]$.
adamc@533 234
adamc@529 235 A tuple type $(\tau_1, \ldots, \tau_n)$ expands to a record type $\{1 = \tau_1, \ldots, n = \tau_n\}$, with natural numbers as field names. A tuple pattern $(p_1, \ldots, p_n)$ expands to a rigid record pattern $\{1 = p_1, \ldots, n = p_n\}$. Positive natural numbers may be used in most places where field names would be allowed.
adamc@529 236
adamc@529 237 In general, several adjacent $\lambda$ forms may be combined into one, and kind and type annotations may be omitted, in which case they are implicitly included as wildcards. More formally, for constructor-level abstractions, we can define a new non-terminal $b ::= x \mid (x :: \kappa) \mid [c \sim c]$ and allow composite abstractions of the form $\lambda b^+ \Rightarrow c$, elaborating into the obvious sequence of one core $\lambda$ per element of $b^+$.
adamc@529 238
adamc@529 239 For any signature item or declaration that defines some entity to be equal to $A$ with classification annotation $B$ (e.g., $\mt{val} \; x : B = A$), $B$ and the preceding colon (or similar punctuation) may be omitted, in which case it is filled in as a wildcard.
adamc@529 240
adamc@529 241 A signature item or declaration $\mt{type} \; x$ or $\mt{type} \; x = \tau$ is elaborated into $\mt{con} \; x :: \mt{Type}$ or $\mt{con} \; x :: \mt{Type} = \tau$, respectively.
adamc@529 242
adamc@529 243 A signature item or declaration $\mt{class} \; x = \lambda y :: \mt{Type} \Rightarrow c$ may be abbreviated $\mt{class} \; x \; y = c$.
adamc@529 244
adamc@529 245 Handling of implicit and explicit constructor arguments may be tweaked with some prefixes to variable references. An expression $@x$ is a version of $x$ where all implicit constructor arguments have been made explicit. An expression $@@x$ achieves the same effect, additionally halting automatic resolution of type class instances. The same syntax works for variables projected out of modules and for capitalized variables (datatype constructors).
adamc@529 246
adamc@529 247 At the expression level, an analogue is available of the composite $\lambda$ form for constructors. We define the language of binders as $b ::= x \mid (x : \tau) \mid (x \; ? \; \kappa) \mid [c \sim c]$. A lone variable $x$ as a binder stands for an expression variable of unspecified type.
adamc@529 248
adamc@529 249 A $\mt{val}$ or $\mt{val} \; \mt{rec}$ declaration may include expression binders before the equal sign, following the binder grammar from the last paragraph. Such declarations are elaborated into versions that add additional $\lambda$s to the fronts of the righthand sides, as appropriate. The keyword $\mt{fun}$ is a synonym for $\mt{val} \; \mt{rec}$.
adamc@529 250
adamc@529 251 A signature item $\mt{functor} \; X_1 \; (X_2 : S_1) : S_2$ is elaborated into $\mt{structure} \; X_1 : \mt{functor}(X_2 : S_1) : S_2$. A declaration $\mt{functor} \; X_1 \; (X_2 : S_1) : S_2 = M$ is elaborated into $\mt{structure} \; X_1 : \mt{functor}(X_2 : S_1) : S_2 = \mt{functor}(X_2 : S_1) : S_2 = M$.
adamc@529 252
adamc@529 253 A declaration $\mt{table} \; x : \{(c = c,)^*\}$ is elaborated into $\mt{table} \; x : [(c = c,)^*]$
adamc@529 254
adamc@529 255 The syntax $\mt{where} \; \mt{type}$ is an alternate form of $\mt{where} \; \mt{con}$.
adamc@529 256
adamc@529 257 The syntax $\mt{if} \; e \; \mt{then} \; e_1 \; \mt{else} \; e_2$ expands to $\mt{case} \; e \; \mt{of} \; \mt{Basis}.\mt{True} \Rightarrow e_1 \mid \mt{Basis}.\mt{False} \Rightarrow e_2$.
adamc@529 258
adamc@529 259 There are infix operator syntaxes for a number of functions defined in the $\mt{Basis}$ module. There is $=$ for $\mt{eq}$, $\neq$ for $\mt{neq}$, $-$ for $\mt{neg}$ (as a prefix operator) and $\mt{minus}$, $+$ for $\mt{plus}$, $\times$ for $\mt{times}$, $/$ for $\mt{div}$, $\%$ for $\mt{mod}$, $<$ for $\mt{lt}$, $\leq$ for $\mt{le}$, $>$ for $\mt{gt}$, and $\geq$ for $\mt{ge}$.
adamc@529 260
adamc@529 261 A signature item $\mt{table} \; x : c$ is shorthand for $\mt{val} \; x : \mt{Basis}.\mt{sql\_table} \; c$. $\mt{sequence} \; x$ is short for $\mt{val} \; x : \mt{Basis}.\mt{sql\_sequence}$, and $\mt{cookie} \; x : \tau$ is shorthand for $\mt{val} \; x : \mt{Basis}.\mt{http\_cookie} \; \tau$.
adamc@529 262
adamc@530 263
adamc@530 264 \section{Static Semantics}
adamc@530 265
adamc@530 266 In this section, we give a declarative presentation of Ur's typing rules and related judgments. Inference is the subject of the next section; here, we assume that an oracle has filled in all wildcards with concrete values.
adamc@530 267
adamc@530 268 Since there is significant mutual recursion among the judgments, we introduce them all before beginning to give rules. We use the same variety of contexts throughout this section, implicitly introducing new sorts of context entries as needed.
adamc@530 269 \begin{itemize}
adamc@530 270 \item $\Gamma \vdash c :: \kappa$ assigns a kind to a constructor in a context.
adamc@530 271 \item $\Gamma \vdash c \sim c$ proves the disjointness of two record constructors; that is, that they share no field names. We overload the judgment to apply to pairs of field names as well.
adamc@531 272 \item $\Gamma \vdash c \hookrightarrow C$ proves that record constructor $c$ decomposes into set $C$ of field names and record constructors.
adamc@530 273 \item $\Gamma \vdash c \equiv c$ proves the computational equivalence of two constructors. This is often called a \emph{definitional equality} in the world of type theory.
adamc@530 274 \item $\Gamma \vdash e : \tau$ is a standard typing judgment.
adamc@534 275 \item $\Gamma \vdash p \leadsto \Gamma; \tau$ combines typing of patterns with calculation of which new variables they bind.
adamc@537 276 \item $\Gamma \vdash d \leadsto \Gamma$ expresses how a declaration modifies a context. We overload this judgment to apply to sequences of declarations, as well as to signature items and sequences of signature items.
adamc@537 277 \item $\Gamma \vdash S \equiv S$ is the signature equivalence judgment.
adamc@536 278 \item $\Gamma \vdash S \leq S$ is the signature compatibility judgment. We write $\Gamma \vdash S$ as shorthand for $\Gamma \vdash S \leq S$.
adamc@530 279 \item $\Gamma \vdash M : S$ is the module signature checking judgment.
adamc@537 280 \item $\mt{proj}(M, \overline{s}, V)$ is a partial function for projecting a signature item from $\overline{s}$, given the module $M$ that we project from. $V$ may be $\mt{con} \; x$, $\mt{datatype} \; x$, $\mt{val} \; x$, $\mt{signature} \; X$, or $\mt{structure} \; X$. The parameter $M$ is needed because the projected signature item may refer to other items from $\overline{s}$.
adamc@539 281 \item $\mt{selfify}(M, \overline{s})$ adds information to signature items $\overline{s}$ to reflect the fact that we are concerned with the particular module $M$. This function is overloaded to work over individual signature items as well.
adamc@530 282 \end{itemize}
adamc@530 283
adamc@530 284 \subsection{Kinding}
adamc@530 285
adamc@530 286 $$\infer{\Gamma \vdash (c) :: \kappa :: \kappa}{
adamc@530 287 \Gamma \vdash c :: \kappa
adamc@530 288 }
adamc@530 289 \quad \infer{\Gamma \vdash x :: \kappa}{
adamc@530 290 x :: \kappa \in \Gamma
adamc@530 291 }
adamc@530 292 \quad \infer{\Gamma \vdash x :: \kappa}{
adamc@530 293 x :: \kappa = c \in \Gamma
adamc@530 294 }$$
adamc@530 295
adamc@530 296 $$\infer{\Gamma \vdash M.x :: \kappa}{
adamc@537 297 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 298 & \mt{proj}(M, \overline{s}, \mt{con} \; x) = \kappa
adamc@530 299 }
adamc@530 300 \quad \infer{\Gamma \vdash M.x :: \kappa}{
adamc@537 301 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 302 & \mt{proj}(M, \overline{s}, \mt{con} \; x) = (\kappa, c)
adamc@530 303 }$$
adamc@530 304
adamc@530 305 $$\infer{\Gamma \vdash \tau_1 \to \tau_2 :: \mt{Type}}{
adamc@530 306 \Gamma \vdash \tau_1 :: \mt{Type}
adamc@530 307 & \Gamma \vdash \tau_2 :: \mt{Type}
adamc@530 308 }
adamc@530 309 \quad \infer{\Gamma \vdash x \; ? \: \kappa \to \tau :: \mt{Type}}{
adamc@530 310 \Gamma, x :: \kappa \vdash \tau :: \mt{Type}
adamc@530 311 }
adamc@530 312 \quad \infer{\Gamma \vdash \$c :: \mt{Type}}{
adamc@530 313 \Gamma \vdash c :: \{\mt{Type}\}
adamc@530 314 }$$
adamc@530 315
adamc@530 316 $$\infer{\Gamma \vdash c_1 \; c_2 :: \kappa_2}{
adamc@530 317 \Gamma \vdash c_1 :: \kappa_1 \to \kappa_2
adamc@530 318 & \Gamma \vdash c_2 :: \kappa_1
adamc@530 319 }
adamc@530 320 \quad \infer{\Gamma \vdash \lambda x \; :: \; \kappa_1 \Rightarrow c :: \kappa_1 \to \kappa_2}{
adamc@530 321 \Gamma, x :: \kappa_1 \vdash c :: \kappa_2
adamc@530 322 }$$
adamc@530 323
adamc@530 324 $$\infer{\Gamma \vdash () :: \mt{Unit}}{}
adamc@530 325 \quad \infer{\Gamma \vdash \#X :: \mt{Name}}{}$$
adamc@530 326
adamc@530 327 $$\infer{\Gamma \vdash [\overline{c_i = c'_i}] :: \{\kappa\}}{
adamc@530 328 \forall i: \Gamma \vdash c_i : \mt{Name}
adamc@530 329 & \Gamma \vdash c'_i :: \kappa
adamc@530 330 & \forall i \neq j: \Gamma \vdash c_i \sim c_j
adamc@530 331 }
adamc@530 332 \quad \infer{\Gamma \vdash c_1 \rc c_2 :: \{\kappa\}}{
adamc@530 333 \Gamma \vdash c_1 :: \{\kappa\}
adamc@530 334 & \Gamma \vdash c_2 :: \{\kappa\}
adamc@530 335 & \Gamma \vdash c_1 \sim c_2
adamc@530 336 }$$
adamc@530 337
adamc@530 338 $$\infer{\Gamma \vdash \mt{fold} :: ((\mt{Name} \to \kappa_1 \to \kappa_2 \to \kappa_2) \to \kappa_2 \to \{\kappa_1\} \to \kappa_2}{}$$
adamc@530 339
adamc@530 340 $$\infer{\Gamma \vdash (\overline c) :: (k_1 \times \ldots \times k_n)}{
adamc@530 341 \forall i: \Gamma \vdash c_i :: k_i
adamc@530 342 }
adamc@530 343 \quad \infer{\Gamma \vdash c.i :: k_i}{
adamc@530 344 \Gamma \vdash c :: (k_1 \times \ldots \times k_n)
adamc@530 345 }$$
adamc@530 346
adamc@530 347 $$\infer{\Gamma \vdash \lambda [c_1 \sim c_2] \Rightarrow c :: \kappa}{
adamc@530 348 \Gamma \vdash c_1 :: \{\kappa'\}
adamc@530 349 & \Gamma \vdash c_2 :: \{\kappa'\}
adamc@530 350 & \Gamma, c_1 \sim c_2 \vdash c :: \kappa
adamc@530 351 }$$
adamc@530 352
adamc@531 353 \subsection{Record Disjointness}
adamc@531 354
adamc@531 355 We will use a keyword $\mt{map}$ as a shorthand, such that, for $f$ of kind $\kappa \to \kappa'$, $\mt{map} \; f$ stands for $\mt{fold} \; (\lambda (x_1 :: \mt{Name}) (x_2 :: \kappa) (x_3 :: \{\kappa'\}) \Rightarrow [x_1 = f \; x_2] \rc x_3) \; []$.
adamc@531 356
adamc@531 357 $$\infer{\Gamma \vdash c_1 \sim c_2}{
adamc@531 358 \Gamma \vdash c_1 \hookrightarrow c'_1
adamc@531 359 & \Gamma \vdash c_2 \hookrightarrow c'_2
adamc@531 360 & \forall c''_1 \in c'_1, c''_2 \in c'_2: \Gamma \vdash c''_1 \sim c''_2
adamc@531 361 }
adamc@531 362 \quad \infer{\Gamma \vdash X \sim X'}{
adamc@531 363 X \neq X'
adamc@531 364 }$$
adamc@531 365
adamc@531 366 $$\infer{\Gamma \vdash c_1 \sim c_2}{
adamc@531 367 c'_1 \sim c'_2 \in \Gamma
adamc@531 368 & \Gamma \vdash c'_1 \hookrightarrow c''_1
adamc@531 369 & \Gamma \vdash c'_2 \hookrightarrow c''_2
adamc@531 370 & c_1 \in c''_1
adamc@531 371 & c_2 \in c''_2
adamc@531 372 }$$
adamc@531 373
adamc@531 374 $$\infer{\Gamma \vdash c \hookrightarrow \{c\}}{}
adamc@531 375 \quad \infer{\Gamma \vdash [\overline{c = c'}] \hookrightarrow \{\overline{c}\}}{}
adamc@531 376 \quad \infer{\Gamma \vdash c_1 \rc c_2 \hookrightarrow C_1 \cup C_2}{
adamc@531 377 \Gamma \vdash c_1 \hookrightarrow C_1
adamc@531 378 & \Gamma \vdash c_2 \hookrightarrow C_2
adamc@531 379 }
adamc@531 380 \quad \infer{\Gamma \vdash c \hookrightarrow C}{
adamc@531 381 \Gamma \vdash c \equiv c'
adamc@531 382 & \Gamma \vdash c' \hookrightarrow C
adamc@531 383 }
adamc@531 384 \quad \infer{\Gamma \vdash \mt{map} \; f \; c \hookrightarrow C}{
adamc@531 385 \Gamma \vdash c \hookrightarrow C
adamc@531 386 }$$
adamc@531 387
adamc@541 388 \subsection{\label{definitional}Definitional Equality}
adamc@532 389
adamc@532 390 We use $\mathcal C$ to stand for a one-hole context that, when filled, yields a constructor. The notation $\mathcal C[c]$ plugs $c$ into $\mathcal C$. We omit the standard definition of one-hole contexts. We write $[x \mapsto c_1]c_2$ for capture-avoiding substitution of $c_1$ for $x$ in $c_2$.
adamc@532 391
adamc@532 392 $$\infer{\Gamma \vdash c \equiv c}{}
adamc@532 393 \quad \infer{\Gamma \vdash c_1 \equiv c_2}{
adamc@532 394 \Gamma \vdash c_2 \equiv c_1
adamc@532 395 }
adamc@532 396 \quad \infer{\Gamma \vdash c_1 \equiv c_3}{
adamc@532 397 \Gamma \vdash c_1 \equiv c_2
adamc@532 398 & \Gamma \vdash c_2 \equiv c_3
adamc@532 399 }
adamc@532 400 \quad \infer{\Gamma \vdash \mathcal C[c_1] \equiv \mathcal C[c_2]}{
adamc@532 401 \Gamma \vdash c_1 \equiv c_2
adamc@532 402 }$$
adamc@532 403
adamc@532 404 $$\infer{\Gamma \vdash x \equiv c}{
adamc@532 405 x :: \kappa = c \in \Gamma
adamc@532 406 }
adamc@532 407 \quad \infer{\Gamma \vdash M.x \equiv c}{
adamc@537 408 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 409 & \mt{proj}(M, \overline{s}, \mt{con} \; x) = (\kappa, c)
adamc@532 410 }
adamc@532 411 \quad \infer{\Gamma \vdash (\overline c).i \equiv c_i}{}$$
adamc@532 412
adamc@532 413 $$\infer{\Gamma \vdash (\lambda x :: \kappa \Rightarrow c) \; c' \equiv [x \mapsto c'] c}{}
adamc@532 414 \quad \infer{\Gamma \vdash c_1 \rc c_2 \equiv c_2 \rc c_1}{}
adamc@532 415 \quad \infer{\Gamma \vdash c_1 \rc (c_2 \rc c_3) \equiv (c_1 \rc c_2) \rc c_3}{}$$
adamc@532 416
adamc@532 417 $$\infer{\Gamma \vdash [] \rc c \equiv c}{}
adamc@532 418 \quad \infer{\Gamma \vdash [\overline{c_1 = c'_1}] \rc [\overline{c_2 = c'_2}] \equiv [\overline{c_1 = c'_1}, \overline{c_2 = c'_2}]}{}$$
adamc@532 419
adamc@532 420 $$\infer{\Gamma \vdash \lambda [c_1 \sim c_2] \Rightarrow c \equiv c}{
adamc@532 421 \Gamma \vdash c_1 \sim c_2
adamc@532 422 }
adamc@532 423 \quad \infer{\Gamma \vdash \mt{fold} \; f \; i \; [] \equiv i}{}
adamc@532 424 \quad \infer{\Gamma \vdash \mt{fold} \; f \; i \; ([c_1 = c_2] \rc c) \equiv f \; c_1 \; c_2 \; (\mt{fold} \; f \; i \; c)}{}$$
adamc@532 425
adamc@532 426 $$\infer{\Gamma \vdash \mt{map} \; (\lambda x \Rightarrow x) \; c \equiv c}{}
adamc@532 427 \quad \infer{\Gamma \vdash \mt{fold} \; f \; i \; (\mt{map} \; f' \; c)
adamc@532 428 \equiv \mt{fold} \; (\lambda (x_1 :: \mt{Name}) (x_2 :: \kappa) \Rightarrow f \; x_1 \; (f' \; x_2)) \; i \; c}{}$$
adamc@532 429
adamc@532 430 $$\infer{\Gamma \vdash \mt{map} \; f \; (c_1 \rc c_2) \equiv \mt{map} \; f \; c_1 \rc \mt{map} \; f \; c_2}{}$$
adamc@531 431
adamc@534 432 \subsection{Expression Typing}
adamc@533 433
adamc@533 434 We assume the existence of a function $T$ assigning types to literal constants. It maps integer constants to $\mt{Basis}.\mt{int}$, float constants to $\mt{Basis}.\mt{float}$, and string constants to $\mt{Basis}.\mt{string}$.
adamc@533 435
adamc@533 436 We also refer to a function $\mathcal I$, such that $\mathcal I(\tau)$ ``uses an oracle'' to instantiate all constructor function arguments at the beginning of $\tau$ that are marked implicit; i.e., replace $x_1 ::: \kappa_1 \to \ldots \to x_n ::: \kappa_n \to \tau$ with $[x_1 \mapsto c_1]\ldots[x_n \mapsto c_n]\tau$, where the $c_i$s are inferred and $\tau$ does not start like $x ::: \kappa \to \tau'$.
adamc@533 437
adamc@533 438 $$\infer{\Gamma \vdash e : \tau : \tau}{
adamc@533 439 \Gamma \vdash e : \tau
adamc@533 440 }
adamc@533 441 \quad \infer{\Gamma \vdash e : \tau}{
adamc@533 442 \Gamma \vdash e : \tau'
adamc@533 443 & \Gamma \vdash \tau' \equiv \tau
adamc@533 444 }
adamc@533 445 \quad \infer{\Gamma \vdash \ell : T(\ell)}{}$$
adamc@533 446
adamc@533 447 $$\infer{\Gamma \vdash x : \mathcal I(\tau)}{
adamc@533 448 x : \tau \in \Gamma
adamc@533 449 }
adamc@533 450 \quad \infer{\Gamma \vdash M.x : \mathcal I(\tau)}{
adamc@537 451 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 452 & \mt{proj}(M, \overline{s}, \mt{val} \; x) = \tau
adamc@533 453 }
adamc@533 454 \quad \infer{\Gamma \vdash X : \mathcal I(\tau)}{
adamc@533 455 X : \tau \in \Gamma
adamc@533 456 }
adamc@533 457 \quad \infer{\Gamma \vdash M.X : \mathcal I(\tau)}{
adamc@537 458 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 459 & \mt{proj}(M, \overline{s}, \mt{val} \; X) = \tau
adamc@533 460 }$$
adamc@533 461
adamc@533 462 $$\infer{\Gamma \vdash e_1 \; e_2 : \tau_2}{
adamc@533 463 \Gamma \vdash e_1 : \tau_1 \to \tau_2
adamc@533 464 & \Gamma \vdash e_2 : \tau_1
adamc@533 465 }
adamc@533 466 \quad \infer{\Gamma \vdash \lambda x : \tau_1 \Rightarrow e : \tau_1 \to \tau_2}{
adamc@533 467 \Gamma, x : \tau_1 \vdash e : \tau_2
adamc@533 468 }$$
adamc@533 469
adamc@533 470 $$\infer{\Gamma \vdash e [c] : [x \mapsto c]\tau}{
adamc@533 471 \Gamma \vdash e : x :: \kappa \to \tau
adamc@533 472 & \Gamma \vdash c :: \kappa
adamc@533 473 }
adamc@533 474 \quad \infer{\Gamma \vdash \lambda x \; ? \; \kappa \Rightarrow e : x \; ? \; \kappa \to \tau}{
adamc@533 475 \Gamma, x :: \kappa \vdash e : \tau
adamc@533 476 }$$
adamc@533 477
adamc@533 478 $$\infer{\Gamma \vdash \{\overline{c = e}\} : \{\overline{c : \tau}\}}{
adamc@533 479 \forall i: \Gamma \vdash c_i :: \mt{Name}
adamc@533 480 & \Gamma \vdash e_i : \tau_i
adamc@533 481 & \forall i \neq j: \Gamma \vdash c_i \sim c_j
adamc@533 482 }
adamc@533 483 \quad \infer{\Gamma \vdash e.c : \tau}{
adamc@533 484 \Gamma \vdash e : \$([c = \tau] \rc c')
adamc@533 485 }
adamc@533 486 \quad \infer{\Gamma \vdash e_1 \rc e_2 : \$(c_1 \rc c_2)}{
adamc@533 487 \Gamma \vdash e_1 : \$c_1
adamc@533 488 & \Gamma \vdash e_2 : \$c_2
adamc@533 489 }$$
adamc@533 490
adamc@533 491 $$\infer{\Gamma \vdash e \rcut c : \$c'}{
adamc@533 492 \Gamma \vdash e : \$([c = \tau] \rc c')
adamc@533 493 }
adamc@533 494 \quad \infer{\Gamma \vdash e \rcutM c : \$c'}{
adamc@533 495 \Gamma \vdash e : \$(c \rc c')
adamc@533 496 }$$
adamc@533 497
adamc@533 498 $$\infer{\Gamma \vdash \mt{fold} : \begin{array}{c}
adamc@533 499 x_1 :: (\{\kappa\} \to \tau)
adamc@533 500 \to (x_2 :: \mt{Name} \to x_3 :: \kappa \to x_4 :: \{\kappa\} \to \lambda [[x_2] \sim x_4]
adamc@533 501 \Rightarrow x_1 \; x_4 \to x_1 \; ([x_2 = x_3] \rc x_4)) \\
adamc@533 502 \to x_1 \; [] \to x_5 :: \{\kappa\} \to x_1 \; x_5
adamc@533 503 \end{array}}{}$$
adamc@533 504
adamc@533 505 $$\infer{\Gamma \vdash \mt{let} \; \overline{ed} \; \mt{in} \; e \; \mt{end} : \tau}{
adamc@533 506 \Gamma \vdash \overline{ed} \leadsto \Gamma'
adamc@533 507 & \Gamma' \vdash e : \tau
adamc@533 508 }
adamc@533 509 \quad \infer{\Gamma \vdash \mt{case} \; e \; \mt{of} \; \overline{p \Rightarrow e} : \tau}{
adamc@533 510 \forall i: \Gamma \vdash p_i \leadsto \Gamma_i, \tau'
adamc@533 511 & \Gamma_i \vdash e_i : \tau
adamc@533 512 }$$
adamc@533 513
adamc@533 514 $$\infer{\Gamma \vdash [c_1 \sim c_2] \Rightarrow e : [c_1 \sim c_2] \Rightarrow \tau}{
adamc@533 515 \Gamma \vdash c_1 :: \{\kappa\}
adamc@533 516 & \Gamma \vdash c_2 :: \{\kappa\}
adamc@533 517 & \Gamma, c_1 \sim c_2 \vdash e : \tau
adamc@533 518 }$$
adamc@533 519
adamc@534 520 \subsection{Pattern Typing}
adamc@534 521
adamc@534 522 $$\infer{\Gamma \vdash \_ \leadsto \Gamma; \tau}{}
adamc@534 523 \quad \infer{\Gamma \vdash x \leadsto \Gamma, x : \tau; \tau}{}
adamc@534 524 \quad \infer{\Gamma \vdash \ell \leadsto \Gamma; T(\ell)}{}$$
adamc@534 525
adamc@534 526 $$\infer{\Gamma \vdash X \leadsto \Gamma; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@534 527 X : \overline{x ::: \mt{Type}} \to \tau \in \Gamma
adamc@534 528 & \textrm{$\tau$ not a function type}
adamc@534 529 }
adamc@534 530 \quad \infer{\Gamma \vdash X \; p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@534 531 X : \overline{x ::: \mt{Type}} \to \tau'' \to \tau \in \Gamma
adamc@534 532 & \Gamma \vdash p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau''
adamc@534 533 }$$
adamc@534 534
adamc@534 535 $$\infer{\Gamma \vdash M.X \leadsto \Gamma; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@537 536 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 537 & \mt{proj}(M, \overline{s}, \mt{val} \; X) = \overline{x ::: \mt{Type}} \to \tau
adamc@534 538 & \textrm{$\tau$ not a function type}
adamc@534 539 }$$
adamc@534 540
adamc@534 541 $$\infer{\Gamma \vdash M.X \; p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@537 542 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 543 & \mt{proj}(M, \overline{s}, \mt{val} \; X) = \overline{x ::: \mt{Type}} \to \tau'' \to \tau
adamc@534 544 & \Gamma \vdash p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau''
adamc@534 545 }$$
adamc@534 546
adamc@534 547 $$\infer{\Gamma \vdash \{\overline{x = p}\} \leadsto \Gamma_n; \{\overline{x = \tau}\}}{
adamc@534 548 \Gamma_0 = \Gamma
adamc@534 549 & \forall i: \Gamma_i \vdash p_i \leadsto \Gamma_{i+1}; \tau_i
adamc@534 550 }
adamc@534 551 \quad \infer{\Gamma \vdash \{\overline{x = p}, \ldots\} \leadsto \Gamma_n; \$([\overline{x = \tau}] \rc c)}{
adamc@534 552 \Gamma_0 = \Gamma
adamc@534 553 & \forall i: \Gamma_i \vdash p_i \leadsto \Gamma_{i+1}; \tau_i
adamc@534 554 }$$
adamc@534 555
adamc@535 556 \subsection{Declaration Typing}
adamc@535 557
adamc@535 558 We use an auxiliary judgment $\overline{y}; x; \Gamma \vdash \overline{dc} \leadsto \Gamma'$, expressing the enrichment of $\Gamma$ with the types of the datatype constructors $\overline{dc}$, when they are known to belong to datatype $x$ with type parameters $\overline{y}$.
adamc@535 559
adamc@541 560 This is the first judgment where we deal with type classes, for the $\mt{class}$ declaration form. We will omit their special handling in this formal specification. Section \ref{typeclasses} gives an informal description of how type classes influence type inference.
adamc@535 561
adamc@537 562 We presuppose the existence of a function $\mathcal O$, where $\mathcal(M, \overline{s})$ implements the $\mt{open}$ declaration by producing a context with the appropriate entry for each available component of module $M$ with signature items $\overline{s}$. Where possible, $\mathcal O$ uses ``transparent'' entries (e.g., an abstract type $M.x$ is mapped to $x :: \mt{Type} = M.x$), so that the relationship with $M$ is maintained. A related function $\mathcal O_c$ builds a context containing the disjointness constraints found in $S$.
adamc@537 563
adamc@537 564 We write $\kappa_1^n \to \kappa$ as a shorthand, where $\kappa_1^0 \to \kappa = \kappa$ and $\kappa_1^{n+1} \to \kappa_2 = \kappa_1 \to (\kappa_1^n \to \kappa_2)$. We write $\mt{len}(\overline{y})$ for the length of vector $\overline{y}$ of variables.
adamc@535 565
adamc@535 566 $$\infer{\Gamma \vdash \cdot \leadsto \Gamma}{}
adamc@535 567 \quad \infer{\Gamma \vdash d, \overline{d} \leadsto \Gamma''}{
adamc@535 568 \Gamma \vdash d \leadsto \Gamma'
adamc@535 569 & \Gamma' \vdash \overline{d} \leadsto \Gamma''
adamc@535 570 }$$
adamc@535 571
adamc@535 572 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa = c \leadsto \Gamma, x :: \kappa = c}{
adamc@535 573 \Gamma \vdash c :: \kappa
adamc@535 574 }
adamc@535 575 \quad \infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leadsto \Gamma'}{
adamc@535 576 \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} \vdash \overline{dc} \leadsto \Gamma'
adamc@535 577 }$$
adamc@535 578
adamc@535 579 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leadsto \Gamma'}{
adamc@537 580 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 581 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@535 582 & \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} = M.z \vdash \overline{dc} \leadsto \Gamma'
adamc@535 583 }$$
adamc@535 584
adamc@535 585 $$\infer{\Gamma \vdash \mt{val} \; x : \tau = e \leadsto \Gamma, x : \tau}{
adamc@535 586 \Gamma \vdash e : \tau
adamc@535 587 }$$
adamc@535 588
adamc@535 589 $$\infer{\Gamma \vdash \mt{val} \; \mt{rec} \; \overline{x : \tau = e} \leadsto \Gamma, \overline{x : \tau}}{
adamc@535 590 \forall i: \Gamma, \overline{x : \tau} \vdash e_i : \tau_i
adamc@535 591 & \textrm{$e_i$ starts with an expression $\lambda$, optionally preceded by constructor and disjointness $\lambda$s}
adamc@535 592 }$$
adamc@535 593
adamc@535 594 $$\infer{\Gamma \vdash \mt{structure} \; X : S = M \leadsto \Gamma, X : S}{
adamc@535 595 \Gamma \vdash M : S
adamc@539 596 & \textrm{ ($M$ not a $\mt{struct} \; \ldots \; \mt{end}$)}
adamc@535 597 }
adamc@539 598 \quad \infer{\Gamma \vdash \mt{structure} \; X : S = \mt{struct} \; \overline{d} \; \mt{end} \leadsto \Gamma, X : \mt{selfify}(X, \overline{s})}{
adamc@539 599 \Gamma \vdash \mt{struct} \; \overline{d} \; \mt{end} : \mt{sig} \; \overline{s} \; \mt{end}
adamc@539 600 }$$
adamc@539 601
adamc@539 602 $$\infer{\Gamma \vdash \mt{signature} \; X = S \leadsto \Gamma, X = S}{
adamc@535 603 \Gamma \vdash S
adamc@535 604 }$$
adamc@535 605
adamc@537 606 $$\infer{\Gamma \vdash \mt{open} \; M \leadsto \Gamma, \mathcal O(M, \overline{s})}{
adamc@537 607 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@535 608 }$$
adamc@535 609
adamc@535 610 $$\infer{\Gamma \vdash \mt{constraint} \; c_1 \sim c_2 \leadsto \Gamma}{
adamc@535 611 \Gamma \vdash c_1 :: \{\kappa\}
adamc@535 612 & \Gamma \vdash c_2 :: \{\kappa\}
adamc@535 613 & \Gamma \vdash c_1 \sim c_2
adamc@535 614 }
adamc@537 615 \quad \infer{\Gamma \vdash \mt{open} \; \mt{constraints} \; M \leadsto \Gamma, \mathcal O_c(M, \overline{s})}{
adamc@537 616 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@535 617 }$$
adamc@535 618
adamc@535 619 $$\infer{\Gamma \vdash \mt{table} \; x : c \leadsto \Gamma, x : \mt{Basis}.\mt{sql\_table} \; c}{
adamc@535 620 \Gamma \vdash c :: \{\mt{Type}\}
adamc@535 621 }
adamc@535 622 \quad \infer{\Gamma \vdash \mt{sequence} \; x \leadsto \Gamma, x : \mt{Basis}.\mt{sql\_sequence}}{}$$
adamc@535 623
adamc@535 624 $$\infer{\Gamma \vdash \mt{cookie} \; x : \tau \leadsto \Gamma, x : \mt{Basis}.\mt{http\_cookie} \; \tau}{
adamc@535 625 \Gamma \vdash \tau :: \mt{Type}
adamc@535 626 }$$
adamc@535 627
adamc@535 628 $$\infer{\Gamma \vdash \mt{class} \; x = c \leadsto \Gamma, x :: \mt{Type} \to \mt{Type} = c}{
adamc@535 629 \Gamma \vdash c :: \mt{Type} \to \mt{Type}
adamc@535 630 }$$
adamc@535 631
adamc@535 632 $$\infer{\overline{y}; x; \Gamma \vdash \cdot \leadsto \Gamma}{}
adamc@535 633 \quad \infer{\overline{y}; x; \Gamma \vdash X \mid \overline{dc} \leadsto \Gamma', X : \overline{y ::: \mt{Type}} \to x \; \overline{y}}{
adamc@535 634 \overline{y}; x; \Gamma \vdash \overline{dc} \leadsto \Gamma'
adamc@535 635 }
adamc@535 636 \quad \infer{\overline{y}; x; \Gamma \vdash X \; \mt{of} \; \tau \mid \overline{dc} \leadsto \Gamma', X : \overline{y ::: \mt{Type}} \to \tau \to x \; \overline{y}}{
adamc@535 637 \overline{y}; x; \Gamma \vdash \overline{dc} \leadsto \Gamma'
adamc@535 638 }$$
adamc@535 639
adamc@537 640 \subsection{Signature Item Typing}
adamc@537 641
adamc@537 642 We appeal to a signature item analogue of the $\mathcal O$ function from the last subsection.
adamc@537 643
adamc@537 644 $$\infer{\Gamma \vdash \cdot \leadsto \Gamma}{}
adamc@537 645 \quad \infer{\Gamma \vdash s, \overline{s} \leadsto \Gamma''}{
adamc@537 646 \Gamma \vdash s \leadsto \Gamma'
adamc@537 647 & \Gamma' \vdash \overline{s} \leadsto \Gamma''
adamc@537 648 }$$
adamc@537 649
adamc@537 650 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa \leadsto \Gamma, x :: \kappa}{}
adamc@537 651 \quad \infer{\Gamma \vdash \mt{con} \; x :: \kappa = c \leadsto \Gamma, x :: \kappa = c}{
adamc@537 652 \Gamma \vdash c :: \kappa
adamc@537 653 }
adamc@537 654 \quad \infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leadsto \Gamma'}{
adamc@537 655 \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} \vdash \overline{dc} \leadsto \Gamma'
adamc@537 656 }$$
adamc@537 657
adamc@537 658 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leadsto \Gamma'}{
adamc@537 659 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 660 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@537 661 & \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} = M.z \vdash \overline{dc} \leadsto \Gamma'
adamc@537 662 }$$
adamc@537 663
adamc@537 664 $$\infer{\Gamma \vdash \mt{val} \; x : \tau \leadsto \Gamma, x : \tau}{
adamc@537 665 \Gamma \vdash \tau :: \mt{Type}
adamc@537 666 }$$
adamc@537 667
adamc@537 668 $$\infer{\Gamma \vdash \mt{structure} \; X : S \leadsto \Gamma, X : S}{
adamc@537 669 \Gamma \vdash S
adamc@537 670 }
adamc@537 671 \quad \infer{\Gamma \vdash \mt{signature} \; X = S \leadsto \Gamma, X = S}{
adamc@537 672 \Gamma \vdash S
adamc@537 673 }$$
adamc@537 674
adamc@537 675 $$\infer{\Gamma \vdash \mt{include} \; S \leadsto \Gamma, \mathcal O(\overline{s})}{
adamc@537 676 \Gamma \vdash S
adamc@537 677 & \Gamma \vdash S \equiv \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 678 }$$
adamc@537 679
adamc@537 680 $$\infer{\Gamma \vdash \mt{constraint} \; c_1 \sim c_2 \leadsto \Gamma, c_1 \sim c_2}{
adamc@537 681 \Gamma \vdash c_1 :: \{\kappa\}
adamc@537 682 & \Gamma \vdash c_2 :: \{\kappa\}
adamc@537 683 }$$
adamc@537 684
adamc@537 685 $$\infer{\Gamma \vdash \mt{class} \; x = c \leadsto \Gamma, x :: \mt{Type} \to \mt{Type} = c}{
adamc@537 686 \Gamma \vdash c :: \mt{Type} \to \mt{Type}
adamc@537 687 }
adamc@537 688 \quad \infer{\Gamma \vdash \mt{class} \; x \leadsto \Gamma, x :: \mt{Type} \to \mt{Type}}{}$$
adamc@537 689
adamc@536 690 \subsection{Signature Compatibility}
adamc@536 691
adamc@537 692 To simplify the judgments in this section, we assume that all signatures are alpha-varied as necessary to avoid including mmultiple bindings for the same identifier. This is in addition to the usual alpha-variation of locally-bound variables.
adamc@537 693
adamc@537 694 We rely on a judgment $\Gamma \vdash \overline{s} \leq s'$, which expresses the occurrence in signature items $\overline{s}$ of an item compatible with $s'$. We also use a judgment $\Gamma \vdash \overline{dc} \leq \overline{dc}$, which expresses compatibility of datatype definitions.
adamc@537 695
adamc@536 696 $$\infer{\Gamma \vdash S \equiv S}{}
adamc@536 697 \quad \infer{\Gamma \vdash S_1 \equiv S_2}{
adamc@536 698 \Gamma \vdash S_2 \equiv S_1
adamc@536 699 }
adamc@536 700 \quad \infer{\Gamma \vdash X \equiv S}{
adamc@536 701 X = S \in \Gamma
adamc@536 702 }
adamc@536 703 \quad \infer{\Gamma \vdash M.X \equiv S}{
adamc@537 704 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 705 & \mt{proj}(M, \overline{s}, \mt{signature} \; X) = S
adamc@536 706 }$$
adamc@536 707
adamc@536 708 $$\infer{\Gamma \vdash S \; \mt{where} \; \mt{con} \; x = c \equiv \mt{sig} \; \overline{s^1} \; \mt{con} \; x :: \kappa = c \; \overline{s_2} \; \mt{end}}{
adamc@536 709 \Gamma \vdash S \equiv \mt{sig} \; \overline{s^1} \; \mt{con} \; x :: \kappa \; \overline{s_2} \; \mt{end}
adamc@536 710 & \Gamma \vdash c :: \kappa
adamc@537 711 }
adamc@537 712 \quad \infer{\Gamma \vdash \mt{sig} \; \overline{s^1} \; \mt{include} \; S \; \overline{s^2} \; \mt{end} \equiv \mt{sig} \; \overline{s^1} \; \overline{s} \; \overline{s^2} \; \mt{end}}{
adamc@537 713 \Gamma \vdash S \equiv \mt{sig} \; \overline{s} \; \mt{end}
adamc@536 714 }$$
adamc@536 715
adamc@536 716 $$\infer{\Gamma \vdash S_1 \leq S_2}{
adamc@536 717 \Gamma \vdash S_1 \equiv S_2
adamc@536 718 }
adamc@536 719 \quad \infer{\Gamma \vdash \mt{sig} \; \overline{s} \; \mt{end} \leq \mt{sig} \; \mt{end}}{}
adamc@537 720 \quad \infer{\Gamma \vdash \mt{sig} \; \overline{s} \; \mt{end} \leq \mt{sig} \; s' \; \overline{s'} \; \mt{end}}{
adamc@537 721 \Gamma \vdash \overline{s} \leq s'
adamc@537 722 & \Gamma \vdash s' \leadsto \Gamma'
adamc@537 723 & \Gamma' \vdash \mt{sig} \; \overline{s} \; \mt{end} \leq \mt{sig} \; \overline{s'} \; \mt{end}
adamc@537 724 }$$
adamc@537 725
adamc@537 726 $$\infer{\Gamma \vdash s \; \overline{s} \leq s'}{
adamc@537 727 \Gamma \vdash s \leq s'
adamc@537 728 }
adamc@537 729 \quad \infer{\Gamma \vdash s \; \overline{s} \leq s'}{
adamc@537 730 \Gamma \vdash s \leadsto \Gamma'
adamc@537 731 & \Gamma' \vdash \overline{s} \leq s'
adamc@536 732 }$$
adamc@536 733
adamc@536 734 $$\infer{\Gamma \vdash \mt{functor} (X : S_1) : S_2 \leq \mt{functor} (X : S'_1) : S'_2}{
adamc@536 735 \Gamma \vdash S'_1 \leq S_1
adamc@536 736 & \Gamma, X : S'_1 \vdash S_2 \leq S'_2
adamc@536 737 }$$
adamc@536 738
adamc@537 739 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa \leq \mt{con} \; x :: \kappa}{}
adamc@537 740 \quad \infer{\Gamma \vdash \mt{con} \; x :: \kappa = c \leq \mt{con} \; x :: \kappa}{}
adamc@537 741 \quad \infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leq \mt{con} \; x :: \mt{Type}}{}$$
adamc@537 742
adamc@537 743 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leq \mt{con} \; x :: \mt{Type}^{\mt{len}(y)} \to \mt{Type}}{
adamc@537 744 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 745 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@537 746 }$$
adamc@537 747
adamc@537 748 $$\infer{\Gamma \vdash \mt{class} \; x \leq \mt{con} \; x :: \mt{Type} \to \mt{Type}}{}
adamc@537 749 \quad \infer{\Gamma \vdash \mt{class} \; x = c \leq \mt{con} \; x :: \mt{Type} \to \mt{Type}}{}$$
adamc@537 750
adamc@537 751 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa = c_1 \leq \mt{con} \; x :: \mt{\kappa} = c_2}{
adamc@537 752 \Gamma \vdash c_1 \equiv c_2
adamc@537 753 }
adamc@537 754 \quad \infer{\Gamma \vdash \mt{class} \; x = c_1 \leq \mt{con} \; x :: \mt{Type} \to \mt{Type} = c_2}{
adamc@537 755 \Gamma \vdash c_1 \equiv c_2
adamc@537 756 }$$
adamc@537 757
adamc@537 758 $$\infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leq \mt{datatype} \; x \; \overline{y} = \overline{dc'}}{
adamc@537 759 \Gamma, \overline{y :: \mt{Type}} \vdash \overline{dc} \leq \overline{dc'}
adamc@537 760 }$$
adamc@537 761
adamc@537 762 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leq \mt{datatype} \; x \; \overline{y} = \overline{dc'}}{
adamc@537 763 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 764 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@537 765 & \Gamma, \overline{y :: \mt{Type}} \vdash \overline{dc} \leq \overline{dc'}
adamc@537 766 }$$
adamc@537 767
adamc@537 768 $$\infer{\Gamma \vdash \cdot \leq \cdot}{}
adamc@537 769 \quad \infer{\Gamma \vdash X; \overline{dc} \leq X; \overline{dc'}}{
adamc@537 770 \Gamma \vdash \overline{dc} \leq \overline{dc'}
adamc@537 771 }
adamc@537 772 \quad \infer{\Gamma \vdash X \; \mt{of} \; \tau_1; \overline{dc} \leq X \; \mt{of} \; \tau_2; \overline{dc'}}{
adamc@537 773 \Gamma \vdash \tau_1 \equiv \tau_2
adamc@537 774 & \Gamma \vdash \overline{dc} \leq \overline{dc'}
adamc@537 775 }$$
adamc@537 776
adamc@537 777 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leq \mt{datatype} \; x = \mt{datatype} \; M'.z'}{
adamc@537 778 \Gamma \vdash M.z \equiv M'.z'
adamc@537 779 }$$
adamc@537 780
adamc@537 781 $$\infer{\Gamma \vdash \mt{val} \; x : \tau_1 \leq \mt{val} \; x : \tau_2}{
adamc@537 782 \Gamma \vdash \tau_1 \equiv \tau_2
adamc@537 783 }
adamc@537 784 \quad \infer{\Gamma \vdash \mt{structure} \; X : S_1 \leq \mt{structure} \; X : S_2}{
adamc@537 785 \Gamma \vdash S_1 \leq S_2
adamc@537 786 }
adamc@537 787 \quad \infer{\Gamma \vdash \mt{signature} \; X = S_1 \leq \mt{signature} \; X = S_2}{
adamc@537 788 \Gamma \vdash S_1 \leq S_2
adamc@537 789 & \Gamma \vdash S_2 \leq S_1
adamc@537 790 }$$
adamc@537 791
adamc@537 792 $$\infer{\Gamma \vdash \mt{constraint} \; c_1 \sim c_2 \leq \mt{constraint} \; c'_1 \sim c'_2}{
adamc@537 793 \Gamma \vdash c_1 \equiv c'_1
adamc@537 794 & \Gamma \vdash c_2 \equiv c'_2
adamc@537 795 }$$
adamc@537 796
adamc@537 797 $$\infer{\Gamma \vdash \mt{class} \; x \leq \mt{class} \; x}{}
adamc@537 798 \quad \infer{\Gamma \vdash \mt{class} \; x = c \leq \mt{class} \; x}{}
adamc@537 799 \quad \infer{\Gamma \vdash \mt{class} \; x = c_1 \leq \mt{class} \; x = c_2}{
adamc@537 800 \Gamma \vdash c_1 \equiv c_2
adamc@537 801 }$$
adamc@537 802
adamc@538 803 \subsection{Module Typing}
adamc@538 804
adamc@538 805 We use a helper function $\mt{sigOf}$, which converts declarations and sequences of declarations into their principal signature items and sequences of signature items, respectively.
adamc@538 806
adamc@538 807 $$\infer{\Gamma \vdash M : S}{
adamc@538 808 \Gamma \vdash M : S'
adamc@538 809 & \Gamma \vdash S' \leq S
adamc@538 810 }
adamc@538 811 \quad \infer{\Gamma \vdash \mt{struct} \; \overline{d} \; \mt{end} : \mt{sig} \; \mt{sigOf}(\overline{d}) \; \mt{end}}{
adamc@538 812 \Gamma \vdash \overline{d} \leadsto \Gamma'
adamc@538 813 }
adamc@538 814 \quad \infer{\Gamma \vdash X : S}{
adamc@538 815 X : S \in \Gamma
adamc@538 816 }$$
adamc@538 817
adamc@538 818 $$\infer{\Gamma \vdash M.X : S}{
adamc@538 819 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@538 820 & \mt{proj}(M, \overline{s}, \mt{structure} \; X) = S
adamc@538 821 }$$
adamc@538 822
adamc@538 823 $$\infer{\Gamma \vdash M_1(M_2) : [X \mapsto M_2]S_2}{
adamc@538 824 \Gamma \vdash M_1 : \mt{functor}(X : S_1) : S_2
adamc@538 825 & \Gamma \vdash M_2 : S_1
adamc@538 826 }
adamc@538 827 \quad \infer{\Gamma \vdash \mt{functor} (X : S_1) : S_2 = M : \mt{functor} (X : S_1) : S_2}{
adamc@538 828 \Gamma \vdash S_1
adamc@538 829 & \Gamma, X : S_1 \vdash S_2
adamc@538 830 & \Gamma, X : S_1 \vdash M : S_2
adamc@538 831 }$$
adamc@538 832
adamc@538 833 \begin{eqnarray*}
adamc@538 834 \mt{sigOf}(\cdot) &=& \cdot \\
adamc@538 835 \mt{sigOf}(s \; \overline{s'}) &=& \mt{sigOf}(s) \; \mt{sigOf}(\overline{s'}) \\
adamc@538 836 \\
adamc@538 837 \mt{sigOf}(\mt{con} \; x :: \kappa = c) &=& \mt{con} \; x :: \kappa = c \\
adamc@538 838 \mt{sigOf}(\mt{datatype} \; x \; \overline{y} = \overline{dc}) &=& \mt{datatype} \; x \; \overline{y} = \overline{dc} \\
adamc@538 839 \mt{sigOf}(\mt{datatype} \; x = \mt{datatype} \; M.z) &=& \mt{datatype} \; x = \mt{datatype} \; M.z \\
adamc@538 840 \mt{sigOf}(\mt{val} \; x : \tau = e) &=& \mt{val} \; x : \tau \\
adamc@538 841 \mt{sigOf}(\mt{val} \; \mt{rec} \; \overline{x : \tau = e}) &=& \overline{\mt{val} \; x : \tau} \\
adamc@538 842 \mt{sigOf}(\mt{structure} \; X : S = M) &=& \mt{structure} \; X : S \\
adamc@538 843 \mt{sigOf}(\mt{signature} \; X = S) &=& \mt{signature} \; X = S \\
adamc@538 844 \mt{sigOf}(\mt{open} \; M) &=& \mt{include} \; S \textrm{ (where $\Gamma \vdash M : S$)} \\
adamc@538 845 \mt{sigOf}(\mt{constraint} \; c_1 \sim c_2) &=& \mt{constraint} \; c_1 \sim c_2 \\
adamc@538 846 \mt{sigOf}(\mt{open} \; \mt{constraints} \; M) &=& \cdot \\
adamc@538 847 \mt{sigOf}(\mt{table} \; x : c) &=& \mt{table} \; x : c \\
adamc@538 848 \mt{sigOf}(\mt{sequence} \; x) &=& \mt{sequence} \; x \\
adamc@538 849 \mt{sigOf}(\mt{cookie} \; x : \tau) &=& \mt{cookie} \; x : \tau \\
adamc@538 850 \mt{sigOf}(\mt{class} \; x = c) &=& \mt{class} \; x = c \\
adamc@538 851 \end{eqnarray*}
adamc@538 852
adamc@539 853 \begin{eqnarray*}
adamc@539 854 \mt{selfify}(M, \cdot) &=& \cdot \\
adamc@539 855 \mt{selfify}(M, s \; \overline{s'}) &=& \mt{selfify}(M, \sigma, s) \; \mt{selfify}(M, \overline{s'}) \\
adamc@539 856 \\
adamc@539 857 \mt{selfify}(M, \mt{con} \; x :: \kappa) &=& \mt{con} \; x :: \kappa = M.x \\
adamc@539 858 \mt{selfify}(M, \mt{con} \; x :: \kappa = c) &=& \mt{con} \; x :: \kappa = c \\
adamc@539 859 \mt{selfify}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc}) &=& \mt{datatype} \; x \; \overline{y} = \mt{datatype} \; M.x \\
adamc@539 860 \mt{selfify}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z) &=& \mt{datatype} \; x = \mt{datatype} \; M'.z \\
adamc@539 861 \mt{selfify}(M, \mt{val} \; x : \tau) &=& \mt{val} \; x : \tau \\
adamc@539 862 \mt{selfify}(M, \mt{structure} \; X : S) &=& \mt{structure} \; X : \mt{selfify}(M.X, \overline{s}) \textrm{ (where $\Gamma \vdash S \equiv \mt{sig} \; \overline{s} \; \mt{end}$)} \\
adamc@539 863 \mt{selfify}(M, \mt{signature} \; X = S) &=& \mt{signature} \; X = S \\
adamc@539 864 \mt{selfify}(M, \mt{include} \; S) &=& \mt{include} \; S \\
adamc@539 865 \mt{selfify}(M, \mt{constraint} \; c_1 \sim c_2) &=& \mt{constraint} \; c_1 \sim c_2 \\
adamc@539 866 \mt{selfify}(M, \mt{class} \; x) &=& \mt{class} \; x = M.x \\
adamc@539 867 \mt{selfify}(M, \mt{class} \; x = c) &=& \mt{class} \; x = c \\
adamc@539 868 \end{eqnarray*}
adamc@539 869
adamc@540 870 \subsection{Module Projection}
adamc@540 871
adamc@540 872 \begin{eqnarray*}
adamc@540 873 \mt{proj}(M, \mt{con} \; x :: \kappa \; \overline{s}, \mt{con} \; x) &=& \kappa \\
adamc@540 874 \mt{proj}(M, \mt{con} \; x :: \kappa = c \; \overline{s}, \mt{con} \; x) &=& (\kappa, c) \\
adamc@540 875 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{con} \; x) &=& \mt{Type}^{\mt{len}(\overline{y})} \to \mt{Type} \\
adamc@540 876 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z \; \overline{s}, \mt{con} \; x) &=& (\mt{Type}^{\mt{len}(\overline{y})} \to \mt{Type}, M'.z) \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$} \\
adamc@540 877 && \textrm{and $\mt{proj}(M', \overline{s'}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})$)} \\
adamc@540 878 \mt{proj}(M, \mt{class} \; x \; \overline{s}, \mt{con} \; x) &=& \mt{Type} \to \mt{Type} \\
adamc@540 879 \mt{proj}(M, \mt{class} \; x = c \; \overline{s}, \mt{con} \; x) &=& (\mt{Type} \to \mt{Type}, c) \\
adamc@540 880 \\
adamc@540 881 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{datatype} \; x) &=& (\overline{y}, \overline{dc}) \\
adamc@540 882 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z \; \overline{s}, \mt{con} \; x) &=& \mt{proj}(M', \overline{s'}, \mt{datatype} \; z) \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$)} \\
adamc@540 883 \\
adamc@540 884 \mt{proj}(M, \mt{val} \; x : \tau \; \overline{s}, \mt{val} \; x) &=& \tau \\
adamc@540 885 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to M.x \; \overline y \textrm{ (where $X \in \overline{dc}$)} \\
adamc@540 886 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to \tau \to M.x \; \overline y \textrm{ (where $X \; \mt{of} \; \tau \in \overline{dc}$)} \\
adamc@540 887 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to M.x \; \overline y \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$} \\
adamc@540 888 && \textrm{and $\mt{proj}(M', \overline{s'}, \mt{datatype} \; z = (\overline{y}, \overline{dc})$ and $X \in \overline{dc}$)} \\
adamc@540 889 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to \tau \to M.x \; \overline y \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$} \\
adamc@540 890 && \textrm{and $\mt{proj}(M', \overline{s'}, \mt{datatype} \; z = (\overline{y}, \overline{dc})$ and $X : \tau \in \overline{dc}$)} \\
adamc@540 891 \\
adamc@540 892 \mt{proj}(M, \mt{structure} \; X : S \; \overline{s}, \mt{structure} \; X) &=& S \\
adamc@540 893 \\
adamc@540 894 \mt{proj}(M, \mt{signature} \; X = S \; \overline{s}, \mt{signature} \; X) &=& S \\
adamc@540 895 \\
adamc@540 896 \mt{proj}(M, \mt{con} \; x :: \kappa \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540 897 \mt{proj}(M, \mt{con} \; x :: \kappa = c \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540 898 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540 899 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540 900 \mt{proj}(M, \mt{val} \; x : \tau \; \overline{s}, V) &=& \mt{proj}(M, \overline{s}, V) \\
adamc@540 901 \mt{proj}(M, \mt{structure} \; X : S \; \overline{s}, V) &=& [X \mapsto M.X]\mt{proj}(M, \overline{s}, V) \\
adamc@540 902 \mt{proj}(M, \mt{signature} \; X = S \; \overline{s}, V) &=& [X \mapsto M.X]\mt{proj}(M, \overline{s}, V) \\
adamc@540 903 \mt{proj}(M, \mt{include} \; S \; \overline{s}, V) &=& \mt{proj}(M, \overline{s'} \; \overline{s}, V) \textrm{ (where $\Gamma \vdash S \equiv \mt{sig} \; \overline{s'} \; \mt{end}$)} \\
adamc@540 904 \mt{proj}(M, \mt{constraint} \; c_1 \sim c_2 \; \overline{s}, V) &=& \mt{proj}(M, \overline{s}, V) \\
adamc@540 905 \mt{proj}(M, \mt{class} \; x \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540 906 \mt{proj}(M, \mt{class} \; x = c \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540 907 \end{eqnarray*}
adamc@540 908
adamc@541 909
adamc@541 910 \section{Type Inference}
adamc@541 911
adamc@541 912 The Ur/Web compiler uses \emph{heuristic type inference}, with no claims of completeness with respect to the declarative specification of the last section. The rules in use seem to work well in practice. This section summarizes those rules, to help Ur programmers predict what will work and what won't.
adamc@541 913
adamc@541 914 \subsection{Basic Unification}
adamc@541 915
adamc@541 916 Type-checkers for languages based on the Hindly-Milner type discipline, like ML and Haskell, take advantage of \emph{principal typing} properties, making complete type inference relatively straightforward. Inference algorithms are traditionally implemented using type unification variables, at various points asserting equalities between types, in the process discovering the values of type variables. The Ur/Web compiler uses the same basic strategy, but the complexity of the type system rules out easy completeness.
adamc@541 917
adamc@541 918 Type-checking can require evaluating recursive functional programs, thanks to the type-level $\mt{fold}$ operator. When a unification variable appears in such a type, the next step of computation can be undetermined. The value of that variable might be determined later, but this would be ``too late'' for the unification problems generated at the first occurrence. This is the essential source of incompletness.
adamc@541 919
adamc@541 920 Nonetheless, the unification engine tends to do reasonably well. Unlike in ML, polymorphism is never inferred in definitions; it must be indicated explicitly by writing out constructor-level parameters. By writing these and other annotations, the programmer can generally get the type inference engine to do most of the type reconstruction work.
adamc@541 921
adamc@541 922 \subsection{Unifying Record Types}
adamc@541 923
adamc@541 924 The type inference engine tries to take advantage of the algebraic rules governing type-level records, as shown in Section \ref{definitional}. When two constructors of record kind are unified, they are reduce to normal forms, with like terms crossed off from each normal form until, hopefully, nothing remains. This cannot be complete, with the inclusion of unification variables. The type-checker can help you understand what goes wrong when the process fails, as it outputs the unmatched remainders of the two normal forms.
adamc@541 925
adamc@541 926 \subsection{\label{typeclasses}Type Classes}
adamc@541 927
adamc@541 928 Ur includes a type class facility inspired by Haskell's. The current version is very rudimentary, only supporting instances for particular types built up from abstract types and datatypes and type-level application.
adamc@541 929
adamc@541 930 Type classes are integrated with the module system. A type class is just a constructor of kind $\mt{Type} \to \mt{Type}$. By marking such a constructor $c$ as a type class, the programmer instructs the type inference engine to, in each scope, record all values of types $c \; \tau$ as \emph{instances}. Any function argument whose type is of such a form is treated as implicit, to be determined by examining the current instance database.
adamc@541 931
adamc@541 932 The ``dictionary encoding'' often used in Haskell implementations is made explicit in Ur. Type class instances are just properly-typed values, and they can also be considered as ``proofs'' of membership in the class. In some cases, it is useful to pass these proofs around explicitly. An underscore written where a proof is expected will also be inferred, if possible, from the current instance database.
adamc@541 933
adamc@541 934 Just as for types, type classes may be exported from modules, and they may be exported as concrete or abstract. Concrete type classes have their ``real'' definitions exposed, so that client code may add new instances freely. Abstract type classes are useful as ``predicates'' that can be used to enforce invariants, as we will see in some definitions of SQL syntax in the Ur/Web standard library.
adamc@541 935
adamc@541 936 \subsection{Reverse-Engineering Record Types}
adamc@541 937
adamc@541 938 It's useful to write Ur functions and functors that take record constructors as inputs, but these constructors can grow quite long, even though their values are often implied by other arguments. The compiler uses a simple heuristic to infer the values of unification variables that are folded over, yielding known results. Often, as in the case of $\mt{map}$-like folds, the base and recursive cases of a fold produce constructors with different top-level structure. Thus, if the result of the fold is known, examining its top-level structure reveals whether the record being folded over is empty or not. If it's empty, we're done; if it's not empty, we replace a single unification variable with a new constructor formed from three new unification variables, as in $[\alpha = \beta] \rc \gamma$. This process can often be repeated to determine a unification variable fully.
adamc@541 939
adamc@541 940 \subsection{Implicit Arguments in Functor Applications}
adamc@541 941
adamc@541 942 Constructor, constraint, and type class witness members of structures may be omitted, when those structures are used in contexts where their assigned signatures imply how to fill in those missing members. This feature combines well with reverse-engineering to allow for uses of complicated meta-programming functors with little more code than would be necessary to invoke an untyped, ad-hoc code generator.
adamc@541 943
adamc@541 944
adamc@542 945 \section{The Ur Standard Library}
adamc@542 946
adamc@542 947 The built-in parts of the Ur/Web standard library are described by the signature in \texttt{lib/basis.urs} in the distribution. A module $\mt{Basis}$ ascribing to that signature is available in the initial environment, and every program is implicitly prefixed by $\mt{open} \; \mt{Basis}$.
adamc@542 948
adamc@542 949 Additionally, other common functions that are definable within Ur are included in \texttt{lib/top.urs} and \texttt{lib/top.ur}. This $\mt{Top}$ module is also opened implicitly.
adamc@542 950
adamc@542 951 The idea behind Ur is to serve as the ideal host for embedded domain-specific languages. For now, however, the ``generic'' functionality is intermixed with Ur/Web-specific functionality, including in these two library modules. We hope that these generic library components have types that speak for themselves. The next section introduces the Ur/Web-specific elements. Here, we only give the type declarations from the beginning of $\mt{Basis}$.
adamc@542 952 $$\begin{array}{l}
adamc@542 953 \mt{type} \; \mt{int} \\
adamc@542 954 \mt{type} \; \mt{float} \\
adamc@542 955 \mt{type} \; \mt{string} \\
adamc@542 956 \mt{type} \; \mt{time} \\
adamc@542 957 \\
adamc@542 958 \mt{type} \; \mt{unit} = \{\} \\
adamc@542 959 \\
adamc@542 960 \mt{datatype} \; \mt{bool} = \mt{False} \mid \mt{True} \\
adamc@542 961 \\
adamc@542 962 \mt{datatype} \; \mt{option} \; \mt{t} = \mt{None} \mid \mt{Some} \; \mt{of} \; \mt{t}
adamc@542 963 \end{array}$$
adamc@542 964
adamc@542 965
adamc@542 966 \section{The Ur/Web Standard Library}
adamc@542 967
adamc@542 968 \subsection{Transactions}
adamc@542 969
adamc@542 970 Ur is a pure language; we use Haskell's trick to support controlled side effects. The standard library defines a monad $\mt{transaction}$, meant to stand for actions that may be undone cleanly. By design, no other kinds of actions are supported.
adamc@542 971 $$\begin{array}{l}
adamc@542 972 \mt{con} \; \mt{transaction} :: \mt{Type} \to \mt{Type} \\
adamc@542 973 \\
adamc@542 974 \mt{val} \; \mt{return} : \mt{t} ::: \mt{Type} \to \mt{t} \to \mt{transaction} \; \mt{t} \\
adamc@542 975 \mt{val} \; \mt{bind} : \mt{t_1} ::: \mt{Type} \to \mt{t_2} ::: \mt{Type} \to \mt{transaction} \; \mt{t_1} \to (\mt{t_1} \to \mt{transaction} \; \mt{t_2}) \to \mt{transaction} \; \mt{t_2}
adamc@542 976 \end{array}$$
adamc@542 977
adamc@542 978 \subsection{HTTP}
adamc@542 979
adamc@542 980 There are transactions for reading an HTTP header by name and for getting and setting strongly-typed cookies. Cookies may only be created by the $\mt{cookie}$ declaration form, ensuring that they be named consistently based on module structure.
adamc@542 981 $$\begin{array}{l}
adamc@542 982 \mt{val} \; \mt{requestHeader} : \mt{string} \to \mt{transaction} \; (\mt{option} \; \mt{string}) \\
adamc@542 983 \\
adamc@542 984 \mt{con} \; \mt{http\_cookie} :: \mt{Type} \to \mt{Type} \\
adamc@542 985 \mt{val} \; \mt{getCookie} : \mt{t} ::: \mt{Type} \to \mt{http\_cookie} \; \mt{t} \to \mt{transaction} \; (\mt{option} \; \mt{t}) \\
adamc@542 986 \mt{val} \; \mt{setCookie} : \mt{t} ::: \mt{Type} \to \mt{http\_cookie} \; \mt{t} \to \mt{t} \to \mt{transaction} \; \mt{unit}
adamc@542 987 \end{array}$$
adamc@542 988
adamc@543 989 \subsection{SQL}
adamc@543 990
adamc@543 991 The fundamental unit of interest in the embedding of SQL is tables, described by a type family and creatable only via the $\mt{table}$ declaration form.
adamc@543 992 $$\begin{array}{l}
adamc@543 993 \mt{con} \; \mt{sql\_table} :: \{\mt{Type}\} \to \mt{Type}
adamc@543 994 \end{array}$$
adamc@543 995
adamc@543 996 \subsubsection{Queries}
adamc@543 997
adamc@543 998 A final query is constructed via the $\mt{sql\_query}$ function. Constructor arguments respectively specify the table fields we select (as records mapping tables to the subsets of their fields that we choose) and the (always named) extra expressions that we select.
adamc@543 999 $$\begin{array}{l}
adamc@543 1000 \mt{con} \; \mt{sql\_query} :: \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@543 1001 \mt{val} \; \mt{sql\_query} : \mt{tables} ::: \{\{\mt{Type}\}\} \\
adamc@543 1002 \hspace{.1in} \to \mt{selectedFields} ::: \{\{\mt{Type}\}\} \\
adamc@543 1003 \hspace{.1in} \to \mt{selectedExps} ::: \{\mt{Type}\} \\
adamc@543 1004 \hspace{.1in} \to \{\mt{Rows} : \mt{sql\_query1} \; \mt{tables} \; \mt{selectedFields} \; \mt{selectedExps}, \\
adamc@543 1005 \hspace{.2in} \mt{OrderBy} : \mt{sql\_order\_by} \; \mt{tables} \; \mt{selectedExps}, \\
adamc@543 1006 \hspace{.2in} \mt{Limit} : \mt{sql\_limit}, \\
adamc@543 1007 \hspace{.2in} \mt{Offset} : \mt{sql\_offset}\} \\
adamc@543 1008 \hspace{.1in} \to \mt{sql\_query} \; \mt{selectedFields} \; \mt{selectedExps}
adamc@543 1009 \end{array}$$
adamc@543 1010
adamc@545 1011 Queries are used by folding over their results inside transactions.
adamc@545 1012 $$\begin{array}{l}
adamc@545 1013 \mt{val} \; \mt{query} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \lambda [\mt{tables} \sim \mt{exps}] \Rightarrow \mt{state} ::: \mt{Type} \to \mt{sql\_query} \; \mt{tables} \; \mt{exps} \\
adamc@545 1014 \hspace{.1in} \to (\$(\mt{exps} \rc \mt{fold} \; (\lambda \mt{nm} \; (\mt{fields} :: \{\mt{Type}\}) \; \mt{acc} \; [[\mt{nm}] \sim \mt{acc}] \Rightarrow [\mt{nm} = \$\mt{fields}] \rc \mt{acc}) \; [] \; \mt{tables}) \\
adamc@545 1015 \hspace{.2in} \to \mt{state} \to \mt{transaction} \; \mt{state}) \\
adamc@545 1016 \hspace{.1in} \to \mt{state} \to \mt{transaction} \; \mt{state}
adamc@545 1017 \end{array}$$
adamc@545 1018
adamc@543 1019 Most of the complexity of the query encoding is in the type $\mt{sql\_query1}$, which includes simple queries and derived queries based on relational operators. Constructor arguments respectively specify the tables we select from, the subset of fields that we keep from each table for the result rows, and the extra expressions that we select.
adamc@543 1020 $$\begin{array}{l}
adamc@543 1021 \mt{con} \; \mt{sql\_query1} :: \{\{\mt{Type}\}\} \to \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@543 1022 \\
adamc@543 1023 \mt{type} \; \mt{sql\_relop} \\
adamc@543 1024 \mt{val} \; \mt{sql\_union} : \mt{sql\_relop} \\
adamc@543 1025 \mt{val} \; \mt{sql\_intersect} : \mt{sql\_relop} \\
adamc@543 1026 \mt{val} \; \mt{sql\_except} : \mt{sql\_relop} \\
adamc@543 1027 \mt{val} \; \mt{sql\_relop} : \mt{tables1} ::: \{\{\mt{Type}\}\} \\
adamc@543 1028 \hspace{.1in} \to \mt{tables2} ::: \{\{\mt{Type}\}\} \\
adamc@543 1029 \hspace{.1in} \to \mt{selectedFields} ::: \{\{\mt{Type}\}\} \\
adamc@543 1030 \hspace{.1in} \to \mt{selectedExps} ::: \{\mt{Type}\} \\
adamc@543 1031 \hspace{.1in} \to \mt{sql\_relop} \\
adamc@543 1032 \hspace{.1in} \to \mt{sql\_query1} \; \mt{tables1} \; \mt{selectedFields} \; \mt{selectedExps} \\
adamc@543 1033 \hspace{.1in} \to \mt{sql\_query1} \; \mt{tables2} \; \mt{selectedFields} \; \mt{selectedExps} \\
adamc@543 1034 \hspace{.1in} \to \mt{sql\_query1} \; \mt{selectedFields} \; \mt{selectedFields} \; \mt{selectedExps}
adamc@543 1035 \end{array}$$
adamc@543 1036
adamc@543 1037 $$\begin{array}{l}
adamc@543 1038 \mt{val} \; \mt{sql\_query1} : \mt{tables} ::: \{\{\mt{Type}\}\} \\
adamc@543 1039 \hspace{.1in} \to \mt{grouped} ::: \{\{\mt{Type}\}\} \\
adamc@543 1040 \hspace{.1in} \to \mt{selectedFields} ::: \{\{\mt{Type}\}\} \\
adamc@543 1041 \hspace{.1in} \to \mt{selectedExps} ::: \{\mt{Type}\} \\
adamc@543 1042 \hspace{.1in} \to \{\mt{From} : \$(\mt{fold} \; (\lambda \mt{nm} \; (\mt{fields} :: \{\mt{Type}\}) \; \mt{acc} \; [[\mt{nm}] \sim \mt{acc}] \Rightarrow [\mt{nm} = \mt{sql\_table} \; \mt{fields}] \rc \mt{acc}) \; [] \; \mt{tables}), \\
adamc@543 1043 \hspace{.2in} \mt{Where} : \mt{sql\_exp} \; \mt{tables} \; [] \; [] \; \mt{bool}, \\
adamc@543 1044 \hspace{.2in} \mt{GroupBy} : \mt{sql\_subset} \; \mt{tables} \; \mt{grouped}, \\
adamc@543 1045 \hspace{.2in} \mt{Having} : \mt{sql\_exp} \; \mt{grouped} \; \mt{tables} \; [] \; \mt{bool}, \\
adamc@543 1046 \hspace{.2in} \mt{SelectFields} : \mt{sql\_subset} \; \mt{grouped} \; \mt{selectedFields}, \\
adamc@543 1047 \hspace{.2in} \mt {SelectExps} : \$(\mt{fold} \; (\lambda \mt{nm} \; (\mt{t} :: \mt{Type}) \; \mt{acc} \; [[\mt{nm}] \sim \mt{acc}] \Rightarrow [\mt{nm} = \mt{sql\_exp} \; \mt{grouped} \; \mt{tables} \; [] \; \mt{t}] \rc \mt{acc}) \; [] \; \mt{selectedExps}) \} \\
adamc@543 1048 \hspace{.1in} \to \mt{sql\_query1} \; \mt{tables} \; \mt{selectedFields} \; \mt{selectedExps}
adamc@543 1049 \end{array}$$
adamc@543 1050
adamc@543 1051 To encode projection of subsets of fields in $\mt{SELECT}$ clauses, and to encode $\mt{GROUP} \; \mt{BY}$ clauses, we rely on a type family $\mt{sql\_subset}$, capturing what it means for one record of table fields to be a subset of another. The main constructor $\mt{sql\_subset}$ ``proves subset facts'' by requiring a split of a record into kept and dropped parts. The extra constructor $\mt{sql\_subset\_all}$ is a convenience for keeping all fields of a record.
adamc@543 1052 $$\begin{array}{l}
adamc@543 1053 \mt{con} \; \mt{sql\_subset} :: \{\{\mt{Type}\}\} \to \{\{\mt{Type}\}\} \to \mt{Type} \\
adamc@543 1054 \mt{val} \; \mt{sql\_subset} : \mt{keep\_drop} :: \{(\{\mt{Type}\} \times \{\mt{Type}\})\} \\
adamc@543 1055 \hspace{.1in} \to \mt{sql\_subset} \\
adamc@543 1056 \hspace{.2in} (\mt{fold} \; (\lambda \mt{nm} \; (\mt{fields} :: (\{\mt{Type}\} \times \{\mt{Type}\})) \; \mt{acc} \; [[\mt{nm}] \sim \mt{acc}] \; [\mt{fields}.1 \sim \mt{fields}.2] \Rightarrow \\
adamc@543 1057 \hspace{.3in} [\mt{nm} = \mt{fields}.1 \rc \mt{fields}.2] \rc \mt{acc}) \; [] \; \mt{keep\_drop}) \\
adamc@543 1058 \hspace{.2in} (\mt{fold} \; (\lambda \mt{nm} \; (\mt{fields} :: (\{\mt{Type}\} \times \{\mt{Type}\})) \; \mt{acc} \; [[\mt{nm}] \sim \mt{acc}] \Rightarrow [\mt{nm} = \mt{fields}.1] \rc \mt{acc}) \; [] \; \mt{keep\_drop}) \\
adamc@543 1059 \mt{val} \; \mt{sql\_subset\_all} : \mt{tables} :: \{\{\mt{Type}\}\} \to \mt{sql\_subset} \; \mt{tables} \; \mt{tables}
adamc@543 1060 \end{array}$$
adamc@543 1061
adamc@543 1062 SQL expressions are used in several places, including $\mt{SELECT}$, $\mt{WHERE}$, $\mt{HAVING}$, and $\mt{ORDER} \; \mt{BY}$ clauses. They reify a fragment of the standard SQL expression language, while making it possible to inject ``native'' Ur values in some places. The arguments to the $\mt{sql\_exp}$ type family respectively give the unrestricted-availablity table fields, the table fields that may only be used in arguments to aggregate functions, the available selected expressions, and the type of the expression.
adamc@543 1063 $$\begin{array}{l}
adamc@543 1064 \mt{con} \; \mt{sql\_exp} :: \{\{\mt{Type}\}\} \to \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \to \mt{Type}
adamc@543 1065 \end{array}$$
adamc@543 1066
adamc@543 1067 Any field in scope may be converted to an expression.
adamc@543 1068 $$\begin{array}{l}
adamc@543 1069 \mt{val} \; \mt{sql\_field} : \mt{otherTabs} ::: \{\{\mt{Type}\}\} \to \mt{otherFields} ::: \{\mt{Type}\} \\
adamc@543 1070 \hspace{.1in} \to \mt{fieldType} ::: \mt{Type} \to \mt{agg} ::: \{\{\mt{Type}\}\} \\
adamc@543 1071 \hspace{.1in} \to \mt{exps} ::: \{\mt{Type}\} \\
adamc@543 1072 \hspace{.1in} \to \mt{tab} :: \mt{Name} \to \mt{field} :: \mt{Name} \\
adamc@543 1073 \hspace{.1in} \to \mt{sql\_exp} \; ([\mt{tab} = [\mt{field} = \mt{fieldType}] \rc \mt{otherFields}] \rc \mt{otherTabs}) \; \mt{agg} \; \mt{exps} \; \mt{fieldType}
adamc@543 1074 \end{array}$$
adamc@543 1075
adamc@544 1076 There is an analogous function for referencing named expressions.
adamc@544 1077 $$\begin{array}{l}
adamc@544 1078 \mt{val} \; \mt{sql\_exp} : \mt{tabs} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{t} ::: \mt{Type} \to \mt{rest} ::: \{\mt{Type}\} \to \mt{nm} :: \mt{Name} \\
adamc@544 1079 \hspace{.1in} \to \mt{sql\_exp} \; \mt{tabs} \; \mt{agg} \; ([\mt{nm} = \mt{t}] \rc \mt{rest}) \; \mt{t}
adamc@544 1080 \end{array}$$
adamc@544 1081
adamc@544 1082 Ur values of appropriate types may be injected into SQL expressions.
adamc@544 1083 $$\begin{array}{l}
adamc@544 1084 \mt{class} \; \mt{sql\_injectable} \\
adamc@544 1085 \mt{val} \; \mt{sql\_bool} : \mt{sql\_injectable} \; \mt{bool} \\
adamc@544 1086 \mt{val} \; \mt{sql\_int} : \mt{sql\_injectable} \; \mt{int} \\
adamc@544 1087 \mt{val} \; \mt{sql\_float} : \mt{sql\_injectable} \; \mt{float} \\
adamc@544 1088 \mt{val} \; \mt{sql\_string} : \mt{sql\_injectable} \; \mt{string} \\
adamc@544 1089 \mt{val} \; \mt{sql\_time} : \mt{sql\_injectable} \; \mt{time} \\
adamc@544 1090 \mt{val} \; \mt{sql\_option\_bool} : \mt{sql\_injectable} \; (\mt{option} \; \mt{bool}) \\
adamc@544 1091 \mt{val} \; \mt{sql\_option\_int} : \mt{sql\_injectable} \; (\mt{option} \; \mt{int}) \\
adamc@544 1092 \mt{val} \; \mt{sql\_option\_float} : \mt{sql\_injectable} \; (\mt{option} \; \mt{float}) \\
adamc@544 1093 \mt{val} \; \mt{sql\_option\_string} : \mt{sql\_injectable} \; (\mt{option} \; \mt{string}) \\
adamc@544 1094 \mt{val} \; \mt{sql\_option\_time} : \mt{sql\_injectable} \; (\mt{option} \; \mt{time}) \\
adamc@544 1095 \mt{val} \; \mt{sql\_inject} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \to \mt{sql\_injectable} \; \mt{t} \\
adamc@544 1096 \hspace{.1in} \to \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t}
adamc@544 1097 \end{array}$$
adamc@544 1098
adamc@544 1099 We have the SQL nullness test, which is necessary because of the strange SQL semantics of equality in the presence of null values.
adamc@544 1100 $$\begin{array}{l}
adamc@544 1101 \mt{val} \; \mt{sql\_is\_null} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544 1102 \hspace{.1in} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; (\mt{option} \; \mt{t}) \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{bool}
adamc@544 1103 \end{array}$$
adamc@544 1104
adamc@544 1105 We have generic nullary, unary, and binary operators, as well as comparison operators.
adamc@544 1106 $$\begin{array}{l}
adamc@544 1107 \mt{con} \; \mt{sql\_nfunc} :: \mt{Type} \to \mt{Type} \\
adamc@544 1108 \mt{val} \; \mt{sql\_current\_timestamp} : \mt{sql\_nfunc} \; \mt{time} \\
adamc@544 1109 \mt{val} \; \mt{sql\_nfunc} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544 1110 \hspace{.1in} \to \mt{sql\_nfunc} \; \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t} \\\end{array}$$
adamc@544 1111
adamc@544 1112 $$\begin{array}{l}
adamc@544 1113 \mt{con} \; \mt{sql\_unary} :: \mt{Type} \to \mt{Type} \to \mt{Type} \\
adamc@544 1114 \mt{val} \; \mt{sql\_not} : \mt{sql\_unary} \; \mt{bool} \; \mt{bool} \\
adamc@544 1115 \mt{val} \; \mt{sql\_unary} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{arg} ::: \mt{Type} \to \mt{res} ::: \mt{Type} \\
adamc@544 1116 \hspace{.1in} \to \mt{sql\_unary} \; \mt{arg} \; \mt{res} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{arg} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{res} \\
adamc@544 1117 \end{array}$$
adamc@544 1118
adamc@544 1119 $$\begin{array}{l}
adamc@544 1120 \mt{con} \; \mt{sql\_binary} :: \mt{Type} \to \mt{Type} \to \mt{Type} \to \mt{Type} \\
adamc@544 1121 \mt{val} \; \mt{sql\_and} : \mt{sql\_binary} \; \mt{bool} \; \mt{bool} \; \mt{bool} \\
adamc@544 1122 \mt{val} \; \mt{sql\_or} : \mt{sql\_binary} \; \mt{bool} \; \mt{bool} \; \mt{bool} \\
adamc@544 1123 \mt{val} \; \mt{sql\_binary} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{arg_1} ::: \mt{Type} \to \mt{arg_2} ::: \mt{Type} \to \mt{res} ::: \mt{Type} \\
adamc@544 1124 \hspace{.1in} \to \mt{sql\_binary} \; \mt{arg_1} \; \mt{arg_2} \; \mt{res} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{arg_1} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{arg_2} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{res}
adamc@544 1125 \end{array}$$
adamc@544 1126
adamc@544 1127 $$\begin{array}{l}
adamc@544 1128 \mt{type} \; \mt{sql\_comparison} \\
adamc@544 1129 \mt{val} \; \mt{sql\_eq} : \mt{sql\_comparison} \\
adamc@544 1130 \mt{val} \; \mt{sql\_ne} : \mt{sql\_comparison} \\
adamc@544 1131 \mt{val} \; \mt{sql\_lt} : \mt{sql\_comparison} \\
adamc@544 1132 \mt{val} \; \mt{sql\_le} : \mt{sql\_comparison} \\
adamc@544 1133 \mt{val} \; \mt{sql\_gt} : \mt{sql\_comparison} \\
adamc@544 1134 \mt{val} \; \mt{sql\_ge} : \mt{sql\_comparison} \\
adamc@544 1135 \mt{val} \; \mt{sql\_comparison} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544 1136 \hspace{.1in} \to \mt{sql\_comparison} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{bool}
adamc@544 1137 \end{array}$$
adamc@544 1138
adamc@544 1139 Finally, we have aggregate functions. The $\mt{COUNT(\ast)}$ syntax is handled specially, since it takes no real argument. The other aggregate functions are placed into a general type family, using type classes to restrict usage to properly-typed arguments. The key aspect of the $\mt{sql\_aggregate}$ function's type is the shift of aggregate-function-only fields into unrestricted fields.
adamc@544 1140
adamc@544 1141 $$\begin{array}{l}
adamc@544 1142 \mt{val} \; \mt{sql\_count} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{int}
adamc@544 1143 \end{array}$$
adamc@544 1144
adamc@544 1145 $$\begin{array}{l}
adamc@544 1146 \mt{con} \; \mt{sql\_aggregate} :: \mt{Type} \to \mt{Type} \\
adamc@544 1147 \mt{val} \; \mt{sql\_aggregate} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544 1148 \hspace{.1in} \to \mt{sql\_aggregate} \; \mt{t} \to \mt{sql\_exp} \; \mt{agg} \; \mt{agg} \; \mt{exps} \; \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t}
adamc@544 1149 \end{array}$$
adamc@544 1150
adamc@544 1151 $$\begin{array}{l}
adamc@544 1152 \mt{class} \; \mt{sql\_summable} \\
adamc@544 1153 \mt{val} \; \mt{sql\_summable\_int} : \mt{sql\_summable} \; \mt{int} \\
adamc@544 1154 \mt{val} \; \mt{sql\_summable\_float} : \mt{sql\_summable} \; \mt{float} \\
adamc@544 1155 \mt{val} \; \mt{sql\_avg} : \mt{t} ::: \mt{Type} \to \mt{sql\_summable} \; \mt{t} \to \mt{sql\_aggregate} \; \mt{t} \\
adamc@544 1156 \mt{val} \; \mt{sql\_sum} : \mt{t} ::: \mt{Type} \to \mt{sql\_summable} \mt{t} \to \mt{sql\_aggregate} \; \mt{t}
adamc@544 1157 \end{array}$$
adamc@544 1158
adamc@544 1159 $$\begin{array}{l}
adamc@544 1160 \mt{class} \; \mt{sql\_maxable} \\
adamc@544 1161 \mt{val} \; \mt{sql\_maxable\_int} : \mt{sql\_maxable} \; \mt{int} \\
adamc@544 1162 \mt{val} \; \mt{sql\_maxable\_float} : \mt{sql\_maxable} \; \mt{float} \\
adamc@544 1163 \mt{val} \; \mt{sql\_maxable\_string} : \mt{sql\_maxable} \; \mt{string} \\
adamc@544 1164 \mt{val} \; \mt{sql\_maxable\_time} : \mt{sql\_maxable} \; \mt{time} \\
adamc@544 1165 \mt{val} \; \mt{sql\_max} : \mt{t} ::: \mt{Type} \to \mt{sql\_maxable} \; \mt{t} \to \mt{sql\_aggregate} \; \mt{t} \\
adamc@544 1166 \mt{val} \; \mt{sql\_min} : \mt{t} ::: \mt{Type} \to \mt{sql\_maxable} \; \mt{t} \to \mt{sql\_aggregate} \; \mt{t}
adamc@544 1167 \end{array}$$
adamc@544 1168
adamc@544 1169 We wrap up the definition of query syntax with the types used in representing $\mt{ORDER} \; \mt{BY}$, $\mt{LIMIT}$, and $\mt{OFFSET}$ clauses.
adamc@544 1170 $$\begin{array}{l}
adamc@544 1171 \mt{type} \; \mt{sql\_direction} \\
adamc@544 1172 \mt{val} \; \mt{sql\_asc} : \mt{sql\_direction} \\
adamc@544 1173 \mt{val} \; \mt{sql\_desc} : \mt{sql\_direction} \\
adamc@544 1174 \\
adamc@544 1175 \mt{con} \; \mt{sql\_order\_by} :: \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@544 1176 \mt{val} \; \mt{sql\_order\_by\_Nil} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{exps} :: \{\mt{Type}\} \to \mt{sql\_order\_by} \; \mt{tables} \; \mt{exps} \\
adamc@544 1177 \mt{val} \; \mt{sql\_order\_by\_Cons} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544 1178 \hspace{.1in} \to \mt{sql\_exp} \; \mt{tables} \; [] \; \mt{exps} \; \mt{t} \to \mt{sql\_direction} \to \mt{sql\_order\_by} \; \mt{tables} \; \mt{exps} \to \mt{sql\_order\_by} \; \mt{tables} \; \mt{exps} \\
adamc@544 1179 \\
adamc@544 1180 \mt{type} \; \mt{sql\_limit} \\
adamc@544 1181 \mt{val} \; \mt{sql\_no\_limit} : \mt{sql\_limit} \\
adamc@544 1182 \mt{val} \; \mt{sql\_limit} : \mt{int} \to \mt{sql\_limit} \\
adamc@544 1183 \\
adamc@544 1184 \mt{type} \; \mt{sql\_offset} \\
adamc@544 1185 \mt{val} \; \mt{sql\_no\_offset} : \mt{sql\_offset} \\
adamc@544 1186 \mt{val} \; \mt{sql\_offset} : \mt{int} \to \mt{sql\_offset}
adamc@544 1187 \end{array}$$
adamc@544 1188
adamc@545 1189
adamc@545 1190 \subsubsection{DML}
adamc@545 1191
adamc@545 1192 The Ur/Web library also includes an embedding of a fragment of SQL's DML, the Data Manipulation Language, for modifying database tables. Any piece of DML may be executed in a transaction.
adamc@545 1193
adamc@545 1194 $$\begin{array}{l}
adamc@545 1195 \mt{type} \; \mt{dml} \\
adamc@545 1196 \mt{val} \; \mt{dml} : \mt{dml} \to \mt{transaction} \; \mt{unit}
adamc@545 1197 \end{array}$$
adamc@545 1198
adamc@545 1199 Properly-typed records may be used to form $\mt{INSERT}$ commands.
adamc@545 1200 $$\begin{array}{l}
adamc@545 1201 \mt{val} \; \mt{insert} : \mt{fields} ::: \{\mt{Type}\} \to \mt{sql\_table} \; \mt{fields} \\
adamc@545 1202 \hspace{.1in} \to \$(\mt{fold} \; (\lambda \mt{nm} \; (\mt{t} :: \mt{Type}) \; \mt{acc} \; [[\mt{nm}] \sim \mt{acc}] \Rightarrow [\mt{nm} = \mt{sql\_exp} \; [] \; [] \; [] \; \mt{t}] \rc \mt{acc}) \; [] \; \mt{fields}) \to \mt{dml}
adamc@545 1203 \end{array}$$
adamc@545 1204
adamc@545 1205 An $\mt{UPDATE}$ command is formed from a choice of which table fields to leave alone and which to change, along with an expression to use to compute the new value of each changed field and a $\mt{WHERE}$ clause.
adamc@545 1206 $$\begin{array}{l}
adamc@545 1207 \mt{val} \; \mt{update} : \mt{unchanged} ::: \{\mt{Type}\} \to \mt{changed} :: \{\mt{Type}\} \to \lambda [\mt{changed} \sim \mt{unchanged}] \\
adamc@545 1208 \hspace{.1in} \Rightarrow \$(\mt{fold} \; (\lambda \mt{nm} \; (\mt{t} :: \mt{Type}) \; \mt{acc} \; [[\mt{nm}] \sim \mt{acc}] \Rightarrow [\mt{nm} = \mt{sql\_exp} \; [\mt{T} = \mt{changed} \rc \mt{unchanged}] \; [] \; [] \; \mt{t}] \rc \mt{acc}) \; [] \; \mt{changed}) \\
adamc@545 1209 \hspace{.1in} \to \mt{sql\_table} \; (\mt{changed} \rc \mt{unchanged}) \to \mt{sql\_exp} \; [\mt{T} = \mt{changed} \rc \mt{unchanged}] \; [] \; [] \; \mt{bool} \to \mt{dml}
adamc@545 1210 \end{array}$$
adamc@545 1211
adamc@545 1212 A $\mt{DELETE}$ command is formed from a table and a $\mt{WHERE}$ clause.
adamc@545 1213 $$\begin{array}{l}
adamc@545 1214 \mt{val} \; \mt{delete} : \mt{fields} ::: \{\mt{Type}\} \to \mt{sql\_table} \; \mt{fields} \to \mt{sql\_exp} \; [\mt{T} = \mt{fields}] \; [] \; [] \; \mt{bool} \to \mt{dml}
adamc@545 1215 \end{array}$$
adamc@545 1216
adamc@546 1217 \subsubsection{Sequences}
adamc@546 1218
adamc@546 1219 SQL sequences are counters with concurrency control, often used to assign unique IDs. Ur/Web supports them via a simple interface. The only way to create a sequence is with the $\mt{sequence}$ declaration form.
adamc@546 1220
adamc@546 1221 $$\begin{array}{l}
adamc@546 1222 \mt{type} \; \mt{sql\_sequence} \\
adamc@546 1223 \mt{val} \; \mt{nextval} : \mt{sql\_sequence} \to \mt{transaction} \; \mt{int}
adamc@546 1224 \end{array}$$
adamc@546 1225
adamc@546 1226
adamc@547 1227 \subsection{XML}
adamc@547 1228
adamc@547 1229 Ur/Web's library contains an encoding of XML syntax and semantic constraints. We make no effort to follow the standards governing XML schemas. Rather, XML fragments are viewed more as values of ML datatypes, and we only track which tags are allowed inside which other tags.
adamc@547 1230
adamc@547 1231 The basic XML type family has arguments respectively indicating the \emph{context} of a fragment, the fields that the fragment expects to be bound on entry (and their types), and the fields that the fragment will bind (and their types). Contexts are a record-based ``poor man's subtyping'' encoding, with each possible set of valid tags corresponding to a different context record. The arguments dealing with field binding are only relevant to HTML forms.
adamc@547 1232 $$\begin{array}{l}
adamc@547 1233 \mt{con} \; \mt{xml} :: \{\mt{Unit}\} \to \{\mt{Type}\} \to \{\mt{Type}\} \to \mt{Type}
adamc@547 1234 \end{array}$$
adamc@547 1235
adamc@547 1236 We also have a type family of XML tags, indexed respectively by the record of optional attributes accepted by the tag, the context in which the tag may be placed, the context required of children of the tag, which form fields the tag uses, and which fields the tag defines.
adamc@547 1237 $$\begin{array}{l}
adamc@547 1238 \mt{con} \; \mt{tag} :: \{\mt{Type}\} \to \{\mt{Unit}\} \to \{\mt{Unit}\} \to \{\mt{Type}\} \to \{\mt{Type}\} \to \mt{Type}
adamc@547 1239 \end{array}$$
adamc@547 1240
adamc@547 1241 Literal text may be injected into XML as ``CDATA.''
adamc@547 1242 $$\begin{array}{l}
adamc@547 1243 \mt{val} \; \mt{cdata} : \mt{ctx} ::: \{\mt{Unit}\} \to \mt{use} ::: \{\mt{Type}\} \to \mt{string} \to \mt{xml} \; \mt{ctx} \; \mt{use} \; []
adamc@547 1244 \end{array}$$
adamc@547 1245
adamc@547 1246 There is a function for producing an XML tree with a particular tag at its root.
adamc@547 1247 $$\begin{array}{l}
adamc@547 1248 \mt{val} \; \mt{tag} : \mt{attrsGiven} ::: \{\mt{Type}\} \to \mt{attrsAbsent} ::: \{\mt{Type}\} \to \mt{ctxOuter} ::: \{\mt{Unit}\} \to \mt{ctxInner} ::: \{\mt{Unit}\} \\
adamc@547 1249 \hspace{.1in} \to \mt{useOuter} ::: \{\mt{Type}\} \to \mt{useInner} ::: \{\mt{Type}\} \to \mt{bindOuter} ::: \{\mt{Type}\} \to \mt{bindInner} ::: \{\mt{Type}\} \\
adamc@547 1250 \hspace{.1in} \to \lambda [\mt{attrsGiven} \sim \mt{attrsAbsent}] \; [\mt{useOuter} \sim \mt{useInner}] \; [\mt{bindOuter} \sim \mt{bindInner}] \Rightarrow \$\mt{attrsGiven} \\
adamc@547 1251 \hspace{.1in} \to \mt{tag} \; (\mt{attrsGiven} \rc \mt{attrsAbsent}) \; \mt{ctxOuter} \; \mt{ctxInner} \; \mt{useOuter} \; \mt{bindOuter} \\
adamc@547 1252 \hspace{.1in} \to \mt{xml} \; \mt{ctxInner} \; \mt{useInner} \; \mt{bindInner} \to \mt{xml} \; \mt{ctxOuter} \; (\mt{useOuter} \rc \mt{useInner}) \; (\mt{bindOuter} \rc \mt{bindInner})
adamc@547 1253 \end{array}$$
adamc@547 1254
adamc@547 1255 Two XML fragments may be concatenated.
adamc@547 1256 $$\begin{array}{l}
adamc@547 1257 \mt{val} \; \mt{join} : \mt{ctx} ::: \{\mt{Unit}\} \to \mt{use_1} ::: \{\mt{Type}\} \to \mt{bind_1} ::: \{\mt{Type}\} \to \mt{bind_2} ::: \{\mt{Type}\} \\
adamc@547 1258 \hspace{.1in} \to \lambda [\mt{use_1} \sim \mt{bind_1}] \; [\mt{bind_1} \sim \mt{bind_2}] \\
adamc@547 1259 \hspace{.1in} \Rightarrow \mt{xml} \; \mt{ctx} \; \mt{use_1} \; \mt{bind_1} \to \mt{xml} \; \mt{ctx} \; (\mt{use_1} \rc \mt{bind_1}) \; \mt{bind_2} \to \mt{xml} \; \mt{ctx} \; \mt{use_1} \; (\mt{bind_1} \rc \mt{bind_2})
adamc@547 1260 \end{array}$$
adamc@547 1261
adamc@547 1262 Finally, any XML fragment may be updated to ``claim'' to use more form fields than it does.
adamc@547 1263 $$\begin{array}{l}
adamc@547 1264 \mt{val} \; \mt{useMore} : \mt{ctx} ::: \{\mt{Unit}\} \to \mt{use_1} ::: \{\mt{Type}\} \to \mt{use_2} ::: \{\mt{Type}\} \to \mt{bind} ::: \{\mt{Type}\} \to \lambda [\mt{use_1} \sim \mt{use_2}] \\
adamc@547 1265 \hspace{.1in} \Rightarrow \mt{xml} \; \mt{ctx} \; \mt{use_1} \; \mt{bind} \to \mt{xml} \; \mt{ctx} \; (\mt{use_1} \rc \mt{use_2}) \; \mt{bind}
adamc@547 1266 \end{array}$$
adamc@547 1267
adamc@547 1268 We will not list here the different HTML tags and related functions from the standard library. They should be easy enough to understand from the code in \texttt{basis.urs}. The set of tags in the library is not yet claimed to be complete for HTML standards.
adamc@547 1269
adamc@547 1270 One last useful function is for aborting any page generation, returning some XML as an error message. This function takes the place of some uses of a general exception mechanism.
adamc@547 1271 $$\begin{array}{l}
adamc@547 1272 \mt{val} \; \mt{error} : \mt{t} ::: \mt{Type} \to \mt{xml} \; [\mt{Body}] \; [] \; [] \to \mt{t}
adamc@547 1273 \end{array}$$
adamc@547 1274
adamc@549 1275
adamc@549 1276 \section{Ur/Web Syntax Extensions}
adamc@549 1277
adamc@549 1278 Ur/Web features some syntactic shorthands for building values using the functions from the last section. This section sketches the grammar of those extensions. We write spans of syntax inside brackets to indicate that they are optional.
adamc@549 1279
adamc@549 1280 \subsection{SQL}
adamc@549 1281
adamc@549 1282 \subsubsection{Queries}
adamc@549 1283
adamc@550 1284 Queries $Q$ are added to the rules for expressions $e$.
adamc@550 1285
adamc@549 1286 $$\begin{array}{rrcll}
adamc@550 1287 \textrm{Queries} & Q &::=& (q \; [\mt{ORDER} \; \mt{BY} \; (E \; [o],)^+] \; [\mt{LIMIT} \; N] \; [\mt{OFFSET} \; N]) \\
adamc@549 1288 \textrm{Pre-queries} & q &::=& \mt{SELECT} \; P \; \mt{FROM} \; T,^+ \; [\mt{WHERE} \; E] \; [\mt{GROUP} \; \mt{BY} \; p,^+] \; [\mt{HAVING} \; E] \\
adamc@549 1289 &&& \mid q \; R \; q \\
adamc@549 1290 \textrm{Relational operators} & R &::=& \mt{UNION} \mid \mt{INTERSECT} \mid \mt{EXCEPT}
adamc@549 1291 \end{array}$$
adamc@549 1292
adamc@549 1293 $$\begin{array}{rrcll}
adamc@549 1294 \textrm{Projections} & P &::=& \ast & \textrm{all columns} \\
adamc@549 1295 &&& p,^+ & \textrm{particular columns} \\
adamc@549 1296 \textrm{Pre-projections} & p &::=& t.f & \textrm{one column from a table} \\
adamc@549 1297 &&& t.\{\{c\}\} & \textrm{a record of colums from a table (of kind $\{\mt{Type}\}$)} \\
adamc@549 1298 \textrm{Table names} & t &::=& x & \textrm{constant table name (automatically capitalized)} \\
adamc@549 1299 &&& X & \textrm{constant table name} \\
adamc@549 1300 &&& \{\{c\}\} & \textrm{computed table name (of kind $\mt{Name}$)} \\
adamc@549 1301 \textrm{Column names} & f &::=& X & \textrm{constant column name} \\
adamc@549 1302 &&& \{c\} & \textrm{computed column name (of kind $\mt{Name}$)} \\
adamc@549 1303 \textrm{Tables} & T &::=& x & \textrm{table variable, named locally by its own capitalization} \\
adamc@549 1304 &&& x \; \mt{AS} \; t & \textrm{table variable, with local name} \\
adamc@549 1305 &&& \{\{e\}\} \; \mt{AS} \; t & \textrm{computed table expression, with local name} \\
adamc@549 1306 \textrm{SQL expressions} & E &::=& p & \textrm{column references} \\
adamc@549 1307 &&& X & \textrm{named expression references} \\
adamc@549 1308 &&& \{\{e\}\} & \textrm{injected native Ur expressions} \\
adamc@549 1309 &&& \{e\} & \textrm{computed expressions, probably using $\mt{sql\_exp}$ directly} \\
adamc@549 1310 &&& \mt{TRUE} \mid \mt{FALSE} & \textrm{boolean constants} \\
adamc@549 1311 &&& \ell & \textrm{primitive type literals} \\
adamc@549 1312 &&& \mt{NULL} & \textrm{null value (injection of $\mt{None}$)} \\
adamc@549 1313 &&& E \; \mt{IS} \; \mt{NULL} & \textrm{nullness test} \\
adamc@549 1314 &&& n & \textrm{nullary operators} \\
adamc@549 1315 &&& u \; E & \textrm{unary operators} \\
adamc@549 1316 &&& E \; b \; E & \textrm{binary operators} \\
adamc@549 1317 &&& \mt{COUNT}(\ast) & \textrm{count number of rows} \\
adamc@549 1318 &&& a(E) & \textrm{other aggregate function} \\
adamc@549 1319 &&& (E) & \textrm{explicit precedence} \\
adamc@549 1320 \textrm{Nullary operators} & n &::=& \mt{CURRENT\_TIMESTAMP} \\
adamc@549 1321 \textrm{Unary operators} & u &::=& \mt{NOT} \\
adamc@549 1322 \textrm{Binary operators} & b &::=& \mt{AND} \mid \mt{OR} \mid \neq \mid < \mid \leq \mid > \mid \geq \\
adamc@549 1323 \textrm{Aggregate functions} & a &::=& \mt{AVG} \mid \mt{SUM} \mid \mt{MIN} \mid \mt{MAX} \\
adamc@550 1324 \textrm{Directions} & o &::=& \mt{ASC} \mid \mt{DESC} \\
adamc@549 1325 \textrm{SQL integer} & N &::=& n \mid \{e\} \\
adamc@549 1326 \end{array}$$
adamc@549 1327
adamc@549 1328 Additionally, an SQL expression may be inserted into normal Ur code with the syntax $(\mt{SQL} \; E)$ or $(\mt{WHERE} \; E)$.
adamc@549 1329
adamc@550 1330 \subsubsection{DML}
adamc@550 1331
adamc@550 1332 DML commands $D$ are added to the rules for expressions $e$.
adamc@550 1333
adamc@550 1334 $$\begin{array}{rrcll}
adamc@550 1335 \textrm{Commands} & D &::=& (\mt{INSERT} \; \mt{INTO} \; T^E \; (f,^+) \; \mt{VALUES} \; (E,^+)) \\
adamc@550 1336 &&& (\mt{UPDATE} \; T^E \; \mt{SET} \; (f = E,)^+ \; \mt{WHERE} \; E) \\
adamc@550 1337 &&& (\mt{DELETE} \; \mt{FROM} \; T^E \; \mt{WHERE} \; E) \\
adamc@550 1338 \textrm{Table expressions} & T^E &::=& x \mid \{\{e\}\}
adamc@550 1339 \end{array}$$
adamc@550 1340
adamc@550 1341 Inside $\mt{UPDATE}$ and $\mt{DELETE}$ commands, lone variables $X$ are interpreted as references to columns of the implicit table $\mt{T}$, rather than to named expressions.
adamc@549 1342
adamc@551 1343 \subsection{XML}
adamc@551 1344
adamc@551 1345 XML fragments $L$ are added to the rules for expressions $e$.
adamc@551 1346
adamc@551 1347 $$\begin{array}{rrcll}
adamc@551 1348 \textrm{XML fragments} & L &::=& \texttt{<xml/>} \mid \texttt{<xml>}l^*\texttt{</xml>} \\
adamc@551 1349 \textrm{XML pieces} & l &::=& \textrm{text} & \textrm{cdata} \\
adamc@551 1350 &&& \texttt{<}g\texttt{/>} & \textrm{tag with no children} \\
adamc@551 1351 &&& \texttt{<}g\texttt{>}l^*\texttt{</}x\texttt{>} & \textrm{tag with children} \\
adamc@551 1352 \textrm{Tag} & g &::=& h \; (x = v)^* \\
adamc@551 1353 \textrm{Tag head} & h &::=& x & \textrm{tag name} \\
adamc@551 1354 &&& h\{c\} & \textrm{constructor parameter} \\
adamc@551 1355 \textrm{Attribute value} & v &::=& \ell & \textrm{literal value} \\
adamc@551 1356 &&& \{e\} & \textrm{computed value} \\
adamc@551 1357 \end{array}$$
adamc@551 1358
adamc@552 1359
adamc@553 1360 \section{The Structure of Web Applications}
adamc@553 1361
adamc@553 1362 A web application is built from a series of modules, with one module, the last one appearing in the \texttt{.urp} file, designated as the main module. The signature of the main module determines the URL entry points to the application. Such an entry point should have type $\mt{unit} \to \mt{transaction} \; \mt{page}$, where $\mt{page}$ is a type synonym for top-level HTML pages, defined in $\mt{Basis}$. If such a function is at the top level of main module $M$, it will be accessible at URI \texttt{/M/f}, and so on for more deeply-nested functions, as described in Section \ref{tag} below.
adamc@553 1363
adamc@553 1364 When the standalone web server receives a request for a known page, it calls the function for that page, ``running'' the resulting transaction to produce the page to return to the client. Pages link to other pages with the \texttt{link} attribute of the \texttt{a} HTML tag. A link has type $\mt{transaction} \; \mt{page}$, and the semantics of a link are that this transaction should be run to compute the result page, when the link is followed. Link targets are assigned URL names in the same way as top-level entry points.
adamc@553 1365
adamc@553 1366 HTML forms are handled in a similar way. The $\mt{action}$ attribute of a $\mt{submit}$ form tag takes a value of type $\$\mt{use} \to \mt{transaction} \; \mt{page}$, where $\mt{use}$ is a kind-$\{\mt{Type}\}$ record of the form fields used by this action handler. Action handlers are assigned URL patterns in the same way as above.
adamc@553 1367
adamc@553 1368 For both links and actions, direct arguments and local variables mentioned implicitly via closures are automatically included in serialized form in URLs, in the order in which they appeared in the source code.
adamc@553 1369
adamc@553 1370
adamc@552 1371 \section{Compiler Phases}
adamc@552 1372
adamc@552 1373 The Ur/Web compiler is unconventional in that it relies on a kind of \emph{heuristic compilation}. Not all valid programs will compile successfully. Informally, programs fail to compile when they are ``too higher order.'' Compiler phases do their best to eliminate different kinds of higher order-ness, but some programs just won't compile. This is a trade-off for producing very efficient executables. Compiled Ur/Web programs use native C representations and require no garbage collection.
adamc@552 1374
adamc@552 1375 In this section, we step through the main phases of compilation, noting what consequences each phase has for effective programming.
adamc@552 1376
adamc@552 1377 \subsection{Parse}
adamc@552 1378
adamc@552 1379 The compiler reads a \texttt{.urp} file, figures out which \texttt{.urs} and \texttt{.ur} files it references, and combines them all into what is conceptually a single sequence of declarations in the core language of Section \ref{core}.
adamc@552 1380
adamc@552 1381 \subsection{Elaborate}
adamc@552 1382
adamc@552 1383 This is where type inference takes place, translating programs into an explicit form with no more wildcards. This phase is the most likely source of compiler error messages.
adamc@552 1384
adamc@552 1385 \subsection{Unnest}
adamc@552 1386
adamc@552 1387 Named local function definitions are moved to the top level, to avoid the need to generate closures.
adamc@552 1388
adamc@552 1389 \subsection{Corify}
adamc@552 1390
adamc@552 1391 Module system features are compiled away, through inlining of functor definitions at application sites. Afterward, most abstraction boundaries are broken, facilitating optimization.
adamc@552 1392
adamc@552 1393 \subsection{Especialize}
adamc@552 1394
adamc@552 1395 Functions are specialized to particular argument patterns. This is an important trick for avoiding the need to maintain any closures at runtime.
adamc@552 1396
adamc@552 1397 \subsection{Untangle}
adamc@552 1398
adamc@552 1399 Remove unnecessary mutual recursion, splitting recursive groups into strongly-connected components.
adamc@552 1400
adamc@552 1401 \subsection{Shake}
adamc@552 1402
adamc@552 1403 Remove all definitions not needed to run the page handlers that are visible in the signature of the last module listed in the \texttt{.urp} file.
adamc@552 1404
adamc@553 1405 \subsection{\label{tag}Tag}
adamc@552 1406
adamc@552 1407 Assign a URL name to each link and form action. It is important that these links and actions are written as applications of named functions, because such names are used to generate URL patterns. A URL pattern has a name built from the full module path of the named function, followed by the function name, with all pieces separated by slashes. The path of a functor application is based on the name given to the result, rather than the path of the functor itself.
adamc@552 1408
adamc@552 1409 \subsection{Reduce}
adamc@552 1410
adamc@552 1411 Apply definitional equality rules to simplify the program as much as possible. This effectively includes inlining of every non-recursive definition.
adamc@552 1412
adamc@552 1413 \subsection{Unpoly}
adamc@552 1414
adamc@552 1415 This phase specializes polymorphic functions to the specific arguments passed to them in the program. If the program contains real polymorphic recursion, Unpoly will be insufficient to avoid later error messages about too much polymorphism.
adamc@552 1416
adamc@552 1417 \subsection{Specialize}
adamc@552 1418
adamc@552 1419 Replace uses of parametrized datatypes with versions specialized to specific parameters. As for Unpoly, this phase will not be effective enough in the presence of polymorphic recursion or other fancy uses of impredicative polymorphism.
adamc@552 1420
adamc@552 1421 \subsection{Shake}
adamc@552 1422
adamc@552 1423 Here the compiler repeats the earlier shake phase.
adamc@552 1424
adamc@552 1425 \subsection{Monoize}
adamc@552 1426
adamc@552 1427 Programs are translated to a new intermediate language without polymorphism or non-$\mt{Type}$ constructors. Error messages may pop up here if earlier phases failed to remove such features.
adamc@552 1428
adamc@552 1429 This is the stage at which concrete names are generated for cookies, tables, and sequences. They are named following the same convention as for links and actions, based on module path information saved from earlier stages. Table and sequence names separate path elements with underscores instead of slashes, and they are prefixed by \texttt{uw\_}.
adamc@552 1430 \subsection{MonoOpt}
adamc@552 1431
adamc@552 1432 Simple algebraic laws are applied to simplify the program, focusing especially on efficient imperative generation of HTML pages.
adamc@552 1433
adamc@552 1434 \subsection{MonoUntangle}
adamc@552 1435
adamc@552 1436 Unnecessary mutual recursion is broken up again.
adamc@552 1437
adamc@552 1438 \subsection{MonoReduce}
adamc@552 1439
adamc@552 1440 Equivalents of the definitional equality rules are applied to simplify programs, with inlining again playing a major role.
adamc@552 1441
adamc@552 1442 \subsection{MonoShake, MonoOpt}
adamc@552 1443
adamc@552 1444 Unneeded declarations are removed, and basic optimizations are repeated.
adamc@552 1445
adamc@552 1446 \subsection{Fuse}
adamc@552 1447
adamc@552 1448 The compiler tries to simplify calls to recursive functions whose results are immediately written as page output. The write action is pushed inside the function definitions to avoid allocation of intermediate results.
adamc@552 1449
adamc@552 1450 \subsection{MonoUntangle, MonoShake}
adamc@552 1451
adamc@552 1452 Fuse often creates more opportunities to remove spurious mutual recursion.
adamc@552 1453
adamc@552 1454 \subsection{Pathcheck}
adamc@552 1455
adamc@552 1456 The compiler checks that no link or action name has been used more than once.
adamc@552 1457
adamc@552 1458 \subsection{Cjrize}
adamc@552 1459
adamc@552 1460 The program is translated to what is more or less a subset of C. If any use of functions as data remains at this point, the compiler will complain.
adamc@552 1461
adamc@552 1462 \subsection{C Compilation and Linking}
adamc@552 1463
adamc@552 1464 The output of the last phase is pretty-printed as C source code and passed to GCC.
adamc@552 1465
adamc@552 1466
adamc@524 1467 \end{document}