annotate doc/manual.tex @ 623:588b9d16b00a

Start of kind polymorphism, up to the point where demo/hello elaborates with updated Basis/Top
author Adam Chlipala <adamc@hcoop.net>
date Sun, 22 Feb 2009 16:10:25 -0500
parents 33500a15b872
children 9db6880184d0
rev   line source
adamc@524 1 \documentclass{article}
adamc@554 2 \usepackage{fullpage,amsmath,amssymb,proof,url}
adamc@524 3
adamc@524 4 \newcommand{\cd}[1]{\texttt{#1}}
adamc@524 5 \newcommand{\mt}[1]{\mathsf{#1}}
adamc@524 6
adamc@524 7 \newcommand{\rc}{+ \hspace{-.075in} + \;}
adamc@527 8 \newcommand{\rcut}{\; \texttt{--} \;}
adamc@527 9 \newcommand{\rcutM}{\; \texttt{---} \;}
adamc@524 10
adamc@524 11 \begin{document}
adamc@524 12
adamc@524 13 \title{The Ur/Web Manual}
adamc@524 14 \author{Adam Chlipala}
adamc@524 15
adamc@524 16 \maketitle
adamc@524 17
adamc@540 18 \tableofcontents
adamc@540 19
adamc@554 20
adamc@554 21 \section{Introduction}
adamc@554 22
adamc@554 23 \emph{Ur} is a programming language designed to introduce richer type system features into functional programming in the tradition of ML and Haskell. Ur is functional, pure, statically-typed, and strict. Ur supports a powerful kind of \emph{metaprogramming} based on \emph{row types}.
adamc@554 24
adamc@554 25 \emph{Ur/Web} is Ur plus a special standard library and associated rules for parsing and optimization. Ur/Web supports construction of dynamic web applications backed by SQL databases. The signature of the standard library is such that well-typed Ur/Web programs ``don't go wrong'' in a very broad sense. Not only do they not crash during particular page generations, but they also may not:
adamc@554 26
adamc@554 27 \begin{itemize}
adamc@554 28 \item Suffer from any kinds of code-injection attacks
adamc@554 29 \item Return invalid HTML
adamc@554 30 \item Contain dead intra-application links
adamc@554 31 \item Have mismatches between HTML forms and the fields expected by their handlers
adamc@554 32 \item Attempt invalid SQL queries
adamc@554 33 \item Use improper marshaling or unmarshaling in communication with SQL databases
adamc@554 34 \end{itemize}
adamc@554 35
adamc@554 36 This type safety is just the foundation of the Ur/Web methodology. It is also possible to use metaprogramming to build significant application pieces by analysis of type structure. For instance, the demo includes an ML-style functor for building an admin interface for an arbitrary SQL table. The type system guarantees that the admin interface sub-application that comes out will always be free of the above-listed bugs, no matter which well-typed table description is given as input.
adamc@554 37
adamc@554 38 The Ur/Web compiler also produces very efficient object code that does not use garbage collection. These compiled programs will often be even more efficient than what most programmers would bother to write in C.
adamc@554 39
adamc@554 40 \medskip
adamc@554 41
adamc@554 42 The official web site for Ur is:
adamc@554 43 \begin{center}
adamc@554 44 \url{http://www.impredicative.com/ur/}
adamc@554 45 \end{center}
adamc@554 46
adamc@555 47
adamc@555 48 \section{Installation}
adamc@555 49
adamc@555 50 If you are lucky, then the following standard command sequence will suffice for installation, in a directory to which you have unpacked the latest distribution tarball.
adamc@555 51
adamc@555 52 \begin{verbatim}
adamc@555 53 ./configure
adamc@555 54 make
adamc@555 55 sudo make install
adamc@555 56 \end{verbatim}
adamc@555 57
adamc@555 58 Some other packages must be installed for the above to work. At a minimum, you need a standard UNIX shell, with standard UNIX tools like sed and GCC in your execution path; and MLton, the whole-program optimizing compiler for Standard ML. To build programs that access SQL databases, you also need libpq, the PostgreSQL client library. As of this writing, in the ``testing'' version of Debian Linux, this command will install the more uncommon of these dependencies:
adamc@555 59
adamc@555 60 \begin{verbatim}
adamc@555 61 apt-get install mlton libpq-dev
adamc@555 62 \end{verbatim}
adamc@555 63
adamc@555 64 It is also possible to access the modules of the Ur/Web compiler interactively, within Standard ML of New Jersey. To install the prerequisites in Debian testing:
adamc@555 65
adamc@555 66 \begin{verbatim}
adamc@555 67 apt-get install smlnj libsmlnj-smlnj ml-yacc ml-lpt
adamc@555 68 \end{verbatim}
adamc@555 69
adamc@555 70 To begin an interactive session with the Ur compiler modules, run \texttt{make smlnj}, and then, from within an \texttt{sml} session, run \texttt{CM.make "src/urweb.cm";}. The \texttt{Compiler} module is the main entry point.
adamc@555 71
adamc@555 72 To run an SQL-backed application, you will probably want to install the PostgreSQL server. Version 8.3 or higher is required.
adamc@555 73
adamc@555 74 \begin{verbatim}
adamc@555 75 apt-get install postgresql-8.3
adamc@555 76 \end{verbatim}
adamc@555 77
adamc@555 78 To use the Emacs mode, you must have a modern Emacs installed. We assume that you already know how to do this, if you're in the business of looking for an Emacs mode. The demo generation facility of the compiler will also call out to Emacs to syntax-highlight code, and that process depends on the \texttt{htmlize} module, which can be installed in Debian testing via:
adamc@555 79
adamc@555 80 \begin{verbatim}
adamc@555 81 apt-get install emacs-goodies-el
adamc@555 82 \end{verbatim}
adamc@555 83
adamc@555 84 Even with the right packages installed, configuration and building might fail to work. After you run \texttt{./configure}, you will see the values of some named environment variables printed. You may need to adjust these values to get proper installation for your system. To change a value, store your preferred alternative in the corresponding UNIX environment variable, before running \texttt{./configure}. For instance, here is how to change the list of extra arguments that the Ur/Web compiler will pass to GCC on every invocation.
adamc@555 85
adamc@555 86 \begin{verbatim}
adamc@555 87 GCCARGS=-fnested-functions ./configure
adamc@555 88 \end{verbatim}
adamc@555 89
adamc@555 90 Some OSX users have reported needing to use this particular GCCARGS value.
adamc@555 91
adamc@555 92 The Emacs mode can be set to autoload by adding the following to your \texttt{.emacs} file.
adamc@555 93
adamc@555 94 \begin{verbatim}
adamc@555 95 (add-to-list 'load-path "/usr/local/share/emacs/site-lisp/urweb-mode")
adamc@555 96 (load "urweb-mode-startup")
adamc@555 97 \end{verbatim}
adamc@555 98
adamc@555 99 Change the path in the first line if you chose a different Emacs installation path during configuration.
adamc@555 100
adamc@555 101
adamc@556 102 \section{Command-Line Compiler}
adamc@556 103
adamc@556 104 \subsection{Project Files}
adamc@556 105
adamc@556 106 The basic inputs to the \texttt{urweb} compiler are project files, which have the extension \texttt{.urp}. Here is a sample \texttt{.urp} file.
adamc@556 107
adamc@556 108 \begin{verbatim}
adamc@556 109 database dbname=test
adamc@556 110 sql crud1.sql
adamc@556 111
adamc@556 112 crud
adamc@556 113 crud1
adamc@556 114 \end{verbatim}
adamc@556 115
adamc@556 116 The \texttt{database} line gives the database information string to pass to libpq. In this case, the string only says to connect to a local database named \texttt{test}.
adamc@556 117
adamc@556 118 The \texttt{sql} line asks for an SQL source file to be generated, giving the commands to run to create the tables and sequences that this application expects to find. After building this \texttt{.urp} file, the following commands could be used to initialize the database, assuming that the current UNIX user exists as a Postgres user with database creation privileges:
adamc@556 119
adamc@556 120 \begin{verbatim}
adamc@556 121 createdb test
adamc@556 122 psql -f crud1.sql test
adamc@556 123 \end{verbatim}
adamc@556 124
adamc@556 125 A blank line always separates the named directives from a list of modules to include in the project; if there are no named directives, a blank line must begin the file.
adamc@556 126
adamc@556 127 For each entry \texttt{M} in the module list, the file \texttt{M.urs} is included in the project if it exists, and the file \texttt{M.ur} must exist and is always included.
adamc@556 128
adamc@556 129 A few other named directives are supported. \texttt{prefix PREFIX} sets the prefix included before every URI within the generated application; the default is \texttt{/}. \texttt{exe FILENAME} sets the filename to which to write the output executable; the default for file \texttt{P.urp} is \texttt{P.exe}. \texttt{debug} saves some intermediate C files, which is mostly useful to help in debugging the compiler itself. \texttt{profile} generates an executable that may be used with gprof.
adamc@556 130
adamc@557 131 \subsection{Building an Application}
adamc@557 132
adamc@557 133 To compile project \texttt{P.urp}, simply run
adamc@557 134 \begin{verbatim}
adamc@557 135 urweb P
adamc@557 136 \end{verbatim}
adamc@558 137 The output executable is a standalone web server. Run it with the command-line argument \texttt{-h} to see which options it takes. If the project file lists a database, the web server will attempt to connect to that database on startup.
adamc@557 138
adamc@557 139 To time how long the different compiler phases run, without generating an executable, run
adamc@557 140 \begin{verbatim}
adamc@557 141 urweb -timing P
adamc@557 142 \end{verbatim}
adamc@557 143
adamc@556 144
adamc@529 145 \section{Ur Syntax}
adamc@529 146
adamc@529 147 In this section, we describe the syntax of Ur, deferring to a later section discussion of most of the syntax specific to SQL and XML. The sole exceptions are the declaration forms for tables, sequences, and cookies.
adamc@524 148
adamc@524 149 \subsection{Lexical Conventions}
adamc@524 150
adamc@524 151 We give the Ur language definition in \LaTeX $\;$ math mode, since that is prettier than monospaced ASCII. The corresponding ASCII syntax can be read off directly. Here is the key for mapping math symbols to ASCII character sequences.
adamc@524 152
adamc@524 153 \begin{center}
adamc@524 154 \begin{tabular}{rl}
adamc@524 155 \textbf{\LaTeX} & \textbf{ASCII} \\
adamc@524 156 $\to$ & \cd{->} \\
adamc@524 157 $\times$ & \cd{*} \\
adamc@524 158 $\lambda$ & \cd{fn} \\
adamc@524 159 $\Rightarrow$ & \cd{=>} \\
adamc@529 160 $\neq$ & \cd{<>} \\
adamc@529 161 $\leq$ & \cd{<=} \\
adamc@529 162 $\geq$ & \cd{>=} \\
adamc@524 163 \\
adamc@524 164 $x$ & Normal textual identifier, not beginning with an uppercase letter \\
adamc@525 165 $X$ & Normal textual identifier, beginning with an uppercase letter \\
adamc@524 166 \end{tabular}
adamc@524 167 \end{center}
adamc@524 168
adamc@525 169 We often write syntax like $e^*$ to indicate zero or more copies of $e$, $e^+$ to indicate one or more copies, and $e,^*$ and $e,^+$ to indicate multiple copies separated by commas. Another separator may be used in place of a comma. The $e$ term may be surrounded by parentheses to indicate grouping; those parentheses should not be included in the actual ASCII.
adamc@524 170
adamc@526 171 We write $\ell$ for literals of the primitive types, for the most part following C conventions. There are $\mt{int}$, $\mt{float}$, and $\mt{string}$ literals.
adamc@526 172
adamc@527 173 This version of the manual doesn't include operator precedences; see \texttt{src/urweb.grm} for that.
adamc@527 174
adamc@552 175 \subsection{\label{core}Core Syntax}
adamc@524 176
adamc@524 177 \emph{Kinds} classify types and other compile-time-only entities. Each kind in the grammar is listed with a description of the sort of data it classifies.
adamc@524 178 $$\begin{array}{rrcll}
adamc@524 179 \textrm{Kinds} & \kappa &::=& \mt{Type} & \textrm{proper types} \\
adamc@525 180 &&& \mt{Unit} & \textrm{the trivial constructor} \\
adamc@525 181 &&& \mt{Name} & \textrm{field names} \\
adamc@525 182 &&& \kappa \to \kappa & \textrm{type-level functions} \\
adamc@525 183 &&& \{\kappa\} & \textrm{type-level records} \\
adamc@525 184 &&& (\kappa\times^+) & \textrm{type-level tuples} \\
adamc@529 185 &&& \_\_ & \textrm{wildcard} \\
adamc@525 186 &&& (\kappa) & \textrm{explicit precedence} \\
adamc@524 187 \end{array}$$
adamc@524 188
adamc@524 189 Ur supports several different notions of functions that take types as arguments. These arguments can be either implicit, causing them to be inferred at use sites; or explicit, forcing them to be specified manually at use sites. There is a common explicitness annotation convention applied at the definitions of and in the types of such functions.
adamc@524 190 $$\begin{array}{rrcll}
adamc@524 191 \textrm{Explicitness} & ? &::=& :: & \textrm{explicit} \\
adamc@558 192 &&& ::: & \textrm{implicit}
adamc@524 193 \end{array}$$
adamc@524 194
adamc@524 195 \emph{Constructors} are the main class of compile-time-only data. They include proper types and are classified by kinds.
adamc@524 196 $$\begin{array}{rrcll}
adamc@524 197 \textrm{Constructors} & c, \tau &::=& (c) :: \kappa & \textrm{kind annotation} \\
adamc@530 198 &&& \hat{x} & \textrm{constructor variable} \\
adamc@524 199 \\
adamc@525 200 &&& \tau \to \tau & \textrm{function type} \\
adamc@525 201 &&& x \; ? \; \kappa \to \tau & \textrm{polymorphic function type} \\
adamc@525 202 &&& \$ c & \textrm{record type} \\
adamc@524 203 \\
adamc@525 204 &&& c \; c & \textrm{type-level function application} \\
adamc@530 205 &&& \lambda x \; :: \; \kappa \Rightarrow c & \textrm{type-level function abstraction} \\
adamc@524 206 \\
adamc@525 207 &&& () & \textrm{type-level unit} \\
adamc@525 208 &&& \#X & \textrm{field name} \\
adamc@524 209 \\
adamc@525 210 &&& [(c = c)^*] & \textrm{known-length type-level record} \\
adamc@525 211 &&& c \rc c & \textrm{type-level record concatenation} \\
adamc@525 212 &&& \mt{fold} & \textrm{type-level record fold} \\
adamc@524 213 \\
adamc@558 214 &&& (c,^+) & \textrm{type-level tuple} \\
adamc@525 215 &&& c.n & \textrm{type-level tuple projection ($n \in \mathbb N^+$)} \\
adamc@524 216 \\
adamc@525 217 &&& \lambda [c \sim c] \Rightarrow c & \textrm{guarded constructor} \\
adamc@524 218 \\
adamc@529 219 &&& \_ :: \kappa & \textrm{wildcard} \\
adamc@525 220 &&& (c) & \textrm{explicit precedence} \\
adamc@530 221 \\
adamc@530 222 \textrm{Qualified uncapitalized variables} & \hat{x} &::=& x & \textrm{not from a module} \\
adamc@530 223 &&& M.x & \textrm{projection from a module} \\
adamc@525 224 \end{array}$$
adamc@525 225
adamc@525 226 Modules of the module system are described by \emph{signatures}.
adamc@525 227 $$\begin{array}{rrcll}
adamc@525 228 \textrm{Signatures} & S &::=& \mt{sig} \; s^* \; \mt{end} & \textrm{constant} \\
adamc@525 229 &&& X & \textrm{variable} \\
adamc@525 230 &&& \mt{functor}(X : S) : S & \textrm{functor} \\
adamc@529 231 &&& S \; \mt{where} \; \mt{con} \; x = c & \textrm{concretizing an abstract constructor} \\
adamc@525 232 &&& M.X & \textrm{projection from a module} \\
adamc@525 233 \\
adamc@525 234 \textrm{Signature items} & s &::=& \mt{con} \; x :: \kappa & \textrm{abstract constructor} \\
adamc@525 235 &&& \mt{con} \; x :: \kappa = c & \textrm{concrete constructor} \\
adamc@528 236 &&& \mt{datatype} \; x \; x^* = dc\mid^+ & \textrm{algebraic datatype definition} \\
adamc@529 237 &&& \mt{datatype} \; x = \mt{datatype} \; M.x & \textrm{algebraic datatype import} \\
adamc@525 238 &&& \mt{val} \; x : \tau & \textrm{value} \\
adamc@525 239 &&& \mt{structure} \; X : S & \textrm{sub-module} \\
adamc@525 240 &&& \mt{signature} \; X = S & \textrm{sub-signature} \\
adamc@525 241 &&& \mt{include} \; S & \textrm{signature inclusion} \\
adamc@525 242 &&& \mt{constraint} \; c \sim c & \textrm{record disjointness constraint} \\
adamc@525 243 &&& \mt{class} \; x & \textrm{abstract type class} \\
adamc@525 244 &&& \mt{class} \; x = c & \textrm{concrete type class} \\
adamc@525 245 \\
adamc@525 246 \textrm{Datatype constructors} & dc &::=& X & \textrm{nullary constructor} \\
adamc@525 247 &&& X \; \mt{of} \; \tau & \textrm{unary constructor} \\
adamc@524 248 \end{array}$$
adamc@524 249
adamc@526 250 \emph{Patterns} are used to describe structural conditions on expressions, such that expressions may be tested against patterns, generating assignments to pattern variables if successful.
adamc@526 251 $$\begin{array}{rrcll}
adamc@526 252 \textrm{Patterns} & p &::=& \_ & \textrm{wildcard} \\
adamc@526 253 &&& x & \textrm{variable} \\
adamc@526 254 &&& \ell & \textrm{constant} \\
adamc@526 255 &&& \hat{X} & \textrm{nullary constructor} \\
adamc@526 256 &&& \hat{X} \; p & \textrm{unary constructor} \\
adamc@526 257 &&& \{(x = p,)^*\} & \textrm{rigid record pattern} \\
adamc@526 258 &&& \{(x = p,)^+, \ldots\} & \textrm{flexible record pattern} \\
adamc@527 259 &&& (p) & \textrm{explicit precedence} \\
adamc@526 260 \\
adamc@529 261 \textrm{Qualified capitalized variables} & \hat{X} &::=& X & \textrm{not from a module} \\
adamc@526 262 &&& M.X & \textrm{projection from a module} \\
adamc@526 263 \end{array}$$
adamc@526 264
adamc@527 265 \emph{Expressions} are the main run-time entities, corresponding to both ``expressions'' and ``statements'' in mainstream imperative languages.
adamc@527 266 $$\begin{array}{rrcll}
adamc@527 267 \textrm{Expressions} & e &::=& e : \tau & \textrm{type annotation} \\
adamc@529 268 &&& \hat{x} & \textrm{variable} \\
adamc@529 269 &&& \hat{X} & \textrm{datatype constructor} \\
adamc@527 270 &&& \ell & \textrm{constant} \\
adamc@527 271 \\
adamc@527 272 &&& e \; e & \textrm{function application} \\
adamc@527 273 &&& \lambda x : \tau \Rightarrow e & \textrm{function abstraction} \\
adamc@527 274 &&& e [c] & \textrm{polymorphic function application} \\
adamc@527 275 &&& \lambda x \; ? \; \kappa \Rightarrow e & \textrm{polymorphic function abstraction} \\
adamc@527 276 \\
adamc@527 277 &&& \{(c = e,)^*\} & \textrm{known-length record} \\
adamc@527 278 &&& e.c & \textrm{record field projection} \\
adamc@527 279 &&& e \rc e & \textrm{record concatenation} \\
adamc@527 280 &&& e \rcut c & \textrm{removal of a single record field} \\
adamc@527 281 &&& e \rcutM c & \textrm{removal of multiple record fields} \\
adamc@527 282 &&& \mt{fold} & \textrm{fold over fields of a type-level record} \\
adamc@527 283 \\
adamc@527 284 &&& \mt{let} \; ed^* \; \mt{in} \; e \; \mt{end} & \textrm{local definitions} \\
adamc@527 285 \\
adamc@527 286 &&& \mt{case} \; e \; \mt{of} \; (p \Rightarrow e|)^+ & \textrm{pattern matching} \\
adamc@527 287 \\
adamc@527 288 &&& \lambda [c \sim c] \Rightarrow e & \textrm{guarded expression} \\
adamc@527 289 \\
adamc@527 290 &&& \_ & \textrm{wildcard} \\
adamc@527 291 &&& (e) & \textrm{explicit precedence} \\
adamc@527 292 \\
adamc@527 293 \textrm{Local declarations} & ed &::=& \cd{val} \; x : \tau = e & \textrm{non-recursive value} \\
adamc@527 294 &&& \cd{val} \; \cd{rec} \; (x : \tau = e \; \cd{and})^+ & \textrm{mutually-recursive values} \\
adamc@527 295 \end{array}$$
adamc@527 296
adamc@528 297 \emph{Declarations} primarily bring new symbols into context.
adamc@528 298 $$\begin{array}{rrcll}
adamc@528 299 \textrm{Declarations} & d &::=& \mt{con} \; x :: \kappa = c & \textrm{constructor synonym} \\
adamc@528 300 &&& \mt{datatype} \; x \; x^* = dc\mid^+ & \textrm{algebraic datatype definition} \\
adamc@529 301 &&& \mt{datatype} \; x = \mt{datatype} \; M.x & \textrm{algebraic datatype import} \\
adamc@528 302 &&& \mt{val} \; x : \tau = e & \textrm{value} \\
adamc@528 303 &&& \mt{val} \; \cd{rec} \; (x : \tau = e \; \mt{and})^+ & \textrm{mutually-recursive values} \\
adamc@528 304 &&& \mt{structure} \; X : S = M & \textrm{module definition} \\
adamc@528 305 &&& \mt{signature} \; X = S & \textrm{signature definition} \\
adamc@528 306 &&& \mt{open} \; M & \textrm{module inclusion} \\
adamc@528 307 &&& \mt{constraint} \; c \sim c & \textrm{record disjointness constraint} \\
adamc@528 308 &&& \mt{open} \; \mt{constraints} \; M & \textrm{inclusion of just the constraints from a module} \\
adamc@528 309 &&& \mt{table} \; x : c & \textrm{SQL table} \\
adamc@528 310 &&& \mt{sequence} \; x & \textrm{SQL sequence} \\
adamc@535 311 &&& \mt{cookie} \; x : \tau & \textrm{HTTP cookie} \\
adamc@528 312 &&& \mt{class} \; x = c & \textrm{concrete type class} \\
adamc@528 313 \\
adamc@529 314 \textrm{Modules} & M &::=& \mt{struct} \; d^* \; \mt{end} & \textrm{constant} \\
adamc@529 315 &&& X & \textrm{variable} \\
adamc@529 316 &&& M.X & \textrm{projection} \\
adamc@529 317 &&& M(M) & \textrm{functor application} \\
adamc@529 318 &&& \mt{functor}(X : S) : S = M & \textrm{functor abstraction} \\
adamc@528 319 \end{array}$$
adamc@528 320
adamc@528 321 There are two kinds of Ur files. A file named $M\texttt{.ur}$ is an \emph{implementation file}, and it should contain a sequence of declarations $d^*$. A file named $M\texttt{.urs}$ is an \emph{interface file}; it must always have a matching $M\texttt{.ur}$ and should contain a sequence of signature items $s^*$. When both files are present, the overall effect is the same as a monolithic declaration $\mt{structure} \; M : \mt{sig} \; s^* \; \mt{end} = \mt{struct} \; d^* \; \mt{end}$. When no interface file is included, the overall effect is similar, with a signature for module $M$ being inferred rather than just checked against an interface.
adamc@527 322
adamc@529 323 \subsection{Shorthands}
adamc@529 324
adamc@529 325 There are a variety of derived syntactic forms that elaborate into the core syntax from the last subsection. We will present the additional forms roughly following the order in which we presented the constructs that they elaborate into.
adamc@529 326
adamc@529 327 In many contexts where record fields are expected, like in a projection $e.c$, a constant field may be written as simply $X$, rather than $\#X$.
adamc@529 328
adamc@529 329 A record type may be written $\{(c = c,)^*\}$, which elaborates to $\$[(c = c,)^*]$.
adamc@529 330
adamc@533 331 The notation $[c_1, \ldots, c_n]$ is shorthand for $[c_1 = (), \ldots, c_n = ()]$.
adamc@533 332
adamc@529 333 A tuple type $(\tau_1, \ldots, \tau_n)$ expands to a record type $\{1 = \tau_1, \ldots, n = \tau_n\}$, with natural numbers as field names. A tuple pattern $(p_1, \ldots, p_n)$ expands to a rigid record pattern $\{1 = p_1, \ldots, n = p_n\}$. Positive natural numbers may be used in most places where field names would be allowed.
adamc@529 334
adamc@529 335 In general, several adjacent $\lambda$ forms may be combined into one, and kind and type annotations may be omitted, in which case they are implicitly included as wildcards. More formally, for constructor-level abstractions, we can define a new non-terminal $b ::= x \mid (x :: \kappa) \mid [c \sim c]$ and allow composite abstractions of the form $\lambda b^+ \Rightarrow c$, elaborating into the obvious sequence of one core $\lambda$ per element of $b^+$.
adamc@529 336
adamc@529 337 For any signature item or declaration that defines some entity to be equal to $A$ with classification annotation $B$ (e.g., $\mt{val} \; x : B = A$), $B$ and the preceding colon (or similar punctuation) may be omitted, in which case it is filled in as a wildcard.
adamc@529 338
adamc@529 339 A signature item or declaration $\mt{type} \; x$ or $\mt{type} \; x = \tau$ is elaborated into $\mt{con} \; x :: \mt{Type}$ or $\mt{con} \; x :: \mt{Type} = \tau$, respectively.
adamc@529 340
adamc@529 341 A signature item or declaration $\mt{class} \; x = \lambda y :: \mt{Type} \Rightarrow c$ may be abbreviated $\mt{class} \; x \; y = c$.
adamc@529 342
adamc@529 343 Handling of implicit and explicit constructor arguments may be tweaked with some prefixes to variable references. An expression $@x$ is a version of $x$ where all implicit constructor arguments have been made explicit. An expression $@@x$ achieves the same effect, additionally halting automatic resolution of type class instances. The same syntax works for variables projected out of modules and for capitalized variables (datatype constructors).
adamc@529 344
adamc@529 345 At the expression level, an analogue is available of the composite $\lambda$ form for constructors. We define the language of binders as $b ::= x \mid (x : \tau) \mid (x \; ? \; \kappa) \mid [c \sim c]$. A lone variable $x$ as a binder stands for an expression variable of unspecified type.
adamc@529 346
adamc@529 347 A $\mt{val}$ or $\mt{val} \; \mt{rec}$ declaration may include expression binders before the equal sign, following the binder grammar from the last paragraph. Such declarations are elaborated into versions that add additional $\lambda$s to the fronts of the righthand sides, as appropriate. The keyword $\mt{fun}$ is a synonym for $\mt{val} \; \mt{rec}$.
adamc@529 348
adamc@529 349 A signature item $\mt{functor} \; X_1 \; (X_2 : S_1) : S_2$ is elaborated into $\mt{structure} \; X_1 : \mt{functor}(X_2 : S_1) : S_2$. A declaration $\mt{functor} \; X_1 \; (X_2 : S_1) : S_2 = M$ is elaborated into $\mt{structure} \; X_1 : \mt{functor}(X_2 : S_1) : S_2 = \mt{functor}(X_2 : S_1) : S_2 = M$.
adamc@529 350
adamc@529 351 A declaration $\mt{table} \; x : \{(c = c,)^*\}$ is elaborated into $\mt{table} \; x : [(c = c,)^*]$
adamc@529 352
adamc@529 353 The syntax $\mt{where} \; \mt{type}$ is an alternate form of $\mt{where} \; \mt{con}$.
adamc@529 354
adamc@529 355 The syntax $\mt{if} \; e \; \mt{then} \; e_1 \; \mt{else} \; e_2$ expands to $\mt{case} \; e \; \mt{of} \; \mt{Basis}.\mt{True} \Rightarrow e_1 \mid \mt{Basis}.\mt{False} \Rightarrow e_2$.
adamc@529 356
adamc@529 357 There are infix operator syntaxes for a number of functions defined in the $\mt{Basis}$ module. There is $=$ for $\mt{eq}$, $\neq$ for $\mt{neq}$, $-$ for $\mt{neg}$ (as a prefix operator) and $\mt{minus}$, $+$ for $\mt{plus}$, $\times$ for $\mt{times}$, $/$ for $\mt{div}$, $\%$ for $\mt{mod}$, $<$ for $\mt{lt}$, $\leq$ for $\mt{le}$, $>$ for $\mt{gt}$, and $\geq$ for $\mt{ge}$.
adamc@529 358
adamc@529 359 A signature item $\mt{table} \; x : c$ is shorthand for $\mt{val} \; x : \mt{Basis}.\mt{sql\_table} \; c$. $\mt{sequence} \; x$ is short for $\mt{val} \; x : \mt{Basis}.\mt{sql\_sequence}$, and $\mt{cookie} \; x : \tau$ is shorthand for $\mt{val} \; x : \mt{Basis}.\mt{http\_cookie} \; \tau$.
adamc@529 360
adamc@530 361
adamc@530 362 \section{Static Semantics}
adamc@530 363
adamc@530 364 In this section, we give a declarative presentation of Ur's typing rules and related judgments. Inference is the subject of the next section; here, we assume that an oracle has filled in all wildcards with concrete values.
adamc@530 365
adamc@530 366 Since there is significant mutual recursion among the judgments, we introduce them all before beginning to give rules. We use the same variety of contexts throughout this section, implicitly introducing new sorts of context entries as needed.
adamc@530 367 \begin{itemize}
adamc@530 368 \item $\Gamma \vdash c :: \kappa$ assigns a kind to a constructor in a context.
adamc@530 369 \item $\Gamma \vdash c \sim c$ proves the disjointness of two record constructors; that is, that they share no field names. We overload the judgment to apply to pairs of field names as well.
adamc@531 370 \item $\Gamma \vdash c \hookrightarrow C$ proves that record constructor $c$ decomposes into set $C$ of field names and record constructors.
adamc@530 371 \item $\Gamma \vdash c \equiv c$ proves the computational equivalence of two constructors. This is often called a \emph{definitional equality} in the world of type theory.
adamc@530 372 \item $\Gamma \vdash e : \tau$ is a standard typing judgment.
adamc@534 373 \item $\Gamma \vdash p \leadsto \Gamma; \tau$ combines typing of patterns with calculation of which new variables they bind.
adamc@537 374 \item $\Gamma \vdash d \leadsto \Gamma$ expresses how a declaration modifies a context. We overload this judgment to apply to sequences of declarations, as well as to signature items and sequences of signature items.
adamc@537 375 \item $\Gamma \vdash S \equiv S$ is the signature equivalence judgment.
adamc@536 376 \item $\Gamma \vdash S \leq S$ is the signature compatibility judgment. We write $\Gamma \vdash S$ as shorthand for $\Gamma \vdash S \leq S$.
adamc@530 377 \item $\Gamma \vdash M : S$ is the module signature checking judgment.
adamc@537 378 \item $\mt{proj}(M, \overline{s}, V)$ is a partial function for projecting a signature item from $\overline{s}$, given the module $M$ that we project from. $V$ may be $\mt{con} \; x$, $\mt{datatype} \; x$, $\mt{val} \; x$, $\mt{signature} \; X$, or $\mt{structure} \; X$. The parameter $M$ is needed because the projected signature item may refer to other items from $\overline{s}$.
adamc@539 379 \item $\mt{selfify}(M, \overline{s})$ adds information to signature items $\overline{s}$ to reflect the fact that we are concerned with the particular module $M$. This function is overloaded to work over individual signature items as well.
adamc@530 380 \end{itemize}
adamc@530 381
adamc@530 382 \subsection{Kinding}
adamc@530 383
adamc@530 384 $$\infer{\Gamma \vdash (c) :: \kappa :: \kappa}{
adamc@530 385 \Gamma \vdash c :: \kappa
adamc@530 386 }
adamc@530 387 \quad \infer{\Gamma \vdash x :: \kappa}{
adamc@530 388 x :: \kappa \in \Gamma
adamc@530 389 }
adamc@530 390 \quad \infer{\Gamma \vdash x :: \kappa}{
adamc@530 391 x :: \kappa = c \in \Gamma
adamc@530 392 }$$
adamc@530 393
adamc@530 394 $$\infer{\Gamma \vdash M.x :: \kappa}{
adamc@537 395 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 396 & \mt{proj}(M, \overline{s}, \mt{con} \; x) = \kappa
adamc@530 397 }
adamc@530 398 \quad \infer{\Gamma \vdash M.x :: \kappa}{
adamc@537 399 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 400 & \mt{proj}(M, \overline{s}, \mt{con} \; x) = (\kappa, c)
adamc@530 401 }$$
adamc@530 402
adamc@530 403 $$\infer{\Gamma \vdash \tau_1 \to \tau_2 :: \mt{Type}}{
adamc@530 404 \Gamma \vdash \tau_1 :: \mt{Type}
adamc@530 405 & \Gamma \vdash \tau_2 :: \mt{Type}
adamc@530 406 }
adamc@530 407 \quad \infer{\Gamma \vdash x \; ? \: \kappa \to \tau :: \mt{Type}}{
adamc@530 408 \Gamma, x :: \kappa \vdash \tau :: \mt{Type}
adamc@530 409 }
adamc@530 410 \quad \infer{\Gamma \vdash \$c :: \mt{Type}}{
adamc@530 411 \Gamma \vdash c :: \{\mt{Type}\}
adamc@530 412 }$$
adamc@530 413
adamc@530 414 $$\infer{\Gamma \vdash c_1 \; c_2 :: \kappa_2}{
adamc@530 415 \Gamma \vdash c_1 :: \kappa_1 \to \kappa_2
adamc@530 416 & \Gamma \vdash c_2 :: \kappa_1
adamc@530 417 }
adamc@530 418 \quad \infer{\Gamma \vdash \lambda x \; :: \; \kappa_1 \Rightarrow c :: \kappa_1 \to \kappa_2}{
adamc@530 419 \Gamma, x :: \kappa_1 \vdash c :: \kappa_2
adamc@530 420 }$$
adamc@530 421
adamc@530 422 $$\infer{\Gamma \vdash () :: \mt{Unit}}{}
adamc@530 423 \quad \infer{\Gamma \vdash \#X :: \mt{Name}}{}$$
adamc@530 424
adamc@530 425 $$\infer{\Gamma \vdash [\overline{c_i = c'_i}] :: \{\kappa\}}{
adamc@530 426 \forall i: \Gamma \vdash c_i : \mt{Name}
adamc@530 427 & \Gamma \vdash c'_i :: \kappa
adamc@530 428 & \forall i \neq j: \Gamma \vdash c_i \sim c_j
adamc@530 429 }
adamc@530 430 \quad \infer{\Gamma \vdash c_1 \rc c_2 :: \{\kappa\}}{
adamc@530 431 \Gamma \vdash c_1 :: \{\kappa\}
adamc@530 432 & \Gamma \vdash c_2 :: \{\kappa\}
adamc@530 433 & \Gamma \vdash c_1 \sim c_2
adamc@530 434 }$$
adamc@530 435
adamc@530 436 $$\infer{\Gamma \vdash \mt{fold} :: ((\mt{Name} \to \kappa_1 \to \kappa_2 \to \kappa_2) \to \kappa_2 \to \{\kappa_1\} \to \kappa_2}{}$$
adamc@530 437
adamc@573 438 $$\infer{\Gamma \vdash (\overline c) :: (\kappa_1 \times \ldots \times \kappa_n)}{
adamc@573 439 \forall i: \Gamma \vdash c_i :: \kappa_i
adamc@530 440 }
adamc@573 441 \quad \infer{\Gamma \vdash c.i :: \kappa_i}{
adamc@573 442 \Gamma \vdash c :: (\kappa_1 \times \ldots \times \kappa_n)
adamc@530 443 }$$
adamc@530 444
adamc@530 445 $$\infer{\Gamma \vdash \lambda [c_1 \sim c_2] \Rightarrow c :: \kappa}{
adamc@530 446 \Gamma \vdash c_1 :: \{\kappa'\}
adamc@530 447 & \Gamma \vdash c_2 :: \{\kappa'\}
adamc@530 448 & \Gamma, c_1 \sim c_2 \vdash c :: \kappa
adamc@530 449 }$$
adamc@530 450
adamc@531 451 \subsection{Record Disjointness}
adamc@531 452
adamc@531 453 We will use a keyword $\mt{map}$ as a shorthand, such that, for $f$ of kind $\kappa \to \kappa'$, $\mt{map} \; f$ stands for $\mt{fold} \; (\lambda (x_1 :: \mt{Name}) (x_2 :: \kappa) (x_3 :: \{\kappa'\}) \Rightarrow [x_1 = f \; x_2] \rc x_3) \; []$.
adamc@531 454
adamc@531 455 $$\infer{\Gamma \vdash c_1 \sim c_2}{
adamc@558 456 \Gamma \vdash c_1 \hookrightarrow C_1
adamc@558 457 & \Gamma \vdash c_2 \hookrightarrow C_2
adamc@558 458 & \forall c'_1 \in C_1, c'_2 \in C_2: \Gamma \vdash c'_1 \sim c'_2
adamc@531 459 }
adamc@531 460 \quad \infer{\Gamma \vdash X \sim X'}{
adamc@531 461 X \neq X'
adamc@531 462 }$$
adamc@531 463
adamc@531 464 $$\infer{\Gamma \vdash c_1 \sim c_2}{
adamc@531 465 c'_1 \sim c'_2 \in \Gamma
adamc@558 466 & \Gamma \vdash c'_1 \hookrightarrow C_1
adamc@558 467 & \Gamma \vdash c'_2 \hookrightarrow C_2
adamc@558 468 & c_1 \in C_1
adamc@558 469 & c_2 \in C_2
adamc@531 470 }$$
adamc@531 471
adamc@531 472 $$\infer{\Gamma \vdash c \hookrightarrow \{c\}}{}
adamc@531 473 \quad \infer{\Gamma \vdash [\overline{c = c'}] \hookrightarrow \{\overline{c}\}}{}
adamc@531 474 \quad \infer{\Gamma \vdash c_1 \rc c_2 \hookrightarrow C_1 \cup C_2}{
adamc@531 475 \Gamma \vdash c_1 \hookrightarrow C_1
adamc@531 476 & \Gamma \vdash c_2 \hookrightarrow C_2
adamc@531 477 }
adamc@531 478 \quad \infer{\Gamma \vdash c \hookrightarrow C}{
adamc@531 479 \Gamma \vdash c \equiv c'
adamc@531 480 & \Gamma \vdash c' \hookrightarrow C
adamc@531 481 }
adamc@531 482 \quad \infer{\Gamma \vdash \mt{map} \; f \; c \hookrightarrow C}{
adamc@531 483 \Gamma \vdash c \hookrightarrow C
adamc@531 484 }$$
adamc@531 485
adamc@541 486 \subsection{\label{definitional}Definitional Equality}
adamc@532 487
adamc@532 488 We use $\mathcal C$ to stand for a one-hole context that, when filled, yields a constructor. The notation $\mathcal C[c]$ plugs $c$ into $\mathcal C$. We omit the standard definition of one-hole contexts. We write $[x \mapsto c_1]c_2$ for capture-avoiding substitution of $c_1$ for $x$ in $c_2$.
adamc@532 489
adamc@532 490 $$\infer{\Gamma \vdash c \equiv c}{}
adamc@532 491 \quad \infer{\Gamma \vdash c_1 \equiv c_2}{
adamc@532 492 \Gamma \vdash c_2 \equiv c_1
adamc@532 493 }
adamc@532 494 \quad \infer{\Gamma \vdash c_1 \equiv c_3}{
adamc@532 495 \Gamma \vdash c_1 \equiv c_2
adamc@532 496 & \Gamma \vdash c_2 \equiv c_3
adamc@532 497 }
adamc@532 498 \quad \infer{\Gamma \vdash \mathcal C[c_1] \equiv \mathcal C[c_2]}{
adamc@532 499 \Gamma \vdash c_1 \equiv c_2
adamc@532 500 }$$
adamc@532 501
adamc@532 502 $$\infer{\Gamma \vdash x \equiv c}{
adamc@532 503 x :: \kappa = c \in \Gamma
adamc@532 504 }
adamc@532 505 \quad \infer{\Gamma \vdash M.x \equiv c}{
adamc@537 506 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 507 & \mt{proj}(M, \overline{s}, \mt{con} \; x) = (\kappa, c)
adamc@532 508 }
adamc@532 509 \quad \infer{\Gamma \vdash (\overline c).i \equiv c_i}{}$$
adamc@532 510
adamc@532 511 $$\infer{\Gamma \vdash (\lambda x :: \kappa \Rightarrow c) \; c' \equiv [x \mapsto c'] c}{}
adamc@532 512 \quad \infer{\Gamma \vdash c_1 \rc c_2 \equiv c_2 \rc c_1}{}
adamc@532 513 \quad \infer{\Gamma \vdash c_1 \rc (c_2 \rc c_3) \equiv (c_1 \rc c_2) \rc c_3}{}$$
adamc@532 514
adamc@532 515 $$\infer{\Gamma \vdash [] \rc c \equiv c}{}
adamc@532 516 \quad \infer{\Gamma \vdash [\overline{c_1 = c'_1}] \rc [\overline{c_2 = c'_2}] \equiv [\overline{c_1 = c'_1}, \overline{c_2 = c'_2}]}{}$$
adamc@532 517
adamc@532 518 $$\infer{\Gamma \vdash \lambda [c_1 \sim c_2] \Rightarrow c \equiv c}{
adamc@532 519 \Gamma \vdash c_1 \sim c_2
adamc@532 520 }
adamc@532 521 \quad \infer{\Gamma \vdash \mt{fold} \; f \; i \; [] \equiv i}{}
adamc@532 522 \quad \infer{\Gamma \vdash \mt{fold} \; f \; i \; ([c_1 = c_2] \rc c) \equiv f \; c_1 \; c_2 \; (\mt{fold} \; f \; i \; c)}{}$$
adamc@532 523
adamc@532 524 $$\infer{\Gamma \vdash \mt{map} \; (\lambda x \Rightarrow x) \; c \equiv c}{}
adamc@532 525 \quad \infer{\Gamma \vdash \mt{fold} \; f \; i \; (\mt{map} \; f' \; c)
adamc@532 526 \equiv \mt{fold} \; (\lambda (x_1 :: \mt{Name}) (x_2 :: \kappa) \Rightarrow f \; x_1 \; (f' \; x_2)) \; i \; c}{}$$
adamc@532 527
adamc@532 528 $$\infer{\Gamma \vdash \mt{map} \; f \; (c_1 \rc c_2) \equiv \mt{map} \; f \; c_1 \rc \mt{map} \; f \; c_2}{}$$
adamc@531 529
adamc@534 530 \subsection{Expression Typing}
adamc@533 531
adamc@533 532 We assume the existence of a function $T$ assigning types to literal constants. It maps integer constants to $\mt{Basis}.\mt{int}$, float constants to $\mt{Basis}.\mt{float}$, and string constants to $\mt{Basis}.\mt{string}$.
adamc@533 533
adamc@533 534 We also refer to a function $\mathcal I$, such that $\mathcal I(\tau)$ ``uses an oracle'' to instantiate all constructor function arguments at the beginning of $\tau$ that are marked implicit; i.e., replace $x_1 ::: \kappa_1 \to \ldots \to x_n ::: \kappa_n \to \tau$ with $[x_1 \mapsto c_1]\ldots[x_n \mapsto c_n]\tau$, where the $c_i$s are inferred and $\tau$ does not start like $x ::: \kappa \to \tau'$.
adamc@533 535
adamc@533 536 $$\infer{\Gamma \vdash e : \tau : \tau}{
adamc@533 537 \Gamma \vdash e : \tau
adamc@533 538 }
adamc@533 539 \quad \infer{\Gamma \vdash e : \tau}{
adamc@533 540 \Gamma \vdash e : \tau'
adamc@533 541 & \Gamma \vdash \tau' \equiv \tau
adamc@533 542 }
adamc@533 543 \quad \infer{\Gamma \vdash \ell : T(\ell)}{}$$
adamc@533 544
adamc@533 545 $$\infer{\Gamma \vdash x : \mathcal I(\tau)}{
adamc@533 546 x : \tau \in \Gamma
adamc@533 547 }
adamc@533 548 \quad \infer{\Gamma \vdash M.x : \mathcal I(\tau)}{
adamc@537 549 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 550 & \mt{proj}(M, \overline{s}, \mt{val} \; x) = \tau
adamc@533 551 }
adamc@533 552 \quad \infer{\Gamma \vdash X : \mathcal I(\tau)}{
adamc@533 553 X : \tau \in \Gamma
adamc@533 554 }
adamc@533 555 \quad \infer{\Gamma \vdash M.X : \mathcal I(\tau)}{
adamc@537 556 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 557 & \mt{proj}(M, \overline{s}, \mt{val} \; X) = \tau
adamc@533 558 }$$
adamc@533 559
adamc@533 560 $$\infer{\Gamma \vdash e_1 \; e_2 : \tau_2}{
adamc@533 561 \Gamma \vdash e_1 : \tau_1 \to \tau_2
adamc@533 562 & \Gamma \vdash e_2 : \tau_1
adamc@533 563 }
adamc@533 564 \quad \infer{\Gamma \vdash \lambda x : \tau_1 \Rightarrow e : \tau_1 \to \tau_2}{
adamc@533 565 \Gamma, x : \tau_1 \vdash e : \tau_2
adamc@533 566 }$$
adamc@533 567
adamc@533 568 $$\infer{\Gamma \vdash e [c] : [x \mapsto c]\tau}{
adamc@533 569 \Gamma \vdash e : x :: \kappa \to \tau
adamc@533 570 & \Gamma \vdash c :: \kappa
adamc@533 571 }
adamc@533 572 \quad \infer{\Gamma \vdash \lambda x \; ? \; \kappa \Rightarrow e : x \; ? \; \kappa \to \tau}{
adamc@533 573 \Gamma, x :: \kappa \vdash e : \tau
adamc@533 574 }$$
adamc@533 575
adamc@533 576 $$\infer{\Gamma \vdash \{\overline{c = e}\} : \{\overline{c : \tau}\}}{
adamc@533 577 \forall i: \Gamma \vdash c_i :: \mt{Name}
adamc@533 578 & \Gamma \vdash e_i : \tau_i
adamc@533 579 & \forall i \neq j: \Gamma \vdash c_i \sim c_j
adamc@533 580 }
adamc@533 581 \quad \infer{\Gamma \vdash e.c : \tau}{
adamc@533 582 \Gamma \vdash e : \$([c = \tau] \rc c')
adamc@533 583 }
adamc@533 584 \quad \infer{\Gamma \vdash e_1 \rc e_2 : \$(c_1 \rc c_2)}{
adamc@533 585 \Gamma \vdash e_1 : \$c_1
adamc@533 586 & \Gamma \vdash e_2 : \$c_2
adamc@573 587 & \Gamma \vdash c_1 \sim c_2
adamc@533 588 }$$
adamc@533 589
adamc@533 590 $$\infer{\Gamma \vdash e \rcut c : \$c'}{
adamc@533 591 \Gamma \vdash e : \$([c = \tau] \rc c')
adamc@533 592 }
adamc@533 593 \quad \infer{\Gamma \vdash e \rcutM c : \$c'}{
adamc@533 594 \Gamma \vdash e : \$(c \rc c')
adamc@533 595 }$$
adamc@533 596
adamc@533 597 $$\infer{\Gamma \vdash \mt{fold} : \begin{array}{c}
adamc@533 598 x_1 :: (\{\kappa\} \to \tau)
adamc@533 599 \to (x_2 :: \mt{Name} \to x_3 :: \kappa \to x_4 :: \{\kappa\} \to \lambda [[x_2] \sim x_4]
adamc@533 600 \Rightarrow x_1 \; x_4 \to x_1 \; ([x_2 = x_3] \rc x_4)) \\
adamc@533 601 \to x_1 \; [] \to x_5 :: \{\kappa\} \to x_1 \; x_5
adamc@533 602 \end{array}}{}$$
adamc@533 603
adamc@533 604 $$\infer{\Gamma \vdash \mt{let} \; \overline{ed} \; \mt{in} \; e \; \mt{end} : \tau}{
adamc@533 605 \Gamma \vdash \overline{ed} \leadsto \Gamma'
adamc@533 606 & \Gamma' \vdash e : \tau
adamc@533 607 }
adamc@533 608 \quad \infer{\Gamma \vdash \mt{case} \; e \; \mt{of} \; \overline{p \Rightarrow e} : \tau}{
adamc@533 609 \forall i: \Gamma \vdash p_i \leadsto \Gamma_i, \tau'
adamc@533 610 & \Gamma_i \vdash e_i : \tau
adamc@533 611 }$$
adamc@533 612
adamc@573 613 $$\infer{\Gamma \vdash \lambda [c_1 \sim c_2] \Rightarrow e : \lambda [c_1 \sim c_2] \Rightarrow \tau}{
adamc@533 614 \Gamma \vdash c_1 :: \{\kappa\}
adamc@533 615 & \Gamma \vdash c_2 :: \{\kappa\}
adamc@533 616 & \Gamma, c_1 \sim c_2 \vdash e : \tau
adamc@533 617 }$$
adamc@533 618
adamc@534 619 \subsection{Pattern Typing}
adamc@534 620
adamc@534 621 $$\infer{\Gamma \vdash \_ \leadsto \Gamma; \tau}{}
adamc@534 622 \quad \infer{\Gamma \vdash x \leadsto \Gamma, x : \tau; \tau}{}
adamc@534 623 \quad \infer{\Gamma \vdash \ell \leadsto \Gamma; T(\ell)}{}$$
adamc@534 624
adamc@534 625 $$\infer{\Gamma \vdash X \leadsto \Gamma; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@534 626 X : \overline{x ::: \mt{Type}} \to \tau \in \Gamma
adamc@534 627 & \textrm{$\tau$ not a function type}
adamc@534 628 }
adamc@534 629 \quad \infer{\Gamma \vdash X \; p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@534 630 X : \overline{x ::: \mt{Type}} \to \tau'' \to \tau \in \Gamma
adamc@534 631 & \Gamma \vdash p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau''
adamc@534 632 }$$
adamc@534 633
adamc@534 634 $$\infer{\Gamma \vdash M.X \leadsto \Gamma; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@537 635 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 636 & \mt{proj}(M, \overline{s}, \mt{val} \; X) = \overline{x ::: \mt{Type}} \to \tau
adamc@534 637 & \textrm{$\tau$ not a function type}
adamc@534 638 }$$
adamc@534 639
adamc@534 640 $$\infer{\Gamma \vdash M.X \; p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@537 641 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 642 & \mt{proj}(M, \overline{s}, \mt{val} \; X) = \overline{x ::: \mt{Type}} \to \tau'' \to \tau
adamc@534 643 & \Gamma \vdash p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau''
adamc@534 644 }$$
adamc@534 645
adamc@534 646 $$\infer{\Gamma \vdash \{\overline{x = p}\} \leadsto \Gamma_n; \{\overline{x = \tau}\}}{
adamc@534 647 \Gamma_0 = \Gamma
adamc@534 648 & \forall i: \Gamma_i \vdash p_i \leadsto \Gamma_{i+1}; \tau_i
adamc@534 649 }
adamc@534 650 \quad \infer{\Gamma \vdash \{\overline{x = p}, \ldots\} \leadsto \Gamma_n; \$([\overline{x = \tau}] \rc c)}{
adamc@534 651 \Gamma_0 = \Gamma
adamc@534 652 & \forall i: \Gamma_i \vdash p_i \leadsto \Gamma_{i+1}; \tau_i
adamc@534 653 }$$
adamc@534 654
adamc@535 655 \subsection{Declaration Typing}
adamc@535 656
adamc@535 657 We use an auxiliary judgment $\overline{y}; x; \Gamma \vdash \overline{dc} \leadsto \Gamma'$, expressing the enrichment of $\Gamma$ with the types of the datatype constructors $\overline{dc}$, when they are known to belong to datatype $x$ with type parameters $\overline{y}$.
adamc@535 658
adamc@541 659 This is the first judgment where we deal with type classes, for the $\mt{class}$ declaration form. We will omit their special handling in this formal specification. Section \ref{typeclasses} gives an informal description of how type classes influence type inference.
adamc@535 660
adamc@558 661 We presuppose the existence of a function $\mathcal O$, where $\mathcal O(M, \overline{s})$ implements the $\mt{open}$ declaration by producing a context with the appropriate entry for each available component of module $M$ with signature items $\overline{s}$. Where possible, $\mathcal O$ uses ``transparent'' entries (e.g., an abstract type $M.x$ is mapped to $x :: \mt{Type} = M.x$), so that the relationship with $M$ is maintained. A related function $\mathcal O_c$ builds a context containing the disjointness constraints found in $\overline s$.
adamc@537 662 We write $\kappa_1^n \to \kappa$ as a shorthand, where $\kappa_1^0 \to \kappa = \kappa$ and $\kappa_1^{n+1} \to \kappa_2 = \kappa_1 \to (\kappa_1^n \to \kappa_2)$. We write $\mt{len}(\overline{y})$ for the length of vector $\overline{y}$ of variables.
adamc@535 663
adamc@535 664 $$\infer{\Gamma \vdash \cdot \leadsto \Gamma}{}
adamc@535 665 \quad \infer{\Gamma \vdash d, \overline{d} \leadsto \Gamma''}{
adamc@535 666 \Gamma \vdash d \leadsto \Gamma'
adamc@535 667 & \Gamma' \vdash \overline{d} \leadsto \Gamma''
adamc@535 668 }$$
adamc@535 669
adamc@535 670 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa = c \leadsto \Gamma, x :: \kappa = c}{
adamc@535 671 \Gamma \vdash c :: \kappa
adamc@535 672 }
adamc@535 673 \quad \infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leadsto \Gamma'}{
adamc@535 674 \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} \vdash \overline{dc} \leadsto \Gamma'
adamc@535 675 }$$
adamc@535 676
adamc@535 677 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leadsto \Gamma'}{
adamc@537 678 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 679 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@535 680 & \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} = M.z \vdash \overline{dc} \leadsto \Gamma'
adamc@535 681 }$$
adamc@535 682
adamc@535 683 $$\infer{\Gamma \vdash \mt{val} \; x : \tau = e \leadsto \Gamma, x : \tau}{
adamc@535 684 \Gamma \vdash e : \tau
adamc@535 685 }$$
adamc@535 686
adamc@535 687 $$\infer{\Gamma \vdash \mt{val} \; \mt{rec} \; \overline{x : \tau = e} \leadsto \Gamma, \overline{x : \tau}}{
adamc@535 688 \forall i: \Gamma, \overline{x : \tau} \vdash e_i : \tau_i
adamc@535 689 & \textrm{$e_i$ starts with an expression $\lambda$, optionally preceded by constructor and disjointness $\lambda$s}
adamc@535 690 }$$
adamc@535 691
adamc@535 692 $$\infer{\Gamma \vdash \mt{structure} \; X : S = M \leadsto \Gamma, X : S}{
adamc@535 693 \Gamma \vdash M : S
adamc@558 694 & \textrm{ $M$ not a constant or application}
adamc@535 695 }
adamc@558 696 \quad \infer{\Gamma \vdash \mt{structure} \; X : S = M \leadsto \Gamma, X : \mt{selfify}(X, \overline{s})}{
adamc@558 697 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@539 698 }$$
adamc@539 699
adamc@539 700 $$\infer{\Gamma \vdash \mt{signature} \; X = S \leadsto \Gamma, X = S}{
adamc@535 701 \Gamma \vdash S
adamc@535 702 }$$
adamc@535 703
adamc@537 704 $$\infer{\Gamma \vdash \mt{open} \; M \leadsto \Gamma, \mathcal O(M, \overline{s})}{
adamc@537 705 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@535 706 }$$
adamc@535 707
adamc@535 708 $$\infer{\Gamma \vdash \mt{constraint} \; c_1 \sim c_2 \leadsto \Gamma}{
adamc@535 709 \Gamma \vdash c_1 :: \{\kappa\}
adamc@535 710 & \Gamma \vdash c_2 :: \{\kappa\}
adamc@535 711 & \Gamma \vdash c_1 \sim c_2
adamc@535 712 }
adamc@537 713 \quad \infer{\Gamma \vdash \mt{open} \; \mt{constraints} \; M \leadsto \Gamma, \mathcal O_c(M, \overline{s})}{
adamc@537 714 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@535 715 }$$
adamc@535 716
adamc@535 717 $$\infer{\Gamma \vdash \mt{table} \; x : c \leadsto \Gamma, x : \mt{Basis}.\mt{sql\_table} \; c}{
adamc@535 718 \Gamma \vdash c :: \{\mt{Type}\}
adamc@535 719 }
adamc@535 720 \quad \infer{\Gamma \vdash \mt{sequence} \; x \leadsto \Gamma, x : \mt{Basis}.\mt{sql\_sequence}}{}$$
adamc@535 721
adamc@535 722 $$\infer{\Gamma \vdash \mt{cookie} \; x : \tau \leadsto \Gamma, x : \mt{Basis}.\mt{http\_cookie} \; \tau}{
adamc@535 723 \Gamma \vdash \tau :: \mt{Type}
adamc@535 724 }$$
adamc@535 725
adamc@535 726 $$\infer{\Gamma \vdash \mt{class} \; x = c \leadsto \Gamma, x :: \mt{Type} \to \mt{Type} = c}{
adamc@535 727 \Gamma \vdash c :: \mt{Type} \to \mt{Type}
adamc@535 728 }$$
adamc@535 729
adamc@535 730 $$\infer{\overline{y}; x; \Gamma \vdash \cdot \leadsto \Gamma}{}
adamc@535 731 \quad \infer{\overline{y}; x; \Gamma \vdash X \mid \overline{dc} \leadsto \Gamma', X : \overline{y ::: \mt{Type}} \to x \; \overline{y}}{
adamc@535 732 \overline{y}; x; \Gamma \vdash \overline{dc} \leadsto \Gamma'
adamc@535 733 }
adamc@535 734 \quad \infer{\overline{y}; x; \Gamma \vdash X \; \mt{of} \; \tau \mid \overline{dc} \leadsto \Gamma', X : \overline{y ::: \mt{Type}} \to \tau \to x \; \overline{y}}{
adamc@535 735 \overline{y}; x; \Gamma \vdash \overline{dc} \leadsto \Gamma'
adamc@535 736 }$$
adamc@535 737
adamc@537 738 \subsection{Signature Item Typing}
adamc@537 739
adamc@537 740 We appeal to a signature item analogue of the $\mathcal O$ function from the last subsection.
adamc@537 741
adamc@537 742 $$\infer{\Gamma \vdash \cdot \leadsto \Gamma}{}
adamc@537 743 \quad \infer{\Gamma \vdash s, \overline{s} \leadsto \Gamma''}{
adamc@537 744 \Gamma \vdash s \leadsto \Gamma'
adamc@537 745 & \Gamma' \vdash \overline{s} \leadsto \Gamma''
adamc@537 746 }$$
adamc@537 747
adamc@537 748 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa \leadsto \Gamma, x :: \kappa}{}
adamc@537 749 \quad \infer{\Gamma \vdash \mt{con} \; x :: \kappa = c \leadsto \Gamma, x :: \kappa = c}{
adamc@537 750 \Gamma \vdash c :: \kappa
adamc@537 751 }
adamc@537 752 \quad \infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leadsto \Gamma'}{
adamc@537 753 \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} \vdash \overline{dc} \leadsto \Gamma'
adamc@537 754 }$$
adamc@537 755
adamc@537 756 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leadsto \Gamma'}{
adamc@537 757 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 758 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@537 759 & \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} = M.z \vdash \overline{dc} \leadsto \Gamma'
adamc@537 760 }$$
adamc@537 761
adamc@537 762 $$\infer{\Gamma \vdash \mt{val} \; x : \tau \leadsto \Gamma, x : \tau}{
adamc@537 763 \Gamma \vdash \tau :: \mt{Type}
adamc@537 764 }$$
adamc@537 765
adamc@537 766 $$\infer{\Gamma \vdash \mt{structure} \; X : S \leadsto \Gamma, X : S}{
adamc@537 767 \Gamma \vdash S
adamc@537 768 }
adamc@537 769 \quad \infer{\Gamma \vdash \mt{signature} \; X = S \leadsto \Gamma, X = S}{
adamc@537 770 \Gamma \vdash S
adamc@537 771 }$$
adamc@537 772
adamc@537 773 $$\infer{\Gamma \vdash \mt{include} \; S \leadsto \Gamma, \mathcal O(\overline{s})}{
adamc@537 774 \Gamma \vdash S
adamc@537 775 & \Gamma \vdash S \equiv \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 776 }$$
adamc@537 777
adamc@537 778 $$\infer{\Gamma \vdash \mt{constraint} \; c_1 \sim c_2 \leadsto \Gamma, c_1 \sim c_2}{
adamc@537 779 \Gamma \vdash c_1 :: \{\kappa\}
adamc@537 780 & \Gamma \vdash c_2 :: \{\kappa\}
adamc@537 781 }$$
adamc@537 782
adamc@537 783 $$\infer{\Gamma \vdash \mt{class} \; x = c \leadsto \Gamma, x :: \mt{Type} \to \mt{Type} = c}{
adamc@537 784 \Gamma \vdash c :: \mt{Type} \to \mt{Type}
adamc@537 785 }
adamc@537 786 \quad \infer{\Gamma \vdash \mt{class} \; x \leadsto \Gamma, x :: \mt{Type} \to \mt{Type}}{}$$
adamc@537 787
adamc@536 788 \subsection{Signature Compatibility}
adamc@536 789
adamc@558 790 To simplify the judgments in this section, we assume that all signatures are alpha-varied as necessary to avoid including multiple bindings for the same identifier. This is in addition to the usual alpha-variation of locally-bound variables.
adamc@537 791
adamc@537 792 We rely on a judgment $\Gamma \vdash \overline{s} \leq s'$, which expresses the occurrence in signature items $\overline{s}$ of an item compatible with $s'$. We also use a judgment $\Gamma \vdash \overline{dc} \leq \overline{dc}$, which expresses compatibility of datatype definitions.
adamc@537 793
adamc@536 794 $$\infer{\Gamma \vdash S \equiv S}{}
adamc@536 795 \quad \infer{\Gamma \vdash S_1 \equiv S_2}{
adamc@536 796 \Gamma \vdash S_2 \equiv S_1
adamc@536 797 }
adamc@536 798 \quad \infer{\Gamma \vdash X \equiv S}{
adamc@536 799 X = S \in \Gamma
adamc@536 800 }
adamc@536 801 \quad \infer{\Gamma \vdash M.X \equiv S}{
adamc@537 802 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 803 & \mt{proj}(M, \overline{s}, \mt{signature} \; X) = S
adamc@536 804 }$$
adamc@536 805
adamc@536 806 $$\infer{\Gamma \vdash S \; \mt{where} \; \mt{con} \; x = c \equiv \mt{sig} \; \overline{s^1} \; \mt{con} \; x :: \kappa = c \; \overline{s_2} \; \mt{end}}{
adamc@536 807 \Gamma \vdash S \equiv \mt{sig} \; \overline{s^1} \; \mt{con} \; x :: \kappa \; \overline{s_2} \; \mt{end}
adamc@536 808 & \Gamma \vdash c :: \kappa
adamc@537 809 }
adamc@537 810 \quad \infer{\Gamma \vdash \mt{sig} \; \overline{s^1} \; \mt{include} \; S \; \overline{s^2} \; \mt{end} \equiv \mt{sig} \; \overline{s^1} \; \overline{s} \; \overline{s^2} \; \mt{end}}{
adamc@537 811 \Gamma \vdash S \equiv \mt{sig} \; \overline{s} \; \mt{end}
adamc@536 812 }$$
adamc@536 813
adamc@536 814 $$\infer{\Gamma \vdash S_1 \leq S_2}{
adamc@536 815 \Gamma \vdash S_1 \equiv S_2
adamc@536 816 }
adamc@536 817 \quad \infer{\Gamma \vdash \mt{sig} \; \overline{s} \; \mt{end} \leq \mt{sig} \; \mt{end}}{}
adamc@537 818 \quad \infer{\Gamma \vdash \mt{sig} \; \overline{s} \; \mt{end} \leq \mt{sig} \; s' \; \overline{s'} \; \mt{end}}{
adamc@537 819 \Gamma \vdash \overline{s} \leq s'
adamc@537 820 & \Gamma \vdash s' \leadsto \Gamma'
adamc@537 821 & \Gamma' \vdash \mt{sig} \; \overline{s} \; \mt{end} \leq \mt{sig} \; \overline{s'} \; \mt{end}
adamc@537 822 }$$
adamc@537 823
adamc@537 824 $$\infer{\Gamma \vdash s \; \overline{s} \leq s'}{
adamc@537 825 \Gamma \vdash s \leq s'
adamc@537 826 }
adamc@537 827 \quad \infer{\Gamma \vdash s \; \overline{s} \leq s'}{
adamc@537 828 \Gamma \vdash s \leadsto \Gamma'
adamc@537 829 & \Gamma' \vdash \overline{s} \leq s'
adamc@536 830 }$$
adamc@536 831
adamc@536 832 $$\infer{\Gamma \vdash \mt{functor} (X : S_1) : S_2 \leq \mt{functor} (X : S'_1) : S'_2}{
adamc@536 833 \Gamma \vdash S'_1 \leq S_1
adamc@536 834 & \Gamma, X : S'_1 \vdash S_2 \leq S'_2
adamc@536 835 }$$
adamc@536 836
adamc@537 837 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa \leq \mt{con} \; x :: \kappa}{}
adamc@537 838 \quad \infer{\Gamma \vdash \mt{con} \; x :: \kappa = c \leq \mt{con} \; x :: \kappa}{}
adamc@558 839 \quad \infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leq \mt{con} \; x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type}}{}$$
adamc@537 840
adamc@537 841 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leq \mt{con} \; x :: \mt{Type}^{\mt{len}(y)} \to \mt{Type}}{
adamc@537 842 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 843 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@537 844 }$$
adamc@537 845
adamc@537 846 $$\infer{\Gamma \vdash \mt{class} \; x \leq \mt{con} \; x :: \mt{Type} \to \mt{Type}}{}
adamc@537 847 \quad \infer{\Gamma \vdash \mt{class} \; x = c \leq \mt{con} \; x :: \mt{Type} \to \mt{Type}}{}$$
adamc@537 848
adamc@537 849 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa = c_1 \leq \mt{con} \; x :: \mt{\kappa} = c_2}{
adamc@537 850 \Gamma \vdash c_1 \equiv c_2
adamc@537 851 }
adamc@537 852 \quad \infer{\Gamma \vdash \mt{class} \; x = c_1 \leq \mt{con} \; x :: \mt{Type} \to \mt{Type} = c_2}{
adamc@537 853 \Gamma \vdash c_1 \equiv c_2
adamc@537 854 }$$
adamc@537 855
adamc@537 856 $$\infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leq \mt{datatype} \; x \; \overline{y} = \overline{dc'}}{
adamc@537 857 \Gamma, \overline{y :: \mt{Type}} \vdash \overline{dc} \leq \overline{dc'}
adamc@537 858 }$$
adamc@537 859
adamc@537 860 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leq \mt{datatype} \; x \; \overline{y} = \overline{dc'}}{
adamc@537 861 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 862 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@537 863 & \Gamma, \overline{y :: \mt{Type}} \vdash \overline{dc} \leq \overline{dc'}
adamc@537 864 }$$
adamc@537 865
adamc@537 866 $$\infer{\Gamma \vdash \cdot \leq \cdot}{}
adamc@537 867 \quad \infer{\Gamma \vdash X; \overline{dc} \leq X; \overline{dc'}}{
adamc@537 868 \Gamma \vdash \overline{dc} \leq \overline{dc'}
adamc@537 869 }
adamc@537 870 \quad \infer{\Gamma \vdash X \; \mt{of} \; \tau_1; \overline{dc} \leq X \; \mt{of} \; \tau_2; \overline{dc'}}{
adamc@537 871 \Gamma \vdash \tau_1 \equiv \tau_2
adamc@537 872 & \Gamma \vdash \overline{dc} \leq \overline{dc'}
adamc@537 873 }$$
adamc@537 874
adamc@537 875 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leq \mt{datatype} \; x = \mt{datatype} \; M'.z'}{
adamc@537 876 \Gamma \vdash M.z \equiv M'.z'
adamc@537 877 }$$
adamc@537 878
adamc@537 879 $$\infer{\Gamma \vdash \mt{val} \; x : \tau_1 \leq \mt{val} \; x : \tau_2}{
adamc@537 880 \Gamma \vdash \tau_1 \equiv \tau_2
adamc@537 881 }
adamc@537 882 \quad \infer{\Gamma \vdash \mt{structure} \; X : S_1 \leq \mt{structure} \; X : S_2}{
adamc@537 883 \Gamma \vdash S_1 \leq S_2
adamc@537 884 }
adamc@537 885 \quad \infer{\Gamma \vdash \mt{signature} \; X = S_1 \leq \mt{signature} \; X = S_2}{
adamc@537 886 \Gamma \vdash S_1 \leq S_2
adamc@537 887 & \Gamma \vdash S_2 \leq S_1
adamc@537 888 }$$
adamc@537 889
adamc@537 890 $$\infer{\Gamma \vdash \mt{constraint} \; c_1 \sim c_2 \leq \mt{constraint} \; c'_1 \sim c'_2}{
adamc@537 891 \Gamma \vdash c_1 \equiv c'_1
adamc@537 892 & \Gamma \vdash c_2 \equiv c'_2
adamc@537 893 }$$
adamc@537 894
adamc@537 895 $$\infer{\Gamma \vdash \mt{class} \; x \leq \mt{class} \; x}{}
adamc@537 896 \quad \infer{\Gamma \vdash \mt{class} \; x = c \leq \mt{class} \; x}{}
adamc@537 897 \quad \infer{\Gamma \vdash \mt{class} \; x = c_1 \leq \mt{class} \; x = c_2}{
adamc@537 898 \Gamma \vdash c_1 \equiv c_2
adamc@537 899 }$$
adamc@537 900
adamc@538 901 \subsection{Module Typing}
adamc@538 902
adamc@538 903 We use a helper function $\mt{sigOf}$, which converts declarations and sequences of declarations into their principal signature items and sequences of signature items, respectively.
adamc@538 904
adamc@538 905 $$\infer{\Gamma \vdash M : S}{
adamc@538 906 \Gamma \vdash M : S'
adamc@538 907 & \Gamma \vdash S' \leq S
adamc@538 908 }
adamc@538 909 \quad \infer{\Gamma \vdash \mt{struct} \; \overline{d} \; \mt{end} : \mt{sig} \; \mt{sigOf}(\overline{d}) \; \mt{end}}{
adamc@538 910 \Gamma \vdash \overline{d} \leadsto \Gamma'
adamc@538 911 }
adamc@538 912 \quad \infer{\Gamma \vdash X : S}{
adamc@538 913 X : S \in \Gamma
adamc@538 914 }$$
adamc@538 915
adamc@538 916 $$\infer{\Gamma \vdash M.X : S}{
adamc@538 917 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@538 918 & \mt{proj}(M, \overline{s}, \mt{structure} \; X) = S
adamc@538 919 }$$
adamc@538 920
adamc@538 921 $$\infer{\Gamma \vdash M_1(M_2) : [X \mapsto M_2]S_2}{
adamc@538 922 \Gamma \vdash M_1 : \mt{functor}(X : S_1) : S_2
adamc@538 923 & \Gamma \vdash M_2 : S_1
adamc@538 924 }
adamc@538 925 \quad \infer{\Gamma \vdash \mt{functor} (X : S_1) : S_2 = M : \mt{functor} (X : S_1) : S_2}{
adamc@538 926 \Gamma \vdash S_1
adamc@538 927 & \Gamma, X : S_1 \vdash S_2
adamc@538 928 & \Gamma, X : S_1 \vdash M : S_2
adamc@538 929 }$$
adamc@538 930
adamc@538 931 \begin{eqnarray*}
adamc@538 932 \mt{sigOf}(\cdot) &=& \cdot \\
adamc@538 933 \mt{sigOf}(s \; \overline{s'}) &=& \mt{sigOf}(s) \; \mt{sigOf}(\overline{s'}) \\
adamc@538 934 \\
adamc@538 935 \mt{sigOf}(\mt{con} \; x :: \kappa = c) &=& \mt{con} \; x :: \kappa = c \\
adamc@538 936 \mt{sigOf}(\mt{datatype} \; x \; \overline{y} = \overline{dc}) &=& \mt{datatype} \; x \; \overline{y} = \overline{dc} \\
adamc@538 937 \mt{sigOf}(\mt{datatype} \; x = \mt{datatype} \; M.z) &=& \mt{datatype} \; x = \mt{datatype} \; M.z \\
adamc@538 938 \mt{sigOf}(\mt{val} \; x : \tau = e) &=& \mt{val} \; x : \tau \\
adamc@538 939 \mt{sigOf}(\mt{val} \; \mt{rec} \; \overline{x : \tau = e}) &=& \overline{\mt{val} \; x : \tau} \\
adamc@538 940 \mt{sigOf}(\mt{structure} \; X : S = M) &=& \mt{structure} \; X : S \\
adamc@538 941 \mt{sigOf}(\mt{signature} \; X = S) &=& \mt{signature} \; X = S \\
adamc@538 942 \mt{sigOf}(\mt{open} \; M) &=& \mt{include} \; S \textrm{ (where $\Gamma \vdash M : S$)} \\
adamc@538 943 \mt{sigOf}(\mt{constraint} \; c_1 \sim c_2) &=& \mt{constraint} \; c_1 \sim c_2 \\
adamc@538 944 \mt{sigOf}(\mt{open} \; \mt{constraints} \; M) &=& \cdot \\
adamc@538 945 \mt{sigOf}(\mt{table} \; x : c) &=& \mt{table} \; x : c \\
adamc@538 946 \mt{sigOf}(\mt{sequence} \; x) &=& \mt{sequence} \; x \\
adamc@538 947 \mt{sigOf}(\mt{cookie} \; x : \tau) &=& \mt{cookie} \; x : \tau \\
adamc@538 948 \mt{sigOf}(\mt{class} \; x = c) &=& \mt{class} \; x = c \\
adamc@538 949 \end{eqnarray*}
adamc@539 950 \begin{eqnarray*}
adamc@539 951 \mt{selfify}(M, \cdot) &=& \cdot \\
adamc@558 952 \mt{selfify}(M, s \; \overline{s'}) &=& \mt{selfify}(M, s) \; \mt{selfify}(M, \overline{s'}) \\
adamc@539 953 \\
adamc@539 954 \mt{selfify}(M, \mt{con} \; x :: \kappa) &=& \mt{con} \; x :: \kappa = M.x \\
adamc@539 955 \mt{selfify}(M, \mt{con} \; x :: \kappa = c) &=& \mt{con} \; x :: \kappa = c \\
adamc@539 956 \mt{selfify}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc}) &=& \mt{datatype} \; x \; \overline{y} = \mt{datatype} \; M.x \\
adamc@539 957 \mt{selfify}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z) &=& \mt{datatype} \; x = \mt{datatype} \; M'.z \\
adamc@539 958 \mt{selfify}(M, \mt{val} \; x : \tau) &=& \mt{val} \; x : \tau \\
adamc@539 959 \mt{selfify}(M, \mt{structure} \; X : S) &=& \mt{structure} \; X : \mt{selfify}(M.X, \overline{s}) \textrm{ (where $\Gamma \vdash S \equiv \mt{sig} \; \overline{s} \; \mt{end}$)} \\
adamc@539 960 \mt{selfify}(M, \mt{signature} \; X = S) &=& \mt{signature} \; X = S \\
adamc@539 961 \mt{selfify}(M, \mt{include} \; S) &=& \mt{include} \; S \\
adamc@539 962 \mt{selfify}(M, \mt{constraint} \; c_1 \sim c_2) &=& \mt{constraint} \; c_1 \sim c_2 \\
adamc@539 963 \mt{selfify}(M, \mt{class} \; x) &=& \mt{class} \; x = M.x \\
adamc@539 964 \mt{selfify}(M, \mt{class} \; x = c) &=& \mt{class} \; x = c \\
adamc@539 965 \end{eqnarray*}
adamc@539 966
adamc@540 967 \subsection{Module Projection}
adamc@540 968
adamc@540 969 \begin{eqnarray*}
adamc@540 970 \mt{proj}(M, \mt{con} \; x :: \kappa \; \overline{s}, \mt{con} \; x) &=& \kappa \\
adamc@540 971 \mt{proj}(M, \mt{con} \; x :: \kappa = c \; \overline{s}, \mt{con} \; x) &=& (\kappa, c) \\
adamc@540 972 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{con} \; x) &=& \mt{Type}^{\mt{len}(\overline{y})} \to \mt{Type} \\
adamc@540 973 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z \; \overline{s}, \mt{con} \; x) &=& (\mt{Type}^{\mt{len}(\overline{y})} \to \mt{Type}, M'.z) \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$} \\
adamc@540 974 && \textrm{and $\mt{proj}(M', \overline{s'}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})$)} \\
adamc@540 975 \mt{proj}(M, \mt{class} \; x \; \overline{s}, \mt{con} \; x) &=& \mt{Type} \to \mt{Type} \\
adamc@540 976 \mt{proj}(M, \mt{class} \; x = c \; \overline{s}, \mt{con} \; x) &=& (\mt{Type} \to \mt{Type}, c) \\
adamc@540 977 \\
adamc@540 978 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{datatype} \; x) &=& (\overline{y}, \overline{dc}) \\
adamc@540 979 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z \; \overline{s}, \mt{con} \; x) &=& \mt{proj}(M', \overline{s'}, \mt{datatype} \; z) \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$)} \\
adamc@540 980 \\
adamc@540 981 \mt{proj}(M, \mt{val} \; x : \tau \; \overline{s}, \mt{val} \; x) &=& \tau \\
adamc@540 982 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to M.x \; \overline y \textrm{ (where $X \in \overline{dc}$)} \\
adamc@540 983 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to \tau \to M.x \; \overline y \textrm{ (where $X \; \mt{of} \; \tau \in \overline{dc}$)} \\
adamc@540 984 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to M.x \; \overline y \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$} \\
adamc@540 985 && \textrm{and $\mt{proj}(M', \overline{s'}, \mt{datatype} \; z = (\overline{y}, \overline{dc})$ and $X \in \overline{dc}$)} \\
adamc@540 986 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to \tau \to M.x \; \overline y \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$} \\
adamc@558 987 && \textrm{and $\mt{proj}(M', \overline{s'}, \mt{datatype} \; z = (\overline{y}, \overline{dc})$ and $X \; \mt{of} \; \tau \in \overline{dc}$)} \\
adamc@540 988 \\
adamc@540 989 \mt{proj}(M, \mt{structure} \; X : S \; \overline{s}, \mt{structure} \; X) &=& S \\
adamc@540 990 \\
adamc@540 991 \mt{proj}(M, \mt{signature} \; X = S \; \overline{s}, \mt{signature} \; X) &=& S \\
adamc@540 992 \\
adamc@540 993 \mt{proj}(M, \mt{con} \; x :: \kappa \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540 994 \mt{proj}(M, \mt{con} \; x :: \kappa = c \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540 995 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540 996 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540 997 \mt{proj}(M, \mt{val} \; x : \tau \; \overline{s}, V) &=& \mt{proj}(M, \overline{s}, V) \\
adamc@540 998 \mt{proj}(M, \mt{structure} \; X : S \; \overline{s}, V) &=& [X \mapsto M.X]\mt{proj}(M, \overline{s}, V) \\
adamc@540 999 \mt{proj}(M, \mt{signature} \; X = S \; \overline{s}, V) &=& [X \mapsto M.X]\mt{proj}(M, \overline{s}, V) \\
adamc@540 1000 \mt{proj}(M, \mt{include} \; S \; \overline{s}, V) &=& \mt{proj}(M, \overline{s'} \; \overline{s}, V) \textrm{ (where $\Gamma \vdash S \equiv \mt{sig} \; \overline{s'} \; \mt{end}$)} \\
adamc@540 1001 \mt{proj}(M, \mt{constraint} \; c_1 \sim c_2 \; \overline{s}, V) &=& \mt{proj}(M, \overline{s}, V) \\
adamc@540 1002 \mt{proj}(M, \mt{class} \; x \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540 1003 \mt{proj}(M, \mt{class} \; x = c \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540 1004 \end{eqnarray*}
adamc@540 1005
adamc@541 1006
adamc@541 1007 \section{Type Inference}
adamc@541 1008
adamc@541 1009 The Ur/Web compiler uses \emph{heuristic type inference}, with no claims of completeness with respect to the declarative specification of the last section. The rules in use seem to work well in practice. This section summarizes those rules, to help Ur programmers predict what will work and what won't.
adamc@541 1010
adamc@541 1011 \subsection{Basic Unification}
adamc@541 1012
adamc@560 1013 Type-checkers for languages based on the Hindley-Milner type discipline, like ML and Haskell, take advantage of \emph{principal typing} properties, making complete type inference relatively straightforward. Inference algorithms are traditionally implemented using type unification variables, at various points asserting equalities between types, in the process discovering the values of type variables. The Ur/Web compiler uses the same basic strategy, but the complexity of the type system rules out easy completeness.
adamc@541 1014
adamc@560 1015 Type-checking can require evaluating recursive functional programs, thanks to the type-level $\mt{fold}$ operator. When a unification variable appears in such a type, the next step of computation can be undetermined. The value of that variable might be determined later, but this would be ``too late'' for the unification problems generated at the first occurrence. This is the essential source of incompleteness.
adamc@541 1016
adamc@541 1017 Nonetheless, the unification engine tends to do reasonably well. Unlike in ML, polymorphism is never inferred in definitions; it must be indicated explicitly by writing out constructor-level parameters. By writing these and other annotations, the programmer can generally get the type inference engine to do most of the type reconstruction work.
adamc@541 1018
adamc@541 1019 \subsection{Unifying Record Types}
adamc@541 1020
adamc@570 1021 The type inference engine tries to take advantage of the algebraic rules governing type-level records, as shown in Section \ref{definitional}. When two constructors of record kind are unified, they are reduced to normal forms, with like terms crossed off from each normal form until, hopefully, nothing remains. This cannot be complete, with the inclusion of unification variables. The type-checker can help you understand what goes wrong when the process fails, as it outputs the unmatched remainders of the two normal forms.
adamc@541 1022
adamc@541 1023 \subsection{\label{typeclasses}Type Classes}
adamc@541 1024
adamc@541 1025 Ur includes a type class facility inspired by Haskell's. The current version is very rudimentary, only supporting instances for particular types built up from abstract types and datatypes and type-level application.
adamc@541 1026
adamc@541 1027 Type classes are integrated with the module system. A type class is just a constructor of kind $\mt{Type} \to \mt{Type}$. By marking such a constructor $c$ as a type class, the programmer instructs the type inference engine to, in each scope, record all values of types $c \; \tau$ as \emph{instances}. Any function argument whose type is of such a form is treated as implicit, to be determined by examining the current instance database.
adamc@541 1028
adamc@541 1029 The ``dictionary encoding'' often used in Haskell implementations is made explicit in Ur. Type class instances are just properly-typed values, and they can also be considered as ``proofs'' of membership in the class. In some cases, it is useful to pass these proofs around explicitly. An underscore written where a proof is expected will also be inferred, if possible, from the current instance database.
adamc@541 1030
adamc@541 1031 Just as for types, type classes may be exported from modules, and they may be exported as concrete or abstract. Concrete type classes have their ``real'' definitions exposed, so that client code may add new instances freely. Abstract type classes are useful as ``predicates'' that can be used to enforce invariants, as we will see in some definitions of SQL syntax in the Ur/Web standard library.
adamc@541 1032
adamc@541 1033 \subsection{Reverse-Engineering Record Types}
adamc@541 1034
adamc@541 1035 It's useful to write Ur functions and functors that take record constructors as inputs, but these constructors can grow quite long, even though their values are often implied by other arguments. The compiler uses a simple heuristic to infer the values of unification variables that are folded over, yielding known results. Often, as in the case of $\mt{map}$-like folds, the base and recursive cases of a fold produce constructors with different top-level structure. Thus, if the result of the fold is known, examining its top-level structure reveals whether the record being folded over is empty or not. If it's empty, we're done; if it's not empty, we replace a single unification variable with a new constructor formed from three new unification variables, as in $[\alpha = \beta] \rc \gamma$. This process can often be repeated to determine a unification variable fully.
adamc@541 1036
adamc@541 1037 \subsection{Implicit Arguments in Functor Applications}
adamc@541 1038
adamc@541 1039 Constructor, constraint, and type class witness members of structures may be omitted, when those structures are used in contexts where their assigned signatures imply how to fill in those missing members. This feature combines well with reverse-engineering to allow for uses of complicated meta-programming functors with little more code than would be necessary to invoke an untyped, ad-hoc code generator.
adamc@541 1040
adamc@541 1041
adamc@542 1042 \section{The Ur Standard Library}
adamc@542 1043
adamc@542 1044 The built-in parts of the Ur/Web standard library are described by the signature in \texttt{lib/basis.urs} in the distribution. A module $\mt{Basis}$ ascribing to that signature is available in the initial environment, and every program is implicitly prefixed by $\mt{open} \; \mt{Basis}$.
adamc@542 1045
adamc@542 1046 Additionally, other common functions that are definable within Ur are included in \texttt{lib/top.urs} and \texttt{lib/top.ur}. This $\mt{Top}$ module is also opened implicitly.
adamc@542 1047
adamc@542 1048 The idea behind Ur is to serve as the ideal host for embedded domain-specific languages. For now, however, the ``generic'' functionality is intermixed with Ur/Web-specific functionality, including in these two library modules. We hope that these generic library components have types that speak for themselves. The next section introduces the Ur/Web-specific elements. Here, we only give the type declarations from the beginning of $\mt{Basis}$.
adamc@542 1049 $$\begin{array}{l}
adamc@542 1050 \mt{type} \; \mt{int} \\
adamc@542 1051 \mt{type} \; \mt{float} \\
adamc@542 1052 \mt{type} \; \mt{string} \\
adamc@542 1053 \mt{type} \; \mt{time} \\
adamc@542 1054 \\
adamc@542 1055 \mt{type} \; \mt{unit} = \{\} \\
adamc@542 1056 \\
adamc@542 1057 \mt{datatype} \; \mt{bool} = \mt{False} \mid \mt{True} \\
adamc@542 1058 \\
adamc@542 1059 \mt{datatype} \; \mt{option} \; \mt{t} = \mt{None} \mid \mt{Some} \; \mt{of} \; \mt{t}
adamc@542 1060 \end{array}$$
adamc@542 1061
adamc@542 1062
adamc@542 1063 \section{The Ur/Web Standard Library}
adamc@542 1064
adamc@542 1065 \subsection{Transactions}
adamc@542 1066
adamc@542 1067 Ur is a pure language; we use Haskell's trick to support controlled side effects. The standard library defines a monad $\mt{transaction}$, meant to stand for actions that may be undone cleanly. By design, no other kinds of actions are supported.
adamc@542 1068 $$\begin{array}{l}
adamc@542 1069 \mt{con} \; \mt{transaction} :: \mt{Type} \to \mt{Type} \\
adamc@542 1070 \\
adamc@542 1071 \mt{val} \; \mt{return} : \mt{t} ::: \mt{Type} \to \mt{t} \to \mt{transaction} \; \mt{t} \\
adamc@542 1072 \mt{val} \; \mt{bind} : \mt{t_1} ::: \mt{Type} \to \mt{t_2} ::: \mt{Type} \to \mt{transaction} \; \mt{t_1} \to (\mt{t_1} \to \mt{transaction} \; \mt{t_2}) \to \mt{transaction} \; \mt{t_2}
adamc@542 1073 \end{array}$$
adamc@542 1074
adamc@542 1075 \subsection{HTTP}
adamc@542 1076
adamc@542 1077 There are transactions for reading an HTTP header by name and for getting and setting strongly-typed cookies. Cookies may only be created by the $\mt{cookie}$ declaration form, ensuring that they be named consistently based on module structure.
adamc@542 1078 $$\begin{array}{l}
adamc@542 1079 \mt{val} \; \mt{requestHeader} : \mt{string} \to \mt{transaction} \; (\mt{option} \; \mt{string}) \\
adamc@542 1080 \\
adamc@542 1081 \mt{con} \; \mt{http\_cookie} :: \mt{Type} \to \mt{Type} \\
adamc@542 1082 \mt{val} \; \mt{getCookie} : \mt{t} ::: \mt{Type} \to \mt{http\_cookie} \; \mt{t} \to \mt{transaction} \; (\mt{option} \; \mt{t}) \\
adamc@542 1083 \mt{val} \; \mt{setCookie} : \mt{t} ::: \mt{Type} \to \mt{http\_cookie} \; \mt{t} \to \mt{t} \to \mt{transaction} \; \mt{unit}
adamc@542 1084 \end{array}$$
adamc@542 1085
adamc@543 1086 \subsection{SQL}
adamc@543 1087
adamc@543 1088 The fundamental unit of interest in the embedding of SQL is tables, described by a type family and creatable only via the $\mt{table}$ declaration form.
adamc@543 1089 $$\begin{array}{l}
adamc@543 1090 \mt{con} \; \mt{sql\_table} :: \{\mt{Type}\} \to \mt{Type}
adamc@543 1091 \end{array}$$
adamc@543 1092
adamc@543 1093 \subsubsection{Queries}
adamc@543 1094
adamc@543 1095 A final query is constructed via the $\mt{sql\_query}$ function. Constructor arguments respectively specify the table fields we select (as records mapping tables to the subsets of their fields that we choose) and the (always named) extra expressions that we select.
adamc@543 1096 $$\begin{array}{l}
adamc@543 1097 \mt{con} \; \mt{sql\_query} :: \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@543 1098 \mt{val} \; \mt{sql\_query} : \mt{tables} ::: \{\{\mt{Type}\}\} \\
adamc@543 1099 \hspace{.1in} \to \mt{selectedFields} ::: \{\{\mt{Type}\}\} \\
adamc@543 1100 \hspace{.1in} \to \mt{selectedExps} ::: \{\mt{Type}\} \\
adamc@543 1101 \hspace{.1in} \to \{\mt{Rows} : \mt{sql\_query1} \; \mt{tables} \; \mt{selectedFields} \; \mt{selectedExps}, \\
adamc@543 1102 \hspace{.2in} \mt{OrderBy} : \mt{sql\_order\_by} \; \mt{tables} \; \mt{selectedExps}, \\
adamc@543 1103 \hspace{.2in} \mt{Limit} : \mt{sql\_limit}, \\
adamc@543 1104 \hspace{.2in} \mt{Offset} : \mt{sql\_offset}\} \\
adamc@543 1105 \hspace{.1in} \to \mt{sql\_query} \; \mt{selectedFields} \; \mt{selectedExps}
adamc@543 1106 \end{array}$$
adamc@543 1107
adamc@545 1108 Queries are used by folding over their results inside transactions.
adamc@545 1109 $$\begin{array}{l}
adamc@545 1110 \mt{val} \; \mt{query} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \lambda [\mt{tables} \sim \mt{exps}] \Rightarrow \mt{state} ::: \mt{Type} \to \mt{sql\_query} \; \mt{tables} \; \mt{exps} \\
adamc@545 1111 \hspace{.1in} \to (\$(\mt{exps} \rc \mt{fold} \; (\lambda \mt{nm} \; (\mt{fields} :: \{\mt{Type}\}) \; \mt{acc} \; [[\mt{nm}] \sim \mt{acc}] \Rightarrow [\mt{nm} = \$\mt{fields}] \rc \mt{acc}) \; [] \; \mt{tables}) \\
adamc@545 1112 \hspace{.2in} \to \mt{state} \to \mt{transaction} \; \mt{state}) \\
adamc@545 1113 \hspace{.1in} \to \mt{state} \to \mt{transaction} \; \mt{state}
adamc@545 1114 \end{array}$$
adamc@545 1115
adamc@543 1116 Most of the complexity of the query encoding is in the type $\mt{sql\_query1}$, which includes simple queries and derived queries based on relational operators. Constructor arguments respectively specify the tables we select from, the subset of fields that we keep from each table for the result rows, and the extra expressions that we select.
adamc@543 1117 $$\begin{array}{l}
adamc@543 1118 \mt{con} \; \mt{sql\_query1} :: \{\{\mt{Type}\}\} \to \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@543 1119 \\
adamc@543 1120 \mt{type} \; \mt{sql\_relop} \\
adamc@543 1121 \mt{val} \; \mt{sql\_union} : \mt{sql\_relop} \\
adamc@543 1122 \mt{val} \; \mt{sql\_intersect} : \mt{sql\_relop} \\
adamc@543 1123 \mt{val} \; \mt{sql\_except} : \mt{sql\_relop} \\
adamc@543 1124 \mt{val} \; \mt{sql\_relop} : \mt{tables1} ::: \{\{\mt{Type}\}\} \\
adamc@543 1125 \hspace{.1in} \to \mt{tables2} ::: \{\{\mt{Type}\}\} \\
adamc@543 1126 \hspace{.1in} \to \mt{selectedFields} ::: \{\{\mt{Type}\}\} \\
adamc@543 1127 \hspace{.1in} \to \mt{selectedExps} ::: \{\mt{Type}\} \\
adamc@543 1128 \hspace{.1in} \to \mt{sql\_relop} \\
adamc@543 1129 \hspace{.1in} \to \mt{sql\_query1} \; \mt{tables1} \; \mt{selectedFields} \; \mt{selectedExps} \\
adamc@543 1130 \hspace{.1in} \to \mt{sql\_query1} \; \mt{tables2} \; \mt{selectedFields} \; \mt{selectedExps} \\
adamc@543 1131 \hspace{.1in} \to \mt{sql\_query1} \; \mt{selectedFields} \; \mt{selectedFields} \; \mt{selectedExps}
adamc@543 1132 \end{array}$$
adamc@543 1133
adamc@543 1134 $$\begin{array}{l}
adamc@543 1135 \mt{val} \; \mt{sql\_query1} : \mt{tables} ::: \{\{\mt{Type}\}\} \\
adamc@543 1136 \hspace{.1in} \to \mt{grouped} ::: \{\{\mt{Type}\}\} \\
adamc@543 1137 \hspace{.1in} \to \mt{selectedFields} ::: \{\{\mt{Type}\}\} \\
adamc@543 1138 \hspace{.1in} \to \mt{selectedExps} ::: \{\mt{Type}\} \\
adamc@543 1139 \hspace{.1in} \to \{\mt{From} : \$(\mt{fold} \; (\lambda \mt{nm} \; (\mt{fields} :: \{\mt{Type}\}) \; \mt{acc} \; [[\mt{nm}] \sim \mt{acc}] \Rightarrow [\mt{nm} = \mt{sql\_table} \; \mt{fields}] \rc \mt{acc}) \; [] \; \mt{tables}), \\
adamc@543 1140 \hspace{.2in} \mt{Where} : \mt{sql\_exp} \; \mt{tables} \; [] \; [] \; \mt{bool}, \\
adamc@543 1141 \hspace{.2in} \mt{GroupBy} : \mt{sql\_subset} \; \mt{tables} \; \mt{grouped}, \\
adamc@543 1142 \hspace{.2in} \mt{Having} : \mt{sql\_exp} \; \mt{grouped} \; \mt{tables} \; [] \; \mt{bool}, \\
adamc@543 1143 \hspace{.2in} \mt{SelectFields} : \mt{sql\_subset} \; \mt{grouped} \; \mt{selectedFields}, \\
adamc@543 1144 \hspace{.2in} \mt {SelectExps} : \$(\mt{fold} \; (\lambda \mt{nm} \; (\mt{t} :: \mt{Type}) \; \mt{acc} \; [[\mt{nm}] \sim \mt{acc}] \Rightarrow [\mt{nm} = \mt{sql\_exp} \; \mt{grouped} \; \mt{tables} \; [] \; \mt{t}] \rc \mt{acc}) \; [] \; \mt{selectedExps}) \} \\
adamc@543 1145 \hspace{.1in} \to \mt{sql\_query1} \; \mt{tables} \; \mt{selectedFields} \; \mt{selectedExps}
adamc@543 1146 \end{array}$$
adamc@543 1147
adamc@543 1148 To encode projection of subsets of fields in $\mt{SELECT}$ clauses, and to encode $\mt{GROUP} \; \mt{BY}$ clauses, we rely on a type family $\mt{sql\_subset}$, capturing what it means for one record of table fields to be a subset of another. The main constructor $\mt{sql\_subset}$ ``proves subset facts'' by requiring a split of a record into kept and dropped parts. The extra constructor $\mt{sql\_subset\_all}$ is a convenience for keeping all fields of a record.
adamc@543 1149 $$\begin{array}{l}
adamc@543 1150 \mt{con} \; \mt{sql\_subset} :: \{\{\mt{Type}\}\} \to \{\{\mt{Type}\}\} \to \mt{Type} \\
adamc@543 1151 \mt{val} \; \mt{sql\_subset} : \mt{keep\_drop} :: \{(\{\mt{Type}\} \times \{\mt{Type}\})\} \\
adamc@543 1152 \hspace{.1in} \to \mt{sql\_subset} \\
adamc@543 1153 \hspace{.2in} (\mt{fold} \; (\lambda \mt{nm} \; (\mt{fields} :: (\{\mt{Type}\} \times \{\mt{Type}\})) \; \mt{acc} \; [[\mt{nm}] \sim \mt{acc}] \; [\mt{fields}.1 \sim \mt{fields}.2] \Rightarrow \\
adamc@543 1154 \hspace{.3in} [\mt{nm} = \mt{fields}.1 \rc \mt{fields}.2] \rc \mt{acc}) \; [] \; \mt{keep\_drop}) \\
adamc@543 1155 \hspace{.2in} (\mt{fold} \; (\lambda \mt{nm} \; (\mt{fields} :: (\{\mt{Type}\} \times \{\mt{Type}\})) \; \mt{acc} \; [[\mt{nm}] \sim \mt{acc}] \Rightarrow [\mt{nm} = \mt{fields}.1] \rc \mt{acc}) \; [] \; \mt{keep\_drop}) \\
adamc@543 1156 \mt{val} \; \mt{sql\_subset\_all} : \mt{tables} :: \{\{\mt{Type}\}\} \to \mt{sql\_subset} \; \mt{tables} \; \mt{tables}
adamc@543 1157 \end{array}$$
adamc@543 1158
adamc@560 1159 SQL expressions are used in several places, including $\mt{SELECT}$, $\mt{WHERE}$, $\mt{HAVING}$, and $\mt{ORDER} \; \mt{BY}$ clauses. They reify a fragment of the standard SQL expression language, while making it possible to inject ``native'' Ur values in some places. The arguments to the $\mt{sql\_exp}$ type family respectively give the unrestricted-availability table fields, the table fields that may only be used in arguments to aggregate functions, the available selected expressions, and the type of the expression.
adamc@543 1160 $$\begin{array}{l}
adamc@543 1161 \mt{con} \; \mt{sql\_exp} :: \{\{\mt{Type}\}\} \to \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \to \mt{Type}
adamc@543 1162 \end{array}$$
adamc@543 1163
adamc@543 1164 Any field in scope may be converted to an expression.
adamc@543 1165 $$\begin{array}{l}
adamc@543 1166 \mt{val} \; \mt{sql\_field} : \mt{otherTabs} ::: \{\{\mt{Type}\}\} \to \mt{otherFields} ::: \{\mt{Type}\} \\
adamc@543 1167 \hspace{.1in} \to \mt{fieldType} ::: \mt{Type} \to \mt{agg} ::: \{\{\mt{Type}\}\} \\
adamc@543 1168 \hspace{.1in} \to \mt{exps} ::: \{\mt{Type}\} \\
adamc@543 1169 \hspace{.1in} \to \mt{tab} :: \mt{Name} \to \mt{field} :: \mt{Name} \\
adamc@543 1170 \hspace{.1in} \to \mt{sql\_exp} \; ([\mt{tab} = [\mt{field} = \mt{fieldType}] \rc \mt{otherFields}] \rc \mt{otherTabs}) \; \mt{agg} \; \mt{exps} \; \mt{fieldType}
adamc@543 1171 \end{array}$$
adamc@543 1172
adamc@544 1173 There is an analogous function for referencing named expressions.
adamc@544 1174 $$\begin{array}{l}
adamc@544 1175 \mt{val} \; \mt{sql\_exp} : \mt{tabs} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{t} ::: \mt{Type} \to \mt{rest} ::: \{\mt{Type}\} \to \mt{nm} :: \mt{Name} \\
adamc@544 1176 \hspace{.1in} \to \mt{sql\_exp} \; \mt{tabs} \; \mt{agg} \; ([\mt{nm} = \mt{t}] \rc \mt{rest}) \; \mt{t}
adamc@544 1177 \end{array}$$
adamc@544 1178
adamc@544 1179 Ur values of appropriate types may be injected into SQL expressions.
adamc@544 1180 $$\begin{array}{l}
adamc@544 1181 \mt{class} \; \mt{sql\_injectable} \\
adamc@544 1182 \mt{val} \; \mt{sql\_bool} : \mt{sql\_injectable} \; \mt{bool} \\
adamc@544 1183 \mt{val} \; \mt{sql\_int} : \mt{sql\_injectable} \; \mt{int} \\
adamc@544 1184 \mt{val} \; \mt{sql\_float} : \mt{sql\_injectable} \; \mt{float} \\
adamc@544 1185 \mt{val} \; \mt{sql\_string} : \mt{sql\_injectable} \; \mt{string} \\
adamc@544 1186 \mt{val} \; \mt{sql\_time} : \mt{sql\_injectable} \; \mt{time} \\
adamc@544 1187 \mt{val} \; \mt{sql\_option\_bool} : \mt{sql\_injectable} \; (\mt{option} \; \mt{bool}) \\
adamc@544 1188 \mt{val} \; \mt{sql\_option\_int} : \mt{sql\_injectable} \; (\mt{option} \; \mt{int}) \\
adamc@544 1189 \mt{val} \; \mt{sql\_option\_float} : \mt{sql\_injectable} \; (\mt{option} \; \mt{float}) \\
adamc@544 1190 \mt{val} \; \mt{sql\_option\_string} : \mt{sql\_injectable} \; (\mt{option} \; \mt{string}) \\
adamc@544 1191 \mt{val} \; \mt{sql\_option\_time} : \mt{sql\_injectable} \; (\mt{option} \; \mt{time}) \\
adamc@544 1192 \mt{val} \; \mt{sql\_inject} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \to \mt{sql\_injectable} \; \mt{t} \\
adamc@544 1193 \hspace{.1in} \to \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t}
adamc@544 1194 \end{array}$$
adamc@544 1195
adamc@544 1196 We have the SQL nullness test, which is necessary because of the strange SQL semantics of equality in the presence of null values.
adamc@544 1197 $$\begin{array}{l}
adamc@544 1198 \mt{val} \; \mt{sql\_is\_null} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544 1199 \hspace{.1in} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; (\mt{option} \; \mt{t}) \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{bool}
adamc@544 1200 \end{array}$$
adamc@544 1201
adamc@559 1202 We have generic nullary, unary, and binary operators.
adamc@544 1203 $$\begin{array}{l}
adamc@544 1204 \mt{con} \; \mt{sql\_nfunc} :: \mt{Type} \to \mt{Type} \\
adamc@544 1205 \mt{val} \; \mt{sql\_current\_timestamp} : \mt{sql\_nfunc} \; \mt{time} \\
adamc@544 1206 \mt{val} \; \mt{sql\_nfunc} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544 1207 \hspace{.1in} \to \mt{sql\_nfunc} \; \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t} \\\end{array}$$
adamc@544 1208
adamc@544 1209 $$\begin{array}{l}
adamc@544 1210 \mt{con} \; \mt{sql\_unary} :: \mt{Type} \to \mt{Type} \to \mt{Type} \\
adamc@544 1211 \mt{val} \; \mt{sql\_not} : \mt{sql\_unary} \; \mt{bool} \; \mt{bool} \\
adamc@544 1212 \mt{val} \; \mt{sql\_unary} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{arg} ::: \mt{Type} \to \mt{res} ::: \mt{Type} \\
adamc@544 1213 \hspace{.1in} \to \mt{sql\_unary} \; \mt{arg} \; \mt{res} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{arg} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{res} \\
adamc@544 1214 \end{array}$$
adamc@544 1215
adamc@544 1216 $$\begin{array}{l}
adamc@544 1217 \mt{con} \; \mt{sql\_binary} :: \mt{Type} \to \mt{Type} \to \mt{Type} \to \mt{Type} \\
adamc@544 1218 \mt{val} \; \mt{sql\_and} : \mt{sql\_binary} \; \mt{bool} \; \mt{bool} \; \mt{bool} \\
adamc@544 1219 \mt{val} \; \mt{sql\_or} : \mt{sql\_binary} \; \mt{bool} \; \mt{bool} \; \mt{bool} \\
adamc@544 1220 \mt{val} \; \mt{sql\_binary} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{arg_1} ::: \mt{Type} \to \mt{arg_2} ::: \mt{Type} \to \mt{res} ::: \mt{Type} \\
adamc@544 1221 \hspace{.1in} \to \mt{sql\_binary} \; \mt{arg_1} \; \mt{arg_2} \; \mt{res} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{arg_1} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{arg_2} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{res}
adamc@544 1222 \end{array}$$
adamc@544 1223
adamc@544 1224 $$\begin{array}{l}
adamc@559 1225 \mt{class} \; \mt{sql\_arith} \\
adamc@559 1226 \mt{val} \; \mt{sql\_int\_arith} : \mt{sql\_arith} \; \mt{int} \\
adamc@559 1227 \mt{val} \; \mt{sql\_float\_arith} : \mt{sql\_arith} \; \mt{float} \\
adamc@559 1228 \mt{val} \; \mt{sql\_neg} : \mt{t} ::: \mt{Type} \to \mt{sql\_arith} \; \mt{t} \to \mt{sql\_unary} \; \mt{t} \; \mt{t} \\
adamc@559 1229 \mt{val} \; \mt{sql\_plus} : \mt{t} ::: \mt{Type} \to \mt{sql\_arith} \; \mt{t} \to \mt{sql\_binary} \; \mt{t} \; \mt{t} \; \mt{t} \\
adamc@559 1230 \mt{val} \; \mt{sql\_minus} : \mt{t} ::: \mt{Type} \to \mt{sql\_arith} \; \mt{t} \to \mt{sql\_binary} \; \mt{t} \; \mt{t} \; \mt{t} \\
adamc@559 1231 \mt{val} \; \mt{sql\_times} : \mt{t} ::: \mt{Type} \to \mt{sql\_arith} \; \mt{t} \to \mt{sql\_binary} \; \mt{t} \; \mt{t} \; \mt{t} \\
adamc@559 1232 \mt{val} \; \mt{sql\_div} : \mt{t} ::: \mt{Type} \to \mt{sql\_arith} \; \mt{t} \to \mt{sql\_binary} \; \mt{t} \; \mt{t} \; \mt{t} \\
adamc@559 1233 \mt{val} \; \mt{sql\_mod} : \mt{sql\_binary} \; \mt{int} \; \mt{int} \; \mt{int}
adamc@559 1234 \end{array}$$
adamc@544 1235
adamc@544 1236 Finally, we have aggregate functions. The $\mt{COUNT(\ast)}$ syntax is handled specially, since it takes no real argument. The other aggregate functions are placed into a general type family, using type classes to restrict usage to properly-typed arguments. The key aspect of the $\mt{sql\_aggregate}$ function's type is the shift of aggregate-function-only fields into unrestricted fields.
adamc@544 1237
adamc@544 1238 $$\begin{array}{l}
adamc@544 1239 \mt{val} \; \mt{sql\_count} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{int}
adamc@544 1240 \end{array}$$
adamc@544 1241
adamc@544 1242 $$\begin{array}{l}
adamc@544 1243 \mt{con} \; \mt{sql\_aggregate} :: \mt{Type} \to \mt{Type} \\
adamc@544 1244 \mt{val} \; \mt{sql\_aggregate} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544 1245 \hspace{.1in} \to \mt{sql\_aggregate} \; \mt{t} \to \mt{sql\_exp} \; \mt{agg} \; \mt{agg} \; \mt{exps} \; \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t}
adamc@544 1246 \end{array}$$
adamc@544 1247
adamc@544 1248 $$\begin{array}{l}
adamc@544 1249 \mt{class} \; \mt{sql\_summable} \\
adamc@544 1250 \mt{val} \; \mt{sql\_summable\_int} : \mt{sql\_summable} \; \mt{int} \\
adamc@544 1251 \mt{val} \; \mt{sql\_summable\_float} : \mt{sql\_summable} \; \mt{float} \\
adamc@544 1252 \mt{val} \; \mt{sql\_avg} : \mt{t} ::: \mt{Type} \to \mt{sql\_summable} \; \mt{t} \to \mt{sql\_aggregate} \; \mt{t} \\
adamc@544 1253 \mt{val} \; \mt{sql\_sum} : \mt{t} ::: \mt{Type} \to \mt{sql\_summable} \mt{t} \to \mt{sql\_aggregate} \; \mt{t}
adamc@544 1254 \end{array}$$
adamc@544 1255
adamc@544 1256 $$\begin{array}{l}
adamc@544 1257 \mt{class} \; \mt{sql\_maxable} \\
adamc@544 1258 \mt{val} \; \mt{sql\_maxable\_int} : \mt{sql\_maxable} \; \mt{int} \\
adamc@544 1259 \mt{val} \; \mt{sql\_maxable\_float} : \mt{sql\_maxable} \; \mt{float} \\
adamc@544 1260 \mt{val} \; \mt{sql\_maxable\_string} : \mt{sql\_maxable} \; \mt{string} \\
adamc@544 1261 \mt{val} \; \mt{sql\_maxable\_time} : \mt{sql\_maxable} \; \mt{time} \\
adamc@544 1262 \mt{val} \; \mt{sql\_max} : \mt{t} ::: \mt{Type} \to \mt{sql\_maxable} \; \mt{t} \to \mt{sql\_aggregate} \; \mt{t} \\
adamc@544 1263 \mt{val} \; \mt{sql\_min} : \mt{t} ::: \mt{Type} \to \mt{sql\_maxable} \; \mt{t} \to \mt{sql\_aggregate} \; \mt{t}
adamc@544 1264 \end{array}$$
adamc@544 1265
adamc@544 1266 We wrap up the definition of query syntax with the types used in representing $\mt{ORDER} \; \mt{BY}$, $\mt{LIMIT}$, and $\mt{OFFSET}$ clauses.
adamc@544 1267 $$\begin{array}{l}
adamc@544 1268 \mt{type} \; \mt{sql\_direction} \\
adamc@544 1269 \mt{val} \; \mt{sql\_asc} : \mt{sql\_direction} \\
adamc@544 1270 \mt{val} \; \mt{sql\_desc} : \mt{sql\_direction} \\
adamc@544 1271 \\
adamc@544 1272 \mt{con} \; \mt{sql\_order\_by} :: \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@544 1273 \mt{val} \; \mt{sql\_order\_by\_Nil} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{exps} :: \{\mt{Type}\} \to \mt{sql\_order\_by} \; \mt{tables} \; \mt{exps} \\
adamc@544 1274 \mt{val} \; \mt{sql\_order\_by\_Cons} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544 1275 \hspace{.1in} \to \mt{sql\_exp} \; \mt{tables} \; [] \; \mt{exps} \; \mt{t} \to \mt{sql\_direction} \to \mt{sql\_order\_by} \; \mt{tables} \; \mt{exps} \to \mt{sql\_order\_by} \; \mt{tables} \; \mt{exps} \\
adamc@544 1276 \\
adamc@544 1277 \mt{type} \; \mt{sql\_limit} \\
adamc@544 1278 \mt{val} \; \mt{sql\_no\_limit} : \mt{sql\_limit} \\
adamc@544 1279 \mt{val} \; \mt{sql\_limit} : \mt{int} \to \mt{sql\_limit} \\
adamc@544 1280 \\
adamc@544 1281 \mt{type} \; \mt{sql\_offset} \\
adamc@544 1282 \mt{val} \; \mt{sql\_no\_offset} : \mt{sql\_offset} \\
adamc@544 1283 \mt{val} \; \mt{sql\_offset} : \mt{int} \to \mt{sql\_offset}
adamc@544 1284 \end{array}$$
adamc@544 1285
adamc@545 1286
adamc@545 1287 \subsubsection{DML}
adamc@545 1288
adamc@545 1289 The Ur/Web library also includes an embedding of a fragment of SQL's DML, the Data Manipulation Language, for modifying database tables. Any piece of DML may be executed in a transaction.
adamc@545 1290
adamc@545 1291 $$\begin{array}{l}
adamc@545 1292 \mt{type} \; \mt{dml} \\
adamc@545 1293 \mt{val} \; \mt{dml} : \mt{dml} \to \mt{transaction} \; \mt{unit}
adamc@545 1294 \end{array}$$
adamc@545 1295
adamc@545 1296 Properly-typed records may be used to form $\mt{INSERT}$ commands.
adamc@545 1297 $$\begin{array}{l}
adamc@545 1298 \mt{val} \; \mt{insert} : \mt{fields} ::: \{\mt{Type}\} \to \mt{sql\_table} \; \mt{fields} \\
adamc@545 1299 \hspace{.1in} \to \$(\mt{fold} \; (\lambda \mt{nm} \; (\mt{t} :: \mt{Type}) \; \mt{acc} \; [[\mt{nm}] \sim \mt{acc}] \Rightarrow [\mt{nm} = \mt{sql\_exp} \; [] \; [] \; [] \; \mt{t}] \rc \mt{acc}) \; [] \; \mt{fields}) \to \mt{dml}
adamc@545 1300 \end{array}$$
adamc@545 1301
adamc@545 1302 An $\mt{UPDATE}$ command is formed from a choice of which table fields to leave alone and which to change, along with an expression to use to compute the new value of each changed field and a $\mt{WHERE}$ clause.
adamc@545 1303 $$\begin{array}{l}
adamc@545 1304 \mt{val} \; \mt{update} : \mt{unchanged} ::: \{\mt{Type}\} \to \mt{changed} :: \{\mt{Type}\} \to \lambda [\mt{changed} \sim \mt{unchanged}] \\
adamc@545 1305 \hspace{.1in} \Rightarrow \$(\mt{fold} \; (\lambda \mt{nm} \; (\mt{t} :: \mt{Type}) \; \mt{acc} \; [[\mt{nm}] \sim \mt{acc}] \Rightarrow [\mt{nm} = \mt{sql\_exp} \; [\mt{T} = \mt{changed} \rc \mt{unchanged}] \; [] \; [] \; \mt{t}] \rc \mt{acc}) \; [] \; \mt{changed}) \\
adamc@545 1306 \hspace{.1in} \to \mt{sql\_table} \; (\mt{changed} \rc \mt{unchanged}) \to \mt{sql\_exp} \; [\mt{T} = \mt{changed} \rc \mt{unchanged}] \; [] \; [] \; \mt{bool} \to \mt{dml}
adamc@545 1307 \end{array}$$
adamc@545 1308
adamc@545 1309 A $\mt{DELETE}$ command is formed from a table and a $\mt{WHERE}$ clause.
adamc@545 1310 $$\begin{array}{l}
adamc@545 1311 \mt{val} \; \mt{delete} : \mt{fields} ::: \{\mt{Type}\} \to \mt{sql\_table} \; \mt{fields} \to \mt{sql\_exp} \; [\mt{T} = \mt{fields}] \; [] \; [] \; \mt{bool} \to \mt{dml}
adamc@545 1312 \end{array}$$
adamc@545 1313
adamc@546 1314 \subsubsection{Sequences}
adamc@546 1315
adamc@546 1316 SQL sequences are counters with concurrency control, often used to assign unique IDs. Ur/Web supports them via a simple interface. The only way to create a sequence is with the $\mt{sequence}$ declaration form.
adamc@546 1317
adamc@546 1318 $$\begin{array}{l}
adamc@546 1319 \mt{type} \; \mt{sql\_sequence} \\
adamc@546 1320 \mt{val} \; \mt{nextval} : \mt{sql\_sequence} \to \mt{transaction} \; \mt{int}
adamc@546 1321 \end{array}$$
adamc@546 1322
adamc@546 1323
adamc@547 1324 \subsection{XML}
adamc@547 1325
adamc@547 1326 Ur/Web's library contains an encoding of XML syntax and semantic constraints. We make no effort to follow the standards governing XML schemas. Rather, XML fragments are viewed more as values of ML datatypes, and we only track which tags are allowed inside which other tags.
adamc@547 1327
adamc@547 1328 The basic XML type family has arguments respectively indicating the \emph{context} of a fragment, the fields that the fragment expects to be bound on entry (and their types), and the fields that the fragment will bind (and their types). Contexts are a record-based ``poor man's subtyping'' encoding, with each possible set of valid tags corresponding to a different context record. The arguments dealing with field binding are only relevant to HTML forms.
adamc@547 1329 $$\begin{array}{l}
adamc@547 1330 \mt{con} \; \mt{xml} :: \{\mt{Unit}\} \to \{\mt{Type}\} \to \{\mt{Type}\} \to \mt{Type}
adamc@547 1331 \end{array}$$
adamc@547 1332
adamc@547 1333 We also have a type family of XML tags, indexed respectively by the record of optional attributes accepted by the tag, the context in which the tag may be placed, the context required of children of the tag, which form fields the tag uses, and which fields the tag defines.
adamc@547 1334 $$\begin{array}{l}
adamc@547 1335 \mt{con} \; \mt{tag} :: \{\mt{Type}\} \to \{\mt{Unit}\} \to \{\mt{Unit}\} \to \{\mt{Type}\} \to \{\mt{Type}\} \to \mt{Type}
adamc@547 1336 \end{array}$$
adamc@547 1337
adamc@547 1338 Literal text may be injected into XML as ``CDATA.''
adamc@547 1339 $$\begin{array}{l}
adamc@547 1340 \mt{val} \; \mt{cdata} : \mt{ctx} ::: \{\mt{Unit}\} \to \mt{use} ::: \{\mt{Type}\} \to \mt{string} \to \mt{xml} \; \mt{ctx} \; \mt{use} \; []
adamc@547 1341 \end{array}$$
adamc@547 1342
adamc@547 1343 There is a function for producing an XML tree with a particular tag at its root.
adamc@547 1344 $$\begin{array}{l}
adamc@547 1345 \mt{val} \; \mt{tag} : \mt{attrsGiven} ::: \{\mt{Type}\} \to \mt{attrsAbsent} ::: \{\mt{Type}\} \to \mt{ctxOuter} ::: \{\mt{Unit}\} \to \mt{ctxInner} ::: \{\mt{Unit}\} \\
adamc@547 1346 \hspace{.1in} \to \mt{useOuter} ::: \{\mt{Type}\} \to \mt{useInner} ::: \{\mt{Type}\} \to \mt{bindOuter} ::: \{\mt{Type}\} \to \mt{bindInner} ::: \{\mt{Type}\} \\
adamc@547 1347 \hspace{.1in} \to \lambda [\mt{attrsGiven} \sim \mt{attrsAbsent}] \; [\mt{useOuter} \sim \mt{useInner}] \; [\mt{bindOuter} \sim \mt{bindInner}] \Rightarrow \$\mt{attrsGiven} \\
adamc@547 1348 \hspace{.1in} \to \mt{tag} \; (\mt{attrsGiven} \rc \mt{attrsAbsent}) \; \mt{ctxOuter} \; \mt{ctxInner} \; \mt{useOuter} \; \mt{bindOuter} \\
adamc@547 1349 \hspace{.1in} \to \mt{xml} \; \mt{ctxInner} \; \mt{useInner} \; \mt{bindInner} \to \mt{xml} \; \mt{ctxOuter} \; (\mt{useOuter} \rc \mt{useInner}) \; (\mt{bindOuter} \rc \mt{bindInner})
adamc@547 1350 \end{array}$$
adamc@547 1351
adamc@547 1352 Two XML fragments may be concatenated.
adamc@547 1353 $$\begin{array}{l}
adamc@547 1354 \mt{val} \; \mt{join} : \mt{ctx} ::: \{\mt{Unit}\} \to \mt{use_1} ::: \{\mt{Type}\} \to \mt{bind_1} ::: \{\mt{Type}\} \to \mt{bind_2} ::: \{\mt{Type}\} \\
adamc@547 1355 \hspace{.1in} \to \lambda [\mt{use_1} \sim \mt{bind_1}] \; [\mt{bind_1} \sim \mt{bind_2}] \\
adamc@547 1356 \hspace{.1in} \Rightarrow \mt{xml} \; \mt{ctx} \; \mt{use_1} \; \mt{bind_1} \to \mt{xml} \; \mt{ctx} \; (\mt{use_1} \rc \mt{bind_1}) \; \mt{bind_2} \to \mt{xml} \; \mt{ctx} \; \mt{use_1} \; (\mt{bind_1} \rc \mt{bind_2})
adamc@547 1357 \end{array}$$
adamc@547 1358
adamc@547 1359 Finally, any XML fragment may be updated to ``claim'' to use more form fields than it does.
adamc@547 1360 $$\begin{array}{l}
adamc@547 1361 \mt{val} \; \mt{useMore} : \mt{ctx} ::: \{\mt{Unit}\} \to \mt{use_1} ::: \{\mt{Type}\} \to \mt{use_2} ::: \{\mt{Type}\} \to \mt{bind} ::: \{\mt{Type}\} \to \lambda [\mt{use_1} \sim \mt{use_2}] \\
adamc@547 1362 \hspace{.1in} \Rightarrow \mt{xml} \; \mt{ctx} \; \mt{use_1} \; \mt{bind} \to \mt{xml} \; \mt{ctx} \; (\mt{use_1} \rc \mt{use_2}) \; \mt{bind}
adamc@547 1363 \end{array}$$
adamc@547 1364
adamc@547 1365 We will not list here the different HTML tags and related functions from the standard library. They should be easy enough to understand from the code in \texttt{basis.urs}. The set of tags in the library is not yet claimed to be complete for HTML standards.
adamc@547 1366
adamc@547 1367 One last useful function is for aborting any page generation, returning some XML as an error message. This function takes the place of some uses of a general exception mechanism.
adamc@547 1368 $$\begin{array}{l}
adamc@547 1369 \mt{val} \; \mt{error} : \mt{t} ::: \mt{Type} \to \mt{xml} \; [\mt{Body}] \; [] \; [] \to \mt{t}
adamc@547 1370 \end{array}$$
adamc@547 1371
adamc@549 1372
adamc@549 1373 \section{Ur/Web Syntax Extensions}
adamc@549 1374
adamc@549 1375 Ur/Web features some syntactic shorthands for building values using the functions from the last section. This section sketches the grammar of those extensions. We write spans of syntax inside brackets to indicate that they are optional.
adamc@549 1376
adamc@549 1377 \subsection{SQL}
adamc@549 1378
adamc@549 1379 \subsubsection{Queries}
adamc@549 1380
adamc@550 1381 Queries $Q$ are added to the rules for expressions $e$.
adamc@550 1382
adamc@549 1383 $$\begin{array}{rrcll}
adamc@550 1384 \textrm{Queries} & Q &::=& (q \; [\mt{ORDER} \; \mt{BY} \; (E \; [o],)^+] \; [\mt{LIMIT} \; N] \; [\mt{OFFSET} \; N]) \\
adamc@549 1385 \textrm{Pre-queries} & q &::=& \mt{SELECT} \; P \; \mt{FROM} \; T,^+ \; [\mt{WHERE} \; E] \; [\mt{GROUP} \; \mt{BY} \; p,^+] \; [\mt{HAVING} \; E] \\
adamc@549 1386 &&& \mid q \; R \; q \\
adamc@549 1387 \textrm{Relational operators} & R &::=& \mt{UNION} \mid \mt{INTERSECT} \mid \mt{EXCEPT}
adamc@549 1388 \end{array}$$
adamc@549 1389
adamc@549 1390 $$\begin{array}{rrcll}
adamc@549 1391 \textrm{Projections} & P &::=& \ast & \textrm{all columns} \\
adamc@549 1392 &&& p,^+ & \textrm{particular columns} \\
adamc@549 1393 \textrm{Pre-projections} & p &::=& t.f & \textrm{one column from a table} \\
adamc@558 1394 &&& t.\{\{c\}\} & \textrm{a record of columns from a table (of kind $\{\mt{Type}\}$)} \\
adamc@549 1395 \textrm{Table names} & t &::=& x & \textrm{constant table name (automatically capitalized)} \\
adamc@549 1396 &&& X & \textrm{constant table name} \\
adamc@549 1397 &&& \{\{c\}\} & \textrm{computed table name (of kind $\mt{Name}$)} \\
adamc@549 1398 \textrm{Column names} & f &::=& X & \textrm{constant column name} \\
adamc@549 1399 &&& \{c\} & \textrm{computed column name (of kind $\mt{Name}$)} \\
adamc@549 1400 \textrm{Tables} & T &::=& x & \textrm{table variable, named locally by its own capitalization} \\
adamc@549 1401 &&& x \; \mt{AS} \; t & \textrm{table variable, with local name} \\
adamc@549 1402 &&& \{\{e\}\} \; \mt{AS} \; t & \textrm{computed table expression, with local name} \\
adamc@549 1403 \textrm{SQL expressions} & E &::=& p & \textrm{column references} \\
adamc@549 1404 &&& X & \textrm{named expression references} \\
adamc@549 1405 &&& \{\{e\}\} & \textrm{injected native Ur expressions} \\
adamc@549 1406 &&& \{e\} & \textrm{computed expressions, probably using $\mt{sql\_exp}$ directly} \\
adamc@549 1407 &&& \mt{TRUE} \mid \mt{FALSE} & \textrm{boolean constants} \\
adamc@549 1408 &&& \ell & \textrm{primitive type literals} \\
adamc@549 1409 &&& \mt{NULL} & \textrm{null value (injection of $\mt{None}$)} \\
adamc@549 1410 &&& E \; \mt{IS} \; \mt{NULL} & \textrm{nullness test} \\
adamc@549 1411 &&& n & \textrm{nullary operators} \\
adamc@549 1412 &&& u \; E & \textrm{unary operators} \\
adamc@549 1413 &&& E \; b \; E & \textrm{binary operators} \\
adamc@549 1414 &&& \mt{COUNT}(\ast) & \textrm{count number of rows} \\
adamc@549 1415 &&& a(E) & \textrm{other aggregate function} \\
adamc@549 1416 &&& (E) & \textrm{explicit precedence} \\
adamc@549 1417 \textrm{Nullary operators} & n &::=& \mt{CURRENT\_TIMESTAMP} \\
adamc@549 1418 \textrm{Unary operators} & u &::=& \mt{NOT} \\
adamc@549 1419 \textrm{Binary operators} & b &::=& \mt{AND} \mid \mt{OR} \mid \neq \mid < \mid \leq \mid > \mid \geq \\
adamc@549 1420 \textrm{Aggregate functions} & a &::=& \mt{AVG} \mid \mt{SUM} \mid \mt{MIN} \mid \mt{MAX} \\
adamc@550 1421 \textrm{Directions} & o &::=& \mt{ASC} \mid \mt{DESC} \\
adamc@549 1422 \textrm{SQL integer} & N &::=& n \mid \{e\} \\
adamc@549 1423 \end{array}$$
adamc@549 1424
adamc@549 1425 Additionally, an SQL expression may be inserted into normal Ur code with the syntax $(\mt{SQL} \; E)$ or $(\mt{WHERE} \; E)$.
adamc@549 1426
adamc@550 1427 \subsubsection{DML}
adamc@550 1428
adamc@550 1429 DML commands $D$ are added to the rules for expressions $e$.
adamc@550 1430
adamc@550 1431 $$\begin{array}{rrcll}
adamc@550 1432 \textrm{Commands} & D &::=& (\mt{INSERT} \; \mt{INTO} \; T^E \; (f,^+) \; \mt{VALUES} \; (E,^+)) \\
adamc@550 1433 &&& (\mt{UPDATE} \; T^E \; \mt{SET} \; (f = E,)^+ \; \mt{WHERE} \; E) \\
adamc@550 1434 &&& (\mt{DELETE} \; \mt{FROM} \; T^E \; \mt{WHERE} \; E) \\
adamc@550 1435 \textrm{Table expressions} & T^E &::=& x \mid \{\{e\}\}
adamc@550 1436 \end{array}$$
adamc@550 1437
adamc@550 1438 Inside $\mt{UPDATE}$ and $\mt{DELETE}$ commands, lone variables $X$ are interpreted as references to columns of the implicit table $\mt{T}$, rather than to named expressions.
adamc@549 1439
adamc@551 1440 \subsection{XML}
adamc@551 1441
adamc@551 1442 XML fragments $L$ are added to the rules for expressions $e$.
adamc@551 1443
adamc@551 1444 $$\begin{array}{rrcll}
adamc@551 1445 \textrm{XML fragments} & L &::=& \texttt{<xml/>} \mid \texttt{<xml>}l^*\texttt{</xml>} \\
adamc@551 1446 \textrm{XML pieces} & l &::=& \textrm{text} & \textrm{cdata} \\
adamc@551 1447 &&& \texttt{<}g\texttt{/>} & \textrm{tag with no children} \\
adamc@551 1448 &&& \texttt{<}g\texttt{>}l^*\texttt{</}x\texttt{>} & \textrm{tag with children} \\
adamc@559 1449 &&& \{e\} & \textrm{computed XML fragment} \\
adamc@559 1450 &&& \{[e]\} & \textrm{injection of an Ur expression, via the $\mt{Top}.\mt{txt}$ function} \\
adamc@551 1451 \textrm{Tag} & g &::=& h \; (x = v)^* \\
adamc@551 1452 \textrm{Tag head} & h &::=& x & \textrm{tag name} \\
adamc@551 1453 &&& h\{c\} & \textrm{constructor parameter} \\
adamc@551 1454 \textrm{Attribute value} & v &::=& \ell & \textrm{literal value} \\
adamc@551 1455 &&& \{e\} & \textrm{computed value} \\
adamc@551 1456 \end{array}$$
adamc@551 1457
adamc@552 1458
adamc@553 1459 \section{The Structure of Web Applications}
adamc@553 1460
adamc@553 1461 A web application is built from a series of modules, with one module, the last one appearing in the \texttt{.urp} file, designated as the main module. The signature of the main module determines the URL entry points to the application. Such an entry point should have type $\mt{unit} \to \mt{transaction} \; \mt{page}$, where $\mt{page}$ is a type synonym for top-level HTML pages, defined in $\mt{Basis}$. If such a function is at the top level of main module $M$, it will be accessible at URI \texttt{/M/f}, and so on for more deeply-nested functions, as described in Section \ref{tag} below.
adamc@553 1462
adamc@553 1463 When the standalone web server receives a request for a known page, it calls the function for that page, ``running'' the resulting transaction to produce the page to return to the client. Pages link to other pages with the \texttt{link} attribute of the \texttt{a} HTML tag. A link has type $\mt{transaction} \; \mt{page}$, and the semantics of a link are that this transaction should be run to compute the result page, when the link is followed. Link targets are assigned URL names in the same way as top-level entry points.
adamc@553 1464
adamc@553 1465 HTML forms are handled in a similar way. The $\mt{action}$ attribute of a $\mt{submit}$ form tag takes a value of type $\$\mt{use} \to \mt{transaction} \; \mt{page}$, where $\mt{use}$ is a kind-$\{\mt{Type}\}$ record of the form fields used by this action handler. Action handlers are assigned URL patterns in the same way as above.
adamc@553 1466
adamc@558 1467 For both links and actions, direct arguments and local variables mentioned implicitly via closures are automatically included in serialized form in URLs, in the order in which they appear in the source code.
adamc@553 1468
adamc@553 1469
adamc@552 1470 \section{Compiler Phases}
adamc@552 1471
adamc@552 1472 The Ur/Web compiler is unconventional in that it relies on a kind of \emph{heuristic compilation}. Not all valid programs will compile successfully. Informally, programs fail to compile when they are ``too higher order.'' Compiler phases do their best to eliminate different kinds of higher order-ness, but some programs just won't compile. This is a trade-off for producing very efficient executables. Compiled Ur/Web programs use native C representations and require no garbage collection.
adamc@552 1473
adamc@552 1474 In this section, we step through the main phases of compilation, noting what consequences each phase has for effective programming.
adamc@552 1475
adamc@552 1476 \subsection{Parse}
adamc@552 1477
adamc@552 1478 The compiler reads a \texttt{.urp} file, figures out which \texttt{.urs} and \texttt{.ur} files it references, and combines them all into what is conceptually a single sequence of declarations in the core language of Section \ref{core}.
adamc@552 1479
adamc@552 1480 \subsection{Elaborate}
adamc@552 1481
adamc@552 1482 This is where type inference takes place, translating programs into an explicit form with no more wildcards. This phase is the most likely source of compiler error messages.
adamc@552 1483
adamc@552 1484 \subsection{Unnest}
adamc@552 1485
adamc@552 1486 Named local function definitions are moved to the top level, to avoid the need to generate closures.
adamc@552 1487
adamc@552 1488 \subsection{Corify}
adamc@552 1489
adamc@552 1490 Module system features are compiled away, through inlining of functor definitions at application sites. Afterward, most abstraction boundaries are broken, facilitating optimization.
adamc@552 1491
adamc@552 1492 \subsection{Especialize}
adamc@552 1493
adamc@552 1494 Functions are specialized to particular argument patterns. This is an important trick for avoiding the need to maintain any closures at runtime.
adamc@552 1495
adamc@552 1496 \subsection{Untangle}
adamc@552 1497
adamc@552 1498 Remove unnecessary mutual recursion, splitting recursive groups into strongly-connected components.
adamc@552 1499
adamc@552 1500 \subsection{Shake}
adamc@552 1501
adamc@552 1502 Remove all definitions not needed to run the page handlers that are visible in the signature of the last module listed in the \texttt{.urp} file.
adamc@552 1503
adamc@553 1504 \subsection{\label{tag}Tag}
adamc@552 1505
adamc@552 1506 Assign a URL name to each link and form action. It is important that these links and actions are written as applications of named functions, because such names are used to generate URL patterns. A URL pattern has a name built from the full module path of the named function, followed by the function name, with all pieces separated by slashes. The path of a functor application is based on the name given to the result, rather than the path of the functor itself.
adamc@552 1507
adamc@552 1508 \subsection{Reduce}
adamc@552 1509
adamc@552 1510 Apply definitional equality rules to simplify the program as much as possible. This effectively includes inlining of every non-recursive definition.
adamc@552 1511
adamc@552 1512 \subsection{Unpoly}
adamc@552 1513
adamc@552 1514 This phase specializes polymorphic functions to the specific arguments passed to them in the program. If the program contains real polymorphic recursion, Unpoly will be insufficient to avoid later error messages about too much polymorphism.
adamc@552 1515
adamc@552 1516 \subsection{Specialize}
adamc@552 1517
adamc@558 1518 Replace uses of parameterized datatypes with versions specialized to specific parameters. As for Unpoly, this phase will not be effective enough in the presence of polymorphic recursion or other fancy uses of impredicative polymorphism.
adamc@552 1519
adamc@552 1520 \subsection{Shake}
adamc@552 1521
adamc@558 1522 Here the compiler repeats the earlier Shake phase.
adamc@552 1523
adamc@552 1524 \subsection{Monoize}
adamc@552 1525
adamc@552 1526 Programs are translated to a new intermediate language without polymorphism or non-$\mt{Type}$ constructors. Error messages may pop up here if earlier phases failed to remove such features.
adamc@552 1527
adamc@552 1528 This is the stage at which concrete names are generated for cookies, tables, and sequences. They are named following the same convention as for links and actions, based on module path information saved from earlier stages. Table and sequence names separate path elements with underscores instead of slashes, and they are prefixed by \texttt{uw\_}.
adamc@552 1529 \subsection{MonoOpt}
adamc@552 1530
adamc@552 1531 Simple algebraic laws are applied to simplify the program, focusing especially on efficient imperative generation of HTML pages.
adamc@552 1532
adamc@552 1533 \subsection{MonoUntangle}
adamc@552 1534
adamc@552 1535 Unnecessary mutual recursion is broken up again.
adamc@552 1536
adamc@552 1537 \subsection{MonoReduce}
adamc@552 1538
adamc@552 1539 Equivalents of the definitional equality rules are applied to simplify programs, with inlining again playing a major role.
adamc@552 1540
adamc@552 1541 \subsection{MonoShake, MonoOpt}
adamc@552 1542
adamc@552 1543 Unneeded declarations are removed, and basic optimizations are repeated.
adamc@552 1544
adamc@552 1545 \subsection{Fuse}
adamc@552 1546
adamc@552 1547 The compiler tries to simplify calls to recursive functions whose results are immediately written as page output. The write action is pushed inside the function definitions to avoid allocation of intermediate results.
adamc@552 1548
adamc@552 1549 \subsection{MonoUntangle, MonoShake}
adamc@552 1550
adamc@552 1551 Fuse often creates more opportunities to remove spurious mutual recursion.
adamc@552 1552
adamc@552 1553 \subsection{Pathcheck}
adamc@552 1554
adamc@552 1555 The compiler checks that no link or action name has been used more than once.
adamc@552 1556
adamc@552 1557 \subsection{Cjrize}
adamc@552 1558
adamc@552 1559 The program is translated to what is more or less a subset of C. If any use of functions as data remains at this point, the compiler will complain.
adamc@552 1560
adamc@552 1561 \subsection{C Compilation and Linking}
adamc@552 1562
adamc@552 1563 The output of the last phase is pretty-printed as C source code and passed to GCC.
adamc@552 1564
adamc@552 1565
adamc@524 1566 \end{document}