annotate doc/manual.tex @ 556:5703a2ad5221

.urp files
author Adam Chlipala <adamc@hcoop.net>
date Tue, 09 Dec 2008 11:52:56 -0500
parents 0b2cf25a5eba
children 0d3db8d6a586
rev   line source
adamc@524 1 \documentclass{article}
adamc@554 2 \usepackage{fullpage,amsmath,amssymb,proof,url}
adamc@524 3
adamc@524 4 \newcommand{\cd}[1]{\texttt{#1}}
adamc@524 5 \newcommand{\mt}[1]{\mathsf{#1}}
adamc@524 6
adamc@524 7 \newcommand{\rc}{+ \hspace{-.075in} + \;}
adamc@527 8 \newcommand{\rcut}{\; \texttt{--} \;}
adamc@527 9 \newcommand{\rcutM}{\; \texttt{---} \;}
adamc@524 10
adamc@524 11 \begin{document}
adamc@524 12
adamc@524 13 \title{The Ur/Web Manual}
adamc@524 14 \author{Adam Chlipala}
adamc@524 15
adamc@524 16 \maketitle
adamc@524 17
adamc@540 18 \tableofcontents
adamc@540 19
adamc@554 20
adamc@554 21 \section{Introduction}
adamc@554 22
adamc@554 23 \emph{Ur} is a programming language designed to introduce richer type system features into functional programming in the tradition of ML and Haskell. Ur is functional, pure, statically-typed, and strict. Ur supports a powerful kind of \emph{metaprogramming} based on \emph{row types}.
adamc@554 24
adamc@554 25 \emph{Ur/Web} is Ur plus a special standard library and associated rules for parsing and optimization. Ur/Web supports construction of dynamic web applications backed by SQL databases. The signature of the standard library is such that well-typed Ur/Web programs ``don't go wrong'' in a very broad sense. Not only do they not crash during particular page generations, but they also may not:
adamc@554 26
adamc@554 27 \begin{itemize}
adamc@554 28 \item Suffer from any kinds of code-injection attacks
adamc@554 29 \item Return invalid HTML
adamc@554 30 \item Contain dead intra-application links
adamc@554 31 \item Have mismatches between HTML forms and the fields expected by their handlers
adamc@554 32 \item Attempt invalid SQL queries
adamc@554 33 \item Use improper marshaling or unmarshaling in communication with SQL databases
adamc@554 34 \end{itemize}
adamc@554 35
adamc@554 36 This type safety is just the foundation of the Ur/Web methodology. It is also possible to use metaprogramming to build significant application pieces by analysis of type structure. For instance, the demo includes an ML-style functor for building an admin interface for an arbitrary SQL table. The type system guarantees that the admin interface sub-application that comes out will always be free of the above-listed bugs, no matter which well-typed table description is given as input.
adamc@554 37
adamc@554 38 The Ur/Web compiler also produces very efficient object code that does not use garbage collection. These compiled programs will often be even more efficient than what most programmers would bother to write in C.
adamc@554 39
adamc@554 40 \medskip
adamc@554 41
adamc@554 42 The official web site for Ur is:
adamc@554 43 \begin{center}
adamc@554 44 \url{http://www.impredicative.com/ur/}
adamc@554 45 \end{center}
adamc@554 46
adamc@555 47
adamc@555 48 \section{Installation}
adamc@555 49
adamc@555 50 If you are lucky, then the following standard command sequence will suffice for installation, in a directory to which you have unpacked the latest distribution tarball.
adamc@555 51
adamc@555 52 \begin{verbatim}
adamc@555 53 ./configure
adamc@555 54 make
adamc@555 55 sudo make install
adamc@555 56 \end{verbatim}
adamc@555 57
adamc@555 58 Some other packages must be installed for the above to work. At a minimum, you need a standard UNIX shell, with standard UNIX tools like sed and GCC in your execution path; and MLton, the whole-program optimizing compiler for Standard ML. To build programs that access SQL databases, you also need libpq, the PostgreSQL client library. As of this writing, in the ``testing'' version of Debian Linux, this command will install the more uncommon of these dependencies:
adamc@555 59
adamc@555 60 \begin{verbatim}
adamc@555 61 apt-get install mlton libpq-dev
adamc@555 62 \end{verbatim}
adamc@555 63
adamc@555 64 It is also possible to access the modules of the Ur/Web compiler interactively, within Standard ML of New Jersey. To install the prerequisites in Debian testing:
adamc@555 65
adamc@555 66 \begin{verbatim}
adamc@555 67 apt-get install smlnj libsmlnj-smlnj ml-yacc ml-lpt
adamc@555 68 \end{verbatim}
adamc@555 69
adamc@555 70 To begin an interactive session with the Ur compiler modules, run \texttt{make smlnj}, and then, from within an \texttt{sml} session, run \texttt{CM.make "src/urweb.cm";}. The \texttt{Compiler} module is the main entry point.
adamc@555 71
adamc@555 72 To run an SQL-backed application, you will probably want to install the PostgreSQL server. Version 8.3 or higher is required.
adamc@555 73
adamc@555 74 \begin{verbatim}
adamc@555 75 apt-get install postgresql-8.3
adamc@555 76 \end{verbatim}
adamc@555 77
adamc@555 78 To use the Emacs mode, you must have a modern Emacs installed. We assume that you already know how to do this, if you're in the business of looking for an Emacs mode. The demo generation facility of the compiler will also call out to Emacs to syntax-highlight code, and that process depends on the \texttt{htmlize} module, which can be installed in Debian testing via:
adamc@555 79
adamc@555 80 \begin{verbatim}
adamc@555 81 apt-get install emacs-goodies-el
adamc@555 82 \end{verbatim}
adamc@555 83
adamc@555 84 Even with the right packages installed, configuration and building might fail to work. After you run \texttt{./configure}, you will see the values of some named environment variables printed. You may need to adjust these values to get proper installation for your system. To change a value, store your preferred alternative in the corresponding UNIX environment variable, before running \texttt{./configure}. For instance, here is how to change the list of extra arguments that the Ur/Web compiler will pass to GCC on every invocation.
adamc@555 85
adamc@555 86 \begin{verbatim}
adamc@555 87 GCCARGS=-fnested-functions ./configure
adamc@555 88 \end{verbatim}
adamc@555 89
adamc@555 90 Some OSX users have reported needing to use this particular GCCARGS value.
adamc@555 91
adamc@555 92 The Emacs mode can be set to autoload by adding the following to your \texttt{.emacs} file.
adamc@555 93
adamc@555 94 \begin{verbatim}
adamc@555 95 (add-to-list 'load-path "/usr/local/share/emacs/site-lisp/urweb-mode")
adamc@555 96 (load "urweb-mode-startup")
adamc@555 97 \end{verbatim}
adamc@555 98
adamc@555 99 Change the path in the first line if you chose a different Emacs installation path during configuration.
adamc@555 100
adamc@555 101
adamc@556 102 \section{Command-Line Compiler}
adamc@556 103
adamc@556 104 \subsection{Project Files}
adamc@556 105
adamc@556 106 The basic inputs to the \texttt{urweb} compiler are project files, which have the extension \texttt{.urp}. Here is a sample \texttt{.urp} file.
adamc@556 107
adamc@556 108 \begin{verbatim}
adamc@556 109 database dbname=test
adamc@556 110 sql crud1.sql
adamc@556 111
adamc@556 112 crud
adamc@556 113 crud1
adamc@556 114 \end{verbatim}
adamc@556 115
adamc@556 116 The \texttt{database} line gives the database information string to pass to libpq. In this case, the string only says to connect to a local database named \texttt{test}.
adamc@556 117
adamc@556 118 The \texttt{sql} line asks for an SQL source file to be generated, giving the commands to run to create the tables and sequences that this application expects to find. After building this \texttt{.urp} file, the following commands could be used to initialize the database, assuming that the current UNIX user exists as a Postgres user with database creation privileges:
adamc@556 119
adamc@556 120 \begin{verbatim}
adamc@556 121 createdb test
adamc@556 122 psql -f crud1.sql test
adamc@556 123 \end{verbatim}
adamc@556 124
adamc@556 125 A blank line always separates the named directives from a list of modules to include in the project; if there are no named directives, a blank line must begin the file.
adamc@556 126
adamc@556 127 For each entry \texttt{M} in the module list, the file \texttt{M.urs} is included in the project if it exists, and the file \texttt{M.ur} must exist and is always included.
adamc@556 128
adamc@556 129 A few other named directives are supported. \texttt{prefix PREFIX} sets the prefix included before every URI within the generated application; the default is \texttt{/}. \texttt{exe FILENAME} sets the filename to which to write the output executable; the default for file \texttt{P.urp} is \texttt{P.exe}. \texttt{debug} saves some intermediate C files, which is mostly useful to help in debugging the compiler itself. \texttt{profile} generates an executable that may be used with gprof.
adamc@556 130
adamc@556 131
adamc@529 132 \section{Ur Syntax}
adamc@529 133
adamc@529 134 In this section, we describe the syntax of Ur, deferring to a later section discussion of most of the syntax specific to SQL and XML. The sole exceptions are the declaration forms for tables, sequences, and cookies.
adamc@524 135
adamc@524 136 \subsection{Lexical Conventions}
adamc@524 137
adamc@524 138 We give the Ur language definition in \LaTeX $\;$ math mode, since that is prettier than monospaced ASCII. The corresponding ASCII syntax can be read off directly. Here is the key for mapping math symbols to ASCII character sequences.
adamc@524 139
adamc@524 140 \begin{center}
adamc@524 141 \begin{tabular}{rl}
adamc@524 142 \textbf{\LaTeX} & \textbf{ASCII} \\
adamc@524 143 $\to$ & \cd{->} \\
adamc@524 144 $\times$ & \cd{*} \\
adamc@524 145 $\lambda$ & \cd{fn} \\
adamc@524 146 $\Rightarrow$ & \cd{=>} \\
adamc@529 147 $\neq$ & \cd{<>} \\
adamc@529 148 $\leq$ & \cd{<=} \\
adamc@529 149 $\geq$ & \cd{>=} \\
adamc@524 150 \\
adamc@524 151 $x$ & Normal textual identifier, not beginning with an uppercase letter \\
adamc@525 152 $X$ & Normal textual identifier, beginning with an uppercase letter \\
adamc@524 153 \end{tabular}
adamc@524 154 \end{center}
adamc@524 155
adamc@525 156 We often write syntax like $e^*$ to indicate zero or more copies of $e$, $e^+$ to indicate one or more copies, and $e,^*$ and $e,^+$ to indicate multiple copies separated by commas. Another separator may be used in place of a comma. The $e$ term may be surrounded by parentheses to indicate grouping; those parentheses should not be included in the actual ASCII.
adamc@524 157
adamc@526 158 We write $\ell$ for literals of the primitive types, for the most part following C conventions. There are $\mt{int}$, $\mt{float}$, and $\mt{string}$ literals.
adamc@526 159
adamc@527 160 This version of the manual doesn't include operator precedences; see \texttt{src/urweb.grm} for that.
adamc@527 161
adamc@552 162 \subsection{\label{core}Core Syntax}
adamc@524 163
adamc@524 164 \emph{Kinds} classify types and other compile-time-only entities. Each kind in the grammar is listed with a description of the sort of data it classifies.
adamc@524 165 $$\begin{array}{rrcll}
adamc@524 166 \textrm{Kinds} & \kappa &::=& \mt{Type} & \textrm{proper types} \\
adamc@525 167 &&& \mt{Unit} & \textrm{the trivial constructor} \\
adamc@525 168 &&& \mt{Name} & \textrm{field names} \\
adamc@525 169 &&& \kappa \to \kappa & \textrm{type-level functions} \\
adamc@525 170 &&& \{\kappa\} & \textrm{type-level records} \\
adamc@525 171 &&& (\kappa\times^+) & \textrm{type-level tuples} \\
adamc@529 172 &&& \_\_ & \textrm{wildcard} \\
adamc@525 173 &&& (\kappa) & \textrm{explicit precedence} \\
adamc@524 174 \end{array}$$
adamc@524 175
adamc@524 176 Ur supports several different notions of functions that take types as arguments. These arguments can be either implicit, causing them to be inferred at use sites; or explicit, forcing them to be specified manually at use sites. There is a common explicitness annotation convention applied at the definitions of and in the types of such functions.
adamc@524 177 $$\begin{array}{rrcll}
adamc@524 178 \textrm{Explicitness} & ? &::=& :: & \textrm{explicit} \\
adamc@525 179 &&& \; ::: & \textrm{implicit}
adamc@524 180 \end{array}$$
adamc@524 181
adamc@524 182 \emph{Constructors} are the main class of compile-time-only data. They include proper types and are classified by kinds.
adamc@524 183 $$\begin{array}{rrcll}
adamc@524 184 \textrm{Constructors} & c, \tau &::=& (c) :: \kappa & \textrm{kind annotation} \\
adamc@530 185 &&& \hat{x} & \textrm{constructor variable} \\
adamc@524 186 \\
adamc@525 187 &&& \tau \to \tau & \textrm{function type} \\
adamc@525 188 &&& x \; ? \; \kappa \to \tau & \textrm{polymorphic function type} \\
adamc@525 189 &&& \$ c & \textrm{record type} \\
adamc@524 190 \\
adamc@525 191 &&& c \; c & \textrm{type-level function application} \\
adamc@530 192 &&& \lambda x \; :: \; \kappa \Rightarrow c & \textrm{type-level function abstraction} \\
adamc@524 193 \\
adamc@525 194 &&& () & \textrm{type-level unit} \\
adamc@525 195 &&& \#X & \textrm{field name} \\
adamc@524 196 \\
adamc@525 197 &&& [(c = c)^*] & \textrm{known-length type-level record} \\
adamc@525 198 &&& c \rc c & \textrm{type-level record concatenation} \\
adamc@525 199 &&& \mt{fold} & \textrm{type-level record fold} \\
adamc@524 200 \\
adamc@525 201 &&& (c^+) & \textrm{type-level tuple} \\
adamc@525 202 &&& c.n & \textrm{type-level tuple projection ($n \in \mathbb N^+$)} \\
adamc@524 203 \\
adamc@525 204 &&& \lambda [c \sim c] \Rightarrow c & \textrm{guarded constructor} \\
adamc@524 205 \\
adamc@529 206 &&& \_ :: \kappa & \textrm{wildcard} \\
adamc@525 207 &&& (c) & \textrm{explicit precedence} \\
adamc@530 208 \\
adamc@530 209 \textrm{Qualified uncapitalized variables} & \hat{x} &::=& x & \textrm{not from a module} \\
adamc@530 210 &&& M.x & \textrm{projection from a module} \\
adamc@525 211 \end{array}$$
adamc@525 212
adamc@525 213 Modules of the module system are described by \emph{signatures}.
adamc@525 214 $$\begin{array}{rrcll}
adamc@525 215 \textrm{Signatures} & S &::=& \mt{sig} \; s^* \; \mt{end} & \textrm{constant} \\
adamc@525 216 &&& X & \textrm{variable} \\
adamc@525 217 &&& \mt{functor}(X : S) : S & \textrm{functor} \\
adamc@529 218 &&& S \; \mt{where} \; \mt{con} \; x = c & \textrm{concretizing an abstract constructor} \\
adamc@525 219 &&& M.X & \textrm{projection from a module} \\
adamc@525 220 \\
adamc@525 221 \textrm{Signature items} & s &::=& \mt{con} \; x :: \kappa & \textrm{abstract constructor} \\
adamc@525 222 &&& \mt{con} \; x :: \kappa = c & \textrm{concrete constructor} \\
adamc@528 223 &&& \mt{datatype} \; x \; x^* = dc\mid^+ & \textrm{algebraic datatype definition} \\
adamc@529 224 &&& \mt{datatype} \; x = \mt{datatype} \; M.x & \textrm{algebraic datatype import} \\
adamc@525 225 &&& \mt{val} \; x : \tau & \textrm{value} \\
adamc@525 226 &&& \mt{structure} \; X : S & \textrm{sub-module} \\
adamc@525 227 &&& \mt{signature} \; X = S & \textrm{sub-signature} \\
adamc@525 228 &&& \mt{include} \; S & \textrm{signature inclusion} \\
adamc@525 229 &&& \mt{constraint} \; c \sim c & \textrm{record disjointness constraint} \\
adamc@525 230 &&& \mt{class} \; x & \textrm{abstract type class} \\
adamc@525 231 &&& \mt{class} \; x = c & \textrm{concrete type class} \\
adamc@525 232 \\
adamc@525 233 \textrm{Datatype constructors} & dc &::=& X & \textrm{nullary constructor} \\
adamc@525 234 &&& X \; \mt{of} \; \tau & \textrm{unary constructor} \\
adamc@524 235 \end{array}$$
adamc@524 236
adamc@526 237 \emph{Patterns} are used to describe structural conditions on expressions, such that expressions may be tested against patterns, generating assignments to pattern variables if successful.
adamc@526 238 $$\begin{array}{rrcll}
adamc@526 239 \textrm{Patterns} & p &::=& \_ & \textrm{wildcard} \\
adamc@526 240 &&& x & \textrm{variable} \\
adamc@526 241 &&& \ell & \textrm{constant} \\
adamc@526 242 &&& \hat{X} & \textrm{nullary constructor} \\
adamc@526 243 &&& \hat{X} \; p & \textrm{unary constructor} \\
adamc@526 244 &&& \{(x = p,)^*\} & \textrm{rigid record pattern} \\
adamc@526 245 &&& \{(x = p,)^+, \ldots\} & \textrm{flexible record pattern} \\
adamc@527 246 &&& (p) & \textrm{explicit precedence} \\
adamc@526 247 \\
adamc@529 248 \textrm{Qualified capitalized variables} & \hat{X} &::=& X & \textrm{not from a module} \\
adamc@526 249 &&& M.X & \textrm{projection from a module} \\
adamc@526 250 \end{array}$$
adamc@526 251
adamc@527 252 \emph{Expressions} are the main run-time entities, corresponding to both ``expressions'' and ``statements'' in mainstream imperative languages.
adamc@527 253 $$\begin{array}{rrcll}
adamc@527 254 \textrm{Expressions} & e &::=& e : \tau & \textrm{type annotation} \\
adamc@529 255 &&& \hat{x} & \textrm{variable} \\
adamc@529 256 &&& \hat{X} & \textrm{datatype constructor} \\
adamc@527 257 &&& \ell & \textrm{constant} \\
adamc@527 258 \\
adamc@527 259 &&& e \; e & \textrm{function application} \\
adamc@527 260 &&& \lambda x : \tau \Rightarrow e & \textrm{function abstraction} \\
adamc@527 261 &&& e [c] & \textrm{polymorphic function application} \\
adamc@527 262 &&& \lambda x \; ? \; \kappa \Rightarrow e & \textrm{polymorphic function abstraction} \\
adamc@527 263 \\
adamc@527 264 &&& \{(c = e,)^*\} & \textrm{known-length record} \\
adamc@527 265 &&& e.c & \textrm{record field projection} \\
adamc@527 266 &&& e \rc e & \textrm{record concatenation} \\
adamc@527 267 &&& e \rcut c & \textrm{removal of a single record field} \\
adamc@527 268 &&& e \rcutM c & \textrm{removal of multiple record fields} \\
adamc@527 269 &&& \mt{fold} & \textrm{fold over fields of a type-level record} \\
adamc@527 270 \\
adamc@527 271 &&& \mt{let} \; ed^* \; \mt{in} \; e \; \mt{end} & \textrm{local definitions} \\
adamc@527 272 \\
adamc@527 273 &&& \mt{case} \; e \; \mt{of} \; (p \Rightarrow e|)^+ & \textrm{pattern matching} \\
adamc@527 274 \\
adamc@527 275 &&& \lambda [c \sim c] \Rightarrow e & \textrm{guarded expression} \\
adamc@527 276 \\
adamc@527 277 &&& \_ & \textrm{wildcard} \\
adamc@527 278 &&& (e) & \textrm{explicit precedence} \\
adamc@527 279 \\
adamc@527 280 \textrm{Local declarations} & ed &::=& \cd{val} \; x : \tau = e & \textrm{non-recursive value} \\
adamc@527 281 &&& \cd{val} \; \cd{rec} \; (x : \tau = e \; \cd{and})^+ & \textrm{mutually-recursive values} \\
adamc@527 282 \end{array}$$
adamc@527 283
adamc@528 284 \emph{Declarations} primarily bring new symbols into context.
adamc@528 285 $$\begin{array}{rrcll}
adamc@528 286 \textrm{Declarations} & d &::=& \mt{con} \; x :: \kappa = c & \textrm{constructor synonym} \\
adamc@528 287 &&& \mt{datatype} \; x \; x^* = dc\mid^+ & \textrm{algebraic datatype definition} \\
adamc@529 288 &&& \mt{datatype} \; x = \mt{datatype} \; M.x & \textrm{algebraic datatype import} \\
adamc@528 289 &&& \mt{val} \; x : \tau = e & \textrm{value} \\
adamc@528 290 &&& \mt{val} \; \cd{rec} \; (x : \tau = e \; \mt{and})^+ & \textrm{mutually-recursive values} \\
adamc@528 291 &&& \mt{structure} \; X : S = M & \textrm{module definition} \\
adamc@528 292 &&& \mt{signature} \; X = S & \textrm{signature definition} \\
adamc@528 293 &&& \mt{open} \; M & \textrm{module inclusion} \\
adamc@528 294 &&& \mt{constraint} \; c \sim c & \textrm{record disjointness constraint} \\
adamc@528 295 &&& \mt{open} \; \mt{constraints} \; M & \textrm{inclusion of just the constraints from a module} \\
adamc@528 296 &&& \mt{table} \; x : c & \textrm{SQL table} \\
adamc@528 297 &&& \mt{sequence} \; x & \textrm{SQL sequence} \\
adamc@535 298 &&& \mt{cookie} \; x : \tau & \textrm{HTTP cookie} \\
adamc@528 299 &&& \mt{class} \; x = c & \textrm{concrete type class} \\
adamc@528 300 \\
adamc@529 301 \textrm{Modules} & M &::=& \mt{struct} \; d^* \; \mt{end} & \textrm{constant} \\
adamc@529 302 &&& X & \textrm{variable} \\
adamc@529 303 &&& M.X & \textrm{projection} \\
adamc@529 304 &&& M(M) & \textrm{functor application} \\
adamc@529 305 &&& \mt{functor}(X : S) : S = M & \textrm{functor abstraction} \\
adamc@528 306 \end{array}$$
adamc@528 307
adamc@528 308 There are two kinds of Ur files. A file named $M\texttt{.ur}$ is an \emph{implementation file}, and it should contain a sequence of declarations $d^*$. A file named $M\texttt{.urs}$ is an \emph{interface file}; it must always have a matching $M\texttt{.ur}$ and should contain a sequence of signature items $s^*$. When both files are present, the overall effect is the same as a monolithic declaration $\mt{structure} \; M : \mt{sig} \; s^* \; \mt{end} = \mt{struct} \; d^* \; \mt{end}$. When no interface file is included, the overall effect is similar, with a signature for module $M$ being inferred rather than just checked against an interface.
adamc@527 309
adamc@529 310 \subsection{Shorthands}
adamc@529 311
adamc@529 312 There are a variety of derived syntactic forms that elaborate into the core syntax from the last subsection. We will present the additional forms roughly following the order in which we presented the constructs that they elaborate into.
adamc@529 313
adamc@529 314 In many contexts where record fields are expected, like in a projection $e.c$, a constant field may be written as simply $X$, rather than $\#X$.
adamc@529 315
adamc@529 316 A record type may be written $\{(c = c,)^*\}$, which elaborates to $\$[(c = c,)^*]$.
adamc@529 317
adamc@533 318 The notation $[c_1, \ldots, c_n]$ is shorthand for $[c_1 = (), \ldots, c_n = ()]$.
adamc@533 319
adamc@529 320 A tuple type $(\tau_1, \ldots, \tau_n)$ expands to a record type $\{1 = \tau_1, \ldots, n = \tau_n\}$, with natural numbers as field names. A tuple pattern $(p_1, \ldots, p_n)$ expands to a rigid record pattern $\{1 = p_1, \ldots, n = p_n\}$. Positive natural numbers may be used in most places where field names would be allowed.
adamc@529 321
adamc@529 322 In general, several adjacent $\lambda$ forms may be combined into one, and kind and type annotations may be omitted, in which case they are implicitly included as wildcards. More formally, for constructor-level abstractions, we can define a new non-terminal $b ::= x \mid (x :: \kappa) \mid [c \sim c]$ and allow composite abstractions of the form $\lambda b^+ \Rightarrow c$, elaborating into the obvious sequence of one core $\lambda$ per element of $b^+$.
adamc@529 323
adamc@529 324 For any signature item or declaration that defines some entity to be equal to $A$ with classification annotation $B$ (e.g., $\mt{val} \; x : B = A$), $B$ and the preceding colon (or similar punctuation) may be omitted, in which case it is filled in as a wildcard.
adamc@529 325
adamc@529 326 A signature item or declaration $\mt{type} \; x$ or $\mt{type} \; x = \tau$ is elaborated into $\mt{con} \; x :: \mt{Type}$ or $\mt{con} \; x :: \mt{Type} = \tau$, respectively.
adamc@529 327
adamc@529 328 A signature item or declaration $\mt{class} \; x = \lambda y :: \mt{Type} \Rightarrow c$ may be abbreviated $\mt{class} \; x \; y = c$.
adamc@529 329
adamc@529 330 Handling of implicit and explicit constructor arguments may be tweaked with some prefixes to variable references. An expression $@x$ is a version of $x$ where all implicit constructor arguments have been made explicit. An expression $@@x$ achieves the same effect, additionally halting automatic resolution of type class instances. The same syntax works for variables projected out of modules and for capitalized variables (datatype constructors).
adamc@529 331
adamc@529 332 At the expression level, an analogue is available of the composite $\lambda$ form for constructors. We define the language of binders as $b ::= x \mid (x : \tau) \mid (x \; ? \; \kappa) \mid [c \sim c]$. A lone variable $x$ as a binder stands for an expression variable of unspecified type.
adamc@529 333
adamc@529 334 A $\mt{val}$ or $\mt{val} \; \mt{rec}$ declaration may include expression binders before the equal sign, following the binder grammar from the last paragraph. Such declarations are elaborated into versions that add additional $\lambda$s to the fronts of the righthand sides, as appropriate. The keyword $\mt{fun}$ is a synonym for $\mt{val} \; \mt{rec}$.
adamc@529 335
adamc@529 336 A signature item $\mt{functor} \; X_1 \; (X_2 : S_1) : S_2$ is elaborated into $\mt{structure} \; X_1 : \mt{functor}(X_2 : S_1) : S_2$. A declaration $\mt{functor} \; X_1 \; (X_2 : S_1) : S_2 = M$ is elaborated into $\mt{structure} \; X_1 : \mt{functor}(X_2 : S_1) : S_2 = \mt{functor}(X_2 : S_1) : S_2 = M$.
adamc@529 337
adamc@529 338 A declaration $\mt{table} \; x : \{(c = c,)^*\}$ is elaborated into $\mt{table} \; x : [(c = c,)^*]$
adamc@529 339
adamc@529 340 The syntax $\mt{where} \; \mt{type}$ is an alternate form of $\mt{where} \; \mt{con}$.
adamc@529 341
adamc@529 342 The syntax $\mt{if} \; e \; \mt{then} \; e_1 \; \mt{else} \; e_2$ expands to $\mt{case} \; e \; \mt{of} \; \mt{Basis}.\mt{True} \Rightarrow e_1 \mid \mt{Basis}.\mt{False} \Rightarrow e_2$.
adamc@529 343
adamc@529 344 There are infix operator syntaxes for a number of functions defined in the $\mt{Basis}$ module. There is $=$ for $\mt{eq}$, $\neq$ for $\mt{neq}$, $-$ for $\mt{neg}$ (as a prefix operator) and $\mt{minus}$, $+$ for $\mt{plus}$, $\times$ for $\mt{times}$, $/$ for $\mt{div}$, $\%$ for $\mt{mod}$, $<$ for $\mt{lt}$, $\leq$ for $\mt{le}$, $>$ for $\mt{gt}$, and $\geq$ for $\mt{ge}$.
adamc@529 345
adamc@529 346 A signature item $\mt{table} \; x : c$ is shorthand for $\mt{val} \; x : \mt{Basis}.\mt{sql\_table} \; c$. $\mt{sequence} \; x$ is short for $\mt{val} \; x : \mt{Basis}.\mt{sql\_sequence}$, and $\mt{cookie} \; x : \tau$ is shorthand for $\mt{val} \; x : \mt{Basis}.\mt{http\_cookie} \; \tau$.
adamc@529 347
adamc@530 348
adamc@530 349 \section{Static Semantics}
adamc@530 350
adamc@530 351 In this section, we give a declarative presentation of Ur's typing rules and related judgments. Inference is the subject of the next section; here, we assume that an oracle has filled in all wildcards with concrete values.
adamc@530 352
adamc@530 353 Since there is significant mutual recursion among the judgments, we introduce them all before beginning to give rules. We use the same variety of contexts throughout this section, implicitly introducing new sorts of context entries as needed.
adamc@530 354 \begin{itemize}
adamc@530 355 \item $\Gamma \vdash c :: \kappa$ assigns a kind to a constructor in a context.
adamc@530 356 \item $\Gamma \vdash c \sim c$ proves the disjointness of two record constructors; that is, that they share no field names. We overload the judgment to apply to pairs of field names as well.
adamc@531 357 \item $\Gamma \vdash c \hookrightarrow C$ proves that record constructor $c$ decomposes into set $C$ of field names and record constructors.
adamc@530 358 \item $\Gamma \vdash c \equiv c$ proves the computational equivalence of two constructors. This is often called a \emph{definitional equality} in the world of type theory.
adamc@530 359 \item $\Gamma \vdash e : \tau$ is a standard typing judgment.
adamc@534 360 \item $\Gamma \vdash p \leadsto \Gamma; \tau$ combines typing of patterns with calculation of which new variables they bind.
adamc@537 361 \item $\Gamma \vdash d \leadsto \Gamma$ expresses how a declaration modifies a context. We overload this judgment to apply to sequences of declarations, as well as to signature items and sequences of signature items.
adamc@537 362 \item $\Gamma \vdash S \equiv S$ is the signature equivalence judgment.
adamc@536 363 \item $\Gamma \vdash S \leq S$ is the signature compatibility judgment. We write $\Gamma \vdash S$ as shorthand for $\Gamma \vdash S \leq S$.
adamc@530 364 \item $\Gamma \vdash M : S$ is the module signature checking judgment.
adamc@537 365 \item $\mt{proj}(M, \overline{s}, V)$ is a partial function for projecting a signature item from $\overline{s}$, given the module $M$ that we project from. $V$ may be $\mt{con} \; x$, $\mt{datatype} \; x$, $\mt{val} \; x$, $\mt{signature} \; X$, or $\mt{structure} \; X$. The parameter $M$ is needed because the projected signature item may refer to other items from $\overline{s}$.
adamc@539 366 \item $\mt{selfify}(M, \overline{s})$ adds information to signature items $\overline{s}$ to reflect the fact that we are concerned with the particular module $M$. This function is overloaded to work over individual signature items as well.
adamc@530 367 \end{itemize}
adamc@530 368
adamc@530 369 \subsection{Kinding}
adamc@530 370
adamc@530 371 $$\infer{\Gamma \vdash (c) :: \kappa :: \kappa}{
adamc@530 372 \Gamma \vdash c :: \kappa
adamc@530 373 }
adamc@530 374 \quad \infer{\Gamma \vdash x :: \kappa}{
adamc@530 375 x :: \kappa \in \Gamma
adamc@530 376 }
adamc@530 377 \quad \infer{\Gamma \vdash x :: \kappa}{
adamc@530 378 x :: \kappa = c \in \Gamma
adamc@530 379 }$$
adamc@530 380
adamc@530 381 $$\infer{\Gamma \vdash M.x :: \kappa}{
adamc@537 382 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 383 & \mt{proj}(M, \overline{s}, \mt{con} \; x) = \kappa
adamc@530 384 }
adamc@530 385 \quad \infer{\Gamma \vdash M.x :: \kappa}{
adamc@537 386 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 387 & \mt{proj}(M, \overline{s}, \mt{con} \; x) = (\kappa, c)
adamc@530 388 }$$
adamc@530 389
adamc@530 390 $$\infer{\Gamma \vdash \tau_1 \to \tau_2 :: \mt{Type}}{
adamc@530 391 \Gamma \vdash \tau_1 :: \mt{Type}
adamc@530 392 & \Gamma \vdash \tau_2 :: \mt{Type}
adamc@530 393 }
adamc@530 394 \quad \infer{\Gamma \vdash x \; ? \: \kappa \to \tau :: \mt{Type}}{
adamc@530 395 \Gamma, x :: \kappa \vdash \tau :: \mt{Type}
adamc@530 396 }
adamc@530 397 \quad \infer{\Gamma \vdash \$c :: \mt{Type}}{
adamc@530 398 \Gamma \vdash c :: \{\mt{Type}\}
adamc@530 399 }$$
adamc@530 400
adamc@530 401 $$\infer{\Gamma \vdash c_1 \; c_2 :: \kappa_2}{
adamc@530 402 \Gamma \vdash c_1 :: \kappa_1 \to \kappa_2
adamc@530 403 & \Gamma \vdash c_2 :: \kappa_1
adamc@530 404 }
adamc@530 405 \quad \infer{\Gamma \vdash \lambda x \; :: \; \kappa_1 \Rightarrow c :: \kappa_1 \to \kappa_2}{
adamc@530 406 \Gamma, x :: \kappa_1 \vdash c :: \kappa_2
adamc@530 407 }$$
adamc@530 408
adamc@530 409 $$\infer{\Gamma \vdash () :: \mt{Unit}}{}
adamc@530 410 \quad \infer{\Gamma \vdash \#X :: \mt{Name}}{}$$
adamc@530 411
adamc@530 412 $$\infer{\Gamma \vdash [\overline{c_i = c'_i}] :: \{\kappa\}}{
adamc@530 413 \forall i: \Gamma \vdash c_i : \mt{Name}
adamc@530 414 & \Gamma \vdash c'_i :: \kappa
adamc@530 415 & \forall i \neq j: \Gamma \vdash c_i \sim c_j
adamc@530 416 }
adamc@530 417 \quad \infer{\Gamma \vdash c_1 \rc c_2 :: \{\kappa\}}{
adamc@530 418 \Gamma \vdash c_1 :: \{\kappa\}
adamc@530 419 & \Gamma \vdash c_2 :: \{\kappa\}
adamc@530 420 & \Gamma \vdash c_1 \sim c_2
adamc@530 421 }$$
adamc@530 422
adamc@530 423 $$\infer{\Gamma \vdash \mt{fold} :: ((\mt{Name} \to \kappa_1 \to \kappa_2 \to \kappa_2) \to \kappa_2 \to \{\kappa_1\} \to \kappa_2}{}$$
adamc@530 424
adamc@530 425 $$\infer{\Gamma \vdash (\overline c) :: (k_1 \times \ldots \times k_n)}{
adamc@530 426 \forall i: \Gamma \vdash c_i :: k_i
adamc@530 427 }
adamc@530 428 \quad \infer{\Gamma \vdash c.i :: k_i}{
adamc@530 429 \Gamma \vdash c :: (k_1 \times \ldots \times k_n)
adamc@530 430 }$$
adamc@530 431
adamc@530 432 $$\infer{\Gamma \vdash \lambda [c_1 \sim c_2] \Rightarrow c :: \kappa}{
adamc@530 433 \Gamma \vdash c_1 :: \{\kappa'\}
adamc@530 434 & \Gamma \vdash c_2 :: \{\kappa'\}
adamc@530 435 & \Gamma, c_1 \sim c_2 \vdash c :: \kappa
adamc@530 436 }$$
adamc@530 437
adamc@531 438 \subsection{Record Disjointness}
adamc@531 439
adamc@531 440 We will use a keyword $\mt{map}$ as a shorthand, such that, for $f$ of kind $\kappa \to \kappa'$, $\mt{map} \; f$ stands for $\mt{fold} \; (\lambda (x_1 :: \mt{Name}) (x_2 :: \kappa) (x_3 :: \{\kappa'\}) \Rightarrow [x_1 = f \; x_2] \rc x_3) \; []$.
adamc@531 441
adamc@531 442 $$\infer{\Gamma \vdash c_1 \sim c_2}{
adamc@531 443 \Gamma \vdash c_1 \hookrightarrow c'_1
adamc@531 444 & \Gamma \vdash c_2 \hookrightarrow c'_2
adamc@531 445 & \forall c''_1 \in c'_1, c''_2 \in c'_2: \Gamma \vdash c''_1 \sim c''_2
adamc@531 446 }
adamc@531 447 \quad \infer{\Gamma \vdash X \sim X'}{
adamc@531 448 X \neq X'
adamc@531 449 }$$
adamc@531 450
adamc@531 451 $$\infer{\Gamma \vdash c_1 \sim c_2}{
adamc@531 452 c'_1 \sim c'_2 \in \Gamma
adamc@531 453 & \Gamma \vdash c'_1 \hookrightarrow c''_1
adamc@531 454 & \Gamma \vdash c'_2 \hookrightarrow c''_2
adamc@531 455 & c_1 \in c''_1
adamc@531 456 & c_2 \in c''_2
adamc@531 457 }$$
adamc@531 458
adamc@531 459 $$\infer{\Gamma \vdash c \hookrightarrow \{c\}}{}
adamc@531 460 \quad \infer{\Gamma \vdash [\overline{c = c'}] \hookrightarrow \{\overline{c}\}}{}
adamc@531 461 \quad \infer{\Gamma \vdash c_1 \rc c_2 \hookrightarrow C_1 \cup C_2}{
adamc@531 462 \Gamma \vdash c_1 \hookrightarrow C_1
adamc@531 463 & \Gamma \vdash c_2 \hookrightarrow C_2
adamc@531 464 }
adamc@531 465 \quad \infer{\Gamma \vdash c \hookrightarrow C}{
adamc@531 466 \Gamma \vdash c \equiv c'
adamc@531 467 & \Gamma \vdash c' \hookrightarrow C
adamc@531 468 }
adamc@531 469 \quad \infer{\Gamma \vdash \mt{map} \; f \; c \hookrightarrow C}{
adamc@531 470 \Gamma \vdash c \hookrightarrow C
adamc@531 471 }$$
adamc@531 472
adamc@541 473 \subsection{\label{definitional}Definitional Equality}
adamc@532 474
adamc@532 475 We use $\mathcal C$ to stand for a one-hole context that, when filled, yields a constructor. The notation $\mathcal C[c]$ plugs $c$ into $\mathcal C$. We omit the standard definition of one-hole contexts. We write $[x \mapsto c_1]c_2$ for capture-avoiding substitution of $c_1$ for $x$ in $c_2$.
adamc@532 476
adamc@532 477 $$\infer{\Gamma \vdash c \equiv c}{}
adamc@532 478 \quad \infer{\Gamma \vdash c_1 \equiv c_2}{
adamc@532 479 \Gamma \vdash c_2 \equiv c_1
adamc@532 480 }
adamc@532 481 \quad \infer{\Gamma \vdash c_1 \equiv c_3}{
adamc@532 482 \Gamma \vdash c_1 \equiv c_2
adamc@532 483 & \Gamma \vdash c_2 \equiv c_3
adamc@532 484 }
adamc@532 485 \quad \infer{\Gamma \vdash \mathcal C[c_1] \equiv \mathcal C[c_2]}{
adamc@532 486 \Gamma \vdash c_1 \equiv c_2
adamc@532 487 }$$
adamc@532 488
adamc@532 489 $$\infer{\Gamma \vdash x \equiv c}{
adamc@532 490 x :: \kappa = c \in \Gamma
adamc@532 491 }
adamc@532 492 \quad \infer{\Gamma \vdash M.x \equiv c}{
adamc@537 493 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 494 & \mt{proj}(M, \overline{s}, \mt{con} \; x) = (\kappa, c)
adamc@532 495 }
adamc@532 496 \quad \infer{\Gamma \vdash (\overline c).i \equiv c_i}{}$$
adamc@532 497
adamc@532 498 $$\infer{\Gamma \vdash (\lambda x :: \kappa \Rightarrow c) \; c' \equiv [x \mapsto c'] c}{}
adamc@532 499 \quad \infer{\Gamma \vdash c_1 \rc c_2 \equiv c_2 \rc c_1}{}
adamc@532 500 \quad \infer{\Gamma \vdash c_1 \rc (c_2 \rc c_3) \equiv (c_1 \rc c_2) \rc c_3}{}$$
adamc@532 501
adamc@532 502 $$\infer{\Gamma \vdash [] \rc c \equiv c}{}
adamc@532 503 \quad \infer{\Gamma \vdash [\overline{c_1 = c'_1}] \rc [\overline{c_2 = c'_2}] \equiv [\overline{c_1 = c'_1}, \overline{c_2 = c'_2}]}{}$$
adamc@532 504
adamc@532 505 $$\infer{\Gamma \vdash \lambda [c_1 \sim c_2] \Rightarrow c \equiv c}{
adamc@532 506 \Gamma \vdash c_1 \sim c_2
adamc@532 507 }
adamc@532 508 \quad \infer{\Gamma \vdash \mt{fold} \; f \; i \; [] \equiv i}{}
adamc@532 509 \quad \infer{\Gamma \vdash \mt{fold} \; f \; i \; ([c_1 = c_2] \rc c) \equiv f \; c_1 \; c_2 \; (\mt{fold} \; f \; i \; c)}{}$$
adamc@532 510
adamc@532 511 $$\infer{\Gamma \vdash \mt{map} \; (\lambda x \Rightarrow x) \; c \equiv c}{}
adamc@532 512 \quad \infer{\Gamma \vdash \mt{fold} \; f \; i \; (\mt{map} \; f' \; c)
adamc@532 513 \equiv \mt{fold} \; (\lambda (x_1 :: \mt{Name}) (x_2 :: \kappa) \Rightarrow f \; x_1 \; (f' \; x_2)) \; i \; c}{}$$
adamc@532 514
adamc@532 515 $$\infer{\Gamma \vdash \mt{map} \; f \; (c_1 \rc c_2) \equiv \mt{map} \; f \; c_1 \rc \mt{map} \; f \; c_2}{}$$
adamc@531 516
adamc@534 517 \subsection{Expression Typing}
adamc@533 518
adamc@533 519 We assume the existence of a function $T$ assigning types to literal constants. It maps integer constants to $\mt{Basis}.\mt{int}$, float constants to $\mt{Basis}.\mt{float}$, and string constants to $\mt{Basis}.\mt{string}$.
adamc@533 520
adamc@533 521 We also refer to a function $\mathcal I$, such that $\mathcal I(\tau)$ ``uses an oracle'' to instantiate all constructor function arguments at the beginning of $\tau$ that are marked implicit; i.e., replace $x_1 ::: \kappa_1 \to \ldots \to x_n ::: \kappa_n \to \tau$ with $[x_1 \mapsto c_1]\ldots[x_n \mapsto c_n]\tau$, where the $c_i$s are inferred and $\tau$ does not start like $x ::: \kappa \to \tau'$.
adamc@533 522
adamc@533 523 $$\infer{\Gamma \vdash e : \tau : \tau}{
adamc@533 524 \Gamma \vdash e : \tau
adamc@533 525 }
adamc@533 526 \quad \infer{\Gamma \vdash e : \tau}{
adamc@533 527 \Gamma \vdash e : \tau'
adamc@533 528 & \Gamma \vdash \tau' \equiv \tau
adamc@533 529 }
adamc@533 530 \quad \infer{\Gamma \vdash \ell : T(\ell)}{}$$
adamc@533 531
adamc@533 532 $$\infer{\Gamma \vdash x : \mathcal I(\tau)}{
adamc@533 533 x : \tau \in \Gamma
adamc@533 534 }
adamc@533 535 \quad \infer{\Gamma \vdash M.x : \mathcal I(\tau)}{
adamc@537 536 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 537 & \mt{proj}(M, \overline{s}, \mt{val} \; x) = \tau
adamc@533 538 }
adamc@533 539 \quad \infer{\Gamma \vdash X : \mathcal I(\tau)}{
adamc@533 540 X : \tau \in \Gamma
adamc@533 541 }
adamc@533 542 \quad \infer{\Gamma \vdash M.X : \mathcal I(\tau)}{
adamc@537 543 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 544 & \mt{proj}(M, \overline{s}, \mt{val} \; X) = \tau
adamc@533 545 }$$
adamc@533 546
adamc@533 547 $$\infer{\Gamma \vdash e_1 \; e_2 : \tau_2}{
adamc@533 548 \Gamma \vdash e_1 : \tau_1 \to \tau_2
adamc@533 549 & \Gamma \vdash e_2 : \tau_1
adamc@533 550 }
adamc@533 551 \quad \infer{\Gamma \vdash \lambda x : \tau_1 \Rightarrow e : \tau_1 \to \tau_2}{
adamc@533 552 \Gamma, x : \tau_1 \vdash e : \tau_2
adamc@533 553 }$$
adamc@533 554
adamc@533 555 $$\infer{\Gamma \vdash e [c] : [x \mapsto c]\tau}{
adamc@533 556 \Gamma \vdash e : x :: \kappa \to \tau
adamc@533 557 & \Gamma \vdash c :: \kappa
adamc@533 558 }
adamc@533 559 \quad \infer{\Gamma \vdash \lambda x \; ? \; \kappa \Rightarrow e : x \; ? \; \kappa \to \tau}{
adamc@533 560 \Gamma, x :: \kappa \vdash e : \tau
adamc@533 561 }$$
adamc@533 562
adamc@533 563 $$\infer{\Gamma \vdash \{\overline{c = e}\} : \{\overline{c : \tau}\}}{
adamc@533 564 \forall i: \Gamma \vdash c_i :: \mt{Name}
adamc@533 565 & \Gamma \vdash e_i : \tau_i
adamc@533 566 & \forall i \neq j: \Gamma \vdash c_i \sim c_j
adamc@533 567 }
adamc@533 568 \quad \infer{\Gamma \vdash e.c : \tau}{
adamc@533 569 \Gamma \vdash e : \$([c = \tau] \rc c')
adamc@533 570 }
adamc@533 571 \quad \infer{\Gamma \vdash e_1 \rc e_2 : \$(c_1 \rc c_2)}{
adamc@533 572 \Gamma \vdash e_1 : \$c_1
adamc@533 573 & \Gamma \vdash e_2 : \$c_2
adamc@533 574 }$$
adamc@533 575
adamc@533 576 $$\infer{\Gamma \vdash e \rcut c : \$c'}{
adamc@533 577 \Gamma \vdash e : \$([c = \tau] \rc c')
adamc@533 578 }
adamc@533 579 \quad \infer{\Gamma \vdash e \rcutM c : \$c'}{
adamc@533 580 \Gamma \vdash e : \$(c \rc c')
adamc@533 581 }$$
adamc@533 582
adamc@533 583 $$\infer{\Gamma \vdash \mt{fold} : \begin{array}{c}
adamc@533 584 x_1 :: (\{\kappa\} \to \tau)
adamc@533 585 \to (x_2 :: \mt{Name} \to x_3 :: \kappa \to x_4 :: \{\kappa\} \to \lambda [[x_2] \sim x_4]
adamc@533 586 \Rightarrow x_1 \; x_4 \to x_1 \; ([x_2 = x_3] \rc x_4)) \\
adamc@533 587 \to x_1 \; [] \to x_5 :: \{\kappa\} \to x_1 \; x_5
adamc@533 588 \end{array}}{}$$
adamc@533 589
adamc@533 590 $$\infer{\Gamma \vdash \mt{let} \; \overline{ed} \; \mt{in} \; e \; \mt{end} : \tau}{
adamc@533 591 \Gamma \vdash \overline{ed} \leadsto \Gamma'
adamc@533 592 & \Gamma' \vdash e : \tau
adamc@533 593 }
adamc@533 594 \quad \infer{\Gamma \vdash \mt{case} \; e \; \mt{of} \; \overline{p \Rightarrow e} : \tau}{
adamc@533 595 \forall i: \Gamma \vdash p_i \leadsto \Gamma_i, \tau'
adamc@533 596 & \Gamma_i \vdash e_i : \tau
adamc@533 597 }$$
adamc@533 598
adamc@533 599 $$\infer{\Gamma \vdash [c_1 \sim c_2] \Rightarrow e : [c_1 \sim c_2] \Rightarrow \tau}{
adamc@533 600 \Gamma \vdash c_1 :: \{\kappa\}
adamc@533 601 & \Gamma \vdash c_2 :: \{\kappa\}
adamc@533 602 & \Gamma, c_1 \sim c_2 \vdash e : \tau
adamc@533 603 }$$
adamc@533 604
adamc@534 605 \subsection{Pattern Typing}
adamc@534 606
adamc@534 607 $$\infer{\Gamma \vdash \_ \leadsto \Gamma; \tau}{}
adamc@534 608 \quad \infer{\Gamma \vdash x \leadsto \Gamma, x : \tau; \tau}{}
adamc@534 609 \quad \infer{\Gamma \vdash \ell \leadsto \Gamma; T(\ell)}{}$$
adamc@534 610
adamc@534 611 $$\infer{\Gamma \vdash X \leadsto \Gamma; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@534 612 X : \overline{x ::: \mt{Type}} \to \tau \in \Gamma
adamc@534 613 & \textrm{$\tau$ not a function type}
adamc@534 614 }
adamc@534 615 \quad \infer{\Gamma \vdash X \; p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@534 616 X : \overline{x ::: \mt{Type}} \to \tau'' \to \tau \in \Gamma
adamc@534 617 & \Gamma \vdash p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau''
adamc@534 618 }$$
adamc@534 619
adamc@534 620 $$\infer{\Gamma \vdash M.X \leadsto \Gamma; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@537 621 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 622 & \mt{proj}(M, \overline{s}, \mt{val} \; X) = \overline{x ::: \mt{Type}} \to \tau
adamc@534 623 & \textrm{$\tau$ not a function type}
adamc@534 624 }$$
adamc@534 625
adamc@534 626 $$\infer{\Gamma \vdash M.X \; p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@537 627 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 628 & \mt{proj}(M, \overline{s}, \mt{val} \; X) = \overline{x ::: \mt{Type}} \to \tau'' \to \tau
adamc@534 629 & \Gamma \vdash p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau''
adamc@534 630 }$$
adamc@534 631
adamc@534 632 $$\infer{\Gamma \vdash \{\overline{x = p}\} \leadsto \Gamma_n; \{\overline{x = \tau}\}}{
adamc@534 633 \Gamma_0 = \Gamma
adamc@534 634 & \forall i: \Gamma_i \vdash p_i \leadsto \Gamma_{i+1}; \tau_i
adamc@534 635 }
adamc@534 636 \quad \infer{\Gamma \vdash \{\overline{x = p}, \ldots\} \leadsto \Gamma_n; \$([\overline{x = \tau}] \rc c)}{
adamc@534 637 \Gamma_0 = \Gamma
adamc@534 638 & \forall i: \Gamma_i \vdash p_i \leadsto \Gamma_{i+1}; \tau_i
adamc@534 639 }$$
adamc@534 640
adamc@535 641 \subsection{Declaration Typing}
adamc@535 642
adamc@535 643 We use an auxiliary judgment $\overline{y}; x; \Gamma \vdash \overline{dc} \leadsto \Gamma'$, expressing the enrichment of $\Gamma$ with the types of the datatype constructors $\overline{dc}$, when they are known to belong to datatype $x$ with type parameters $\overline{y}$.
adamc@535 644
adamc@541 645 This is the first judgment where we deal with type classes, for the $\mt{class}$ declaration form. We will omit their special handling in this formal specification. Section \ref{typeclasses} gives an informal description of how type classes influence type inference.
adamc@535 646
adamc@537 647 We presuppose the existence of a function $\mathcal O$, where $\mathcal(M, \overline{s})$ implements the $\mt{open}$ declaration by producing a context with the appropriate entry for each available component of module $M$ with signature items $\overline{s}$. Where possible, $\mathcal O$ uses ``transparent'' entries (e.g., an abstract type $M.x$ is mapped to $x :: \mt{Type} = M.x$), so that the relationship with $M$ is maintained. A related function $\mathcal O_c$ builds a context containing the disjointness constraints found in $S$.
adamc@537 648
adamc@537 649 We write $\kappa_1^n \to \kappa$ as a shorthand, where $\kappa_1^0 \to \kappa = \kappa$ and $\kappa_1^{n+1} \to \kappa_2 = \kappa_1 \to (\kappa_1^n \to \kappa_2)$. We write $\mt{len}(\overline{y})$ for the length of vector $\overline{y}$ of variables.
adamc@535 650
adamc@535 651 $$\infer{\Gamma \vdash \cdot \leadsto \Gamma}{}
adamc@535 652 \quad \infer{\Gamma \vdash d, \overline{d} \leadsto \Gamma''}{
adamc@535 653 \Gamma \vdash d \leadsto \Gamma'
adamc@535 654 & \Gamma' \vdash \overline{d} \leadsto \Gamma''
adamc@535 655 }$$
adamc@535 656
adamc@535 657 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa = c \leadsto \Gamma, x :: \kappa = c}{
adamc@535 658 \Gamma \vdash c :: \kappa
adamc@535 659 }
adamc@535 660 \quad \infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leadsto \Gamma'}{
adamc@535 661 \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} \vdash \overline{dc} \leadsto \Gamma'
adamc@535 662 }$$
adamc@535 663
adamc@535 664 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leadsto \Gamma'}{
adamc@537 665 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 666 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@535 667 & \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} = M.z \vdash \overline{dc} \leadsto \Gamma'
adamc@535 668 }$$
adamc@535 669
adamc@535 670 $$\infer{\Gamma \vdash \mt{val} \; x : \tau = e \leadsto \Gamma, x : \tau}{
adamc@535 671 \Gamma \vdash e : \tau
adamc@535 672 }$$
adamc@535 673
adamc@535 674 $$\infer{\Gamma \vdash \mt{val} \; \mt{rec} \; \overline{x : \tau = e} \leadsto \Gamma, \overline{x : \tau}}{
adamc@535 675 \forall i: \Gamma, \overline{x : \tau} \vdash e_i : \tau_i
adamc@535 676 & \textrm{$e_i$ starts with an expression $\lambda$, optionally preceded by constructor and disjointness $\lambda$s}
adamc@535 677 }$$
adamc@535 678
adamc@535 679 $$\infer{\Gamma \vdash \mt{structure} \; X : S = M \leadsto \Gamma, X : S}{
adamc@535 680 \Gamma \vdash M : S
adamc@539 681 & \textrm{ ($M$ not a $\mt{struct} \; \ldots \; \mt{end}$)}
adamc@535 682 }
adamc@539 683 \quad \infer{\Gamma \vdash \mt{structure} \; X : S = \mt{struct} \; \overline{d} \; \mt{end} \leadsto \Gamma, X : \mt{selfify}(X, \overline{s})}{
adamc@539 684 \Gamma \vdash \mt{struct} \; \overline{d} \; \mt{end} : \mt{sig} \; \overline{s} \; \mt{end}
adamc@539 685 }$$
adamc@539 686
adamc@539 687 $$\infer{\Gamma \vdash \mt{signature} \; X = S \leadsto \Gamma, X = S}{
adamc@535 688 \Gamma \vdash S
adamc@535 689 }$$
adamc@535 690
adamc@537 691 $$\infer{\Gamma \vdash \mt{open} \; M \leadsto \Gamma, \mathcal O(M, \overline{s})}{
adamc@537 692 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@535 693 }$$
adamc@535 694
adamc@535 695 $$\infer{\Gamma \vdash \mt{constraint} \; c_1 \sim c_2 \leadsto \Gamma}{
adamc@535 696 \Gamma \vdash c_1 :: \{\kappa\}
adamc@535 697 & \Gamma \vdash c_2 :: \{\kappa\}
adamc@535 698 & \Gamma \vdash c_1 \sim c_2
adamc@535 699 }
adamc@537 700 \quad \infer{\Gamma \vdash \mt{open} \; \mt{constraints} \; M \leadsto \Gamma, \mathcal O_c(M, \overline{s})}{
adamc@537 701 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@535 702 }$$
adamc@535 703
adamc@535 704 $$\infer{\Gamma \vdash \mt{table} \; x : c \leadsto \Gamma, x : \mt{Basis}.\mt{sql\_table} \; c}{
adamc@535 705 \Gamma \vdash c :: \{\mt{Type}\}
adamc@535 706 }
adamc@535 707 \quad \infer{\Gamma \vdash \mt{sequence} \; x \leadsto \Gamma, x : \mt{Basis}.\mt{sql\_sequence}}{}$$
adamc@535 708
adamc@535 709 $$\infer{\Gamma \vdash \mt{cookie} \; x : \tau \leadsto \Gamma, x : \mt{Basis}.\mt{http\_cookie} \; \tau}{
adamc@535 710 \Gamma \vdash \tau :: \mt{Type}
adamc@535 711 }$$
adamc@535 712
adamc@535 713 $$\infer{\Gamma \vdash \mt{class} \; x = c \leadsto \Gamma, x :: \mt{Type} \to \mt{Type} = c}{
adamc@535 714 \Gamma \vdash c :: \mt{Type} \to \mt{Type}
adamc@535 715 }$$
adamc@535 716
adamc@535 717 $$\infer{\overline{y}; x; \Gamma \vdash \cdot \leadsto \Gamma}{}
adamc@535 718 \quad \infer{\overline{y}; x; \Gamma \vdash X \mid \overline{dc} \leadsto \Gamma', X : \overline{y ::: \mt{Type}} \to x \; \overline{y}}{
adamc@535 719 \overline{y}; x; \Gamma \vdash \overline{dc} \leadsto \Gamma'
adamc@535 720 }
adamc@535 721 \quad \infer{\overline{y}; x; \Gamma \vdash X \; \mt{of} \; \tau \mid \overline{dc} \leadsto \Gamma', X : \overline{y ::: \mt{Type}} \to \tau \to x \; \overline{y}}{
adamc@535 722 \overline{y}; x; \Gamma \vdash \overline{dc} \leadsto \Gamma'
adamc@535 723 }$$
adamc@535 724
adamc@537 725 \subsection{Signature Item Typing}
adamc@537 726
adamc@537 727 We appeal to a signature item analogue of the $\mathcal O$ function from the last subsection.
adamc@537 728
adamc@537 729 $$\infer{\Gamma \vdash \cdot \leadsto \Gamma}{}
adamc@537 730 \quad \infer{\Gamma \vdash s, \overline{s} \leadsto \Gamma''}{
adamc@537 731 \Gamma \vdash s \leadsto \Gamma'
adamc@537 732 & \Gamma' \vdash \overline{s} \leadsto \Gamma''
adamc@537 733 }$$
adamc@537 734
adamc@537 735 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa \leadsto \Gamma, x :: \kappa}{}
adamc@537 736 \quad \infer{\Gamma \vdash \mt{con} \; x :: \kappa = c \leadsto \Gamma, x :: \kappa = c}{
adamc@537 737 \Gamma \vdash c :: \kappa
adamc@537 738 }
adamc@537 739 \quad \infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leadsto \Gamma'}{
adamc@537 740 \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} \vdash \overline{dc} \leadsto \Gamma'
adamc@537 741 }$$
adamc@537 742
adamc@537 743 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leadsto \Gamma'}{
adamc@537 744 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 745 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@537 746 & \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} = M.z \vdash \overline{dc} \leadsto \Gamma'
adamc@537 747 }$$
adamc@537 748
adamc@537 749 $$\infer{\Gamma \vdash \mt{val} \; x : \tau \leadsto \Gamma, x : \tau}{
adamc@537 750 \Gamma \vdash \tau :: \mt{Type}
adamc@537 751 }$$
adamc@537 752
adamc@537 753 $$\infer{\Gamma \vdash \mt{structure} \; X : S \leadsto \Gamma, X : S}{
adamc@537 754 \Gamma \vdash S
adamc@537 755 }
adamc@537 756 \quad \infer{\Gamma \vdash \mt{signature} \; X = S \leadsto \Gamma, X = S}{
adamc@537 757 \Gamma \vdash S
adamc@537 758 }$$
adamc@537 759
adamc@537 760 $$\infer{\Gamma \vdash \mt{include} \; S \leadsto \Gamma, \mathcal O(\overline{s})}{
adamc@537 761 \Gamma \vdash S
adamc@537 762 & \Gamma \vdash S \equiv \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 763 }$$
adamc@537 764
adamc@537 765 $$\infer{\Gamma \vdash \mt{constraint} \; c_1 \sim c_2 \leadsto \Gamma, c_1 \sim c_2}{
adamc@537 766 \Gamma \vdash c_1 :: \{\kappa\}
adamc@537 767 & \Gamma \vdash c_2 :: \{\kappa\}
adamc@537 768 }$$
adamc@537 769
adamc@537 770 $$\infer{\Gamma \vdash \mt{class} \; x = c \leadsto \Gamma, x :: \mt{Type} \to \mt{Type} = c}{
adamc@537 771 \Gamma \vdash c :: \mt{Type} \to \mt{Type}
adamc@537 772 }
adamc@537 773 \quad \infer{\Gamma \vdash \mt{class} \; x \leadsto \Gamma, x :: \mt{Type} \to \mt{Type}}{}$$
adamc@537 774
adamc@536 775 \subsection{Signature Compatibility}
adamc@536 776
adamc@537 777 To simplify the judgments in this section, we assume that all signatures are alpha-varied as necessary to avoid including mmultiple bindings for the same identifier. This is in addition to the usual alpha-variation of locally-bound variables.
adamc@537 778
adamc@537 779 We rely on a judgment $\Gamma \vdash \overline{s} \leq s'$, which expresses the occurrence in signature items $\overline{s}$ of an item compatible with $s'$. We also use a judgment $\Gamma \vdash \overline{dc} \leq \overline{dc}$, which expresses compatibility of datatype definitions.
adamc@537 780
adamc@536 781 $$\infer{\Gamma \vdash S \equiv S}{}
adamc@536 782 \quad \infer{\Gamma \vdash S_1 \equiv S_2}{
adamc@536 783 \Gamma \vdash S_2 \equiv S_1
adamc@536 784 }
adamc@536 785 \quad \infer{\Gamma \vdash X \equiv S}{
adamc@536 786 X = S \in \Gamma
adamc@536 787 }
adamc@536 788 \quad \infer{\Gamma \vdash M.X \equiv S}{
adamc@537 789 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 790 & \mt{proj}(M, \overline{s}, \mt{signature} \; X) = S
adamc@536 791 }$$
adamc@536 792
adamc@536 793 $$\infer{\Gamma \vdash S \; \mt{where} \; \mt{con} \; x = c \equiv \mt{sig} \; \overline{s^1} \; \mt{con} \; x :: \kappa = c \; \overline{s_2} \; \mt{end}}{
adamc@536 794 \Gamma \vdash S \equiv \mt{sig} \; \overline{s^1} \; \mt{con} \; x :: \kappa \; \overline{s_2} \; \mt{end}
adamc@536 795 & \Gamma \vdash c :: \kappa
adamc@537 796 }
adamc@537 797 \quad \infer{\Gamma \vdash \mt{sig} \; \overline{s^1} \; \mt{include} \; S \; \overline{s^2} \; \mt{end} \equiv \mt{sig} \; \overline{s^1} \; \overline{s} \; \overline{s^2} \; \mt{end}}{
adamc@537 798 \Gamma \vdash S \equiv \mt{sig} \; \overline{s} \; \mt{end}
adamc@536 799 }$$
adamc@536 800
adamc@536 801 $$\infer{\Gamma \vdash S_1 \leq S_2}{
adamc@536 802 \Gamma \vdash S_1 \equiv S_2
adamc@536 803 }
adamc@536 804 \quad \infer{\Gamma \vdash \mt{sig} \; \overline{s} \; \mt{end} \leq \mt{sig} \; \mt{end}}{}
adamc@537 805 \quad \infer{\Gamma \vdash \mt{sig} \; \overline{s} \; \mt{end} \leq \mt{sig} \; s' \; \overline{s'} \; \mt{end}}{
adamc@537 806 \Gamma \vdash \overline{s} \leq s'
adamc@537 807 & \Gamma \vdash s' \leadsto \Gamma'
adamc@537 808 & \Gamma' \vdash \mt{sig} \; \overline{s} \; \mt{end} \leq \mt{sig} \; \overline{s'} \; \mt{end}
adamc@537 809 }$$
adamc@537 810
adamc@537 811 $$\infer{\Gamma \vdash s \; \overline{s} \leq s'}{
adamc@537 812 \Gamma \vdash s \leq s'
adamc@537 813 }
adamc@537 814 \quad \infer{\Gamma \vdash s \; \overline{s} \leq s'}{
adamc@537 815 \Gamma \vdash s \leadsto \Gamma'
adamc@537 816 & \Gamma' \vdash \overline{s} \leq s'
adamc@536 817 }$$
adamc@536 818
adamc@536 819 $$\infer{\Gamma \vdash \mt{functor} (X : S_1) : S_2 \leq \mt{functor} (X : S'_1) : S'_2}{
adamc@536 820 \Gamma \vdash S'_1 \leq S_1
adamc@536 821 & \Gamma, X : S'_1 \vdash S_2 \leq S'_2
adamc@536 822 }$$
adamc@536 823
adamc@537 824 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa \leq \mt{con} \; x :: \kappa}{}
adamc@537 825 \quad \infer{\Gamma \vdash \mt{con} \; x :: \kappa = c \leq \mt{con} \; x :: \kappa}{}
adamc@537 826 \quad \infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leq \mt{con} \; x :: \mt{Type}}{}$$
adamc@537 827
adamc@537 828 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leq \mt{con} \; x :: \mt{Type}^{\mt{len}(y)} \to \mt{Type}}{
adamc@537 829 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 830 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@537 831 }$$
adamc@537 832
adamc@537 833 $$\infer{\Gamma \vdash \mt{class} \; x \leq \mt{con} \; x :: \mt{Type} \to \mt{Type}}{}
adamc@537 834 \quad \infer{\Gamma \vdash \mt{class} \; x = c \leq \mt{con} \; x :: \mt{Type} \to \mt{Type}}{}$$
adamc@537 835
adamc@537 836 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa = c_1 \leq \mt{con} \; x :: \mt{\kappa} = c_2}{
adamc@537 837 \Gamma \vdash c_1 \equiv c_2
adamc@537 838 }
adamc@537 839 \quad \infer{\Gamma \vdash \mt{class} \; x = c_1 \leq \mt{con} \; x :: \mt{Type} \to \mt{Type} = c_2}{
adamc@537 840 \Gamma \vdash c_1 \equiv c_2
adamc@537 841 }$$
adamc@537 842
adamc@537 843 $$\infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leq \mt{datatype} \; x \; \overline{y} = \overline{dc'}}{
adamc@537 844 \Gamma, \overline{y :: \mt{Type}} \vdash \overline{dc} \leq \overline{dc'}
adamc@537 845 }$$
adamc@537 846
adamc@537 847 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leq \mt{datatype} \; x \; \overline{y} = \overline{dc'}}{
adamc@537 848 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537 849 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@537 850 & \Gamma, \overline{y :: \mt{Type}} \vdash \overline{dc} \leq \overline{dc'}
adamc@537 851 }$$
adamc@537 852
adamc@537 853 $$\infer{\Gamma \vdash \cdot \leq \cdot}{}
adamc@537 854 \quad \infer{\Gamma \vdash X; \overline{dc} \leq X; \overline{dc'}}{
adamc@537 855 \Gamma \vdash \overline{dc} \leq \overline{dc'}
adamc@537 856 }
adamc@537 857 \quad \infer{\Gamma \vdash X \; \mt{of} \; \tau_1; \overline{dc} \leq X \; \mt{of} \; \tau_2; \overline{dc'}}{
adamc@537 858 \Gamma \vdash \tau_1 \equiv \tau_2
adamc@537 859 & \Gamma \vdash \overline{dc} \leq \overline{dc'}
adamc@537 860 }$$
adamc@537 861
adamc@537 862 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leq \mt{datatype} \; x = \mt{datatype} \; M'.z'}{
adamc@537 863 \Gamma \vdash M.z \equiv M'.z'
adamc@537 864 }$$
adamc@537 865
adamc@537 866 $$\infer{\Gamma \vdash \mt{val} \; x : \tau_1 \leq \mt{val} \; x : \tau_2}{
adamc@537 867 \Gamma \vdash \tau_1 \equiv \tau_2
adamc@537 868 }
adamc@537 869 \quad \infer{\Gamma \vdash \mt{structure} \; X : S_1 \leq \mt{structure} \; X : S_2}{
adamc@537 870 \Gamma \vdash S_1 \leq S_2
adamc@537 871 }
adamc@537 872 \quad \infer{\Gamma \vdash \mt{signature} \; X = S_1 \leq \mt{signature} \; X = S_2}{
adamc@537 873 \Gamma \vdash S_1 \leq S_2
adamc@537 874 & \Gamma \vdash S_2 \leq S_1
adamc@537 875 }$$
adamc@537 876
adamc@537 877 $$\infer{\Gamma \vdash \mt{constraint} \; c_1 \sim c_2 \leq \mt{constraint} \; c'_1 \sim c'_2}{
adamc@537 878 \Gamma \vdash c_1 \equiv c'_1
adamc@537 879 & \Gamma \vdash c_2 \equiv c'_2
adamc@537 880 }$$
adamc@537 881
adamc@537 882 $$\infer{\Gamma \vdash \mt{class} \; x \leq \mt{class} \; x}{}
adamc@537 883 \quad \infer{\Gamma \vdash \mt{class} \; x = c \leq \mt{class} \; x}{}
adamc@537 884 \quad \infer{\Gamma \vdash \mt{class} \; x = c_1 \leq \mt{class} \; x = c_2}{
adamc@537 885 \Gamma \vdash c_1 \equiv c_2
adamc@537 886 }$$
adamc@537 887
adamc@538 888 \subsection{Module Typing}
adamc@538 889
adamc@538 890 We use a helper function $\mt{sigOf}$, which converts declarations and sequences of declarations into their principal signature items and sequences of signature items, respectively.
adamc@538 891
adamc@538 892 $$\infer{\Gamma \vdash M : S}{
adamc@538 893 \Gamma \vdash M : S'
adamc@538 894 & \Gamma \vdash S' \leq S
adamc@538 895 }
adamc@538 896 \quad \infer{\Gamma \vdash \mt{struct} \; \overline{d} \; \mt{end} : \mt{sig} \; \mt{sigOf}(\overline{d}) \; \mt{end}}{
adamc@538 897 \Gamma \vdash \overline{d} \leadsto \Gamma'
adamc@538 898 }
adamc@538 899 \quad \infer{\Gamma \vdash X : S}{
adamc@538 900 X : S \in \Gamma
adamc@538 901 }$$
adamc@538 902
adamc@538 903 $$\infer{\Gamma \vdash M.X : S}{
adamc@538 904 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@538 905 & \mt{proj}(M, \overline{s}, \mt{structure} \; X) = S
adamc@538 906 }$$
adamc@538 907
adamc@538 908 $$\infer{\Gamma \vdash M_1(M_2) : [X \mapsto M_2]S_2}{
adamc@538 909 \Gamma \vdash M_1 : \mt{functor}(X : S_1) : S_2
adamc@538 910 & \Gamma \vdash M_2 : S_1
adamc@538 911 }
adamc@538 912 \quad \infer{\Gamma \vdash \mt{functor} (X : S_1) : S_2 = M : \mt{functor} (X : S_1) : S_2}{
adamc@538 913 \Gamma \vdash S_1
adamc@538 914 & \Gamma, X : S_1 \vdash S_2
adamc@538 915 & \Gamma, X : S_1 \vdash M : S_2
adamc@538 916 }$$
adamc@538 917
adamc@538 918 \begin{eqnarray*}
adamc@538 919 \mt{sigOf}(\cdot) &=& \cdot \\
adamc@538 920 \mt{sigOf}(s \; \overline{s'}) &=& \mt{sigOf}(s) \; \mt{sigOf}(\overline{s'}) \\
adamc@538 921 \\
adamc@538 922 \mt{sigOf}(\mt{con} \; x :: \kappa = c) &=& \mt{con} \; x :: \kappa = c \\
adamc@538 923 \mt{sigOf}(\mt{datatype} \; x \; \overline{y} = \overline{dc}) &=& \mt{datatype} \; x \; \overline{y} = \overline{dc} \\
adamc@538 924 \mt{sigOf}(\mt{datatype} \; x = \mt{datatype} \; M.z) &=& \mt{datatype} \; x = \mt{datatype} \; M.z \\
adamc@538 925 \mt{sigOf}(\mt{val} \; x : \tau = e) &=& \mt{val} \; x : \tau \\
adamc@538 926 \mt{sigOf}(\mt{val} \; \mt{rec} \; \overline{x : \tau = e}) &=& \overline{\mt{val} \; x : \tau} \\
adamc@538 927 \mt{sigOf}(\mt{structure} \; X : S = M) &=& \mt{structure} \; X : S \\
adamc@538 928 \mt{sigOf}(\mt{signature} \; X = S) &=& \mt{signature} \; X = S \\
adamc@538 929 \mt{sigOf}(\mt{open} \; M) &=& \mt{include} \; S \textrm{ (where $\Gamma \vdash M : S$)} \\
adamc@538 930 \mt{sigOf}(\mt{constraint} \; c_1 \sim c_2) &=& \mt{constraint} \; c_1 \sim c_2 \\
adamc@538 931 \mt{sigOf}(\mt{open} \; \mt{constraints} \; M) &=& \cdot \\
adamc@538 932 \mt{sigOf}(\mt{table} \; x : c) &=& \mt{table} \; x : c \\
adamc@538 933 \mt{sigOf}(\mt{sequence} \; x) &=& \mt{sequence} \; x \\
adamc@538 934 \mt{sigOf}(\mt{cookie} \; x : \tau) &=& \mt{cookie} \; x : \tau \\
adamc@538 935 \mt{sigOf}(\mt{class} \; x = c) &=& \mt{class} \; x = c \\
adamc@538 936 \end{eqnarray*}
adamc@538 937
adamc@539 938 \begin{eqnarray*}
adamc@539 939 \mt{selfify}(M, \cdot) &=& \cdot \\
adamc@539 940 \mt{selfify}(M, s \; \overline{s'}) &=& \mt{selfify}(M, \sigma, s) \; \mt{selfify}(M, \overline{s'}) \\
adamc@539 941 \\
adamc@539 942 \mt{selfify}(M, \mt{con} \; x :: \kappa) &=& \mt{con} \; x :: \kappa = M.x \\
adamc@539 943 \mt{selfify}(M, \mt{con} \; x :: \kappa = c) &=& \mt{con} \; x :: \kappa = c \\
adamc@539 944 \mt{selfify}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc}) &=& \mt{datatype} \; x \; \overline{y} = \mt{datatype} \; M.x \\
adamc@539 945 \mt{selfify}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z) &=& \mt{datatype} \; x = \mt{datatype} \; M'.z \\
adamc@539 946 \mt{selfify}(M, \mt{val} \; x : \tau) &=& \mt{val} \; x : \tau \\
adamc@539 947 \mt{selfify}(M, \mt{structure} \; X : S) &=& \mt{structure} \; X : \mt{selfify}(M.X, \overline{s}) \textrm{ (where $\Gamma \vdash S \equiv \mt{sig} \; \overline{s} \; \mt{end}$)} \\
adamc@539 948 \mt{selfify}(M, \mt{signature} \; X = S) &=& \mt{signature} \; X = S \\
adamc@539 949 \mt{selfify}(M, \mt{include} \; S) &=& \mt{include} \; S \\
adamc@539 950 \mt{selfify}(M, \mt{constraint} \; c_1 \sim c_2) &=& \mt{constraint} \; c_1 \sim c_2 \\
adamc@539 951 \mt{selfify}(M, \mt{class} \; x) &=& \mt{class} \; x = M.x \\
adamc@539 952 \mt{selfify}(M, \mt{class} \; x = c) &=& \mt{class} \; x = c \\
adamc@539 953 \end{eqnarray*}
adamc@539 954
adamc@540 955 \subsection{Module Projection}
adamc@540 956
adamc@540 957 \begin{eqnarray*}
adamc@540 958 \mt{proj}(M, \mt{con} \; x :: \kappa \; \overline{s}, \mt{con} \; x) &=& \kappa \\
adamc@540 959 \mt{proj}(M, \mt{con} \; x :: \kappa = c \; \overline{s}, \mt{con} \; x) &=& (\kappa, c) \\
adamc@540 960 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{con} \; x) &=& \mt{Type}^{\mt{len}(\overline{y})} \to \mt{Type} \\
adamc@540 961 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z \; \overline{s}, \mt{con} \; x) &=& (\mt{Type}^{\mt{len}(\overline{y})} \to \mt{Type}, M'.z) \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$} \\
adamc@540 962 && \textrm{and $\mt{proj}(M', \overline{s'}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})$)} \\
adamc@540 963 \mt{proj}(M, \mt{class} \; x \; \overline{s}, \mt{con} \; x) &=& \mt{Type} \to \mt{Type} \\
adamc@540 964 \mt{proj}(M, \mt{class} \; x = c \; \overline{s}, \mt{con} \; x) &=& (\mt{Type} \to \mt{Type}, c) \\
adamc@540 965 \\
adamc@540 966 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{datatype} \; x) &=& (\overline{y}, \overline{dc}) \\
adamc@540 967 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z \; \overline{s}, \mt{con} \; x) &=& \mt{proj}(M', \overline{s'}, \mt{datatype} \; z) \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$)} \\
adamc@540 968 \\
adamc@540 969 \mt{proj}(M, \mt{val} \; x : \tau \; \overline{s}, \mt{val} \; x) &=& \tau \\
adamc@540 970 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to M.x \; \overline y \textrm{ (where $X \in \overline{dc}$)} \\
adamc@540 971 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to \tau \to M.x \; \overline y \textrm{ (where $X \; \mt{of} \; \tau \in \overline{dc}$)} \\
adamc@540 972 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to M.x \; \overline y \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$} \\
adamc@540 973 && \textrm{and $\mt{proj}(M', \overline{s'}, \mt{datatype} \; z = (\overline{y}, \overline{dc})$ and $X \in \overline{dc}$)} \\
adamc@540 974 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to \tau \to M.x \; \overline y \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$} \\
adamc@540 975 && \textrm{and $\mt{proj}(M', \overline{s'}, \mt{datatype} \; z = (\overline{y}, \overline{dc})$ and $X : \tau \in \overline{dc}$)} \\
adamc@540 976 \\
adamc@540 977 \mt{proj}(M, \mt{structure} \; X : S \; \overline{s}, \mt{structure} \; X) &=& S \\
adamc@540 978 \\
adamc@540 979 \mt{proj}(M, \mt{signature} \; X = S \; \overline{s}, \mt{signature} \; X) &=& S \\
adamc@540 980 \\
adamc@540 981 \mt{proj}(M, \mt{con} \; x :: \kappa \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540 982 \mt{proj}(M, \mt{con} \; x :: \kappa = c \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540 983 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540 984 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540 985 \mt{proj}(M, \mt{val} \; x : \tau \; \overline{s}, V) &=& \mt{proj}(M, \overline{s}, V) \\
adamc@540 986 \mt{proj}(M, \mt{structure} \; X : S \; \overline{s}, V) &=& [X \mapsto M.X]\mt{proj}(M, \overline{s}, V) \\
adamc@540 987 \mt{proj}(M, \mt{signature} \; X = S \; \overline{s}, V) &=& [X \mapsto M.X]\mt{proj}(M, \overline{s}, V) \\
adamc@540 988 \mt{proj}(M, \mt{include} \; S \; \overline{s}, V) &=& \mt{proj}(M, \overline{s'} \; \overline{s}, V) \textrm{ (where $\Gamma \vdash S \equiv \mt{sig} \; \overline{s'} \; \mt{end}$)} \\
adamc@540 989 \mt{proj}(M, \mt{constraint} \; c_1 \sim c_2 \; \overline{s}, V) &=& \mt{proj}(M, \overline{s}, V) \\
adamc@540 990 \mt{proj}(M, \mt{class} \; x \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540 991 \mt{proj}(M, \mt{class} \; x = c \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540 992 \end{eqnarray*}
adamc@540 993
adamc@541 994
adamc@541 995 \section{Type Inference}
adamc@541 996
adamc@541 997 The Ur/Web compiler uses \emph{heuristic type inference}, with no claims of completeness with respect to the declarative specification of the last section. The rules in use seem to work well in practice. This section summarizes those rules, to help Ur programmers predict what will work and what won't.
adamc@541 998
adamc@541 999 \subsection{Basic Unification}
adamc@541 1000
adamc@541 1001 Type-checkers for languages based on the Hindly-Milner type discipline, like ML and Haskell, take advantage of \emph{principal typing} properties, making complete type inference relatively straightforward. Inference algorithms are traditionally implemented using type unification variables, at various points asserting equalities between types, in the process discovering the values of type variables. The Ur/Web compiler uses the same basic strategy, but the complexity of the type system rules out easy completeness.
adamc@541 1002
adamc@541 1003 Type-checking can require evaluating recursive functional programs, thanks to the type-level $\mt{fold}$ operator. When a unification variable appears in such a type, the next step of computation can be undetermined. The value of that variable might be determined later, but this would be ``too late'' for the unification problems generated at the first occurrence. This is the essential source of incompletness.
adamc@541 1004
adamc@541 1005 Nonetheless, the unification engine tends to do reasonably well. Unlike in ML, polymorphism is never inferred in definitions; it must be indicated explicitly by writing out constructor-level parameters. By writing these and other annotations, the programmer can generally get the type inference engine to do most of the type reconstruction work.
adamc@541 1006
adamc@541 1007 \subsection{Unifying Record Types}
adamc@541 1008
adamc@541 1009 The type inference engine tries to take advantage of the algebraic rules governing type-level records, as shown in Section \ref{definitional}. When two constructors of record kind are unified, they are reduce to normal forms, with like terms crossed off from each normal form until, hopefully, nothing remains. This cannot be complete, with the inclusion of unification variables. The type-checker can help you understand what goes wrong when the process fails, as it outputs the unmatched remainders of the two normal forms.
adamc@541 1010
adamc@541 1011 \subsection{\label{typeclasses}Type Classes}
adamc@541 1012
adamc@541 1013 Ur includes a type class facility inspired by Haskell's. The current version is very rudimentary, only supporting instances for particular types built up from abstract types and datatypes and type-level application.
adamc@541 1014
adamc@541 1015 Type classes are integrated with the module system. A type class is just a constructor of kind $\mt{Type} \to \mt{Type}$. By marking such a constructor $c$ as a type class, the programmer instructs the type inference engine to, in each scope, record all values of types $c \; \tau$ as \emph{instances}. Any function argument whose type is of such a form is treated as implicit, to be determined by examining the current instance database.
adamc@541 1016
adamc@541 1017 The ``dictionary encoding'' often used in Haskell implementations is made explicit in Ur. Type class instances are just properly-typed values, and they can also be considered as ``proofs'' of membership in the class. In some cases, it is useful to pass these proofs around explicitly. An underscore written where a proof is expected will also be inferred, if possible, from the current instance database.
adamc@541 1018
adamc@541 1019 Just as for types, type classes may be exported from modules, and they may be exported as concrete or abstract. Concrete type classes have their ``real'' definitions exposed, so that client code may add new instances freely. Abstract type classes are useful as ``predicates'' that can be used to enforce invariants, as we will see in some definitions of SQL syntax in the Ur/Web standard library.
adamc@541 1020
adamc@541 1021 \subsection{Reverse-Engineering Record Types}
adamc@541 1022
adamc@541 1023 It's useful to write Ur functions and functors that take record constructors as inputs, but these constructors can grow quite long, even though their values are often implied by other arguments. The compiler uses a simple heuristic to infer the values of unification variables that are folded over, yielding known results. Often, as in the case of $\mt{map}$-like folds, the base and recursive cases of a fold produce constructors with different top-level structure. Thus, if the result of the fold is known, examining its top-level structure reveals whether the record being folded over is empty or not. If it's empty, we're done; if it's not empty, we replace a single unification variable with a new constructor formed from three new unification variables, as in $[\alpha = \beta] \rc \gamma$. This process can often be repeated to determine a unification variable fully.
adamc@541 1024
adamc@541 1025 \subsection{Implicit Arguments in Functor Applications}
adamc@541 1026
adamc@541 1027 Constructor, constraint, and type class witness members of structures may be omitted, when those structures are used in contexts where their assigned signatures imply how to fill in those missing members. This feature combines well with reverse-engineering to allow for uses of complicated meta-programming functors with little more code than would be necessary to invoke an untyped, ad-hoc code generator.
adamc@541 1028
adamc@541 1029
adamc@542 1030 \section{The Ur Standard Library}
adamc@542 1031
adamc@542 1032 The built-in parts of the Ur/Web standard library are described by the signature in \texttt{lib/basis.urs} in the distribution. A module $\mt{Basis}$ ascribing to that signature is available in the initial environment, and every program is implicitly prefixed by $\mt{open} \; \mt{Basis}$.
adamc@542 1033
adamc@542 1034 Additionally, other common functions that are definable within Ur are included in \texttt{lib/top.urs} and \texttt{lib/top.ur}. This $\mt{Top}$ module is also opened implicitly.
adamc@542 1035
adamc@542 1036 The idea behind Ur is to serve as the ideal host for embedded domain-specific languages. For now, however, the ``generic'' functionality is intermixed with Ur/Web-specific functionality, including in these two library modules. We hope that these generic library components have types that speak for themselves. The next section introduces the Ur/Web-specific elements. Here, we only give the type declarations from the beginning of $\mt{Basis}$.
adamc@542 1037 $$\begin{array}{l}
adamc@542 1038 \mt{type} \; \mt{int} \\
adamc@542 1039 \mt{type} \; \mt{float} \\
adamc@542 1040 \mt{type} \; \mt{string} \\
adamc@542 1041 \mt{type} \; \mt{time} \\
adamc@542 1042 \\
adamc@542 1043 \mt{type} \; \mt{unit} = \{\} \\
adamc@542 1044 \\
adamc@542 1045 \mt{datatype} \; \mt{bool} = \mt{False} \mid \mt{True} \\
adamc@542 1046 \\
adamc@542 1047 \mt{datatype} \; \mt{option} \; \mt{t} = \mt{None} \mid \mt{Some} \; \mt{of} \; \mt{t}
adamc@542 1048 \end{array}$$
adamc@542 1049
adamc@542 1050
adamc@542 1051 \section{The Ur/Web Standard Library}
adamc@542 1052
adamc@542 1053 \subsection{Transactions}
adamc@542 1054
adamc@542 1055 Ur is a pure language; we use Haskell's trick to support controlled side effects. The standard library defines a monad $\mt{transaction}$, meant to stand for actions that may be undone cleanly. By design, no other kinds of actions are supported.
adamc@542 1056 $$\begin{array}{l}
adamc@542 1057 \mt{con} \; \mt{transaction} :: \mt{Type} \to \mt{Type} \\
adamc@542 1058 \\
adamc@542 1059 \mt{val} \; \mt{return} : \mt{t} ::: \mt{Type} \to \mt{t} \to \mt{transaction} \; \mt{t} \\
adamc@542 1060 \mt{val} \; \mt{bind} : \mt{t_1} ::: \mt{Type} \to \mt{t_2} ::: \mt{Type} \to \mt{transaction} \; \mt{t_1} \to (\mt{t_1} \to \mt{transaction} \; \mt{t_2}) \to \mt{transaction} \; \mt{t_2}
adamc@542 1061 \end{array}$$
adamc@542 1062
adamc@542 1063 \subsection{HTTP}
adamc@542 1064
adamc@542 1065 There are transactions for reading an HTTP header by name and for getting and setting strongly-typed cookies. Cookies may only be created by the $\mt{cookie}$ declaration form, ensuring that they be named consistently based on module structure.
adamc@542 1066 $$\begin{array}{l}
adamc@542 1067 \mt{val} \; \mt{requestHeader} : \mt{string} \to \mt{transaction} \; (\mt{option} \; \mt{string}) \\
adamc@542 1068 \\
adamc@542 1069 \mt{con} \; \mt{http\_cookie} :: \mt{Type} \to \mt{Type} \\
adamc@542 1070 \mt{val} \; \mt{getCookie} : \mt{t} ::: \mt{Type} \to \mt{http\_cookie} \; \mt{t} \to \mt{transaction} \; (\mt{option} \; \mt{t}) \\
adamc@542 1071 \mt{val} \; \mt{setCookie} : \mt{t} ::: \mt{Type} \to \mt{http\_cookie} \; \mt{t} \to \mt{t} \to \mt{transaction} \; \mt{unit}
adamc@542 1072 \end{array}$$
adamc@542 1073
adamc@543 1074 \subsection{SQL}
adamc@543 1075
adamc@543 1076 The fundamental unit of interest in the embedding of SQL is tables, described by a type family and creatable only via the $\mt{table}$ declaration form.
adamc@543 1077 $$\begin{array}{l}
adamc@543 1078 \mt{con} \; \mt{sql\_table} :: \{\mt{Type}\} \to \mt{Type}
adamc@543 1079 \end{array}$$
adamc@543 1080
adamc@543 1081 \subsubsection{Queries}
adamc@543 1082
adamc@543 1083 A final query is constructed via the $\mt{sql\_query}$ function. Constructor arguments respectively specify the table fields we select (as records mapping tables to the subsets of their fields that we choose) and the (always named) extra expressions that we select.
adamc@543 1084 $$\begin{array}{l}
adamc@543 1085 \mt{con} \; \mt{sql\_query} :: \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@543 1086 \mt{val} \; \mt{sql\_query} : \mt{tables} ::: \{\{\mt{Type}\}\} \\
adamc@543 1087 \hspace{.1in} \to \mt{selectedFields} ::: \{\{\mt{Type}\}\} \\
adamc@543 1088 \hspace{.1in} \to \mt{selectedExps} ::: \{\mt{Type}\} \\
adamc@543 1089 \hspace{.1in} \to \{\mt{Rows} : \mt{sql\_query1} \; \mt{tables} \; \mt{selectedFields} \; \mt{selectedExps}, \\
adamc@543 1090 \hspace{.2in} \mt{OrderBy} : \mt{sql\_order\_by} \; \mt{tables} \; \mt{selectedExps}, \\
adamc@543 1091 \hspace{.2in} \mt{Limit} : \mt{sql\_limit}, \\
adamc@543 1092 \hspace{.2in} \mt{Offset} : \mt{sql\_offset}\} \\
adamc@543 1093 \hspace{.1in} \to \mt{sql\_query} \; \mt{selectedFields} \; \mt{selectedExps}
adamc@543 1094 \end{array}$$
adamc@543 1095
adamc@545 1096 Queries are used by folding over their results inside transactions.
adamc@545 1097 $$\begin{array}{l}
adamc@545 1098 \mt{val} \; \mt{query} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \lambda [\mt{tables} \sim \mt{exps}] \Rightarrow \mt{state} ::: \mt{Type} \to \mt{sql\_query} \; \mt{tables} \; \mt{exps} \\
adamc@545 1099 \hspace{.1in} \to (\$(\mt{exps} \rc \mt{fold} \; (\lambda \mt{nm} \; (\mt{fields} :: \{\mt{Type}\}) \; \mt{acc} \; [[\mt{nm}] \sim \mt{acc}] \Rightarrow [\mt{nm} = \$\mt{fields}] \rc \mt{acc}) \; [] \; \mt{tables}) \\
adamc@545 1100 \hspace{.2in} \to \mt{state} \to \mt{transaction} \; \mt{state}) \\
adamc@545 1101 \hspace{.1in} \to \mt{state} \to \mt{transaction} \; \mt{state}
adamc@545 1102 \end{array}$$
adamc@545 1103
adamc@543 1104 Most of the complexity of the query encoding is in the type $\mt{sql\_query1}$, which includes simple queries and derived queries based on relational operators. Constructor arguments respectively specify the tables we select from, the subset of fields that we keep from each table for the result rows, and the extra expressions that we select.
adamc@543 1105 $$\begin{array}{l}
adamc@543 1106 \mt{con} \; \mt{sql\_query1} :: \{\{\mt{Type}\}\} \to \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@543 1107 \\
adamc@543 1108 \mt{type} \; \mt{sql\_relop} \\
adamc@543 1109 \mt{val} \; \mt{sql\_union} : \mt{sql\_relop} \\
adamc@543 1110 \mt{val} \; \mt{sql\_intersect} : \mt{sql\_relop} \\
adamc@543 1111 \mt{val} \; \mt{sql\_except} : \mt{sql\_relop} \\
adamc@543 1112 \mt{val} \; \mt{sql\_relop} : \mt{tables1} ::: \{\{\mt{Type}\}\} \\
adamc@543 1113 \hspace{.1in} \to \mt{tables2} ::: \{\{\mt{Type}\}\} \\
adamc@543 1114 \hspace{.1in} \to \mt{selectedFields} ::: \{\{\mt{Type}\}\} \\
adamc@543 1115 \hspace{.1in} \to \mt{selectedExps} ::: \{\mt{Type}\} \\
adamc@543 1116 \hspace{.1in} \to \mt{sql\_relop} \\
adamc@543 1117 \hspace{.1in} \to \mt{sql\_query1} \; \mt{tables1} \; \mt{selectedFields} \; \mt{selectedExps} \\
adamc@543 1118 \hspace{.1in} \to \mt{sql\_query1} \; \mt{tables2} \; \mt{selectedFields} \; \mt{selectedExps} \\
adamc@543 1119 \hspace{.1in} \to \mt{sql\_query1} \; \mt{selectedFields} \; \mt{selectedFields} \; \mt{selectedExps}
adamc@543 1120 \end{array}$$
adamc@543 1121
adamc@543 1122 $$\begin{array}{l}
adamc@543 1123 \mt{val} \; \mt{sql\_query1} : \mt{tables} ::: \{\{\mt{Type}\}\} \\
adamc@543 1124 \hspace{.1in} \to \mt{grouped} ::: \{\{\mt{Type}\}\} \\
adamc@543 1125 \hspace{.1in} \to \mt{selectedFields} ::: \{\{\mt{Type}\}\} \\
adamc@543 1126 \hspace{.1in} \to \mt{selectedExps} ::: \{\mt{Type}\} \\
adamc@543 1127 \hspace{.1in} \to \{\mt{From} : \$(\mt{fold} \; (\lambda \mt{nm} \; (\mt{fields} :: \{\mt{Type}\}) \; \mt{acc} \; [[\mt{nm}] \sim \mt{acc}] \Rightarrow [\mt{nm} = \mt{sql\_table} \; \mt{fields}] \rc \mt{acc}) \; [] \; \mt{tables}), \\
adamc@543 1128 \hspace{.2in} \mt{Where} : \mt{sql\_exp} \; \mt{tables} \; [] \; [] \; \mt{bool}, \\
adamc@543 1129 \hspace{.2in} \mt{GroupBy} : \mt{sql\_subset} \; \mt{tables} \; \mt{grouped}, \\
adamc@543 1130 \hspace{.2in} \mt{Having} : \mt{sql\_exp} \; \mt{grouped} \; \mt{tables} \; [] \; \mt{bool}, \\
adamc@543 1131 \hspace{.2in} \mt{SelectFields} : \mt{sql\_subset} \; \mt{grouped} \; \mt{selectedFields}, \\
adamc@543 1132 \hspace{.2in} \mt {SelectExps} : \$(\mt{fold} \; (\lambda \mt{nm} \; (\mt{t} :: \mt{Type}) \; \mt{acc} \; [[\mt{nm}] \sim \mt{acc}] \Rightarrow [\mt{nm} = \mt{sql\_exp} \; \mt{grouped} \; \mt{tables} \; [] \; \mt{t}] \rc \mt{acc}) \; [] \; \mt{selectedExps}) \} \\
adamc@543 1133 \hspace{.1in} \to \mt{sql\_query1} \; \mt{tables} \; \mt{selectedFields} \; \mt{selectedExps}
adamc@543 1134 \end{array}$$
adamc@543 1135
adamc@543 1136 To encode projection of subsets of fields in $\mt{SELECT}$ clauses, and to encode $\mt{GROUP} \; \mt{BY}$ clauses, we rely on a type family $\mt{sql\_subset}$, capturing what it means for one record of table fields to be a subset of another. The main constructor $\mt{sql\_subset}$ ``proves subset facts'' by requiring a split of a record into kept and dropped parts. The extra constructor $\mt{sql\_subset\_all}$ is a convenience for keeping all fields of a record.
adamc@543 1137 $$\begin{array}{l}
adamc@543 1138 \mt{con} \; \mt{sql\_subset} :: \{\{\mt{Type}\}\} \to \{\{\mt{Type}\}\} \to \mt{Type} \\
adamc@543 1139 \mt{val} \; \mt{sql\_subset} : \mt{keep\_drop} :: \{(\{\mt{Type}\} \times \{\mt{Type}\})\} \\
adamc@543 1140 \hspace{.1in} \to \mt{sql\_subset} \\
adamc@543 1141 \hspace{.2in} (\mt{fold} \; (\lambda \mt{nm} \; (\mt{fields} :: (\{\mt{Type}\} \times \{\mt{Type}\})) \; \mt{acc} \; [[\mt{nm}] \sim \mt{acc}] \; [\mt{fields}.1 \sim \mt{fields}.2] \Rightarrow \\
adamc@543 1142 \hspace{.3in} [\mt{nm} = \mt{fields}.1 \rc \mt{fields}.2] \rc \mt{acc}) \; [] \; \mt{keep\_drop}) \\
adamc@543 1143 \hspace{.2in} (\mt{fold} \; (\lambda \mt{nm} \; (\mt{fields} :: (\{\mt{Type}\} \times \{\mt{Type}\})) \; \mt{acc} \; [[\mt{nm}] \sim \mt{acc}] \Rightarrow [\mt{nm} = \mt{fields}.1] \rc \mt{acc}) \; [] \; \mt{keep\_drop}) \\
adamc@543 1144 \mt{val} \; \mt{sql\_subset\_all} : \mt{tables} :: \{\{\mt{Type}\}\} \to \mt{sql\_subset} \; \mt{tables} \; \mt{tables}
adamc@543 1145 \end{array}$$
adamc@543 1146
adamc@543 1147 SQL expressions are used in several places, including $\mt{SELECT}$, $\mt{WHERE}$, $\mt{HAVING}$, and $\mt{ORDER} \; \mt{BY}$ clauses. They reify a fragment of the standard SQL expression language, while making it possible to inject ``native'' Ur values in some places. The arguments to the $\mt{sql\_exp}$ type family respectively give the unrestricted-availablity table fields, the table fields that may only be used in arguments to aggregate functions, the available selected expressions, and the type of the expression.
adamc@543 1148 $$\begin{array}{l}
adamc@543 1149 \mt{con} \; \mt{sql\_exp} :: \{\{\mt{Type}\}\} \to \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \to \mt{Type}
adamc@543 1150 \end{array}$$
adamc@543 1151
adamc@543 1152 Any field in scope may be converted to an expression.
adamc@543 1153 $$\begin{array}{l}
adamc@543 1154 \mt{val} \; \mt{sql\_field} : \mt{otherTabs} ::: \{\{\mt{Type}\}\} \to \mt{otherFields} ::: \{\mt{Type}\} \\
adamc@543 1155 \hspace{.1in} \to \mt{fieldType} ::: \mt{Type} \to \mt{agg} ::: \{\{\mt{Type}\}\} \\
adamc@543 1156 \hspace{.1in} \to \mt{exps} ::: \{\mt{Type}\} \\
adamc@543 1157 \hspace{.1in} \to \mt{tab} :: \mt{Name} \to \mt{field} :: \mt{Name} \\
adamc@543 1158 \hspace{.1in} \to \mt{sql\_exp} \; ([\mt{tab} = [\mt{field} = \mt{fieldType}] \rc \mt{otherFields}] \rc \mt{otherTabs}) \; \mt{agg} \; \mt{exps} \; \mt{fieldType}
adamc@543 1159 \end{array}$$
adamc@543 1160
adamc@544 1161 There is an analogous function for referencing named expressions.
adamc@544 1162 $$\begin{array}{l}
adamc@544 1163 \mt{val} \; \mt{sql\_exp} : \mt{tabs} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{t} ::: \mt{Type} \to \mt{rest} ::: \{\mt{Type}\} \to \mt{nm} :: \mt{Name} \\
adamc@544 1164 \hspace{.1in} \to \mt{sql\_exp} \; \mt{tabs} \; \mt{agg} \; ([\mt{nm} = \mt{t}] \rc \mt{rest}) \; \mt{t}
adamc@544 1165 \end{array}$$
adamc@544 1166
adamc@544 1167 Ur values of appropriate types may be injected into SQL expressions.
adamc@544 1168 $$\begin{array}{l}
adamc@544 1169 \mt{class} \; \mt{sql\_injectable} \\
adamc@544 1170 \mt{val} \; \mt{sql\_bool} : \mt{sql\_injectable} \; \mt{bool} \\
adamc@544 1171 \mt{val} \; \mt{sql\_int} : \mt{sql\_injectable} \; \mt{int} \\
adamc@544 1172 \mt{val} \; \mt{sql\_float} : \mt{sql\_injectable} \; \mt{float} \\
adamc@544 1173 \mt{val} \; \mt{sql\_string} : \mt{sql\_injectable} \; \mt{string} \\
adamc@544 1174 \mt{val} \; \mt{sql\_time} : \mt{sql\_injectable} \; \mt{time} \\
adamc@544 1175 \mt{val} \; \mt{sql\_option\_bool} : \mt{sql\_injectable} \; (\mt{option} \; \mt{bool}) \\
adamc@544 1176 \mt{val} \; \mt{sql\_option\_int} : \mt{sql\_injectable} \; (\mt{option} \; \mt{int}) \\
adamc@544 1177 \mt{val} \; \mt{sql\_option\_float} : \mt{sql\_injectable} \; (\mt{option} \; \mt{float}) \\
adamc@544 1178 \mt{val} \; \mt{sql\_option\_string} : \mt{sql\_injectable} \; (\mt{option} \; \mt{string}) \\
adamc@544 1179 \mt{val} \; \mt{sql\_option\_time} : \mt{sql\_injectable} \; (\mt{option} \; \mt{time}) \\
adamc@544 1180 \mt{val} \; \mt{sql\_inject} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \to \mt{sql\_injectable} \; \mt{t} \\
adamc@544 1181 \hspace{.1in} \to \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t}
adamc@544 1182 \end{array}$$
adamc@544 1183
adamc@544 1184 We have the SQL nullness test, which is necessary because of the strange SQL semantics of equality in the presence of null values.
adamc@544 1185 $$\begin{array}{l}
adamc@544 1186 \mt{val} \; \mt{sql\_is\_null} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544 1187 \hspace{.1in} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; (\mt{option} \; \mt{t}) \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{bool}
adamc@544 1188 \end{array}$$
adamc@544 1189
adamc@544 1190 We have generic nullary, unary, and binary operators, as well as comparison operators.
adamc@544 1191 $$\begin{array}{l}
adamc@544 1192 \mt{con} \; \mt{sql\_nfunc} :: \mt{Type} \to \mt{Type} \\
adamc@544 1193 \mt{val} \; \mt{sql\_current\_timestamp} : \mt{sql\_nfunc} \; \mt{time} \\
adamc@544 1194 \mt{val} \; \mt{sql\_nfunc} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544 1195 \hspace{.1in} \to \mt{sql\_nfunc} \; \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t} \\\end{array}$$
adamc@544 1196
adamc@544 1197 $$\begin{array}{l}
adamc@544 1198 \mt{con} \; \mt{sql\_unary} :: \mt{Type} \to \mt{Type} \to \mt{Type} \\
adamc@544 1199 \mt{val} \; \mt{sql\_not} : \mt{sql\_unary} \; \mt{bool} \; \mt{bool} \\
adamc@544 1200 \mt{val} \; \mt{sql\_unary} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{arg} ::: \mt{Type} \to \mt{res} ::: \mt{Type} \\
adamc@544 1201 \hspace{.1in} \to \mt{sql\_unary} \; \mt{arg} \; \mt{res} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{arg} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{res} \\
adamc@544 1202 \end{array}$$
adamc@544 1203
adamc@544 1204 $$\begin{array}{l}
adamc@544 1205 \mt{con} \; \mt{sql\_binary} :: \mt{Type} \to \mt{Type} \to \mt{Type} \to \mt{Type} \\
adamc@544 1206 \mt{val} \; \mt{sql\_and} : \mt{sql\_binary} \; \mt{bool} \; \mt{bool} \; \mt{bool} \\
adamc@544 1207 \mt{val} \; \mt{sql\_or} : \mt{sql\_binary} \; \mt{bool} \; \mt{bool} \; \mt{bool} \\
adamc@544 1208 \mt{val} \; \mt{sql\_binary} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{arg_1} ::: \mt{Type} \to \mt{arg_2} ::: \mt{Type} \to \mt{res} ::: \mt{Type} \\
adamc@544 1209 \hspace{.1in} \to \mt{sql\_binary} \; \mt{arg_1} \; \mt{arg_2} \; \mt{res} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{arg_1} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{arg_2} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{res}
adamc@544 1210 \end{array}$$
adamc@544 1211
adamc@544 1212 $$\begin{array}{l}
adamc@544 1213 \mt{type} \; \mt{sql\_comparison} \\
adamc@544 1214 \mt{val} \; \mt{sql\_eq} : \mt{sql\_comparison} \\
adamc@544 1215 \mt{val} \; \mt{sql\_ne} : \mt{sql\_comparison} \\
adamc@544 1216 \mt{val} \; \mt{sql\_lt} : \mt{sql\_comparison} \\
adamc@544 1217 \mt{val} \; \mt{sql\_le} : \mt{sql\_comparison} \\
adamc@544 1218 \mt{val} \; \mt{sql\_gt} : \mt{sql\_comparison} \\
adamc@544 1219 \mt{val} \; \mt{sql\_ge} : \mt{sql\_comparison} \\
adamc@544 1220 \mt{val} \; \mt{sql\_comparison} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544 1221 \hspace{.1in} \to \mt{sql\_comparison} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{bool}
adamc@544 1222 \end{array}$$
adamc@544 1223
adamc@544 1224 Finally, we have aggregate functions. The $\mt{COUNT(\ast)}$ syntax is handled specially, since it takes no real argument. The other aggregate functions are placed into a general type family, using type classes to restrict usage to properly-typed arguments. The key aspect of the $\mt{sql\_aggregate}$ function's type is the shift of aggregate-function-only fields into unrestricted fields.
adamc@544 1225
adamc@544 1226 $$\begin{array}{l}
adamc@544 1227 \mt{val} \; \mt{sql\_count} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{int}
adamc@544 1228 \end{array}$$
adamc@544 1229
adamc@544 1230 $$\begin{array}{l}
adamc@544 1231 \mt{con} \; \mt{sql\_aggregate} :: \mt{Type} \to \mt{Type} \\
adamc@544 1232 \mt{val} \; \mt{sql\_aggregate} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544 1233 \hspace{.1in} \to \mt{sql\_aggregate} \; \mt{t} \to \mt{sql\_exp} \; \mt{agg} \; \mt{agg} \; \mt{exps} \; \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t}
adamc@544 1234 \end{array}$$
adamc@544 1235
adamc@544 1236 $$\begin{array}{l}
adamc@544 1237 \mt{class} \; \mt{sql\_summable} \\
adamc@544 1238 \mt{val} \; \mt{sql\_summable\_int} : \mt{sql\_summable} \; \mt{int} \\
adamc@544 1239 \mt{val} \; \mt{sql\_summable\_float} : \mt{sql\_summable} \; \mt{float} \\
adamc@544 1240 \mt{val} \; \mt{sql\_avg} : \mt{t} ::: \mt{Type} \to \mt{sql\_summable} \; \mt{t} \to \mt{sql\_aggregate} \; \mt{t} \\
adamc@544 1241 \mt{val} \; \mt{sql\_sum} : \mt{t} ::: \mt{Type} \to \mt{sql\_summable} \mt{t} \to \mt{sql\_aggregate} \; \mt{t}
adamc@544 1242 \end{array}$$
adamc@544 1243
adamc@544 1244 $$\begin{array}{l}
adamc@544 1245 \mt{class} \; \mt{sql\_maxable} \\
adamc@544 1246 \mt{val} \; \mt{sql\_maxable\_int} : \mt{sql\_maxable} \; \mt{int} \\
adamc@544 1247 \mt{val} \; \mt{sql\_maxable\_float} : \mt{sql\_maxable} \; \mt{float} \\
adamc@544 1248 \mt{val} \; \mt{sql\_maxable\_string} : \mt{sql\_maxable} \; \mt{string} \\
adamc@544 1249 \mt{val} \; \mt{sql\_maxable\_time} : \mt{sql\_maxable} \; \mt{time} \\
adamc@544 1250 \mt{val} \; \mt{sql\_max} : \mt{t} ::: \mt{Type} \to \mt{sql\_maxable} \; \mt{t} \to \mt{sql\_aggregate} \; \mt{t} \\
adamc@544 1251 \mt{val} \; \mt{sql\_min} : \mt{t} ::: \mt{Type} \to \mt{sql\_maxable} \; \mt{t} \to \mt{sql\_aggregate} \; \mt{t}
adamc@544 1252 \end{array}$$
adamc@544 1253
adamc@544 1254 We wrap up the definition of query syntax with the types used in representing $\mt{ORDER} \; \mt{BY}$, $\mt{LIMIT}$, and $\mt{OFFSET}$ clauses.
adamc@544 1255 $$\begin{array}{l}
adamc@544 1256 \mt{type} \; \mt{sql\_direction} \\
adamc@544 1257 \mt{val} \; \mt{sql\_asc} : \mt{sql\_direction} \\
adamc@544 1258 \mt{val} \; \mt{sql\_desc} : \mt{sql\_direction} \\
adamc@544 1259 \\
adamc@544 1260 \mt{con} \; \mt{sql\_order\_by} :: \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@544 1261 \mt{val} \; \mt{sql\_order\_by\_Nil} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{exps} :: \{\mt{Type}\} \to \mt{sql\_order\_by} \; \mt{tables} \; \mt{exps} \\
adamc@544 1262 \mt{val} \; \mt{sql\_order\_by\_Cons} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544 1263 \hspace{.1in} \to \mt{sql\_exp} \; \mt{tables} \; [] \; \mt{exps} \; \mt{t} \to \mt{sql\_direction} \to \mt{sql\_order\_by} \; \mt{tables} \; \mt{exps} \to \mt{sql\_order\_by} \; \mt{tables} \; \mt{exps} \\
adamc@544 1264 \\
adamc@544 1265 \mt{type} \; \mt{sql\_limit} \\
adamc@544 1266 \mt{val} \; \mt{sql\_no\_limit} : \mt{sql\_limit} \\
adamc@544 1267 \mt{val} \; \mt{sql\_limit} : \mt{int} \to \mt{sql\_limit} \\
adamc@544 1268 \\
adamc@544 1269 \mt{type} \; \mt{sql\_offset} \\
adamc@544 1270 \mt{val} \; \mt{sql\_no\_offset} : \mt{sql\_offset} \\
adamc@544 1271 \mt{val} \; \mt{sql\_offset} : \mt{int} \to \mt{sql\_offset}
adamc@544 1272 \end{array}$$
adamc@544 1273
adamc@545 1274
adamc@545 1275 \subsubsection{DML}
adamc@545 1276
adamc@545 1277 The Ur/Web library also includes an embedding of a fragment of SQL's DML, the Data Manipulation Language, for modifying database tables. Any piece of DML may be executed in a transaction.
adamc@545 1278
adamc@545 1279 $$\begin{array}{l}
adamc@545 1280 \mt{type} \; \mt{dml} \\
adamc@545 1281 \mt{val} \; \mt{dml} : \mt{dml} \to \mt{transaction} \; \mt{unit}
adamc@545 1282 \end{array}$$
adamc@545 1283
adamc@545 1284 Properly-typed records may be used to form $\mt{INSERT}$ commands.
adamc@545 1285 $$\begin{array}{l}
adamc@545 1286 \mt{val} \; \mt{insert} : \mt{fields} ::: \{\mt{Type}\} \to \mt{sql\_table} \; \mt{fields} \\
adamc@545 1287 \hspace{.1in} \to \$(\mt{fold} \; (\lambda \mt{nm} \; (\mt{t} :: \mt{Type}) \; \mt{acc} \; [[\mt{nm}] \sim \mt{acc}] \Rightarrow [\mt{nm} = \mt{sql\_exp} \; [] \; [] \; [] \; \mt{t}] \rc \mt{acc}) \; [] \; \mt{fields}) \to \mt{dml}
adamc@545 1288 \end{array}$$
adamc@545 1289
adamc@545 1290 An $\mt{UPDATE}$ command is formed from a choice of which table fields to leave alone and which to change, along with an expression to use to compute the new value of each changed field and a $\mt{WHERE}$ clause.
adamc@545 1291 $$\begin{array}{l}
adamc@545 1292 \mt{val} \; \mt{update} : \mt{unchanged} ::: \{\mt{Type}\} \to \mt{changed} :: \{\mt{Type}\} \to \lambda [\mt{changed} \sim \mt{unchanged}] \\
adamc@545 1293 \hspace{.1in} \Rightarrow \$(\mt{fold} \; (\lambda \mt{nm} \; (\mt{t} :: \mt{Type}) \; \mt{acc} \; [[\mt{nm}] \sim \mt{acc}] \Rightarrow [\mt{nm} = \mt{sql\_exp} \; [\mt{T} = \mt{changed} \rc \mt{unchanged}] \; [] \; [] \; \mt{t}] \rc \mt{acc}) \; [] \; \mt{changed}) \\
adamc@545 1294 \hspace{.1in} \to \mt{sql\_table} \; (\mt{changed} \rc \mt{unchanged}) \to \mt{sql\_exp} \; [\mt{T} = \mt{changed} \rc \mt{unchanged}] \; [] \; [] \; \mt{bool} \to \mt{dml}
adamc@545 1295 \end{array}$$
adamc@545 1296
adamc@545 1297 A $\mt{DELETE}$ command is formed from a table and a $\mt{WHERE}$ clause.
adamc@545 1298 $$\begin{array}{l}
adamc@545 1299 \mt{val} \; \mt{delete} : \mt{fields} ::: \{\mt{Type}\} \to \mt{sql\_table} \; \mt{fields} \to \mt{sql\_exp} \; [\mt{T} = \mt{fields}] \; [] \; [] \; \mt{bool} \to \mt{dml}
adamc@545 1300 \end{array}$$
adamc@545 1301
adamc@546 1302 \subsubsection{Sequences}
adamc@546 1303
adamc@546 1304 SQL sequences are counters with concurrency control, often used to assign unique IDs. Ur/Web supports them via a simple interface. The only way to create a sequence is with the $\mt{sequence}$ declaration form.
adamc@546 1305
adamc@546 1306 $$\begin{array}{l}
adamc@546 1307 \mt{type} \; \mt{sql\_sequence} \\
adamc@546 1308 \mt{val} \; \mt{nextval} : \mt{sql\_sequence} \to \mt{transaction} \; \mt{int}
adamc@546 1309 \end{array}$$
adamc@546 1310
adamc@546 1311
adamc@547 1312 \subsection{XML}
adamc@547 1313
adamc@547 1314 Ur/Web's library contains an encoding of XML syntax and semantic constraints. We make no effort to follow the standards governing XML schemas. Rather, XML fragments are viewed more as values of ML datatypes, and we only track which tags are allowed inside which other tags.
adamc@547 1315
adamc@547 1316 The basic XML type family has arguments respectively indicating the \emph{context} of a fragment, the fields that the fragment expects to be bound on entry (and their types), and the fields that the fragment will bind (and their types). Contexts are a record-based ``poor man's subtyping'' encoding, with each possible set of valid tags corresponding to a different context record. The arguments dealing with field binding are only relevant to HTML forms.
adamc@547 1317 $$\begin{array}{l}
adamc@547 1318 \mt{con} \; \mt{xml} :: \{\mt{Unit}\} \to \{\mt{Type}\} \to \{\mt{Type}\} \to \mt{Type}
adamc@547 1319 \end{array}$$
adamc@547 1320
adamc@547 1321 We also have a type family of XML tags, indexed respectively by the record of optional attributes accepted by the tag, the context in which the tag may be placed, the context required of children of the tag, which form fields the tag uses, and which fields the tag defines.
adamc@547 1322 $$\begin{array}{l}
adamc@547 1323 \mt{con} \; \mt{tag} :: \{\mt{Type}\} \to \{\mt{Unit}\} \to \{\mt{Unit}\} \to \{\mt{Type}\} \to \{\mt{Type}\} \to \mt{Type}
adamc@547 1324 \end{array}$$
adamc@547 1325
adamc@547 1326 Literal text may be injected into XML as ``CDATA.''
adamc@547 1327 $$\begin{array}{l}
adamc@547 1328 \mt{val} \; \mt{cdata} : \mt{ctx} ::: \{\mt{Unit}\} \to \mt{use} ::: \{\mt{Type}\} \to \mt{string} \to \mt{xml} \; \mt{ctx} \; \mt{use} \; []
adamc@547 1329 \end{array}$$
adamc@547 1330
adamc@547 1331 There is a function for producing an XML tree with a particular tag at its root.
adamc@547 1332 $$\begin{array}{l}
adamc@547 1333 \mt{val} \; \mt{tag} : \mt{attrsGiven} ::: \{\mt{Type}\} \to \mt{attrsAbsent} ::: \{\mt{Type}\} \to \mt{ctxOuter} ::: \{\mt{Unit}\} \to \mt{ctxInner} ::: \{\mt{Unit}\} \\
adamc@547 1334 \hspace{.1in} \to \mt{useOuter} ::: \{\mt{Type}\} \to \mt{useInner} ::: \{\mt{Type}\} \to \mt{bindOuter} ::: \{\mt{Type}\} \to \mt{bindInner} ::: \{\mt{Type}\} \\
adamc@547 1335 \hspace{.1in} \to \lambda [\mt{attrsGiven} \sim \mt{attrsAbsent}] \; [\mt{useOuter} \sim \mt{useInner}] \; [\mt{bindOuter} \sim \mt{bindInner}] \Rightarrow \$\mt{attrsGiven} \\
adamc@547 1336 \hspace{.1in} \to \mt{tag} \; (\mt{attrsGiven} \rc \mt{attrsAbsent}) \; \mt{ctxOuter} \; \mt{ctxInner} \; \mt{useOuter} \; \mt{bindOuter} \\
adamc@547 1337 \hspace{.1in} \to \mt{xml} \; \mt{ctxInner} \; \mt{useInner} \; \mt{bindInner} \to \mt{xml} \; \mt{ctxOuter} \; (\mt{useOuter} \rc \mt{useInner}) \; (\mt{bindOuter} \rc \mt{bindInner})
adamc@547 1338 \end{array}$$
adamc@547 1339
adamc@547 1340 Two XML fragments may be concatenated.
adamc@547 1341 $$\begin{array}{l}
adamc@547 1342 \mt{val} \; \mt{join} : \mt{ctx} ::: \{\mt{Unit}\} \to \mt{use_1} ::: \{\mt{Type}\} \to \mt{bind_1} ::: \{\mt{Type}\} \to \mt{bind_2} ::: \{\mt{Type}\} \\
adamc@547 1343 \hspace{.1in} \to \lambda [\mt{use_1} \sim \mt{bind_1}] \; [\mt{bind_1} \sim \mt{bind_2}] \\
adamc@547 1344 \hspace{.1in} \Rightarrow \mt{xml} \; \mt{ctx} \; \mt{use_1} \; \mt{bind_1} \to \mt{xml} \; \mt{ctx} \; (\mt{use_1} \rc \mt{bind_1}) \; \mt{bind_2} \to \mt{xml} \; \mt{ctx} \; \mt{use_1} \; (\mt{bind_1} \rc \mt{bind_2})
adamc@547 1345 \end{array}$$
adamc@547 1346
adamc@547 1347 Finally, any XML fragment may be updated to ``claim'' to use more form fields than it does.
adamc@547 1348 $$\begin{array}{l}
adamc@547 1349 \mt{val} \; \mt{useMore} : \mt{ctx} ::: \{\mt{Unit}\} \to \mt{use_1} ::: \{\mt{Type}\} \to \mt{use_2} ::: \{\mt{Type}\} \to \mt{bind} ::: \{\mt{Type}\} \to \lambda [\mt{use_1} \sim \mt{use_2}] \\
adamc@547 1350 \hspace{.1in} \Rightarrow \mt{xml} \; \mt{ctx} \; \mt{use_1} \; \mt{bind} \to \mt{xml} \; \mt{ctx} \; (\mt{use_1} \rc \mt{use_2}) \; \mt{bind}
adamc@547 1351 \end{array}$$
adamc@547 1352
adamc@547 1353 We will not list here the different HTML tags and related functions from the standard library. They should be easy enough to understand from the code in \texttt{basis.urs}. The set of tags in the library is not yet claimed to be complete for HTML standards.
adamc@547 1354
adamc@547 1355 One last useful function is for aborting any page generation, returning some XML as an error message. This function takes the place of some uses of a general exception mechanism.
adamc@547 1356 $$\begin{array}{l}
adamc@547 1357 \mt{val} \; \mt{error} : \mt{t} ::: \mt{Type} \to \mt{xml} \; [\mt{Body}] \; [] \; [] \to \mt{t}
adamc@547 1358 \end{array}$$
adamc@547 1359
adamc@549 1360
adamc@549 1361 \section{Ur/Web Syntax Extensions}
adamc@549 1362
adamc@549 1363 Ur/Web features some syntactic shorthands for building values using the functions from the last section. This section sketches the grammar of those extensions. We write spans of syntax inside brackets to indicate that they are optional.
adamc@549 1364
adamc@549 1365 \subsection{SQL}
adamc@549 1366
adamc@549 1367 \subsubsection{Queries}
adamc@549 1368
adamc@550 1369 Queries $Q$ are added to the rules for expressions $e$.
adamc@550 1370
adamc@549 1371 $$\begin{array}{rrcll}
adamc@550 1372 \textrm{Queries} & Q &::=& (q \; [\mt{ORDER} \; \mt{BY} \; (E \; [o],)^+] \; [\mt{LIMIT} \; N] \; [\mt{OFFSET} \; N]) \\
adamc@549 1373 \textrm{Pre-queries} & q &::=& \mt{SELECT} \; P \; \mt{FROM} \; T,^+ \; [\mt{WHERE} \; E] \; [\mt{GROUP} \; \mt{BY} \; p,^+] \; [\mt{HAVING} \; E] \\
adamc@549 1374 &&& \mid q \; R \; q \\
adamc@549 1375 \textrm{Relational operators} & R &::=& \mt{UNION} \mid \mt{INTERSECT} \mid \mt{EXCEPT}
adamc@549 1376 \end{array}$$
adamc@549 1377
adamc@549 1378 $$\begin{array}{rrcll}
adamc@549 1379 \textrm{Projections} & P &::=& \ast & \textrm{all columns} \\
adamc@549 1380 &&& p,^+ & \textrm{particular columns} \\
adamc@549 1381 \textrm{Pre-projections} & p &::=& t.f & \textrm{one column from a table} \\
adamc@549 1382 &&& t.\{\{c\}\} & \textrm{a record of colums from a table (of kind $\{\mt{Type}\}$)} \\
adamc@549 1383 \textrm{Table names} & t &::=& x & \textrm{constant table name (automatically capitalized)} \\
adamc@549 1384 &&& X & \textrm{constant table name} \\
adamc@549 1385 &&& \{\{c\}\} & \textrm{computed table name (of kind $\mt{Name}$)} \\
adamc@549 1386 \textrm{Column names} & f &::=& X & \textrm{constant column name} \\
adamc@549 1387 &&& \{c\} & \textrm{computed column name (of kind $\mt{Name}$)} \\
adamc@549 1388 \textrm{Tables} & T &::=& x & \textrm{table variable, named locally by its own capitalization} \\
adamc@549 1389 &&& x \; \mt{AS} \; t & \textrm{table variable, with local name} \\
adamc@549 1390 &&& \{\{e\}\} \; \mt{AS} \; t & \textrm{computed table expression, with local name} \\
adamc@549 1391 \textrm{SQL expressions} & E &::=& p & \textrm{column references} \\
adamc@549 1392 &&& X & \textrm{named expression references} \\
adamc@549 1393 &&& \{\{e\}\} & \textrm{injected native Ur expressions} \\
adamc@549 1394 &&& \{e\} & \textrm{computed expressions, probably using $\mt{sql\_exp}$ directly} \\
adamc@549 1395 &&& \mt{TRUE} \mid \mt{FALSE} & \textrm{boolean constants} \\
adamc@549 1396 &&& \ell & \textrm{primitive type literals} \\
adamc@549 1397 &&& \mt{NULL} & \textrm{null value (injection of $\mt{None}$)} \\
adamc@549 1398 &&& E \; \mt{IS} \; \mt{NULL} & \textrm{nullness test} \\
adamc@549 1399 &&& n & \textrm{nullary operators} \\
adamc@549 1400 &&& u \; E & \textrm{unary operators} \\
adamc@549 1401 &&& E \; b \; E & \textrm{binary operators} \\
adamc@549 1402 &&& \mt{COUNT}(\ast) & \textrm{count number of rows} \\
adamc@549 1403 &&& a(E) & \textrm{other aggregate function} \\
adamc@549 1404 &&& (E) & \textrm{explicit precedence} \\
adamc@549 1405 \textrm{Nullary operators} & n &::=& \mt{CURRENT\_TIMESTAMP} \\
adamc@549 1406 \textrm{Unary operators} & u &::=& \mt{NOT} \\
adamc@549 1407 \textrm{Binary operators} & b &::=& \mt{AND} \mid \mt{OR} \mid \neq \mid < \mid \leq \mid > \mid \geq \\
adamc@549 1408 \textrm{Aggregate functions} & a &::=& \mt{AVG} \mid \mt{SUM} \mid \mt{MIN} \mid \mt{MAX} \\
adamc@550 1409 \textrm{Directions} & o &::=& \mt{ASC} \mid \mt{DESC} \\
adamc@549 1410 \textrm{SQL integer} & N &::=& n \mid \{e\} \\
adamc@549 1411 \end{array}$$
adamc@549 1412
adamc@549 1413 Additionally, an SQL expression may be inserted into normal Ur code with the syntax $(\mt{SQL} \; E)$ or $(\mt{WHERE} \; E)$.
adamc@549 1414
adamc@550 1415 \subsubsection{DML}
adamc@550 1416
adamc@550 1417 DML commands $D$ are added to the rules for expressions $e$.
adamc@550 1418
adamc@550 1419 $$\begin{array}{rrcll}
adamc@550 1420 \textrm{Commands} & D &::=& (\mt{INSERT} \; \mt{INTO} \; T^E \; (f,^+) \; \mt{VALUES} \; (E,^+)) \\
adamc@550 1421 &&& (\mt{UPDATE} \; T^E \; \mt{SET} \; (f = E,)^+ \; \mt{WHERE} \; E) \\
adamc@550 1422 &&& (\mt{DELETE} \; \mt{FROM} \; T^E \; \mt{WHERE} \; E) \\
adamc@550 1423 \textrm{Table expressions} & T^E &::=& x \mid \{\{e\}\}
adamc@550 1424 \end{array}$$
adamc@550 1425
adamc@550 1426 Inside $\mt{UPDATE}$ and $\mt{DELETE}$ commands, lone variables $X$ are interpreted as references to columns of the implicit table $\mt{T}$, rather than to named expressions.
adamc@549 1427
adamc@551 1428 \subsection{XML}
adamc@551 1429
adamc@551 1430 XML fragments $L$ are added to the rules for expressions $e$.
adamc@551 1431
adamc@551 1432 $$\begin{array}{rrcll}
adamc@551 1433 \textrm{XML fragments} & L &::=& \texttt{<xml/>} \mid \texttt{<xml>}l^*\texttt{</xml>} \\
adamc@551 1434 \textrm{XML pieces} & l &::=& \textrm{text} & \textrm{cdata} \\
adamc@551 1435 &&& \texttt{<}g\texttt{/>} & \textrm{tag with no children} \\
adamc@551 1436 &&& \texttt{<}g\texttt{>}l^*\texttt{</}x\texttt{>} & \textrm{tag with children} \\
adamc@551 1437 \textrm{Tag} & g &::=& h \; (x = v)^* \\
adamc@551 1438 \textrm{Tag head} & h &::=& x & \textrm{tag name} \\
adamc@551 1439 &&& h\{c\} & \textrm{constructor parameter} \\
adamc@551 1440 \textrm{Attribute value} & v &::=& \ell & \textrm{literal value} \\
adamc@551 1441 &&& \{e\} & \textrm{computed value} \\
adamc@551 1442 \end{array}$$
adamc@551 1443
adamc@552 1444
adamc@553 1445 \section{The Structure of Web Applications}
adamc@553 1446
adamc@553 1447 A web application is built from a series of modules, with one module, the last one appearing in the \texttt{.urp} file, designated as the main module. The signature of the main module determines the URL entry points to the application. Such an entry point should have type $\mt{unit} \to \mt{transaction} \; \mt{page}$, where $\mt{page}$ is a type synonym for top-level HTML pages, defined in $\mt{Basis}$. If such a function is at the top level of main module $M$, it will be accessible at URI \texttt{/M/f}, and so on for more deeply-nested functions, as described in Section \ref{tag} below.
adamc@553 1448
adamc@553 1449 When the standalone web server receives a request for a known page, it calls the function for that page, ``running'' the resulting transaction to produce the page to return to the client. Pages link to other pages with the \texttt{link} attribute of the \texttt{a} HTML tag. A link has type $\mt{transaction} \; \mt{page}$, and the semantics of a link are that this transaction should be run to compute the result page, when the link is followed. Link targets are assigned URL names in the same way as top-level entry points.
adamc@553 1450
adamc@553 1451 HTML forms are handled in a similar way. The $\mt{action}$ attribute of a $\mt{submit}$ form tag takes a value of type $\$\mt{use} \to \mt{transaction} \; \mt{page}$, where $\mt{use}$ is a kind-$\{\mt{Type}\}$ record of the form fields used by this action handler. Action handlers are assigned URL patterns in the same way as above.
adamc@553 1452
adamc@553 1453 For both links and actions, direct arguments and local variables mentioned implicitly via closures are automatically included in serialized form in URLs, in the order in which they appeared in the source code.
adamc@553 1454
adamc@553 1455
adamc@552 1456 \section{Compiler Phases}
adamc@552 1457
adamc@552 1458 The Ur/Web compiler is unconventional in that it relies on a kind of \emph{heuristic compilation}. Not all valid programs will compile successfully. Informally, programs fail to compile when they are ``too higher order.'' Compiler phases do their best to eliminate different kinds of higher order-ness, but some programs just won't compile. This is a trade-off for producing very efficient executables. Compiled Ur/Web programs use native C representations and require no garbage collection.
adamc@552 1459
adamc@552 1460 In this section, we step through the main phases of compilation, noting what consequences each phase has for effective programming.
adamc@552 1461
adamc@552 1462 \subsection{Parse}
adamc@552 1463
adamc@552 1464 The compiler reads a \texttt{.urp} file, figures out which \texttt{.urs} and \texttt{.ur} files it references, and combines them all into what is conceptually a single sequence of declarations in the core language of Section \ref{core}.
adamc@552 1465
adamc@552 1466 \subsection{Elaborate}
adamc@552 1467
adamc@552 1468 This is where type inference takes place, translating programs into an explicit form with no more wildcards. This phase is the most likely source of compiler error messages.
adamc@552 1469
adamc@552 1470 \subsection{Unnest}
adamc@552 1471
adamc@552 1472 Named local function definitions are moved to the top level, to avoid the need to generate closures.
adamc@552 1473
adamc@552 1474 \subsection{Corify}
adamc@552 1475
adamc@552 1476 Module system features are compiled away, through inlining of functor definitions at application sites. Afterward, most abstraction boundaries are broken, facilitating optimization.
adamc@552 1477
adamc@552 1478 \subsection{Especialize}
adamc@552 1479
adamc@552 1480 Functions are specialized to particular argument patterns. This is an important trick for avoiding the need to maintain any closures at runtime.
adamc@552 1481
adamc@552 1482 \subsection{Untangle}
adamc@552 1483
adamc@552 1484 Remove unnecessary mutual recursion, splitting recursive groups into strongly-connected components.
adamc@552 1485
adamc@552 1486 \subsection{Shake}
adamc@552 1487
adamc@552 1488 Remove all definitions not needed to run the page handlers that are visible in the signature of the last module listed in the \texttt{.urp} file.
adamc@552 1489
adamc@553 1490 \subsection{\label{tag}Tag}
adamc@552 1491
adamc@552 1492 Assign a URL name to each link and form action. It is important that these links and actions are written as applications of named functions, because such names are used to generate URL patterns. A URL pattern has a name built from the full module path of the named function, followed by the function name, with all pieces separated by slashes. The path of a functor application is based on the name given to the result, rather than the path of the functor itself.
adamc@552 1493
adamc@552 1494 \subsection{Reduce}
adamc@552 1495
adamc@552 1496 Apply definitional equality rules to simplify the program as much as possible. This effectively includes inlining of every non-recursive definition.
adamc@552 1497
adamc@552 1498 \subsection{Unpoly}
adamc@552 1499
adamc@552 1500 This phase specializes polymorphic functions to the specific arguments passed to them in the program. If the program contains real polymorphic recursion, Unpoly will be insufficient to avoid later error messages about too much polymorphism.
adamc@552 1501
adamc@552 1502 \subsection{Specialize}
adamc@552 1503
adamc@552 1504 Replace uses of parametrized datatypes with versions specialized to specific parameters. As for Unpoly, this phase will not be effective enough in the presence of polymorphic recursion or other fancy uses of impredicative polymorphism.
adamc@552 1505
adamc@552 1506 \subsection{Shake}
adamc@552 1507
adamc@552 1508 Here the compiler repeats the earlier shake phase.
adamc@552 1509
adamc@552 1510 \subsection{Monoize}
adamc@552 1511
adamc@552 1512 Programs are translated to a new intermediate language without polymorphism or non-$\mt{Type}$ constructors. Error messages may pop up here if earlier phases failed to remove such features.
adamc@552 1513
adamc@552 1514 This is the stage at which concrete names are generated for cookies, tables, and sequences. They are named following the same convention as for links and actions, based on module path information saved from earlier stages. Table and sequence names separate path elements with underscores instead of slashes, and they are prefixed by \texttt{uw\_}.
adamc@552 1515 \subsection{MonoOpt}
adamc@552 1516
adamc@552 1517 Simple algebraic laws are applied to simplify the program, focusing especially on efficient imperative generation of HTML pages.
adamc@552 1518
adamc@552 1519 \subsection{MonoUntangle}
adamc@552 1520
adamc@552 1521 Unnecessary mutual recursion is broken up again.
adamc@552 1522
adamc@552 1523 \subsection{MonoReduce}
adamc@552 1524
adamc@552 1525 Equivalents of the definitional equality rules are applied to simplify programs, with inlining again playing a major role.
adamc@552 1526
adamc@552 1527 \subsection{MonoShake, MonoOpt}
adamc@552 1528
adamc@552 1529 Unneeded declarations are removed, and basic optimizations are repeated.
adamc@552 1530
adamc@552 1531 \subsection{Fuse}
adamc@552 1532
adamc@552 1533 The compiler tries to simplify calls to recursive functions whose results are immediately written as page output. The write action is pushed inside the function definitions to avoid allocation of intermediate results.
adamc@552 1534
adamc@552 1535 \subsection{MonoUntangle, MonoShake}
adamc@552 1536
adamc@552 1537 Fuse often creates more opportunities to remove spurious mutual recursion.
adamc@552 1538
adamc@552 1539 \subsection{Pathcheck}
adamc@552 1540
adamc@552 1541 The compiler checks that no link or action name has been used more than once.
adamc@552 1542
adamc@552 1543 \subsection{Cjrize}
adamc@552 1544
adamc@552 1545 The program is translated to what is more or less a subset of C. If any use of functions as data remains at this point, the compiler will complain.
adamc@552 1546
adamc@552 1547 \subsection{C Compilation and Linking}
adamc@552 1548
adamc@552 1549 The output of the last phase is pretty-printed as C source code and passed to GCC.
adamc@552 1550
adamc@552 1551
adamc@524 1552 \end{document}