Ur/Web: doc/manual.tex annotate

annotate doc/manual.tex @ 873:41971801b62d

MySQL query gets up to C linking

author	Adam Chlipala <adamc@hcoop.net>
date	Sun, 12 Jul 2009 13:16:05 -0400
parents	19fdeef40ada
children	0ae8894d5c97

rev	line source
adamc@524	1 \documentclass{article}
adamc@554	2 \usepackage{fullpage,amsmath,amssymb,proof,url}
adamc@524	3
adamc@524	4 \newcommand{\cd}[1]{\texttt{#1}}
adamc@524	5 \newcommand{\mt}[1]{\mathsf{#1}}
adamc@524	6
adamc@524	7 \newcommand{\rc}{+ \hspace{-.075in} + \;}
adamc@527	8 \newcommand{\rcut}{\; \texttt{--} \;}
adamc@527	9 \newcommand{\rcutM}{\; \texttt{---} \;}
adamc@524	10
adamc@524	11 \begin{document}
adamc@524	12
adamc@524	13 \title{The Ur/Web Manual}
adamc@524	14 \author{Adam Chlipala}
adamc@524	15
adamc@524	16 \maketitle
adamc@524	17
adamc@540	18 \tableofcontents
adamc@540	19
adamc@554	20
adamc@554	21 \section{Introduction}
adamc@554	22
adamc@554	23 \emph{Ur} is a programming language designed to introduce richer type system features into functional programming in the tradition of ML and Haskell. Ur is functional, pure, statically-typed, and strict. Ur supports a powerful kind of \emph{metaprogramming} based on \emph{row types}.
adamc@554	24
adamc@554	25 \emph{Ur/Web} is Ur plus a special standard library and associated rules for parsing and optimization. Ur/Web supports construction of dynamic web applications backed by SQL databases. The signature of the standard library is such that well-typed Ur/Web programs ``don't go wrong'' in a very broad sense. Not only do they not crash during particular page generations, but they also may not:
adamc@554	26
adamc@554	27 \begin{itemize}
adamc@554	28 \item Suffer from any kinds of code-injection attacks
adamc@554	29 \item Return invalid HTML
adamc@554	30 \item Contain dead intra-application links
adamc@554	31 \item Have mismatches between HTML forms and the fields expected by their handlers
adamc@652	32 \item Include client-side code that makes incorrect assumptions about the ``AJAX''-style services that the remote web server provides
adamc@554	33 \item Attempt invalid SQL queries
adamc@652	34 \item Use improper marshaling or unmarshaling in communication with SQL databases or between browsers and web servers
adamc@554	35 \end{itemize}
adamc@554	36
adamc@554	37 This type safety is just the foundation of the Ur/Web methodology. It is also possible to use metaprogramming to build significant application pieces by analysis of type structure. For instance, the demo includes an ML-style functor for building an admin interface for an arbitrary SQL table. The type system guarantees that the admin interface sub-application that comes out will always be free of the above-listed bugs, no matter which well-typed table description is given as input.
adamc@554	38
adamc@652	39 The Ur/Web compiler also produces very efficient object code that does not use garbage collection. These compiled programs will often be even more efficient than what most programmers would bother to write in C. The compiler also generates JavaScript versions of client-side code, with no need to write those parts of applications in a different language.
adamc@554	40
adamc@554	41 \medskip
adamc@554	42
adamc@554	43 The official web site for Ur is:
adamc@554	44 \begin{center}
adamc@554	45 \url{http://www.impredicative.com/ur/}
adamc@554	46 \end{center}
adamc@554	47
adamc@555	48
adamc@555	49 \section{Installation}
adamc@555	50
adamc@555	51 If you are lucky, then the following standard command sequence will suffice for installation, in a directory to which you have unpacked the latest distribution tarball.
adamc@555	52
adamc@555	53 \begin{verbatim}
adamc@555	54 ./configure
adamc@555	55 make
adamc@555	56 sudo make install
adamc@555	57 \end{verbatim}
adamc@555	58
adamc@783	59 Some other packages must be installed for the above to work. At a minimum, you need a standard UNIX shell, with standard UNIX tools like sed and GCC in your execution path; MLton, the whole-program optimizing compiler for Standard ML; and the mhash C library. To build programs that access SQL databases, you also need libpq, the PostgreSQL client library. As of this writing, in the ``testing'' version of Debian Linux, this command will install the more uncommon of these dependencies:
adamc@555	60
adamc@555	61 \begin{verbatim}
adamc@783	62 apt-get install mlton libmhash-dev libpq-dev
adamc@555	63 \end{verbatim}
adamc@555	64
adamc@555	65 It is also possible to access the modules of the Ur/Web compiler interactively, within Standard ML of New Jersey. To install the prerequisites in Debian testing:
adamc@555	66
adamc@555	67 \begin{verbatim}
adamc@555	68 apt-get install smlnj libsmlnj-smlnj ml-yacc ml-lpt
adamc@555	69 \end{verbatim}
adamc@555	70
adamc@555	71 To begin an interactive session with the Ur compiler modules, run \texttt{make smlnj}, and then, from within an \texttt{sml} session, run \texttt{CM.make "src/urweb.cm";}. The \texttt{Compiler} module is the main entry point.
adamc@555	72
adamc@555	73 To run an SQL-backed application, you will probably want to install the PostgreSQL server. Version 8.3 or higher is required.
adamc@555	74
adamc@555	75 \begin{verbatim}
adamc@555	76 apt-get install postgresql-8.3
adamc@555	77 \end{verbatim}
adamc@555	78
adamc@555	79 To use the Emacs mode, you must have a modern Emacs installed. We assume that you already know how to do this, if you're in the business of looking for an Emacs mode. The demo generation facility of the compiler will also call out to Emacs to syntax-highlight code, and that process depends on the \texttt{htmlize} module, which can be installed in Debian testing via:
adamc@555	80
adamc@555	81 \begin{verbatim}
adamc@555	82 apt-get install emacs-goodies-el
adamc@555	83 \end{verbatim}
adamc@555	84
adamc@555	85 Even with the right packages installed, configuration and building might fail to work. After you run \texttt{./configure}, you will see the values of some named environment variables printed. You may need to adjust these values to get proper installation for your system. To change a value, store your preferred alternative in the corresponding UNIX environment variable, before running \texttt{./configure}. For instance, here is how to change the list of extra arguments that the Ur/Web compiler will pass to GCC on every invocation.
adamc@555	86
adamc@555	87 \begin{verbatim}
adamc@555	88 GCCARGS=-fnested-functions ./configure
adamc@555	89 \end{verbatim}
adamc@555	90
adamc@555	91 Some OSX users have reported needing to use this particular GCCARGS value.
adamc@555	92
adamc@555	93 The Emacs mode can be set to autoload by adding the following to your \texttt{.emacs} file.
adamc@555	94
adamc@555	95 \begin{verbatim}
adamc@555	96 (add-to-list 'load-path "/usr/local/share/emacs/site-lisp/urweb-mode")
adamc@555	97 (load "urweb-mode-startup")
adamc@555	98 \end{verbatim}
adamc@555	99
adamc@555	100 Change the path in the first line if you chose a different Emacs installation path during configuration.
adamc@555	101
adamc@555	102
adamc@556	103 \section{Command-Line Compiler}
adamc@556	104
adamc@556	105 \subsection{Project Files}
adamc@556	106
adamc@556	107 The basic inputs to the \texttt{urweb} compiler are project files, which have the extension \texttt{.urp}. Here is a sample \texttt{.urp} file.
adamc@556	108
adamc@556	109 \begin{verbatim}
adamc@556	110 database dbname=test
adamc@556	111 sql crud1.sql
adamc@556	112
adamc@556	113 crud
adamc@556	114 crud1
adamc@556	115 \end{verbatim}
adamc@556	116
adamc@556	117 The \texttt{database} line gives the database information string to pass to libpq. In this case, the string only says to connect to a local database named \texttt{test}.
adamc@556	118
adamc@556	119 The \texttt{sql} line asks for an SQL source file to be generated, giving the commands to run to create the tables and sequences that this application expects to find. After building this \texttt{.urp} file, the following commands could be used to initialize the database, assuming that the current UNIX user exists as a Postgres user with database creation privileges:
adamc@556	120
adamc@556	121 \begin{verbatim}
adamc@556	122 createdb test
adamc@556	123 psql -f crud1.sql test
adamc@556	124 \end{verbatim}
adamc@556	125
adamc@556	126 A blank line always separates the named directives from a list of modules to include in the project; if there are no named directives, a blank line must begin the file.
adamc@556	127
adamc@556	128 For each entry \texttt{M} in the module list, the file \texttt{M.urs} is included in the project if it exists, and the file \texttt{M.ur} must exist and is always included.
adamc@556	129
adamc@783	130 Here is the complete list of directive forms. ``FFI'' stands for ``foreign function interface,'' Ur's facility for interaction between Ur programs and C and JavaScript libraries.
adamc@783	131 \begin{itemize}
adamc@783	132 \item \texttt{[allow\|deny] [url\|mime] PATTERN} registers a rule governing which URLs or MIME types are allowed in this application. The first such rule to match a URL or MIME type determines the verdict. If \texttt{PATTERN} ends in \texttt{*}, it is interpreted as a prefix rule. Otherwise, a string must match it exactly.
adamc@783	133 \item \texttt{clientOnly Module.ident} registers an FFI function or transaction that may only be run in client browsers.
adamc@783	134 \item \texttt{clientToServer Module.ident} adds FFI type \texttt{Module.ident} to the list of types that are OK to marshal from clients to servers. Values like XML trees and SQL queries are hard to marshal without introducing expensive validity checks, so it's easier to ensure that the server never trusts clients to send such values. The file \texttt{include/urweb.h} shows examples of the C support functions that are required of any type that may be marshalled. These include \texttt{attrify}, \texttt{urlify}, and \texttt{unurlify} functions.
adamc@783	135 \item \texttt{database DBSTRING} sets the string to pass to libpq to open a database connection.
adamc@783	136 \item \texttt{debug} saves some intermediate C files, which is mostly useful to help in debugging the compiler itself.
adamc@783	137 \item \texttt{effectful Module.ident} registers an FFI function or transaction as having side effects. The optimizer avoids removing, moving, or duplicating calls to such functions. Every effectful FFI function must be registered, or the optimizer may make invalid transformations.
adamc@783	138 \item \texttt{exe FILENAME} sets the filename to which to write the output executable. The default for file \texttt{P.urp} is \texttt{P.exe}.
adamc@783	139 \item \texttt{ffi FILENAME} reads the file \texttt{FILENAME.urs} to determine the interface to a new FFI module. The name of the module is calculated from \texttt{FILENAME} in the same way as for normal source files. See the files \texttt{include/urweb.h} and \texttt{src/c/urweb.c} for examples of C headers and implementations for FFI modules. In general, every type or value \texttt{Module.ident} becomes \texttt{uw\_Module\_ident} in C.
adamc@873	140 \item \texttt{header FILENAME} adds \texttt{FILENAME} to the list of files to be \texttt{\#include}d in C sources. This is most useful for interfacing with new FFI modules.
adamc@783	141 \item \texttt{jsFunc Module.ident=name} gives the JavaScript name of an FFI value.
adamc@783	142 \item \texttt{library FILENAME} parses \texttt{FILENAME.urp} and merges its contents with the rest of the current file's contents.
adamc@783	143 \item \texttt{link FILENAME} adds \texttt{FILENAME} to the list of files to be passed to the GCC linker at the end of compilation. This is most useful for importing extra libraries needed by new FFI modules.
adamc@852	144 \item \texttt{path NAME=VALUE} creates a mapping from \texttt{NAME} to \texttt{VALUE}. This mapping may be used at the beginnings of filesystem paths given to various other configuration directives. A path like \texttt{\$NAME/rest} is expanded to \texttt{VALUE/rest}. There is an initial mapping from the empty name (for paths like \texttt{\$/list}) to the directory where the Ur/Web standard library is installed. If you accept the default \texttt{configure} options, this directory is \texttt{/usr/local/lib/urweb/ur}.
adamc@783	145 \item \texttt{prefix PREFIX} sets the prefix included before every URI within the generated application. The default is \texttt{/}.
adamc@783	146 \item \texttt{profile} generates an executable that may be used with gprof.
adamc@783	147 \item \texttt{rewrite KIND FROM TO} gives a rule for rewriting canonical module paths. For instance, the canonical path of a page may be \texttt{Mod1.Mod2.mypage}, while you would rather the page were accessed via a URL containing only \texttt{page}. The directive \texttt{rewrite url Mod1/Mod2/mypage page} would accomplish that. The possible values of \texttt{KIND} determine which kinds of objects are affected. The kind \texttt{all} matches any object, and \texttt{url} matches page URLs. The kinds \texttt{table}, \texttt{sequence}, and \texttt{view} match those sorts of SQL entities, and \texttt{relation} matches any of those three. \texttt{cookie} matches HTTP cookies, and \texttt{style} matches CSS class names. If \texttt{FROM} ends in \texttt{/*}, it is interpreted as a prefix matching rule, and rewriting occurs by replacing only the appropriate prefix of a path with \texttt{TO}. While the actual external names of relations and styles have parts separated by underscores instead of slashes, all rewrite rules must be written in terms of slashes.
adamc@783	148 \item \texttt{script URL} adds \texttt{URL} to the list of extra JavaScript files to be included at the beginning of any page that uses JavaScript. This is most useful for importing JavaScript versions of functions found in new FFI modules.
adamc@783	149 \item \texttt{serverOnly Module.ident} registers an FFI function or transaction that may only be run on the server.
adamc@783	150 \item \texttt{sql FILENAME} sets where to write an SQL file with the commands to create the expected database schema. The default is not to create such a file.
adamc@783	151 \item \texttt{timeout N} sets to \texttt{N} seconds the amount of time that the generated server will wait after the last contact from a client before determining that that client has exited the application. Clients that remain active will take the timeout setting into account in determining how often to ping the server, so it only makes sense to set a high timeout to cope with browser and network delays and failures. Higher timeouts can lead to more unnecessary client information taking up memory on the server. The timeout goes unused by any page that doesn't involve the \texttt{recv} function, since the server only needs to store per-client information for clients that receive asynchronous messages.
adamc@783	152 \end{itemize}
adamc@701	153
adamc@701	154
adamc@557	155 \subsection{Building an Application}
adamc@557	156
adamc@557	157 To compile project \texttt{P.urp}, simply run
adamc@557	158 \begin{verbatim}
adamc@557	159 urweb P
adamc@557	160 \end{verbatim}
adamc@558	161 The output executable is a standalone web server. Run it with the command-line argument \texttt{-h} to see which options it takes. If the project file lists a database, the web server will attempt to connect to that database on startup.
adamc@557	162
adamc@557	163 To time how long the different compiler phases run, without generating an executable, run
adamc@557	164 \begin{verbatim}
adamc@557	165 urweb -timing P
adamc@557	166 \end{verbatim}
adamc@557	167
adamc@556	168
adamc@529	169 \section{Ur Syntax}
adamc@529	170
adamc@784	171 In this section, we describe the syntax of Ur, deferring to a later section discussion of most of the syntax specific to SQL and XML. The sole exceptions are the declaration forms for relations, cookies, and styles.
adamc@524	172
adamc@524	173 \subsection{Lexical Conventions}
adamc@524	174
adamc@524	175 We give the Ur language definition in \LaTeX $\;$ math mode, since that is prettier than monospaced ASCII. The corresponding ASCII syntax can be read off directly. Here is the key for mapping math symbols to ASCII character sequences.
adamc@524	176
adamc@524	177 \begin{center}
adamc@524	178 \begin{tabular}{rl}
adamc@524	179 \textbf{\LaTeX} & \textbf{ASCII} \\
adamc@524	180 $\to$ & \cd{->} \\
adamc@652	181 $\longrightarrow$ & \cd{-->} \\
adamc@524	182 $\times$ & \cd{*} \\
adamc@524	183 $\lambda$ & \cd{fn} \\
adamc@524	184 $\Rightarrow$ & \cd{=>} \\
adamc@652	185 $\Longrightarrow$ & \cd{==>} \\
adamc@529	186 $\neq$ & \cd{<>} \\
adamc@529	187 $\leq$ & \cd{<=} \\
adamc@529	188 $\geq$ & \cd{>=} \\
adamc@524	189 \\
adamc@524	190 $x$ & Normal textual identifier, not beginning with an uppercase letter \\
adamc@525	191 $X$ & Normal textual identifier, beginning with an uppercase letter \\
adamc@524	192 \end{tabular}
adamc@524	193 \end{center}
adamc@524	194
adamc@525	195 We often write syntax like $e^$ to indicate zero or more copies of $e$, $e^+$ to indicate one or more copies, and $e,^$ and $e,^+$ to indicate multiple copies separated by commas. Another separator may be used in place of a comma. The $e$ term may be surrounded by parentheses to indicate grouping; those parentheses should not be included in the actual ASCII.
adamc@524	196
adamc@873	197 We write $\ell$ for literals of the primitive types, for the most part following C conventions. There are $\mt{int}$, $\mt{float}$, $\mt{char}$, and $\mt{string}$ literals. Character literals follow the SML convention instead of the C convention, written like \texttt{\#"a"} instead of \texttt{'a'}.
adamc@526	198
adamc@527	199 This version of the manual doesn't include operator precedences; see \texttt{src/urweb.grm} for that.
adamc@527	200
adamc@552	201 \subsection{\label{core}Core Syntax}
adamc@524	202
adamc@524	203 \emph{Kinds} classify types and other compile-time-only entities. Each kind in the grammar is listed with a description of the sort of data it classifies.
adamc@524	204 $$\begin{array}{rrcll}
adamc@524	205 \textrm{Kinds} & \kappa &::=& \mt{Type} & \textrm{proper types} \\
adamc@525	206 &&& \mt{Unit} & \textrm{the trivial constructor} \\
adamc@525	207 &&& \mt{Name} & \textrm{field names} \\
adamc@525	208 &&& \kappa \to \kappa & \textrm{type-level functions} \\
adamc@525	209 &&& \{\kappa\} & \textrm{type-level records} \\
adamc@525	210 &&& (\kappa\times^+) & \textrm{type-level tuples} \\
adamc@652	211 &&& X & \textrm{variable} \\
adamc@652	212 &&& X \longrightarrow k & \textrm{kind-polymorphic type-level function} \\
adamc@529	213 &&& \_\_ & \textrm{wildcard} \\
adamc@525	214 &&& (\kappa) & \textrm{explicit precedence} \\
adamc@524	215 \end{array}$$
adamc@524	216
adamc@524	217 Ur supports several different notions of functions that take types as arguments. These arguments can be either implicit, causing them to be inferred at use sites; or explicit, forcing them to be specified manually at use sites. There is a common explicitness annotation convention applied at the definitions of and in the types of such functions.
adamc@524	218 $$\begin{array}{rrcll}
adamc@524	219 \textrm{Explicitness} & ? &::=& :: & \textrm{explicit} \\
adamc@558	220 &&& ::: & \textrm{implicit}
adamc@524	221 \end{array}$$
adamc@524	222
adamc@524	223 \emph{Constructors} are the main class of compile-time-only data. They include proper types and are classified by kinds.
adamc@524	224 $$\begin{array}{rrcll}
adamc@524	225 \textrm{Constructors} & c, \tau &::=& (c) :: \kappa & \textrm{kind annotation} \\
adamc@530	226 &&& \hat{x} & \textrm{constructor variable} \\
adamc@524	227 \\
adamc@525	228 &&& \tau \to \tau & \textrm{function type} \\
adamc@525	229 &&& x \; ? \; \kappa \to \tau & \textrm{polymorphic function type} \\
adamc@652	230 &&& X \longrightarrow \tau & \textrm{kind-polymorphic function type} \\
adamc@525	231 &&& \$ c & \textrm{record type} \\
adamc@524	232 \\
adamc@525	233 &&& c \; c & \textrm{type-level function application} \\
adamc@530	234 &&& \lambda x \; :: \; \kappa \Rightarrow c & \textrm{type-level function abstraction} \\
adamc@524	235 \\
adamc@652	236 &&& X \Longrightarrow c & \textrm{type-level kind-polymorphic function abstraction} \\
adamc@655	237 &&& c [\kappa] & \textrm{type-level kind-polymorphic function application} \\
adamc@652	238 \\
adamc@525	239 &&& () & \textrm{type-level unit} \\
adamc@525	240 &&& \#X & \textrm{field name} \\
adamc@524	241 \\
adamc@525	242 &&& [(c = c)^*] & \textrm{known-length type-level record} \\
adamc@525	243 &&& c \rc c & \textrm{type-level record concatenation} \\
adamc@652	244 &&& \mt{map} & \textrm{type-level record map} \\
adamc@524	245 \\
adamc@558	246 &&& (c,^+) & \textrm{type-level tuple} \\
adamc@525	247 &&& c.n & \textrm{type-level tuple projection ($n \in \mathbb N^+$)} \\
adamc@524	248 \\
adamc@652	249 &&& [c \sim c] \Rightarrow \tau & \textrm{guarded type} \\
adamc@524	250 \\
adamc@529	251 &&& \_ :: \kappa & \textrm{wildcard} \\
adamc@525	252 &&& (c) & \textrm{explicit precedence} \\
adamc@530	253 \\
adamc@530	254 \textrm{Qualified uncapitalized variables} & \hat{x} &::=& x & \textrm{not from a module} \\
adamc@530	255 &&& M.x & \textrm{projection from a module} \\
adamc@525	256 \end{array}$$
adamc@525	257
adamc@655	258 We include both abstraction and application for kind polymorphism, but applications are only inferred internally; they may not be written explicitly in source programs.
adamc@655	259
adamc@525	260 Modules of the module system are described by \emph{signatures}.
adamc@525	261 $$\begin{array}{rrcll}
adamc@525	262 \textrm{Signatures} & S &::=& \mt{sig} \; s^* \; \mt{end} & \textrm{constant} \\
adamc@525	263 &&& X & \textrm{variable} \\
adamc@525	264 &&& \mt{functor}(X : S) : S & \textrm{functor} \\
adamc@529	265 &&& S \; \mt{where} \; \mt{con} \; x = c & \textrm{concretizing an abstract constructor} \\
adamc@525	266 &&& M.X & \textrm{projection from a module} \\
adamc@525	267 \\
adamc@525	268 \textrm{Signature items} & s &::=& \mt{con} \; x :: \kappa & \textrm{abstract constructor} \\
adamc@525	269 &&& \mt{con} \; x :: \kappa = c & \textrm{concrete constructor} \\
adamc@528	270 &&& \mt{datatype} \; x \; x^* = dc\mid^+ & \textrm{algebraic datatype definition} \\
adamc@529	271 &&& \mt{datatype} \; x = \mt{datatype} \; M.x & \textrm{algebraic datatype import} \\
adamc@525	272 &&& \mt{val} \; x : \tau & \textrm{value} \\
adamc@525	273 &&& \mt{structure} \; X : S & \textrm{sub-module} \\
adamc@525	274 &&& \mt{signature} \; X = S & \textrm{sub-signature} \\
adamc@525	275 &&& \mt{include} \; S & \textrm{signature inclusion} \\
adamc@525	276 &&& \mt{constraint} \; c \sim c & \textrm{record disjointness constraint} \\
adamc@654	277 &&& \mt{class} \; x :: \kappa & \textrm{abstract constructor class} \\
adamc@654	278 &&& \mt{class} \; x :: \kappa = c & \textrm{concrete constructor class} \\
adamc@525	279 \\
adamc@525	280 \textrm{Datatype constructors} & dc &::=& X & \textrm{nullary constructor} \\
adamc@525	281 &&& X \; \mt{of} \; \tau & \textrm{unary constructor} \\
adamc@524	282 \end{array}$$
adamc@524	283
adamc@526	284 \emph{Patterns} are used to describe structural conditions on expressions, such that expressions may be tested against patterns, generating assignments to pattern variables if successful.
adamc@526	285 $$\begin{array}{rrcll}
adamc@526	286 \textrm{Patterns} & p &::=& \_ & \textrm{wildcard} \\
adamc@526	287 &&& x & \textrm{variable} \\
adamc@526	288 &&& \ell & \textrm{constant} \\
adamc@526	289 &&& \hat{X} & \textrm{nullary constructor} \\
adamc@526	290 &&& \hat{X} \; p & \textrm{unary constructor} \\
adamc@526	291 &&& \{(x = p,)^*\} & \textrm{rigid record pattern} \\
adamc@526	292 &&& \{(x = p,)^+, \ldots\} & \textrm{flexible record pattern} \\
adamc@852	293 &&& p : \tau & \textrm{type annotation} \\
adamc@527	294 &&& (p) & \textrm{explicit precedence} \\
adamc@526	295 \\
adamc@529	296 \textrm{Qualified capitalized variables} & \hat{X} &::=& X & \textrm{not from a module} \\
adamc@526	297 &&& M.X & \textrm{projection from a module} \\
adamc@526	298 \end{array}$$
adamc@526	299
adamc@527	300 \emph{Expressions} are the main run-time entities, corresponding to both ``expressions'' and ``statements'' in mainstream imperative languages.
adamc@527	301 $$\begin{array}{rrcll}
adamc@527	302 \textrm{Expressions} & e &::=& e : \tau & \textrm{type annotation} \\
adamc@529	303 &&& \hat{x} & \textrm{variable} \\
adamc@529	304 &&& \hat{X} & \textrm{datatype constructor} \\
adamc@527	305 &&& \ell & \textrm{constant} \\
adamc@527	306 \\
adamc@527	307 &&& e \; e & \textrm{function application} \\
adamc@527	308 &&& \lambda x : \tau \Rightarrow e & \textrm{function abstraction} \\
adamc@527	309 &&& e [c] & \textrm{polymorphic function application} \\
adamc@852	310 &&& \lambda [x \; ? \; \kappa] \Rightarrow e & \textrm{polymorphic function abstraction} \\
adamc@655	311 &&& e [\kappa] & \textrm{kind-polymorphic function application} \\
adamc@652	312 &&& X \Longrightarrow e & \textrm{kind-polymorphic function abstraction} \\
adamc@527	313 \\
adamc@527	314 &&& \{(c = e,)^*\} & \textrm{known-length record} \\
adamc@527	315 &&& e.c & \textrm{record field projection} \\
adamc@527	316 &&& e \rc e & \textrm{record concatenation} \\
adamc@527	317 &&& e \rcut c & \textrm{removal of a single record field} \\
adamc@527	318 &&& e \rcutM c & \textrm{removal of multiple record fields} \\
adamc@527	319 \\
adamc@527	320 &&& \mt{let} \; ed^* \; \mt{in} \; e \; \mt{end} & \textrm{local definitions} \\
adamc@527	321 \\
adamc@527	322 &&& \mt{case} \; e \; \mt{of} \; (p \Rightarrow e\|)^+ & \textrm{pattern matching} \\
adamc@527	323 \\
adamc@654	324 &&& \lambda [c \sim c] \Rightarrow e & \textrm{guarded expression abstraction} \\
adamc@654	325 &&& e \; ! & \textrm{guarded expression application} \\
adamc@527	326 \\
adamc@527	327 &&& \_ & \textrm{wildcard} \\
adamc@527	328 &&& (e) & \textrm{explicit precedence} \\
adamc@527	329 \\
adamc@527	330 \textrm{Local declarations} & ed &::=& \cd{val} \; x : \tau = e & \textrm{non-recursive value} \\
adamc@527	331 &&& \cd{val} \; \cd{rec} \; (x : \tau = e \; \cd{and})^+ & \textrm{mutually-recursive values} \\
adamc@527	332 \end{array}$$
adamc@527	333
adamc@655	334 As with constructors, we include both abstraction and application for kind polymorphism, but applications are only inferred internally.
adamc@655	335
adamc@528	336 \emph{Declarations} primarily bring new symbols into context.
adamc@528	337 $$\begin{array}{rrcll}
adamc@528	338 \textrm{Declarations} & d &::=& \mt{con} \; x :: \kappa = c & \textrm{constructor synonym} \\
adamc@528	339 &&& \mt{datatype} \; x \; x^* = dc\mid^+ & \textrm{algebraic datatype definition} \\
adamc@529	340 &&& \mt{datatype} \; x = \mt{datatype} \; M.x & \textrm{algebraic datatype import} \\
adamc@528	341 &&& \mt{val} \; x : \tau = e & \textrm{value} \\
adamc@528	342 &&& \mt{val} \; \cd{rec} \; (x : \tau = e \; \mt{and})^+ & \textrm{mutually-recursive values} \\
adamc@528	343 &&& \mt{structure} \; X : S = M & \textrm{module definition} \\
adamc@528	344 &&& \mt{signature} \; X = S & \textrm{signature definition} \\
adamc@528	345 &&& \mt{open} \; M & \textrm{module inclusion} \\
adamc@528	346 &&& \mt{constraint} \; c \sim c & \textrm{record disjointness constraint} \\
adamc@528	347 &&& \mt{open} \; \mt{constraints} \; M & \textrm{inclusion of just the constraints from a module} \\
adamc@528	348 &&& \mt{table} \; x : c & \textrm{SQL table} \\
adamc@784	349 &&& \mt{view} \; x : c & \textrm{SQL view} \\
adamc@528	350 &&& \mt{sequence} \; x & \textrm{SQL sequence} \\
adamc@535	351 &&& \mt{cookie} \; x : \tau & \textrm{HTTP cookie} \\
adamc@784	352 &&& \mt{style} \; x : \tau & \textrm{CSS class} \\
adamc@654	353 &&& \mt{class} \; x :: \kappa = c & \textrm{concrete constructor class} \\
adamc@528	354 \\
adamc@529	355 \textrm{Modules} & M &::=& \mt{struct} \; d^* \; \mt{end} & \textrm{constant} \\
adamc@529	356 &&& X & \textrm{variable} \\
adamc@529	357 &&& M.X & \textrm{projection} \\
adamc@529	358 &&& M(M) & \textrm{functor application} \\
adamc@529	359 &&& \mt{functor}(X : S) : S = M & \textrm{functor abstraction} \\
adamc@528	360 \end{array}$$
adamc@528	361
adamc@528	362 There are two kinds of Ur files. A file named $M\texttt{.ur}$ is an \emph{implementation file}, and it should contain a sequence of declarations $d^$. A file named $M\texttt{.urs}$ is an \emph{interface file}; it must always have a matching $M\texttt{.ur}$ and should contain a sequence of signature items $s^$. When both files are present, the overall effect is the same as a monolithic declaration $\mt{structure} \; M : \mt{sig} \; s^* \; \mt{end} = \mt{struct} \; d^* \; \mt{end}$. When no interface file is included, the overall effect is similar, with a signature for module $M$ being inferred rather than just checked against an interface.
adamc@527	363
adamc@784	364 We omit some extra possibilities in $\mt{table}$ syntax, deferring them to Section \ref{tables}.
adamc@784	365
adamc@529	366 \subsection{Shorthands}
adamc@529	367
adamc@529	368 There are a variety of derived syntactic forms that elaborate into the core syntax from the last subsection. We will present the additional forms roughly following the order in which we presented the constructs that they elaborate into.
adamc@529	369
adamc@529	370 In many contexts where record fields are expected, like in a projection $e.c$, a constant field may be written as simply $X$, rather than $\#X$.
adamc@529	371
adamc@529	372 A record type may be written $\{(c = c,)^\}$, which elaborates to $\$[(c = c,)^]$.
adamc@529	373
adamc@533	374 The notation $[c_1, \ldots, c_n]$ is shorthand for $[c_1 = (), \ldots, c_n = ()]$.
adamc@533	375
adamc@529	376 A tuple type $(\tau_1, \ldots, \tau_n)$ expands to a record type $\{1 = \tau_1, \ldots, n = \tau_n\}$, with natural numbers as field names. A tuple pattern $(p_1, \ldots, p_n)$ expands to a rigid record pattern $\{1 = p_1, \ldots, n = p_n\}$. Positive natural numbers may be used in most places where field names would be allowed.
adamc@529	377
adamc@852	378 In general, several adjacent $\lambda$ forms may be combined into one, and kind and type annotations may be omitted, in which case they are implicitly included as wildcards. More formally, for constructor-level abstractions, we can define a new non-terminal $b ::= x \mid (x :: \kappa) \mid X$ and allow composite abstractions of the form $\lambda b^+ \Rightarrow c$, elaborating into the obvious sequence of one core $\lambda$ per element of $b^+$.
adamc@529	379
adamc@529	380 For any signature item or declaration that defines some entity to be equal to $A$ with classification annotation $B$ (e.g., $\mt{val} \; x : B = A$), $B$ and the preceding colon (or similar punctuation) may be omitted, in which case it is filled in as a wildcard.
adamc@529	381
adamc@529	382 A signature item or declaration $\mt{type} \; x$ or $\mt{type} \; x = \tau$ is elaborated into $\mt{con} \; x :: \mt{Type}$ or $\mt{con} \; x :: \mt{Type} = \tau$, respectively.
adamc@529	383
adamc@654	384 A signature item or declaration $\mt{class} \; x = \lambda y \Rightarrow c$ may be abbreviated $\mt{class} \; x \; y = c$.
adamc@529	385
adamc@654	386 Handling of implicit and explicit constructor arguments may be tweaked with some prefixes to variable references. An expression $@x$ is a version of $x$ where all implicit constructor arguments have been made explicit. An expression $@@x$ achieves the same effect, additionally halting automatic resolution of type class instances and automatic proving of disjointness constraints. The default is that any prefix of a variable's type consisting only of implicit polymorphism, type class instances, and disjointness obligations is resolved automatically, with the variable treated as having the type that starts after the last implicit element, with suitable unification variables substituted. The same syntax works for variables projected out of modules and for capitalized variables (datatype constructors).
adamc@529	387
adamc@852	388 At the expression level, an analogue is available of the composite $\lambda$ form for constructors. We define the language of binders as $b ::= p \mid [x] \mid [x \; ? \; \kappa] \mid X \mid [c \sim c]$. A lone variable $[x]$ stands for an implicit constructor variable of unspecified kind. The standard value-level function binder is recovered as the type-annotated pattern form $x : \tau$. It is a compile-time error to include a pattern $p$ that does not match every value of the appropriate type.
adamc@529	389
adamc@852	390 A local $\mt{val}$ declaration may bind a pattern instead of just a plain variable. As for function arguments, only irrefutable patterns are legal.
adamc@852	391
adamc@852	392 The keyword $\mt{fun}$ is a shorthand for $\mt{val} \; \mt{rec}$ that allows arguments to be specified before the equal sign in the definition of each mutually-recursive function, as in SML. Each curried argument must follow the grammar of the $b$ non-terminal introduced two paragraphs ago. A $\mt{fun}$ declaration is elaborated into a version that adds additional $\lambda$s to the fronts of the righthand sides, as appropriate.
adamc@529	393
adamc@529	394 A signature item $\mt{functor} \; X_1 \; (X_2 : S_1) : S_2$ is elaborated into $\mt{structure} \; X_1 : \mt{functor}(X_2 : S_1) : S_2$. A declaration $\mt{functor} \; X_1 \; (X_2 : S_1) : S_2 = M$ is elaborated into $\mt{structure} \; X_1 : \mt{functor}(X_2 : S_1) : S_2 = \mt{functor}(X_2 : S_1) : S_2 = M$.
adamc@529	395
adamc@852	396 An $\mt{open} \; \mt{constraints}$ declaration is implicitly inserted for the argument of every functor at the beginning of the functor body. For every declaration of the form $\mt{structure} \; X : S = \mt{struct} \ldots \mt{end}$, an $\mt{open} \; \mt{constraints} \; X$ declaration is implicitly inserted immediately afterward.
adamc@852	397
adamc@853	398 A declaration $\mt{table} \; x : \{(c = c,)^\}$ is elaborated into $\mt{table} \; x : [(c = c,)^]$.
adamc@529	399
adamc@529	400 The syntax $\mt{where} \; \mt{type}$ is an alternate form of $\mt{where} \; \mt{con}$.
adamc@529	401
adamc@529	402 The syntax $\mt{if} \; e \; \mt{then} \; e_1 \; \mt{else} \; e_2$ expands to $\mt{case} \; e \; \mt{of} \; \mt{Basis}.\mt{True} \Rightarrow e_1 \mid \mt{Basis}.\mt{False} \Rightarrow e_2$.
adamc@529	403
adamc@529	404 There are infix operator syntaxes for a number of functions defined in the $\mt{Basis}$ module. There is $=$ for $\mt{eq}$, $\neq$ for $\mt{neq}$, $-$ for $\mt{neg}$ (as a prefix operator) and $\mt{minus}$, $+$ for $\mt{plus}$, $\times$ for $\mt{times}$, $/$ for $\mt{div}$, $\%$ for $\mt{mod}$, $<$ for $\mt{lt}$, $\leq$ for $\mt{le}$, $>$ for $\mt{gt}$, and $\geq$ for $\mt{ge}$.
adamc@529	405
adamc@784	406 A signature item $\mt{table} \; x : c$ is shorthand for $\mt{val} \; x : \mt{Basis}.\mt{sql\_table} \; c \; []$. $\mt{view} \; x : c$ is shorthand for $\mt{val} \; x : \mt{Basis}.\mt{sql\_view} \; c$, $\mt{sequence} \; x$ is short for $\mt{val} \; x : \mt{Basis}.\mt{sql\_sequence}$. $\mt{cookie} \; x : \tau$ is shorthand for $\mt{val} \; x : \mt{Basis}.\mt{http\_cookie} \; \tau$, and $\mt{style} \; x$ is shorthand for $\mt{val} \; x : \mt{Basis}.\mt{css\_class}$.
adamc@529	407
adamc@530	408
adamc@530	409 \section{Static Semantics}
adamc@530	410
adamc@530	411 In this section, we give a declarative presentation of Ur's typing rules and related judgments. Inference is the subject of the next section; here, we assume that an oracle has filled in all wildcards with concrete values.
adamc@530	412
adamc@530	413 Since there is significant mutual recursion among the judgments, we introduce them all before beginning to give rules. We use the same variety of contexts throughout this section, implicitly introducing new sorts of context entries as needed.
adamc@530	414 \begin{itemize}
adamc@655	415 \item $\Gamma \vdash \kappa$ expresses kind well-formedness.
adamc@530	416 \item $\Gamma \vdash c :: \kappa$ assigns a kind to a constructor in a context.
adamc@530	417 \item $\Gamma \vdash c \sim c$ proves the disjointness of two record constructors; that is, that they share no field names. We overload the judgment to apply to pairs of field names as well.
adamc@531	418 \item $\Gamma \vdash c \hookrightarrow C$ proves that record constructor $c$ decomposes into set $C$ of field names and record constructors.
adamc@530	419 \item $\Gamma \vdash c \equiv c$ proves the computational equivalence of two constructors. This is often called a \emph{definitional equality} in the world of type theory.
adamc@530	420 \item $\Gamma \vdash e : \tau$ is a standard typing judgment.
adamc@534	421 \item $\Gamma \vdash p \leadsto \Gamma; \tau$ combines typing of patterns with calculation of which new variables they bind.
adamc@537	422 \item $\Gamma \vdash d \leadsto \Gamma$ expresses how a declaration modifies a context. We overload this judgment to apply to sequences of declarations, as well as to signature items and sequences of signature items.
adamc@537	423 \item $\Gamma \vdash S \equiv S$ is the signature equivalence judgment.
adamc@536	424 \item $\Gamma \vdash S \leq S$ is the signature compatibility judgment. We write $\Gamma \vdash S$ as shorthand for $\Gamma \vdash S \leq S$.
adamc@530	425 \item $\Gamma \vdash M : S$ is the module signature checking judgment.
adamc@537	426 \item $\mt{proj}(M, \overline{s}, V)$ is a partial function for projecting a signature item from $\overline{s}$, given the module $M$ that we project from. $V$ may be $\mt{con} \; x$, $\mt{datatype} \; x$, $\mt{val} \; x$, $\mt{signature} \; X$, or $\mt{structure} \; X$. The parameter $M$ is needed because the projected signature item may refer to other items from $\overline{s}$.
adamc@539	427 \item $\mt{selfify}(M, \overline{s})$ adds information to signature items $\overline{s}$ to reflect the fact that we are concerned with the particular module $M$. This function is overloaded to work over individual signature items as well.
adamc@530	428 \end{itemize}
adamc@530	429
adamc@655	430
adamc@655	431 \subsection{Kind Well-Formedness}
adamc@655	432
adamc@655	433 $$\infer{\Gamma \vdash \mt{Type}}{}
adamc@655	434 \quad \infer{\Gamma \vdash \mt{Unit}}{}
adamc@655	435 \quad \infer{\Gamma \vdash \mt{Name}}{}
adamc@655	436 \quad \infer{\Gamma \vdash \kappa_1 \to \kappa_2}{
adamc@655	437 \Gamma \vdash \kappa_1
adamc@655	438 & \Gamma \vdash \kappa_2
adamc@655	439 }
adamc@655	440 \quad \infer{\Gamma \vdash \{\kappa\}}{
adamc@655	441 \Gamma \vdash \kappa
adamc@655	442 }
adamc@655	443 \quad \infer{\Gamma \vdash (\kappa_1 \times \ldots \times \kappa_n)}{
adamc@655	444 \forall i: \Gamma \vdash \kappa_i
adamc@655	445 }$$
adamc@655	446
adamc@655	447 $$\infer{\Gamma \vdash X}{
adamc@655	448 X \in \Gamma
adamc@655	449 }
adamc@655	450 \quad \infer{\Gamma \vdash X \longrightarrow \kappa}{
adamc@655	451 \Gamma, X \vdash \kappa
adamc@655	452 }$$
adamc@655	453
adamc@530	454 \subsection{Kinding}
adamc@530	455
adamc@655	456 We write $[X \mapsto \kappa_1]\kappa_2$ for capture-avoiding substitution of $\kappa_1$ for $X$ in $\kappa_2$.
adamc@655	457
adamc@530	458 $$\infer{\Gamma \vdash (c) :: \kappa :: \kappa}{
adamc@530	459 \Gamma \vdash c :: \kappa
adamc@530	460 }
adamc@530	461 \quad \infer{\Gamma \vdash x :: \kappa}{
adamc@530	462 x :: \kappa \in \Gamma
adamc@530	463 }
adamc@530	464 \quad \infer{\Gamma \vdash x :: \kappa}{
adamc@530	465 x :: \kappa = c \in \Gamma
adamc@530	466 }$$
adamc@530	467
adamc@530	468 $$\infer{\Gamma \vdash M.x :: \kappa}{
adamc@537	469 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	470 & \mt{proj}(M, \overline{s}, \mt{con} \; x) = \kappa
adamc@530	471 }
adamc@530	472 \quad \infer{\Gamma \vdash M.x :: \kappa}{
adamc@537	473 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	474 & \mt{proj}(M, \overline{s}, \mt{con} \; x) = (\kappa, c)
adamc@530	475 }$$
adamc@530	476
adamc@530	477 $$\infer{\Gamma \vdash \tau_1 \to \tau_2 :: \mt{Type}}{
adamc@530	478 \Gamma \vdash \tau_1 :: \mt{Type}
adamc@530	479 & \Gamma \vdash \tau_2 :: \mt{Type}
adamc@530	480 }
adamc@530	481 \quad \infer{\Gamma \vdash x \; ? \: \kappa \to \tau :: \mt{Type}}{
adamc@530	482 \Gamma, x :: \kappa \vdash \tau :: \mt{Type}
adamc@530	483 }
adamc@655	484 \quad \infer{\Gamma \vdash X \longrightarrow \tau :: \mt{Type}}{
adamc@655	485 \Gamma, X \vdash \tau :: \mt{Type}
adamc@655	486 }
adamc@530	487 \quad \infer{\Gamma \vdash \$c :: \mt{Type}}{
adamc@530	488 \Gamma \vdash c :: \{\mt{Type}\}
adamc@530	489 }$$
adamc@530	490
adamc@530	491 $$\infer{\Gamma \vdash c_1 \; c_2 :: \kappa_2}{
adamc@530	492 \Gamma \vdash c_1 :: \kappa_1 \to \kappa_2
adamc@530	493 & \Gamma \vdash c_2 :: \kappa_1
adamc@530	494 }
adamc@530	495 \quad \infer{\Gamma \vdash \lambda x \; :: \; \kappa_1 \Rightarrow c :: \kappa_1 \to \kappa_2}{
adamc@530	496 \Gamma, x :: \kappa_1 \vdash c :: \kappa_2
adamc@530	497 }$$
adamc@530	498
adamc@655	499 $$\infer{\Gamma \vdash c[\kappa'] :: [X \mapsto \kappa']\kappa}{
adamc@655	500 \Gamma \vdash c :: X \to \kappa
adamc@655	501 & \Gamma \vdash \kappa'
adamc@655	502 }
adamc@655	503 \quad \infer{\Gamma \vdash X \Longrightarrow c :: X \to \kappa}{
adamc@655	504 \Gamma, X \vdash c :: \kappa
adamc@655	505 }$$
adamc@655	506
adamc@530	507 $$\infer{\Gamma \vdash () :: \mt{Unit}}{}
adamc@530	508 \quad \infer{\Gamma \vdash \#X :: \mt{Name}}{}$$
adamc@530	509
adamc@530	510 $$\infer{\Gamma \vdash [\overline{c_i = c'_i}] :: \{\kappa\}}{
adamc@530	511 \forall i: \Gamma \vdash c_i : \mt{Name}
adamc@530	512 & \Gamma \vdash c'_i :: \kappa
adamc@530	513 & \forall i \neq j: \Gamma \vdash c_i \sim c_j
adamc@530	514 }
adamc@530	515 \quad \infer{\Gamma \vdash c_1 \rc c_2 :: \{\kappa\}}{
adamc@530	516 \Gamma \vdash c_1 :: \{\kappa\}
adamc@530	517 & \Gamma \vdash c_2 :: \{\kappa\}
adamc@530	518 & \Gamma \vdash c_1 \sim c_2
adamc@530	519 }$$
adamc@530	520
adamc@655	521 $$\infer{\Gamma \vdash \mt{map} :: (\kappa_1 \to \kappa_2) \to \{\kappa_1\} \to \{\kappa_2\}}{}$$
adamc@530	522
adamc@573	523 $$\infer{\Gamma \vdash (\overline c) :: (\kappa_1 \times \ldots \times \kappa_n)}{
adamc@573	524 \forall i: \Gamma \vdash c_i :: \kappa_i
adamc@530	525 }
adamc@573	526 \quad \infer{\Gamma \vdash c.i :: \kappa_i}{
adamc@573	527 \Gamma \vdash c :: (\kappa_1 \times \ldots \times \kappa_n)
adamc@530	528 }$$
adamc@530	529
adamc@655	530 $$\infer{\Gamma \vdash \lambda [c_1 \sim c_2] \Rightarrow \tau :: \mt{Type}}{
adamc@655	531 \Gamma \vdash c_1 :: \{\kappa\}
adamc@530	532 & \Gamma \vdash c_2 :: \{\kappa'\}
adamc@655	533 & \Gamma, c_1 \sim c_2 \vdash \tau :: \mt{Type}
adamc@530	534 }$$
adamc@530	535
adamc@531	536 \subsection{Record Disjointness}
adamc@531	537
adamc@531	538 $$\infer{\Gamma \vdash c_1 \sim c_2}{
adamc@558	539 \Gamma \vdash c_1 \hookrightarrow C_1
adamc@558	540 & \Gamma \vdash c_2 \hookrightarrow C_2
adamc@558	541 & \forall c'_1 \in C_1, c'_2 \in C_2: \Gamma \vdash c'_1 \sim c'_2
adamc@531	542 }
adamc@531	543 \quad \infer{\Gamma \vdash X \sim X'}{
adamc@531	544 X \neq X'
adamc@531	545 }$$
adamc@531	546
adamc@531	547 $$\infer{\Gamma \vdash c_1 \sim c_2}{
adamc@531	548 c'_1 \sim c'_2 \in \Gamma
adamc@558	549 & \Gamma \vdash c'_1 \hookrightarrow C_1
adamc@558	550 & \Gamma \vdash c'_2 \hookrightarrow C_2
adamc@558	551 & c_1 \in C_1
adamc@558	552 & c_2 \in C_2
adamc@531	553 }$$
adamc@531	554
adamc@531	555 $$\infer{\Gamma \vdash c \hookrightarrow \{c\}}{}
adamc@531	556 \quad \infer{\Gamma \vdash [\overline{c = c'}] \hookrightarrow \{\overline{c}\}}{}
adamc@531	557 \quad \infer{\Gamma \vdash c_1 \rc c_2 \hookrightarrow C_1 \cup C_2}{
adamc@531	558 \Gamma \vdash c_1 \hookrightarrow C_1
adamc@531	559 & \Gamma \vdash c_2 \hookrightarrow C_2
adamc@531	560 }
adamc@531	561 \quad \infer{\Gamma \vdash c \hookrightarrow C}{
adamc@531	562 \Gamma \vdash c \equiv c'
adamc@531	563 & \Gamma \vdash c' \hookrightarrow C
adamc@531	564 }
adamc@531	565 \quad \infer{\Gamma \vdash \mt{map} \; f \; c \hookrightarrow C}{
adamc@531	566 \Gamma \vdash c \hookrightarrow C
adamc@531	567 }$$
adamc@531	568
adamc@541	569 \subsection{\label{definitional}Definitional Equality}
adamc@532	570
adamc@655	571 We use $\mathcal C$ to stand for a one-hole context that, when filled, yields a constructor. The notation $\mathcal C[c]$ plugs $c$ into $\mathcal C$. We omit the standard definition of one-hole contexts. We write $[x \mapsto c_1]c_2$ for capture-avoiding substitution of $c_1$ for $x$ in $c_2$, with analogous notation for substituting a kind in a constructor.
adamc@532	572
adamc@532	573 $$\infer{\Gamma \vdash c \equiv c}{}
adamc@532	574 \quad \infer{\Gamma \vdash c_1 \equiv c_2}{
adamc@532	575 \Gamma \vdash c_2 \equiv c_1
adamc@532	576 }
adamc@532	577 \quad \infer{\Gamma \vdash c_1 \equiv c_3}{
adamc@532	578 \Gamma \vdash c_1 \equiv c_2
adamc@532	579 & \Gamma \vdash c_2 \equiv c_3
adamc@532	580 }
adamc@532	581 \quad \infer{\Gamma \vdash \mathcal C[c_1] \equiv \mathcal C[c_2]}{
adamc@532	582 \Gamma \vdash c_1 \equiv c_2
adamc@532	583 }$$
adamc@532	584
adamc@532	585 $$\infer{\Gamma \vdash x \equiv c}{
adamc@532	586 x :: \kappa = c \in \Gamma
adamc@532	587 }
adamc@532	588 \quad \infer{\Gamma \vdash M.x \equiv c}{
adamc@537	589 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	590 & \mt{proj}(M, \overline{s}, \mt{con} \; x) = (\kappa, c)
adamc@532	591 }
adamc@532	592 \quad \infer{\Gamma \vdash (\overline c).i \equiv c_i}{}$$
adamc@532	593
adamc@532	594 $$\infer{\Gamma \vdash (\lambda x :: \kappa \Rightarrow c) \; c' \equiv [x \mapsto c'] c}{}
adamc@655	595 \quad \infer{\Gamma \vdash (X \Longrightarrow c) [\kappa] \equiv [X \mapsto \kappa] c}{}$$
adamc@655	596
adamc@655	597 $$\infer{\Gamma \vdash c_1 \rc c_2 \equiv c_2 \rc c_1}{}
adamc@532	598 \quad \infer{\Gamma \vdash c_1 \rc (c_2 \rc c_3) \equiv (c_1 \rc c_2) \rc c_3}{}$$
adamc@532	599
adamc@532	600 $$\infer{\Gamma \vdash [] \rc c \equiv c}{}
adamc@532	601 \quad \infer{\Gamma \vdash [\overline{c_1 = c'_1}] \rc [\overline{c_2 = c'_2}] \equiv [\overline{c_1 = c'_1}, \overline{c_2 = c'_2}]}{}$$
adamc@532	602
adamc@655	603 $$\infer{\Gamma \vdash \mt{map} \; f \; [] \equiv []}{}
adamc@655	604 \quad \infer{\Gamma \vdash \mt{map} \; f \; ([c_1 = c_2] \rc c) \equiv [c_1 = f \; c_2] \rc \mt{map} \; f \; c}{}$$
adamc@532	605
adamc@532	606 $$\infer{\Gamma \vdash \mt{map} \; (\lambda x \Rightarrow x) \; c \equiv c}{}
adamc@655	607 \quad \infer{\Gamma \vdash \mt{map} \; f \; (\mt{map} \; f' \; c)
adamc@655	608 \equiv \mt{map} \; (\lambda x \Rightarrow f \; (f' \; x)) \; c}{}$$
adamc@532	609
adamc@532	610 $$\infer{\Gamma \vdash \mt{map} \; f \; (c_1 \rc c_2) \equiv \mt{map} \; f \; c_1 \rc \mt{map} \; f \; c_2}{}$$
adamc@531	611
adamc@534	612 \subsection{Expression Typing}
adamc@533	613
adamc@873	614 We assume the existence of a function $T$ assigning types to literal constants. It maps integer constants to $\mt{Basis}.\mt{int}$, float constants to $\mt{Basis}.\mt{float}$, character constants to $\mt{Basis}.\mt{char}$, and string constants to $\mt{Basis}.\mt{string}$.
adamc@533	615
adamc@533	616 We also refer to a function $\mathcal I$, such that $\mathcal I(\tau)$ ``uses an oracle'' to instantiate all constructor function arguments at the beginning of $\tau$ that are marked implicit; i.e., replace $x_1 ::: \kappa_1 \to \ldots \to x_n ::: \kappa_n \to \tau$ with $[x_1 \mapsto c_1]\ldots[x_n \mapsto c_n]\tau$, where the $c_i$s are inferred and $\tau$ does not start like $x ::: \kappa \to \tau'$.
adamc@533	617
adamc@533	618 $$\infer{\Gamma \vdash e : \tau : \tau}{
adamc@533	619 \Gamma \vdash e : \tau
adamc@533	620 }
adamc@533	621 \quad \infer{\Gamma \vdash e : \tau}{
adamc@533	622 \Gamma \vdash e : \tau'
adamc@533	623 & \Gamma \vdash \tau' \equiv \tau
adamc@533	624 }
adamc@533	625 \quad \infer{\Gamma \vdash \ell : T(\ell)}{}$$
adamc@533	626
adamc@533	627 $$\infer{\Gamma \vdash x : \mathcal I(\tau)}{
adamc@533	628 x : \tau \in \Gamma
adamc@533	629 }
adamc@533	630 \quad \infer{\Gamma \vdash M.x : \mathcal I(\tau)}{
adamc@537	631 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	632 & \mt{proj}(M, \overline{s}, \mt{val} \; x) = \tau
adamc@533	633 }
adamc@533	634 \quad \infer{\Gamma \vdash X : \mathcal I(\tau)}{
adamc@533	635 X : \tau \in \Gamma
adamc@533	636 }
adamc@533	637 \quad \infer{\Gamma \vdash M.X : \mathcal I(\tau)}{
adamc@537	638 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	639 & \mt{proj}(M, \overline{s}, \mt{val} \; X) = \tau
adamc@533	640 }$$
adamc@533	641
adamc@533	642 $$\infer{\Gamma \vdash e_1 \; e_2 : \tau_2}{
adamc@533	643 \Gamma \vdash e_1 : \tau_1 \to \tau_2
adamc@533	644 & \Gamma \vdash e_2 : \tau_1
adamc@533	645 }
adamc@533	646 \quad \infer{\Gamma \vdash \lambda x : \tau_1 \Rightarrow e : \tau_1 \to \tau_2}{
adamc@533	647 \Gamma, x : \tau_1 \vdash e : \tau_2
adamc@533	648 }$$
adamc@533	649
adamc@533	650 $$\infer{\Gamma \vdash e [c] : [x \mapsto c]\tau}{
adamc@533	651 \Gamma \vdash e : x :: \kappa \to \tau
adamc@533	652 & \Gamma \vdash c :: \kappa
adamc@533	653 }
adamc@852	654 \quad \infer{\Gamma \vdash \lambda [x \; ? \; \kappa] \Rightarrow e : x \; ? \; \kappa \to \tau}{
adamc@533	655 \Gamma, x :: \kappa \vdash e : \tau
adamc@533	656 }$$
adamc@533	657
adamc@655	658 $$\infer{\Gamma \vdash e [\kappa] : [X \mapsto \kappa]\tau}{
adamc@655	659 \Gamma \vdash e : X \longrightarrow \tau
adamc@655	660 & \Gamma \vdash \kappa
adamc@655	661 }
adamc@655	662 \quad \infer{\Gamma \vdash X \Longrightarrow e : X \longrightarrow \tau}{
adamc@655	663 \Gamma, X \vdash e : \tau
adamc@655	664 }$$
adamc@655	665
adamc@533	666 $$\infer{\Gamma \vdash \{\overline{c = e}\} : \{\overline{c : \tau}\}}{
adamc@533	667 \forall i: \Gamma \vdash c_i :: \mt{Name}
adamc@533	668 & \Gamma \vdash e_i : \tau_i
adamc@533	669 & \forall i \neq j: \Gamma \vdash c_i \sim c_j
adamc@533	670 }
adamc@533	671 \quad \infer{\Gamma \vdash e.c : \tau}{
adamc@533	672 \Gamma \vdash e : \$([c = \tau] \rc c')
adamc@533	673 }
adamc@533	674 \quad \infer{\Gamma \vdash e_1 \rc e_2 : \$(c_1 \rc c_2)}{
adamc@533	675 \Gamma \vdash e_1 : \$c_1
adamc@533	676 & \Gamma \vdash e_2 : \$c_2
adamc@573	677 & \Gamma \vdash c_1 \sim c_2
adamc@533	678 }$$
adamc@533	679
adamc@533	680 $$\infer{\Gamma \vdash e \rcut c : \$c'}{
adamc@533	681 \Gamma \vdash e : \$([c = \tau] \rc c')
adamc@533	682 }
adamc@533	683 \quad \infer{\Gamma \vdash e \rcutM c : \$c'}{
adamc@533	684 \Gamma \vdash e : \$(c \rc c')
adamc@533	685 }$$
adamc@533	686
adamc@533	687 $$\infer{\Gamma \vdash \mt{let} \; \overline{ed} \; \mt{in} \; e \; \mt{end} : \tau}{
adamc@533	688 \Gamma \vdash \overline{ed} \leadsto \Gamma'
adamc@533	689 & \Gamma' \vdash e : \tau
adamc@533	690 }
adamc@533	691 \quad \infer{\Gamma \vdash \mt{case} \; e \; \mt{of} \; \overline{p \Rightarrow e} : \tau}{
adamc@533	692 \forall i: \Gamma \vdash p_i \leadsto \Gamma_i, \tau'
adamc@533	693 & \Gamma_i \vdash e_i : \tau
adamc@533	694 }$$
adamc@533	695
adamc@573	696 $$\infer{\Gamma \vdash \lambda [c_1 \sim c_2] \Rightarrow e : \lambda [c_1 \sim c_2] \Rightarrow \tau}{
adamc@533	697 \Gamma \vdash c_1 :: \{\kappa\}
adamc@655	698 & \Gamma \vdash c_2 :: \{\kappa'\}
adamc@533	699 & \Gamma, c_1 \sim c_2 \vdash e : \tau
adamc@662	700 }
adamc@662	701 \quad \infer{\Gamma \vdash e \; ! : \tau}{
adamc@662	702 \Gamma \vdash e : [c_1 \sim c_2] \Rightarrow \tau
adamc@662	703 & \Gamma \vdash c_1 \sim c_2
adamc@533	704 }$$
adamc@533	705
adamc@534	706 \subsection{Pattern Typing}
adamc@534	707
adamc@534	708 $$\infer{\Gamma \vdash \_ \leadsto \Gamma; \tau}{}
adamc@534	709 \quad \infer{\Gamma \vdash x \leadsto \Gamma, x : \tau; \tau}{}
adamc@534	710 \quad \infer{\Gamma \vdash \ell \leadsto \Gamma; T(\ell)}{}$$
adamc@534	711
adamc@534	712 $$\infer{\Gamma \vdash X \leadsto \Gamma; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@534	713 X : \overline{x ::: \mt{Type}} \to \tau \in \Gamma
adamc@534	714 & \textrm{$\tau$ not a function type}
adamc@534	715 }
adamc@534	716 \quad \infer{\Gamma \vdash X \; p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@534	717 X : \overline{x ::: \mt{Type}} \to \tau'' \to \tau \in \Gamma
adamc@534	718 & \Gamma \vdash p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau''
adamc@534	719 }$$
adamc@534	720
adamc@534	721 $$\infer{\Gamma \vdash M.X \leadsto \Gamma; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@537	722 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	723 & \mt{proj}(M, \overline{s}, \mt{val} \; X) = \overline{x ::: \mt{Type}} \to \tau
adamc@534	724 & \textrm{$\tau$ not a function type}
adamc@534	725 }$$
adamc@534	726
adamc@534	727 $$\infer{\Gamma \vdash M.X \; p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@537	728 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	729 & \mt{proj}(M, \overline{s}, \mt{val} \; X) = \overline{x ::: \mt{Type}} \to \tau'' \to \tau
adamc@534	730 & \Gamma \vdash p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau''
adamc@534	731 }$$
adamc@534	732
adamc@534	733 $$\infer{\Gamma \vdash \{\overline{x = p}\} \leadsto \Gamma_n; \{\overline{x = \tau}\}}{
adamc@534	734 \Gamma_0 = \Gamma
adamc@534	735 & \forall i: \Gamma_i \vdash p_i \leadsto \Gamma_{i+1}; \tau_i
adamc@534	736 }
adamc@534	737 \quad \infer{\Gamma \vdash \{\overline{x = p}, \ldots\} \leadsto \Gamma_n; \$([\overline{x = \tau}] \rc c)}{
adamc@534	738 \Gamma_0 = \Gamma
adamc@534	739 & \forall i: \Gamma_i \vdash p_i \leadsto \Gamma_{i+1}; \tau_i
adamc@534	740 }$$
adamc@534	741
adamc@852	742 $$\infer{\Gamma \vdash p : \tau \leadsto \Gamma'; \tau}{
adamc@852	743 \Gamma \vdash p \leadsto \Gamma'; \tau'
adamc@852	744 & \Gamma \vdash \tau' \equiv \tau
adamc@852	745 }$$
adamc@852	746
adamc@535	747 \subsection{Declaration Typing}
adamc@535	748
adamc@535	749 We use an auxiliary judgment $\overline{y}; x; \Gamma \vdash \overline{dc} \leadsto \Gamma'$, expressing the enrichment of $\Gamma$ with the types of the datatype constructors $\overline{dc}$, when they are known to belong to datatype $x$ with type parameters $\overline{y}$.
adamc@535	750
adamc@655	751 This is the first judgment where we deal with constructor classes, for the $\mt{class}$ declaration form. We will omit their special handling in this formal specification. Section \ref{typeclasses} gives an informal description of how constructor classes influence type inference.
adamc@535	752
adamc@558	753 We presuppose the existence of a function $\mathcal O$, where $\mathcal O(M, \overline{s})$ implements the $\mt{open}$ declaration by producing a context with the appropriate entry for each available component of module $M$ with signature items $\overline{s}$. Where possible, $\mathcal O$ uses ``transparent'' entries (e.g., an abstract type $M.x$ is mapped to $x :: \mt{Type} = M.x$), so that the relationship with $M$ is maintained. A related function $\mathcal O_c$ builds a context containing the disjointness constraints found in $\overline s$.
adamc@537	754 We write $\kappa_1^n \to \kappa$ as a shorthand, where $\kappa_1^0 \to \kappa = \kappa$ and $\kappa_1^{n+1} \to \kappa_2 = \kappa_1 \to (\kappa_1^n \to \kappa_2)$. We write $\mt{len}(\overline{y})$ for the length of vector $\overline{y}$ of variables.
adamc@535	755
adamc@535	756 $$\infer{\Gamma \vdash \cdot \leadsto \Gamma}{}
adamc@535	757 \quad \infer{\Gamma \vdash d, \overline{d} \leadsto \Gamma''}{
adamc@535	758 \Gamma \vdash d \leadsto \Gamma'
adamc@535	759 & \Gamma' \vdash \overline{d} \leadsto \Gamma''
adamc@535	760 }$$
adamc@535	761
adamc@535	762 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa = c \leadsto \Gamma, x :: \kappa = c}{
adamc@535	763 \Gamma \vdash c :: \kappa
adamc@535	764 }
adamc@535	765 \quad \infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leadsto \Gamma'}{
adamc@535	766 \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} \vdash \overline{dc} \leadsto \Gamma'
adamc@535	767 }$$
adamc@535	768
adamc@535	769 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leadsto \Gamma'}{
adamc@537	770 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	771 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@535	772 & \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} = M.z \vdash \overline{dc} \leadsto \Gamma'
adamc@535	773 }$$
adamc@535	774
adamc@535	775 $$\infer{\Gamma \vdash \mt{val} \; x : \tau = e \leadsto \Gamma, x : \tau}{
adamc@535	776 \Gamma \vdash e : \tau
adamc@535	777 }$$
adamc@535	778
adamc@535	779 $$\infer{\Gamma \vdash \mt{val} \; \mt{rec} \; \overline{x : \tau = e} \leadsto \Gamma, \overline{x : \tau}}{
adamc@535	780 \forall i: \Gamma, \overline{x : \tau} \vdash e_i : \tau_i
adamc@535	781 & \textrm{$e_i$ starts with an expression $\lambda$, optionally preceded by constructor and disjointness $\lambda$s}
adamc@535	782 }$$
adamc@535	783
adamc@535	784 $$\infer{\Gamma \vdash \mt{structure} \; X : S = M \leadsto \Gamma, X : S}{
adamc@535	785 \Gamma \vdash M : S
adamc@558	786 & \textrm{ $M$ not a constant or application}
adamc@535	787 }
adamc@558	788 \quad \infer{\Gamma \vdash \mt{structure} \; X : S = M \leadsto \Gamma, X : \mt{selfify}(X, \overline{s})}{
adamc@558	789 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@539	790 }$$
adamc@539	791
adamc@539	792 $$\infer{\Gamma \vdash \mt{signature} \; X = S \leadsto \Gamma, X = S}{
adamc@535	793 \Gamma \vdash S
adamc@535	794 }$$
adamc@535	795
adamc@537	796 $$\infer{\Gamma \vdash \mt{open} \; M \leadsto \Gamma, \mathcal O(M, \overline{s})}{
adamc@537	797 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@535	798 }$$
adamc@535	799
adamc@535	800 $$\infer{\Gamma \vdash \mt{constraint} \; c_1 \sim c_2 \leadsto \Gamma}{
adamc@535	801 \Gamma \vdash c_1 :: \{\kappa\}
adamc@535	802 & \Gamma \vdash c_2 :: \{\kappa\}
adamc@535	803 & \Gamma \vdash c_1 \sim c_2
adamc@535	804 }
adamc@537	805 \quad \infer{\Gamma \vdash \mt{open} \; \mt{constraints} \; M \leadsto \Gamma, \mathcal O_c(M, \overline{s})}{
adamc@537	806 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@535	807 }$$
adamc@535	808
adamc@784	809 $$\infer{\Gamma \vdash \mt{table} \; x : c \leadsto \Gamma, x : \mt{Basis}.\mt{sql\_table} \; c \; []}{
adamc@535	810 \Gamma \vdash c :: \{\mt{Type}\}
adamc@535	811 }
adamc@784	812 \quad \infer{\Gamma \vdash \mt{view} \; x : c \leadsto \Gamma, x : \mt{Basis}.\mt{sql\_view} \; c}{
adamc@784	813 \Gamma \vdash c :: \{\mt{Type}\}
adamc@784	814 }$$
adamc@784	815
adamc@784	816 $$\infer{\Gamma \vdash \mt{sequence} \; x \leadsto \Gamma, x : \mt{Basis}.\mt{sql\_sequence}}{}$$
adamc@535	817
adamc@535	818 $$\infer{\Gamma \vdash \mt{cookie} \; x : \tau \leadsto \Gamma, x : \mt{Basis}.\mt{http\_cookie} \; \tau}{
adamc@535	819 \Gamma \vdash \tau :: \mt{Type}
adamc@784	820 }
adamc@784	821 \quad \infer{\Gamma \vdash \mt{style} \; x \leadsto \Gamma, x : \mt{Basis}.\mt{css\_class}}{}$$
adamc@535	822
adamc@784	823 $$\infer{\Gamma \vdash \mt{class} \; x :: \kappa = c \leadsto \Gamma, x :: \kappa = c}{
adamc@784	824 \Gamma \vdash c :: \kappa
adamc@535	825 }$$
adamc@535	826
adamc@535	827 $$\infer{\overline{y}; x; \Gamma \vdash \cdot \leadsto \Gamma}{}
adamc@535	828 \quad \infer{\overline{y}; x; \Gamma \vdash X \mid \overline{dc} \leadsto \Gamma', X : \overline{y ::: \mt{Type}} \to x \; \overline{y}}{
adamc@535	829 \overline{y}; x; \Gamma \vdash \overline{dc} \leadsto \Gamma'
adamc@535	830 }
adamc@535	831 \quad \infer{\overline{y}; x; \Gamma \vdash X \; \mt{of} \; \tau \mid \overline{dc} \leadsto \Gamma', X : \overline{y ::: \mt{Type}} \to \tau \to x \; \overline{y}}{
adamc@535	832 \overline{y}; x; \Gamma \vdash \overline{dc} \leadsto \Gamma'
adamc@535	833 }$$
adamc@535	834
adamc@537	835 \subsection{Signature Item Typing}
adamc@537	836
adamc@537	837 We appeal to a signature item analogue of the $\mathcal O$ function from the last subsection.
adamc@537	838
adamc@537	839 $$\infer{\Gamma \vdash \cdot \leadsto \Gamma}{}
adamc@537	840 \quad \infer{\Gamma \vdash s, \overline{s} \leadsto \Gamma''}{
adamc@537	841 \Gamma \vdash s \leadsto \Gamma'
adamc@537	842 & \Gamma' \vdash \overline{s} \leadsto \Gamma''
adamc@537	843 }$$
adamc@537	844
adamc@537	845 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa \leadsto \Gamma, x :: \kappa}{}
adamc@537	846 \quad \infer{\Gamma \vdash \mt{con} \; x :: \kappa = c \leadsto \Gamma, x :: \kappa = c}{
adamc@537	847 \Gamma \vdash c :: \kappa
adamc@537	848 }
adamc@537	849 \quad \infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leadsto \Gamma'}{
adamc@537	850 \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} \vdash \overline{dc} \leadsto \Gamma'
adamc@537	851 }$$
adamc@537	852
adamc@537	853 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leadsto \Gamma'}{
adamc@537	854 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	855 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@537	856 & \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} = M.z \vdash \overline{dc} \leadsto \Gamma'
adamc@537	857 }$$
adamc@537	858
adamc@537	859 $$\infer{\Gamma \vdash \mt{val} \; x : \tau \leadsto \Gamma, x : \tau}{
adamc@537	860 \Gamma \vdash \tau :: \mt{Type}
adamc@537	861 }$$
adamc@537	862
adamc@537	863 $$\infer{\Gamma \vdash \mt{structure} \; X : S \leadsto \Gamma, X : S}{
adamc@537	864 \Gamma \vdash S
adamc@537	865 }
adamc@537	866 \quad \infer{\Gamma \vdash \mt{signature} \; X = S \leadsto \Gamma, X = S}{
adamc@537	867 \Gamma \vdash S
adamc@537	868 }$$
adamc@537	869
adamc@537	870 $$\infer{\Gamma \vdash \mt{include} \; S \leadsto \Gamma, \mathcal O(\overline{s})}{
adamc@537	871 \Gamma \vdash S
adamc@537	872 & \Gamma \vdash S \equiv \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	873 }$$
adamc@537	874
adamc@537	875 $$\infer{\Gamma \vdash \mt{constraint} \; c_1 \sim c_2 \leadsto \Gamma, c_1 \sim c_2}{
adamc@537	876 \Gamma \vdash c_1 :: \{\kappa\}
adamc@537	877 & \Gamma \vdash c_2 :: \{\kappa\}
adamc@537	878 }$$
adamc@537	879
adamc@784	880 $$\infer{\Gamma \vdash \mt{class} \; x :: \kappa = c \leadsto \Gamma, x :: \kappa = c}{
adamc@784	881 \Gamma \vdash c :: \kappa
adamc@537	882 }
adamc@784	883 \quad \infer{\Gamma \vdash \mt{class} \; x :: \kappa \leadsto \Gamma, x :: \kappa}{}$$
adamc@537	884
adamc@536	885 \subsection{Signature Compatibility}
adamc@536	886
adamc@558	887 To simplify the judgments in this section, we assume that all signatures are alpha-varied as necessary to avoid including multiple bindings for the same identifier. This is in addition to the usual alpha-variation of locally-bound variables.
adamc@537	888
adamc@537	889 We rely on a judgment $\Gamma \vdash \overline{s} \leq s'$, which expresses the occurrence in signature items $\overline{s}$ of an item compatible with $s'$. We also use a judgment $\Gamma \vdash \overline{dc} \leq \overline{dc}$, which expresses compatibility of datatype definitions.
adamc@537	890
adamc@536	891 $$\infer{\Gamma \vdash S \equiv S}{}
adamc@536	892 \quad \infer{\Gamma \vdash S_1 \equiv S_2}{
adamc@536	893 \Gamma \vdash S_2 \equiv S_1
adamc@536	894 }
adamc@536	895 \quad \infer{\Gamma \vdash X \equiv S}{
adamc@536	896 X = S \in \Gamma
adamc@536	897 }
adamc@536	898 \quad \infer{\Gamma \vdash M.X \equiv S}{
adamc@537	899 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	900 & \mt{proj}(M, \overline{s}, \mt{signature} \; X) = S
adamc@536	901 }$$
adamc@536	902
adamc@536	903 $$\infer{\Gamma \vdash S \; \mt{where} \; \mt{con} \; x = c \equiv \mt{sig} \; \overline{s^1} \; \mt{con} \; x :: \kappa = c \; \overline{s_2} \; \mt{end}}{
adamc@536	904 \Gamma \vdash S \equiv \mt{sig} \; \overline{s^1} \; \mt{con} \; x :: \kappa \; \overline{s_2} \; \mt{end}
adamc@536	905 & \Gamma \vdash c :: \kappa
adamc@537	906 }
adamc@537	907 \quad \infer{\Gamma \vdash \mt{sig} \; \overline{s^1} \; \mt{include} \; S \; \overline{s^2} \; \mt{end} \equiv \mt{sig} \; \overline{s^1} \; \overline{s} \; \overline{s^2} \; \mt{end}}{
adamc@537	908 \Gamma \vdash S \equiv \mt{sig} \; \overline{s} \; \mt{end}
adamc@536	909 }$$
adamc@536	910
adamc@536	911 $$\infer{\Gamma \vdash S_1 \leq S_2}{
adamc@536	912 \Gamma \vdash S_1 \equiv S_2
adamc@536	913 }
adamc@536	914 \quad \infer{\Gamma \vdash \mt{sig} \; \overline{s} \; \mt{end} \leq \mt{sig} \; \mt{end}}{}
adamc@537	915 \quad \infer{\Gamma \vdash \mt{sig} \; \overline{s} \; \mt{end} \leq \mt{sig} \; s' \; \overline{s'} \; \mt{end}}{
adamc@537	916 \Gamma \vdash \overline{s} \leq s'
adamc@537	917 & \Gamma \vdash s' \leadsto \Gamma'
adamc@537	918 & \Gamma' \vdash \mt{sig} \; \overline{s} \; \mt{end} \leq \mt{sig} \; \overline{s'} \; \mt{end}
adamc@537	919 }$$
adamc@537	920
adamc@537	921 $$\infer{\Gamma \vdash s \; \overline{s} \leq s'}{
adamc@537	922 \Gamma \vdash s \leq s'
adamc@537	923 }
adamc@537	924 \quad \infer{\Gamma \vdash s \; \overline{s} \leq s'}{
adamc@537	925 \Gamma \vdash s \leadsto \Gamma'
adamc@537	926 & \Gamma' \vdash \overline{s} \leq s'
adamc@536	927 }$$
adamc@536	928
adamc@536	929 $$\infer{\Gamma \vdash \mt{functor} (X : S_1) : S_2 \leq \mt{functor} (X : S'_1) : S'_2}{
adamc@536	930 \Gamma \vdash S'_1 \leq S_1
adamc@536	931 & \Gamma, X : S'_1 \vdash S_2 \leq S'_2
adamc@536	932 }$$
adamc@536	933
adamc@537	934 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa \leq \mt{con} \; x :: \kappa}{}
adamc@537	935 \quad \infer{\Gamma \vdash \mt{con} \; x :: \kappa = c \leq \mt{con} \; x :: \kappa}{}
adamc@558	936 \quad \infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leq \mt{con} \; x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type}}{}$$
adamc@537	937
adamc@537	938 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leq \mt{con} \; x :: \mt{Type}^{\mt{len}(y)} \to \mt{Type}}{
adamc@537	939 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	940 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@537	941 }$$
adamc@537	942
adamc@784	943 $$\infer{\Gamma \vdash \mt{class} \; x :: \kappa \leq \mt{con} \; x :: \kappa}{}
adamc@784	944 \quad \infer{\Gamma \vdash \mt{class} \; x :: \kappa = c \leq \mt{con} \; x :: \kappa}{}$$
adamc@537	945
adamc@537	946 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa = c_1 \leq \mt{con} \; x :: \mt{\kappa} = c_2}{
adamc@537	947 \Gamma \vdash c_1 \equiv c_2
adamc@537	948 }
adamc@784	949 \quad \infer{\Gamma \vdash \mt{class} \; x :: \kappa = c_1 \leq \mt{con} \; x :: \kappa = c_2}{
adamc@537	950 \Gamma \vdash c_1 \equiv c_2
adamc@537	951 }$$
adamc@537	952
adamc@537	953 $$\infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leq \mt{datatype} \; x \; \overline{y} = \overline{dc'}}{
adamc@537	954 \Gamma, \overline{y :: \mt{Type}} \vdash \overline{dc} \leq \overline{dc'}
adamc@537	955 }$$
adamc@537	956
adamc@537	957 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leq \mt{datatype} \; x \; \overline{y} = \overline{dc'}}{
adamc@537	958 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	959 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@537	960 & \Gamma, \overline{y :: \mt{Type}} \vdash \overline{dc} \leq \overline{dc'}
adamc@537	961 }$$
adamc@537	962
adamc@537	963 $$\infer{\Gamma \vdash \cdot \leq \cdot}{}
adamc@537	964 \quad \infer{\Gamma \vdash X; \overline{dc} \leq X; \overline{dc'}}{
adamc@537	965 \Gamma \vdash \overline{dc} \leq \overline{dc'}
adamc@537	966 }
adamc@537	967 \quad \infer{\Gamma \vdash X \; \mt{of} \; \tau_1; \overline{dc} \leq X \; \mt{of} \; \tau_2; \overline{dc'}}{
adamc@537	968 \Gamma \vdash \tau_1 \equiv \tau_2
adamc@537	969 & \Gamma \vdash \overline{dc} \leq \overline{dc'}
adamc@537	970 }$$
adamc@537	971
adamc@537	972 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leq \mt{datatype} \; x = \mt{datatype} \; M'.z'}{
adamc@537	973 \Gamma \vdash M.z \equiv M'.z'
adamc@537	974 }$$
adamc@537	975
adamc@537	976 $$\infer{\Gamma \vdash \mt{val} \; x : \tau_1 \leq \mt{val} \; x : \tau_2}{
adamc@537	977 \Gamma \vdash \tau_1 \equiv \tau_2
adamc@537	978 }
adamc@537	979 \quad \infer{\Gamma \vdash \mt{structure} \; X : S_1 \leq \mt{structure} \; X : S_2}{
adamc@537	980 \Gamma \vdash S_1 \leq S_2
adamc@537	981 }
adamc@537	982 \quad \infer{\Gamma \vdash \mt{signature} \; X = S_1 \leq \mt{signature} \; X = S_2}{
adamc@537	983 \Gamma \vdash S_1 \leq S_2
adamc@537	984 & \Gamma \vdash S_2 \leq S_1
adamc@537	985 }$$
adamc@537	986
adamc@537	987 $$\infer{\Gamma \vdash \mt{constraint} \; c_1 \sim c_2 \leq \mt{constraint} \; c'_1 \sim c'_2}{
adamc@537	988 \Gamma \vdash c_1 \equiv c'_1
adamc@537	989 & \Gamma \vdash c_2 \equiv c'_2
adamc@537	990 }$$
adamc@537	991
adamc@655	992 $$\infer{\Gamma \vdash \mt{class} \; x :: \kappa \leq \mt{class} \; x :: \kappa}{}
adamc@655	993 \quad \infer{\Gamma \vdash \mt{class} \; x :: \kappa = c \leq \mt{class} \; x :: \kappa}{}
adamc@655	994 \quad \infer{\Gamma \vdash \mt{class} \; x :: \kappa = c_1 \leq \mt{class} \; x :: \kappa = c_2}{
adamc@537	995 \Gamma \vdash c_1 \equiv c_2
adamc@537	996 }$$
adamc@537	997
adamc@538	998 \subsection{Module Typing}
adamc@538	999
adamc@538	1000 We use a helper function $\mt{sigOf}$, which converts declarations and sequences of declarations into their principal signature items and sequences of signature items, respectively.
adamc@538	1001
adamc@538	1002 $$\infer{\Gamma \vdash M : S}{
adamc@538	1003 \Gamma \vdash M : S'
adamc@538	1004 & \Gamma \vdash S' \leq S
adamc@538	1005 }
adamc@538	1006 \quad \infer{\Gamma \vdash \mt{struct} \; \overline{d} \; \mt{end} : \mt{sig} \; \mt{sigOf}(\overline{d}) \; \mt{end}}{
adamc@538	1007 \Gamma \vdash \overline{d} \leadsto \Gamma'
adamc@538	1008 }
adamc@538	1009 \quad \infer{\Gamma \vdash X : S}{
adamc@538	1010 X : S \in \Gamma
adamc@538	1011 }$$
adamc@538	1012
adamc@538	1013 $$\infer{\Gamma \vdash M.X : S}{
adamc@538	1014 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@538	1015 & \mt{proj}(M, \overline{s}, \mt{structure} \; X) = S
adamc@538	1016 }$$
adamc@538	1017
adamc@538	1018 $$\infer{\Gamma \vdash M_1(M_2) : [X \mapsto M_2]S_2}{
adamc@538	1019 \Gamma \vdash M_1 : \mt{functor}(X : S_1) : S_2
adamc@538	1020 & \Gamma \vdash M_2 : S_1
adamc@538	1021 }
adamc@538	1022 \quad \infer{\Gamma \vdash \mt{functor} (X : S_1) : S_2 = M : \mt{functor} (X : S_1) : S_2}{
adamc@538	1023 \Gamma \vdash S_1
adamc@538	1024 & \Gamma, X : S_1 \vdash S_2
adamc@538	1025 & \Gamma, X : S_1 \vdash M : S_2
adamc@538	1026 }$$
adamc@538	1027
adamc@538	1028 \begin{eqnarray*}
adamc@538	1029 \mt{sigOf}(\cdot) &=& \cdot \\
adamc@538	1030 \mt{sigOf}(s \; \overline{s'}) &=& \mt{sigOf}(s) \; \mt{sigOf}(\overline{s'}) \\
adamc@538	1031 \\
adamc@538	1032 \mt{sigOf}(\mt{con} \; x :: \kappa = c) &=& \mt{con} \; x :: \kappa = c \\
adamc@538	1033 \mt{sigOf}(\mt{datatype} \; x \; \overline{y} = \overline{dc}) &=& \mt{datatype} \; x \; \overline{y} = \overline{dc} \\
adamc@538	1034 \mt{sigOf}(\mt{datatype} \; x = \mt{datatype} \; M.z) &=& \mt{datatype} \; x = \mt{datatype} \; M.z \\
adamc@538	1035 \mt{sigOf}(\mt{val} \; x : \tau = e) &=& \mt{val} \; x : \tau \\
adamc@538	1036 \mt{sigOf}(\mt{val} \; \mt{rec} \; \overline{x : \tau = e}) &=& \overline{\mt{val} \; x : \tau} \\
adamc@538	1037 \mt{sigOf}(\mt{structure} \; X : S = M) &=& \mt{structure} \; X : S \\
adamc@538	1038 \mt{sigOf}(\mt{signature} \; X = S) &=& \mt{signature} \; X = S \\
adamc@538	1039 \mt{sigOf}(\mt{open} \; M) &=& \mt{include} \; S \textrm{ (where $\Gamma \vdash M : S$)} \\
adamc@538	1040 \mt{sigOf}(\mt{constraint} \; c_1 \sim c_2) &=& \mt{constraint} \; c_1 \sim c_2 \\
adamc@538	1041 \mt{sigOf}(\mt{open} \; \mt{constraints} \; M) &=& \cdot \\
adamc@538	1042 \mt{sigOf}(\mt{table} \; x : c) &=& \mt{table} \; x : c \\
adamc@784	1043 \mt{sigOf}(\mt{view} \; x : c) &=& \mt{view} \; x : c \\
adamc@538	1044 \mt{sigOf}(\mt{sequence} \; x) &=& \mt{sequence} \; x \\
adamc@538	1045 \mt{sigOf}(\mt{cookie} \; x : \tau) &=& \mt{cookie} \; x : \tau \\
adamc@784	1046 \mt{sigOf}(\mt{style} \; x) &=& \mt{style} \; x \\
adamc@655	1047 \mt{sigOf}(\mt{class} \; x :: \kappa = c) &=& \mt{class} \; x :: \kappa = c \\
adamc@538	1048 \end{eqnarray*}
adamc@539	1049 \begin{eqnarray*}
adamc@539	1050 \mt{selfify}(M, \cdot) &=& \cdot \\
adamc@558	1051 \mt{selfify}(M, s \; \overline{s'}) &=& \mt{selfify}(M, s) \; \mt{selfify}(M, \overline{s'}) \\
adamc@539	1052 \\
adamc@539	1053 \mt{selfify}(M, \mt{con} \; x :: \kappa) &=& \mt{con} \; x :: \kappa = M.x \\
adamc@539	1054 \mt{selfify}(M, \mt{con} \; x :: \kappa = c) &=& \mt{con} \; x :: \kappa = c \\
adamc@539	1055 \mt{selfify}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc}) &=& \mt{datatype} \; x \; \overline{y} = \mt{datatype} \; M.x \\
adamc@539	1056 \mt{selfify}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z) &=& \mt{datatype} \; x = \mt{datatype} \; M'.z \\
adamc@539	1057 \mt{selfify}(M, \mt{val} \; x : \tau) &=& \mt{val} \; x : \tau \\
adamc@539	1058 \mt{selfify}(M, \mt{structure} \; X : S) &=& \mt{structure} \; X : \mt{selfify}(M.X, \overline{s}) \textrm{ (where $\Gamma \vdash S \equiv \mt{sig} \; \overline{s} \; \mt{end}$)} \\
adamc@539	1059 \mt{selfify}(M, \mt{signature} \; X = S) &=& \mt{signature} \; X = S \\
adamc@539	1060 \mt{selfify}(M, \mt{include} \; S) &=& \mt{include} \; S \\
adamc@539	1061 \mt{selfify}(M, \mt{constraint} \; c_1 \sim c_2) &=& \mt{constraint} \; c_1 \sim c_2 \\
adamc@655	1062 \mt{selfify}(M, \mt{class} \; x :: \kappa) &=& \mt{class} \; x :: \kappa = M.x \\
adamc@655	1063 \mt{selfify}(M, \mt{class} \; x :: \kappa = c) &=& \mt{class} \; x :: \kappa = c \\
adamc@539	1064 \end{eqnarray*}
adamc@539	1065
adamc@540	1066 \subsection{Module Projection}
adamc@540	1067
adamc@540	1068 \begin{eqnarray*}
adamc@540	1069 \mt{proj}(M, \mt{con} \; x :: \kappa \; \overline{s}, \mt{con} \; x) &=& \kappa \\
adamc@540	1070 \mt{proj}(M, \mt{con} \; x :: \kappa = c \; \overline{s}, \mt{con} \; x) &=& (\kappa, c) \\
adamc@540	1071 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{con} \; x) &=& \mt{Type}^{\mt{len}(\overline{y})} \to \mt{Type} \\
adamc@540	1072 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z \; \overline{s}, \mt{con} \; x) &=& (\mt{Type}^{\mt{len}(\overline{y})} \to \mt{Type}, M'.z) \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$} \\
adamc@540	1073 && \textrm{and $\mt{proj}(M', \overline{s'}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})$)} \\
adamc@655	1074 \mt{proj}(M, \mt{class} \; x :: \kappa \; \overline{s}, \mt{con} \; x) &=& \kappa \to \mt{Type} \\
adamc@655	1075 \mt{proj}(M, \mt{class} \; x :: \kappa = c \; \overline{s}, \mt{con} \; x) &=& (\kappa \to \mt{Type}, c) \\
adamc@540	1076 \\
adamc@540	1077 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{datatype} \; x) &=& (\overline{y}, \overline{dc}) \\
adamc@540	1078 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z \; \overline{s}, \mt{con} \; x) &=& \mt{proj}(M', \overline{s'}, \mt{datatype} \; z) \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$)} \\
adamc@540	1079 \\
adamc@540	1080 \mt{proj}(M, \mt{val} \; x : \tau \; \overline{s}, \mt{val} \; x) &=& \tau \\
adamc@540	1081 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to M.x \; \overline y \textrm{ (where $X \in \overline{dc}$)} \\
adamc@540	1082 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to \tau \to M.x \; \overline y \textrm{ (where $X \; \mt{of} \; \tau \in \overline{dc}$)} \\
adamc@540	1083 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to M.x \; \overline y \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$} \\
adamc@540	1084 && \textrm{and $\mt{proj}(M', \overline{s'}, \mt{datatype} \; z = (\overline{y}, \overline{dc})$ and $X \in \overline{dc}$)} \\
adamc@540	1085 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to \tau \to M.x \; \overline y \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$} \\
adamc@558	1086 && \textrm{and $\mt{proj}(M', \overline{s'}, \mt{datatype} \; z = (\overline{y}, \overline{dc})$ and $X \; \mt{of} \; \tau \in \overline{dc}$)} \\
adamc@540	1087 \\
adamc@540	1088 \mt{proj}(M, \mt{structure} \; X : S \; \overline{s}, \mt{structure} \; X) &=& S \\
adamc@540	1089 \\
adamc@540	1090 \mt{proj}(M, \mt{signature} \; X = S \; \overline{s}, \mt{signature} \; X) &=& S \\
adamc@540	1091 \\
adamc@540	1092 \mt{proj}(M, \mt{con} \; x :: \kappa \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540	1093 \mt{proj}(M, \mt{con} \; x :: \kappa = c \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540	1094 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540	1095 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540	1096 \mt{proj}(M, \mt{val} \; x : \tau \; \overline{s}, V) &=& \mt{proj}(M, \overline{s}, V) \\
adamc@540	1097 \mt{proj}(M, \mt{structure} \; X : S \; \overline{s}, V) &=& [X \mapsto M.X]\mt{proj}(M, \overline{s}, V) \\
adamc@540	1098 \mt{proj}(M, \mt{signature} \; X = S \; \overline{s}, V) &=& [X \mapsto M.X]\mt{proj}(M, \overline{s}, V) \\
adamc@540	1099 \mt{proj}(M, \mt{include} \; S \; \overline{s}, V) &=& \mt{proj}(M, \overline{s'} \; \overline{s}, V) \textrm{ (where $\Gamma \vdash S \equiv \mt{sig} \; \overline{s'} \; \mt{end}$)} \\
adamc@540	1100 \mt{proj}(M, \mt{constraint} \; c_1 \sim c_2 \; \overline{s}, V) &=& \mt{proj}(M, \overline{s}, V) \\
adamc@655	1101 \mt{proj}(M, \mt{class} \; x :: \kappa \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@655	1102 \mt{proj}(M, \mt{class} \; x :: \kappa = c \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540	1103 \end{eqnarray*}
adamc@540	1104
adamc@541	1105
adamc@541	1106 \section{Type Inference}
adamc@541	1107
adamc@541	1108 The Ur/Web compiler uses \emph{heuristic type inference}, with no claims of completeness with respect to the declarative specification of the last section. The rules in use seem to work well in practice. This section summarizes those rules, to help Ur programmers predict what will work and what won't.
adamc@541	1109
adamc@541	1110 \subsection{Basic Unification}
adamc@541	1111
adamc@560	1112 Type-checkers for languages based on the Hindley-Milner type discipline, like ML and Haskell, take advantage of \emph{principal typing} properties, making complete type inference relatively straightforward. Inference algorithms are traditionally implemented using type unification variables, at various points asserting equalities between types, in the process discovering the values of type variables. The Ur/Web compiler uses the same basic strategy, but the complexity of the type system rules out easy completeness.
adamc@541	1113
adamc@656	1114 Type-checking can require evaluating recursive functional programs, thanks to the type-level $\mt{map}$ operator. When a unification variable appears in such a type, the next step of computation can be undetermined. The value of that variable might be determined later, but this would be ``too late'' for the unification problems generated at the first occurrence. This is the essential source of incompleteness.
adamc@541	1115
adamc@541	1116 Nonetheless, the unification engine tends to do reasonably well. Unlike in ML, polymorphism is never inferred in definitions; it must be indicated explicitly by writing out constructor-level parameters. By writing these and other annotations, the programmer can generally get the type inference engine to do most of the type reconstruction work.
adamc@541	1117
adamc@541	1118 \subsection{Unifying Record Types}
adamc@541	1119
adamc@570	1120 The type inference engine tries to take advantage of the algebraic rules governing type-level records, as shown in Section \ref{definitional}. When two constructors of record kind are unified, they are reduced to normal forms, with like terms crossed off from each normal form until, hopefully, nothing remains. This cannot be complete, with the inclusion of unification variables. The type-checker can help you understand what goes wrong when the process fails, as it outputs the unmatched remainders of the two normal forms.
adamc@541	1121
adamc@656	1122 \subsection{\label{typeclasses}Constructor Classes}
adamc@541	1123
adamc@784	1124 Ur includes a constructor class facility inspired by Haskell's. The current version is experimental, with very general Prolog-like facilities that can lead to compile-time non-termination.
adamc@541	1125
adamc@784	1126 Constructor classes are integrated with the module system. A constructor class of kind $\kappa$ is just a constructor of kind $\kappa$. By marking such a constructor $c$ as a constructor class, the programmer instructs the type inference engine to, in each scope, record all values of types $c \; c_1 \; \ldots \; c_n$ as \emph{instances}. Any function argument whose type is of such a form is treated as implicit, to be determined by examining the current instance database.
adamc@541	1127
adamc@656	1128 The ``dictionary encoding'' often used in Haskell implementations is made explicit in Ur. Constructor class instances are just properly-typed values, and they can also be considered as ``proofs'' of membership in the class. In some cases, it is useful to pass these proofs around explicitly. An underscore written where a proof is expected will also be inferred, if possible, from the current instance database.
adamc@541	1129
adamc@656	1130 Just as for constructors, constructors classes may be exported from modules, and they may be exported as concrete or abstract. Concrete constructor classes have their ``real'' definitions exposed, so that client code may add new instances freely. Abstract constructor classes are useful as ``predicates'' that can be used to enforce invariants, as we will see in some definitions of SQL syntax in the Ur/Web standard library.
adamc@541	1131
adamc@541	1132 \subsection{Reverse-Engineering Record Types}
adamc@541	1133
adamc@656	1134 It's useful to write Ur functions and functors that take record constructors as inputs, but these constructors can grow quite long, even though their values are often implied by other arguments. The compiler uses a simple heuristic to infer the values of unification variables that are mapped over, yielding known results. If the result is empty, we're done; if it's not empty, we replace a single unification variable with a new constructor formed from three new unification variables, as in $[\alpha = \beta] \rc \gamma$. This process can often be repeated to determine a unification variable fully.
adamc@541	1135
adamc@541	1136 \subsection{Implicit Arguments in Functor Applications}
adamc@541	1137
adamc@656	1138 Constructor, constraint, and constructor class witness members of structures may be omitted, when those structures are used in contexts where their assigned signatures imply how to fill in those missing members. This feature combines well with reverse-engineering to allow for uses of complicated meta-programming functors with little more code than would be necessary to invoke an untyped, ad-hoc code generator.
adamc@541	1139
adamc@541	1140
adamc@542	1141 \section{The Ur Standard Library}
adamc@542	1142
adamc@542	1143 The built-in parts of the Ur/Web standard library are described by the signature in \texttt{lib/basis.urs} in the distribution. A module $\mt{Basis}$ ascribing to that signature is available in the initial environment, and every program is implicitly prefixed by $\mt{open} \; \mt{Basis}$.
adamc@542	1144
adamc@542	1145 Additionally, other common functions that are definable within Ur are included in \texttt{lib/top.urs} and \texttt{lib/top.ur}. This $\mt{Top}$ module is also opened implicitly.
adamc@542	1146
adamc@542	1147 The idea behind Ur is to serve as the ideal host for embedded domain-specific languages. For now, however, the ``generic'' functionality is intermixed with Ur/Web-specific functionality, including in these two library modules. We hope that these generic library components have types that speak for themselves. The next section introduces the Ur/Web-specific elements. Here, we only give the type declarations from the beginning of $\mt{Basis}$.
adamc@542	1148 $$\begin{array}{l}
adamc@542	1149 \mt{type} \; \mt{int} \\
adamc@542	1150 \mt{type} \; \mt{float} \\
adamc@873	1151 \mt{type} \; \mt{char} \\
adamc@542	1152 \mt{type} \; \mt{string} \\
adamc@542	1153 \mt{type} \; \mt{time} \\
adamc@785	1154 \mt{type} \; \mt{blob} \\
adamc@542	1155 \\
adamc@542	1156 \mt{type} \; \mt{unit} = \{\} \\
adamc@542	1157 \\
adamc@542	1158 \mt{datatype} \; \mt{bool} = \mt{False} \mid \mt{True} \\
adamc@542	1159 \\
adamc@785	1160 \mt{datatype} \; \mt{option} \; \mt{t} = \mt{None} \mid \mt{Some} \; \mt{of} \; \mt{t} \\
adamc@785	1161 \\
adamc@785	1162 \mt{datatype} \; \mt{list} \; \mt{t} = \mt{Nil} \mid \mt{Cons} \; \mt{of} \; \mt{t} \times \mt{list} \; \mt{t}
adamc@542	1163 \end{array}$$
adamc@542	1164
adamc@785	1165 The only unusual element of this list is the $\mt{blob}$ type, which stands for binary sequences.
adamc@785	1166
adamc@657	1167 Another important generic Ur element comes at the beginning of \texttt{top.urs}.
adamc@657	1168
adamc@657	1169 $$\begin{array}{l}
adamc@657	1170 \mt{con} \; \mt{folder} :: \mt{K} \longrightarrow \{\mt{K}\} \to \mt{Type} \\
adamc@657	1171 \\
adamc@657	1172 \mt{val} \; \mt{fold} : \mt{K} \longrightarrow \mt{tf} :: (\{\mt{K}\} \to \mt{Type}) \\
adamc@657	1173 \hspace{.1in} \to (\mt{nm} :: \mt{Name} \to \mt{v} :: \mt{K} \to \mt{r} :: \{\mt{K}\} \to [[\mt{nm}] \sim \mt{r}] \Rightarrow \\
adamc@657	1174 \hspace{.2in} \mt{tf} \; \mt{r} \to \mt{tf} \; ([\mt{nm} = \mt{v}] \rc \mt{r})) \\
adamc@657	1175 \hspace{.1in} \to \mt{tf} \; [] \\
adamc@657	1176 \hspace{.1in} \to \mt{r} :: \{\mt{K}\} \to \mt{folder} \; \mt{r} \to \mt{tf} \; \mt{r}
adamc@657	1177 \end{array}$$
adamc@657	1178
adamc@657	1179 For a type-level record $\mt{r}$, a $\mt{folder} \; \mt{r}$ encodes a permutation of $\mt{r}$'s elements. The $\mt{fold}$ function can be called on a $\mt{folder}$ to iterate over the elements of $\mt{r}$ in that order. $\mt{fold}$ is parameterized on a type-level function to be used to calculate the type of each intermediate result of folding. After processing a subset $\mt{r'}$ of $\mt{r}$'s entries, the type of the accumulator should be $\mt{tf} \; \mt{r'}$. The next two expression arguments to $\mt{fold}$ are the usual step function and initial accumulator, familiar from fold functions over lists. The final two arguments are the record to fold over and a $\mt{folder}$ for it.
adamc@657	1180
adamc@664	1181 The Ur compiler treats $\mt{folder}$ like a constructor class, using built-in rules to infer $\mt{folder}$s for records with known structure. The order in which field names are mentioned in source code is used as a hint about the permutation that the programmer would like.
adamc@657	1182
adamc@542	1183
adamc@542	1184 \section{The Ur/Web Standard Library}
adamc@542	1185
adamc@658	1186 \subsection{Monads}
adamc@658	1187
adamc@658	1188 The Ur Basis defines the monad constructor class from Haskell.
adamc@658	1189
adamc@658	1190 $$\begin{array}{l}
adamc@658	1191 \mt{class} \; \mt{monad} :: \mt{Type} \to \mt{Type} \\
adamc@658	1192 \mt{val} \; \mt{return} : \mt{m} ::: (\mt{Type} \to \mt{Type}) \to \mt{t} ::: \mt{Type} \\
adamc@658	1193 \hspace{.1in} \to \mt{monad} \; \mt{m} \\
adamc@658	1194 \hspace{.1in} \to \mt{t} \to \mt{m} \; \mt{t} \\
adamc@658	1195 \mt{val} \; \mt{bind} : \mt{m} ::: (\mt{Type} \to \mt{Type}) \to \mt{t1} ::: \mt{Type} \to \mt{t2} ::: \mt{Type} \\
adamc@658	1196 \hspace{.1in} \to \mt{monad} \; \mt{m} \\
adamc@658	1197 \hspace{.1in} \to \mt{m} \; \mt{t1} \to (\mt{t1} \to \mt{m} \; \mt{t2}) \\
adamc@658	1198 \hspace{.1in} \to \mt{m} \; \mt{t2}
adamc@658	1199 \end{array}$$
adamc@658	1200
adamc@542	1201 \subsection{Transactions}
adamc@542	1202
adamc@542	1203 Ur is a pure language; we use Haskell's trick to support controlled side effects. The standard library defines a monad $\mt{transaction}$, meant to stand for actions that may be undone cleanly. By design, no other kinds of actions are supported.
adamc@542	1204 $$\begin{array}{l}
adamc@542	1205 \mt{con} \; \mt{transaction} :: \mt{Type} \to \mt{Type} \\
adamc@658	1206 \mt{val} \; \mt{transaction\_monad} : \mt{monad} \; \mt{transaction}
adamc@542	1207 \end{array}$$
adamc@542	1208
adamc@542	1209 \subsection{HTTP}
adamc@542	1210
adamc@542	1211 There are transactions for reading an HTTP header by name and for getting and setting strongly-typed cookies. Cookies may only be created by the $\mt{cookie}$ declaration form, ensuring that they be named consistently based on module structure.
adamc@542	1212 $$\begin{array}{l}
adamc@786	1213 \mt{val} \; \mt{requestHeader} : \mt{string} \to \mt{transaction} \; (\mt{option} \; \mt{string}) \\
adamc@786	1214 \\
adamc@786	1215 \mt{con} \; \mt{http\_cookie} :: \mt{Type} \to \mt{Type} \\
adamc@786	1216 \mt{val} \; \mt{getCookie} : \mt{t} ::: \mt{Type} \to \mt{http\_cookie} \; \mt{t} \to \mt{transaction} \; (\mt{option} \; \mt{t}) \\
adamc@786	1217 \mt{val} \; \mt{setCookie} : \mt{t} ::: \mt{Type} \to \mt{http\_cookie} \; \mt{t} \to \mt{t} \to \mt{transaction} \; \mt{unit}
adamc@786	1218 \end{array}$$
adamc@786	1219
adamc@786	1220 There are also an abstract $\mt{url}$ type and functions for converting to it, based on the policy defined by \texttt{[allow\|deny] url} directives in the project file.
adamc@786	1221 $$\begin{array}{l}
adamc@786	1222 \mt{type} \; \mt{url} \\
adamc@786	1223 \mt{val} \; \mt{bless} : \mt{string} \to \mt{url} \\
adamc@786	1224 \mt{val} \; \mt{checkUrl} : \mt{string} \to \mt{option} \; \mt{url}
adamc@786	1225 \end{array}$$
adamc@786	1226 $\mt{bless}$ raises a runtime error if the string passed to it fails the URL policy.
adamc@786	1227
adamc@786	1228 It's possible for pages to return files of arbitrary MIME types. A file can be input from the user using this data type, along with the $\mt{upload}$ form tag.
adamc@786	1229 $$\begin{array}{l}
adamc@786	1230 \mt{type} \; \mt{file} \\
adamc@786	1231 \mt{val} \; \mt{fileName} : \mt{file} \to \mt{option} \; \mt{string} \\
adamc@786	1232 \mt{val} \; \mt{fileMimeType} : \mt{file} \to \mt{string} \\
adamc@786	1233 \mt{val} \; \mt{fileData} : \mt{file} \to \mt{blob}
adamc@786	1234 \end{array}$$
adamc@786	1235
adamc@786	1236 A blob can be extracted from a file and returned as the page result. There are bless and check functions for MIME types analogous to those for URLs.
adamc@786	1237 $$\begin{array}{l}
adamc@786	1238 \mt{type} \; \mt{mimeType} \\
adamc@786	1239 \mt{val} \; \mt{blessMime} : \mt{string} \to \mt{mimeType} \\
adamc@786	1240 \mt{val} \; \mt{checkMime} : \mt{string} \to \mt{option} \; \mt{mimeType} \\
adamc@786	1241 \mt{val} \; \mt{returnBlob} : \mt{t} ::: \mt{Type} \to \mt{blob} \to \mt{mimeType} \to \mt{transaction} \; \mt{t}
adamc@542	1242 \end{array}$$
adamc@542	1243
adamc@543	1244 \subsection{SQL}
adamc@543	1245
adamc@543	1246 The fundamental unit of interest in the embedding of SQL is tables, described by a type family and creatable only via the $\mt{table}$ declaration form.
adamc@543	1247 $$\begin{array}{l}
adamc@785	1248 \mt{con} \; \mt{sql\_table} :: \{\mt{Type}\} \to \{\{\mt{Unit}\}\} \to \mt{Type}
adamc@785	1249 \end{array}$$
adamc@785	1250 The first argument to this constructor gives the names and types of a table's columns, and the second argument gives the set of valid keys. Keys are the only subsets of the columns that may be referenced as foreign keys. Each key has a name.
adamc@785	1251
adamc@785	1252 We also have the simpler type family of SQL views, which have no keys.
adamc@785	1253 $$\begin{array}{l}
adamc@785	1254 \mt{con} \; \mt{sql\_view} :: \{\mt{Type}\} \to \mt{Type}
adamc@543	1255 \end{array}$$
adamc@543	1256
adamc@785	1257 A multi-parameter type class is used to allow tables and views to be used interchangeably, with a way of extracting the set of columns from each.
adamc@785	1258 $$\begin{array}{l}
adamc@785	1259 \mt{class} \; \mt{fieldsOf} :: \mt{Type} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@785	1260 \mt{val} \; \mt{fieldsOf\_table} : \mt{fs} ::: \{\mt{Type}\} \to \mt{keys} ::: \{\{\mt{Unit}\}\} \to \mt{fieldsOf} \; (\mt{sql\_table} \; \mt{fs} \; \mt{keys}) \; \mt{fs} \\
adamc@785	1261 \mt{val} \; \mt{fieldsOf\_view} : \mt{fs} ::: \{\mt{Type}\} \to \mt{fieldsOf} \; (\mt{sql\_view} \; \mt{fs}) \; \mt{fs}
adamc@785	1262 \end{array}$$
adamc@785	1263
adamc@785	1264 \subsubsection{Table Constraints}
adamc@785	1265
adamc@785	1266 Tables may be declared with constraints, such that database modifications that violate the constraints are blocked. A table may have at most one \texttt{PRIMARY KEY} constraint, which gives the subset of columns that will most often be used to look up individual rows in the table.
adamc@785	1267
adamc@785	1268 $$\begin{array}{l}
adamc@785	1269 \mt{con} \; \mt{primary\_key} :: \{\mt{Type}\} \to \{\{\mt{Unit}\}\} \to \mt{Type} \\
adamc@785	1270 \mt{val} \; \mt{no\_primary\_key} : \mt{fs} ::: \{\mt{Type}\} \to \mt{primary\_key} \; \mt{fs} \; [] \\
adamc@785	1271 \mt{val} \; \mt{primary\_key} : \mt{rest} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \to \mt{key1} :: \mt{Name} \to \mt{keys} :: \{\mt{Type}\} \\
adamc@785	1272 \hspace{.1in} \to [[\mt{key1}] \sim \mt{keys}] \Rightarrow [[\mt{key1} = \mt{t}] \rc \mt{keys} \sim \mt{rest}] \\
adamc@785	1273 \hspace{.1in} \Rightarrow \$([\mt{key1} = \mt{sql\_injectable\_prim} \; \mt{t}] \rc \mt{map} \; \mt{sql\_injectable\_prim} \; \mt{keys}) \\
adamc@785	1274 \hspace{.1in} \to \mt{primary\_key} \; ([\mt{key1} = \mt{t}] \rc \mt{keys} \rc \mt{rest}) \; [\mt{Pkey} = [\mt{key1}] \rc \mt{map} \; (\lambda \_ \Rightarrow ()) \; \mt{keys}]
adamc@785	1275 \end{array}$$
adamc@785	1276 The type class $\mt{sql\_injectable\_prim}$ characterizes which types are allowed in SQL and are not $\mt{option}$ types. In SQL, a \texttt{PRIMARY KEY} constraint enforces after-the-fact that a column may not contain \texttt{NULL}s, but Ur/Web forces that information to be included in table types from the beginning. Thus, the only effect of this kind of constraint in Ur/Web is to enforce uniqueness of the given key within the table.
adamc@785	1277
adamc@785	1278 A type family stands for sets of named constraints of the remaining varieties.
adamc@785	1279 $$\begin{array}{l}
adamc@785	1280 \mt{con} \; \mt{sql\_constraints} :: \{\mt{Type}\} \to \{\{\mt{Unit}\}\} \to \mt{Type}
adamc@785	1281 \end{array}$$
adamc@785	1282 The first argument gives the column types of the table being constrained, and the second argument maps constraint names to the keys that they define. Constraints that don't define keys are mapped to ``empty keys.''
adamc@785	1283
adamc@785	1284 There is a type family of individual, unnamed constraints.
adamc@785	1285 $$\begin{array}{l}
adamc@785	1286 \mt{con} \; \mt{sql\_constraint} :: \{\mt{Type}\} \to \{\mt{Unit}\} \to \mt{Type}
adamc@785	1287 \end{array}$$
adamc@785	1288 The first argument is the same as above, and the second argument gives the key columns for just this constraint.
adamc@785	1289
adamc@785	1290 We have operations for assembling constraints into constraint sets.
adamc@785	1291 $$\begin{array}{l}
adamc@785	1292 \mt{val} \; \mt{no\_constraint} : \mt{fs} ::: \{\mt{Type}\} \to \mt{sql\_constraints} \; \mt{fs} \; [] \\
adamc@785	1293 \mt{val} \; \mt{one\_constraint} : \mt{fs} ::: \{\mt{Type}\} \to \mt{unique} ::: \{\mt{Unit}\} \to \mt{name} :: \mt{Name} \\
adamc@785	1294 \hspace{.1in} \to \mt{sql\_constraint} \; \mt{fs} \; \mt{unique} \to \mt{sql\_constraints} \; \mt{fs} \; [\mt{name} = \mt{unique}] \\
adamc@785	1295 \mt{val} \; \mt{join\_constraints} : \mt{fs} ::: \{\mt{Type}\} \to \mt{uniques1} ::: \{\{\mt{Unit}\}\} \to \mt{uniques2} ::: \{\{\mt{Unit}\}\} \to [\mt{uniques1} \sim \mt{uniques2}] \\
adamc@785	1296 \hspace{.1in} \Rightarrow \mt{sql\_constraints} \; \mt{fs} \; \mt{uniques1} \to \mt{sql\_constraints} \; \mt{fs} \; \mt{uniques2} \to \mt{sql\_constraints} \; \mt{fs} \; (\mt{uniques1} \rc \mt{uniques2})
adamc@785	1297 \end{array}$$
adamc@785	1298
adamc@785	1299 A \texttt{UNIQUE} constraint forces a set of columns to be a key, which means that no combination of column values may occur more than once in the table. The $\mt{unique1}$ and $\mt{unique}$ arguments are separated out only to ensure that empty \texttt{UNIQUE} constraints are rejected.
adamc@785	1300 $$\begin{array}{l}
adamc@785	1301 \mt{val} \; \mt{unique} : \mt{rest} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \to \mt{unique1} :: \mt{Name} \to \mt{unique} :: \{\mt{Type}\} \\
adamc@785	1302 \hspace{.1in} \to [[\mt{unique1}] \sim \mt{unique}] \Rightarrow [[\mt{unique1} = \mt{t}] \rc \mt{unique} \sim \mt{rest}] \\
adamc@785	1303 \hspace{.1in} \Rightarrow \mt{sql\_constraint} \; ([\mt{unique1} = \mt{t}] \rc \mt{unique} \rc \mt{rest}) \; ([\mt{unique1}] \rc \mt{map} \; (\lambda \_ \Rightarrow ()) \; \mt{unique})
adamc@785	1304 \end{array}$$
adamc@785	1305
adamc@785	1306 A \texttt{FOREIGN KEY} constraint connects a set of local columns to a local or remote key, enforcing that the local columns always reference an existent row of the foreign key's table. A local column of type $\mt{t}$ may be linked to a foreign column of type $\mt{option} \; \mt{t}$, and vice versa. We formalize that notion with a type class.
adamc@785	1307 $$\begin{array}{l}
adamc@785	1308 \mt{class} \; \mt{linkable} :: \mt{Type} \to \mt{Type} \to \mt{Type} \\
adamc@785	1309 \mt{val} \; \mt{linkable\_same} : \mt{t} ::: \mt{Type} \to \mt{linkable} \; \mt{t} \; \mt{t} \\
adamc@785	1310 \mt{val} \; \mt{linkable\_from\_nullable} : \mt{t} ::: \mt{Type} \to \mt{linkable} \; (\mt{option} \; \mt{t}) \; \mt{t} \\
adamc@785	1311 \mt{val} \; \mt{linkable\_to\_nullable} : \mt{t} ::: \mt{Type} \to \mt{linkable} \; \mt{t} \; (\mt{option} \; \mt{t})
adamc@785	1312 \end{array}$$
adamc@785	1313
adamc@785	1314 The $\mt{matching}$ type family uses $\mt{linkable}$ to define when two keys match up type-wise.
adamc@785	1315 $$\begin{array}{l}
adamc@785	1316 \mt{con} \; \mt{matching} :: \{\mt{Type}\} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@785	1317 \mt{val} \; \mt{mat\_nil} : \mt{matching} \; [] \; [] \\
adamc@785	1318 \mt{val} \; \mt{mat\_cons} : \mt{t1} ::: \mt{Type} \to \mt{rest1} ::: \{\mt{Type}\} \to \mt{t2} ::: \mt{Type} \to \mt{rest2} ::: \{\mt{Type}\} \to \mt{nm1} :: \mt{Name} \to \mt{nm2} :: \mt{Name} \\
adamc@785	1319 \hspace{.1in} \to [[\mt{nm1}] \sim \mt{rest1}] \Rightarrow [[\mt{nm2}] \sim \mt{rest2}] \Rightarrow \mt{linkable} \; \mt{t1} \; \mt{t2} \to \mt{matching} \; \mt{rest1} \; \mt{rest2} \\
adamc@785	1320 \hspace{.1in} \to \mt{matching} \; ([\mt{nm1} = \mt{t1}] \rc \mt{rest1}) \; ([\mt{nm2} = \mt{t2}] \rc \mt{rest2})
adamc@785	1321 \end{array}$$
adamc@785	1322
adamc@785	1323 SQL provides a number of different propagation modes for \texttt{FOREIGN KEY} constraints, governing what happens when a row containing a still-referenced foreign key value is deleted or modified to have a different key value. The argument of a propagation mode's type gives the local key type.
adamc@785	1324 $$\begin{array}{l}
adamc@785	1325 \mt{con} \; \mt{propagation\_mode} :: \{\mt{Type}\} \to \mt{Type} \\
adamc@785	1326 \mt{val} \; \mt{restrict} : \mt{fs} ::: \{\mt{Type}\} \to \mt{propagation\_mode} \; \mt{fs} \\
adamc@785	1327 \mt{val} \; \mt{cascade} : \mt{fs} ::: \{\mt{Type}\} \to \mt{propagation\_mode} \; \mt{fs} \\
adamc@785	1328 \mt{val} \; \mt{no\_action} : \mt{fs} ::: \{\mt{Type}\} \to \mt{propagation\_mode} \; \mt{fs} \\
adamc@785	1329 \mt{val} \; \mt{set\_null} : \mt{fs} ::: \{\mt{Type}\} \to \mt{propagation\_mode} \; (\mt{map} \; \mt{option} \; \mt{fs})
adamc@785	1330 \end{array}$$
adamc@785	1331
adamc@785	1332 Finally, we put these ingredient together to define the \texttt{FOREIGN KEY} constraint function.
adamc@785	1333 $$\begin{array}{l}
adamc@785	1334 \mt{val} \; \mt{foreign\_key} : \mt{mine1} ::: \mt{Name} \to \mt{t} ::: \mt{Type} \to \mt{mine} ::: \{\mt{Type}\} \to \mt{munused} ::: \{\mt{Type}\} \to \mt{foreign} ::: \{\mt{Type}\} \\
adamc@785	1335 \hspace{.1in} \to \mt{funused} ::: \{\mt{Type}\} \to \mt{nm} ::: \mt{Name} \to \mt{uniques} ::: \{\{\mt{Unit}\}\} \\
adamc@785	1336 \hspace{.1in} \to [[\mt{mine1}] \sim \mt{mine}] \Rightarrow [[\mt{mine1} = \mt{t}] \rc \mt{mine} \sim \mt{munused}] \Rightarrow [\mt{foreign} \sim \mt{funused}] \Rightarrow [[\mt{nm}] \sim \mt{uniques}] \\
adamc@785	1337 \hspace{.1in} \Rightarrow \mt{matching} \; ([\mt{mine1} = \mt{t}] \rc \mt{mine}) \; \mt{foreign} \\
adamc@785	1338 \hspace{.1in} \to \mt{sql\_table} \; (\mt{foreign} \rc \mt{funused}) \; ([\mt{nm} = \mt{map} \; (\lambda \_ \Rightarrow ()) \; \mt{foreign}] \rc \mt{uniques}) \\
adamc@785	1339 \hspace{.1in} \to \{\mt{OnDelete} : \mt{propagation\_mode} \; ([\mt{mine1} = \mt{t}] \rc \mt{mine}), \\
adamc@785	1340 \hspace{.2in} \mt{OnUpdate} : \mt{propagation\_mode} \; ([\mt{mine1} = \mt{t}] \rc \mt{mine})\} \\
adamc@785	1341 \hspace{.1in} \to \mt{sql\_constraint} \; ([\mt{mine1} = \mt{t}] \rc \mt{mine} \rc \mt{munused}) \; []
adamc@785	1342 \end{array}$$
adamc@785	1343
adamc@785	1344 The last kind of constraint is a \texttt{CHECK} constraint, which attaches a boolean invariant over a row's contents. It is defined using the $\mt{sql\_exp}$ type family, which we discuss in more detail below.
adamc@785	1345 $$\begin{array}{l}
adamc@785	1346 \mt{val} \; \mt{check} : \mt{fs} ::: \{\mt{Type}\} \to \mt{sql\_exp} \; [] \; [] \; \mt{fs} \; \mt{bool} \to \mt{sql\_constraint} \; \mt{fs} \; []
adamc@785	1347 \end{array}$$
adamc@785	1348
adamc@785	1349 Section \ref{tables} shows the expanded syntax of the $\mt{table}$ declaration and signature item that includes constraints. There is no other way to use constraints with SQL in Ur/Web.
adamc@785	1350
adamc@784	1351
adamc@543	1352 \subsubsection{Queries}
adamc@543	1353
adamc@543	1354 A final query is constructed via the $\mt{sql\_query}$ function. Constructor arguments respectively specify the table fields we select (as records mapping tables to the subsets of their fields that we choose) and the (always named) extra expressions that we select.
adamc@543	1355 $$\begin{array}{l}
adamc@543	1356 \mt{con} \; \mt{sql\_query} :: \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@543	1357 \mt{val} \; \mt{sql\_query} : \mt{tables} ::: \{\{\mt{Type}\}\} \\
adamc@543	1358 \hspace{.1in} \to \mt{selectedFields} ::: \{\{\mt{Type}\}\} \\
adamc@543	1359 \hspace{.1in} \to \mt{selectedExps} ::: \{\mt{Type}\} \\
adamc@543	1360 \hspace{.1in} \to \{\mt{Rows} : \mt{sql\_query1} \; \mt{tables} \; \mt{selectedFields} \; \mt{selectedExps}, \\
adamc@543	1361 \hspace{.2in} \mt{OrderBy} : \mt{sql\_order\_by} \; \mt{tables} \; \mt{selectedExps}, \\
adamc@543	1362 \hspace{.2in} \mt{Limit} : \mt{sql\_limit}, \\
adamc@543	1363 \hspace{.2in} \mt{Offset} : \mt{sql\_offset}\} \\
adamc@543	1364 \hspace{.1in} \to \mt{sql\_query} \; \mt{selectedFields} \; \mt{selectedExps}
adamc@543	1365 \end{array}$$
adamc@543	1366
adamc@545	1367 Queries are used by folding over their results inside transactions.
adamc@545	1368 $$\begin{array}{l}
adamc@545	1369 \mt{val} \; \mt{query} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \lambda [\mt{tables} \sim \mt{exps}] \Rightarrow \mt{state} ::: \mt{Type} \to \mt{sql\_query} \; \mt{tables} \; \mt{exps} \\
adamc@658	1370 \hspace{.1in} \to (\$(\mt{exps} \rc \mt{map} \; (\lambda \mt{fields} :: \{\mt{Type}\} \Rightarrow \$\mt{fields}) \; \mt{tables}) \\
adamc@545	1371 \hspace{.2in} \to \mt{state} \to \mt{transaction} \; \mt{state}) \\
adamc@545	1372 \hspace{.1in} \to \mt{state} \to \mt{transaction} \; \mt{state}
adamc@545	1373 \end{array}$$
adamc@545	1374
adamc@543	1375 Most of the complexity of the query encoding is in the type $\mt{sql\_query1}$, which includes simple queries and derived queries based on relational operators. Constructor arguments respectively specify the tables we select from, the subset of fields that we keep from each table for the result rows, and the extra expressions that we select.
adamc@543	1376 $$\begin{array}{l}
adamc@543	1377 \mt{con} \; \mt{sql\_query1} :: \{\{\mt{Type}\}\} \to \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@543	1378 \\
adamc@543	1379 \mt{type} \; \mt{sql\_relop} \\
adamc@543	1380 \mt{val} \; \mt{sql\_union} : \mt{sql\_relop} \\
adamc@543	1381 \mt{val} \; \mt{sql\_intersect} : \mt{sql\_relop} \\
adamc@543	1382 \mt{val} \; \mt{sql\_except} : \mt{sql\_relop} \\
adamc@543	1383 \mt{val} \; \mt{sql\_relop} : \mt{tables1} ::: \{\{\mt{Type}\}\} \\
adamc@543	1384 \hspace{.1in} \to \mt{tables2} ::: \{\{\mt{Type}\}\} \\
adamc@543	1385 \hspace{.1in} \to \mt{selectedFields} ::: \{\{\mt{Type}\}\} \\
adamc@543	1386 \hspace{.1in} \to \mt{selectedExps} ::: \{\mt{Type}\} \\
adamc@543	1387 \hspace{.1in} \to \mt{sql\_relop} \\
adamc@543	1388 \hspace{.1in} \to \mt{sql\_query1} \; \mt{tables1} \; \mt{selectedFields} \; \mt{selectedExps} \\
adamc@543	1389 \hspace{.1in} \to \mt{sql\_query1} \; \mt{tables2} \; \mt{selectedFields} \; \mt{selectedExps} \\
adamc@543	1390 \hspace{.1in} \to \mt{sql\_query1} \; \mt{selectedFields} \; \mt{selectedFields} \; \mt{selectedExps}
adamc@543	1391 \end{array}$$
adamc@543	1392
adamc@543	1393 $$\begin{array}{l}
adamc@543	1394 \mt{val} \; \mt{sql\_query1} : \mt{tables} ::: \{\{\mt{Type}\}\} \\
adamc@543	1395 \hspace{.1in} \to \mt{grouped} ::: \{\{\mt{Type}\}\} \\
adamc@543	1396 \hspace{.1in} \to \mt{selectedFields} ::: \{\{\mt{Type}\}\} \\
adamc@543	1397 \hspace{.1in} \to \mt{selectedExps} ::: \{\mt{Type}\} \\
adamc@786	1398 \hspace{.1in} \to \{\mt{From} : \mt{sql\_from\_items} \; \mt{tables}, \\
adamc@543	1399 \hspace{.2in} \mt{Where} : \mt{sql\_exp} \; \mt{tables} \; [] \; [] \; \mt{bool}, \\
adamc@543	1400 \hspace{.2in} \mt{GroupBy} : \mt{sql\_subset} \; \mt{tables} \; \mt{grouped}, \\
adamc@543	1401 \hspace{.2in} \mt{Having} : \mt{sql\_exp} \; \mt{grouped} \; \mt{tables} \; [] \; \mt{bool}, \\
adamc@543	1402 \hspace{.2in} \mt{SelectFields} : \mt{sql\_subset} \; \mt{grouped} \; \mt{selectedFields}, \\
adamc@658	1403 \hspace{.2in} \mt {SelectExps} : \$(\mt{map} \; (\mt{sql\_exp} \; \mt{grouped} \; \mt{tables} \; []) \; \mt{selectedExps}) \} \\
adamc@543	1404 \hspace{.1in} \to \mt{sql\_query1} \; \mt{tables} \; \mt{selectedFields} \; \mt{selectedExps}
adamc@543	1405 \end{array}$$
adamc@543	1406
adamc@543	1407 To encode projection of subsets of fields in $\mt{SELECT}$ clauses, and to encode $\mt{GROUP} \; \mt{BY}$ clauses, we rely on a type family $\mt{sql\_subset}$, capturing what it means for one record of table fields to be a subset of another. The main constructor $\mt{sql\_subset}$ ``proves subset facts'' by requiring a split of a record into kept and dropped parts. The extra constructor $\mt{sql\_subset\_all}$ is a convenience for keeping all fields of a record.
adamc@543	1408 $$\begin{array}{l}
adamc@543	1409 \mt{con} \; \mt{sql\_subset} :: \{\{\mt{Type}\}\} \to \{\{\mt{Type}\}\} \to \mt{Type} \\
adamc@543	1410 \mt{val} \; \mt{sql\_subset} : \mt{keep\_drop} :: \{(\{\mt{Type}\} \times \{\mt{Type}\})\} \\
adamc@543	1411 \hspace{.1in} \to \mt{sql\_subset} \\
adamc@658	1412 \hspace{.2in} (\mt{map} \; (\lambda \mt{fields} :: (\{\mt{Type}\} \times \{\mt{Type}\}) \Rightarrow \mt{fields}.1 \rc \mt{fields}.2)\; \mt{keep\_drop}) \\
adamc@658	1413 \hspace{.2in} (\mt{map} \; (\lambda \mt{fields} :: (\{\mt{Type}\} \times \{\mt{Type}\}) \Rightarrow \mt{fields}.1) \; \mt{keep\_drop}) \\
adamc@543	1414 \mt{val} \; \mt{sql\_subset\_all} : \mt{tables} :: \{\{\mt{Type}\}\} \to \mt{sql\_subset} \; \mt{tables} \; \mt{tables}
adamc@543	1415 \end{array}$$
adamc@543	1416
adamc@560	1417 SQL expressions are used in several places, including $\mt{SELECT}$, $\mt{WHERE}$, $\mt{HAVING}$, and $\mt{ORDER} \; \mt{BY}$ clauses. They reify a fragment of the standard SQL expression language, while making it possible to inject ``native'' Ur values in some places. The arguments to the $\mt{sql\_exp}$ type family respectively give the unrestricted-availability table fields, the table fields that may only be used in arguments to aggregate functions, the available selected expressions, and the type of the expression.
adamc@543	1418 $$\begin{array}{l}
adamc@543	1419 \mt{con} \; \mt{sql\_exp} :: \{\{\mt{Type}\}\} \to \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \to \mt{Type}
adamc@543	1420 \end{array}$$
adamc@543	1421
adamc@543	1422 Any field in scope may be converted to an expression.
adamc@543	1423 $$\begin{array}{l}
adamc@543	1424 \mt{val} \; \mt{sql\_field} : \mt{otherTabs} ::: \{\{\mt{Type}\}\} \to \mt{otherFields} ::: \{\mt{Type}\} \\
adamc@543	1425 \hspace{.1in} \to \mt{fieldType} ::: \mt{Type} \to \mt{agg} ::: \{\{\mt{Type}\}\} \\
adamc@543	1426 \hspace{.1in} \to \mt{exps} ::: \{\mt{Type}\} \\
adamc@543	1427 \hspace{.1in} \to \mt{tab} :: \mt{Name} \to \mt{field} :: \mt{Name} \\
adamc@543	1428 \hspace{.1in} \to \mt{sql\_exp} \; ([\mt{tab} = [\mt{field} = \mt{fieldType}] \rc \mt{otherFields}] \rc \mt{otherTabs}) \; \mt{agg} \; \mt{exps} \; \mt{fieldType}
adamc@543	1429 \end{array}$$
adamc@543	1430
adamc@544	1431 There is an analogous function for referencing named expressions.
adamc@544	1432 $$\begin{array}{l}
adamc@544	1433 \mt{val} \; \mt{sql\_exp} : \mt{tabs} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{t} ::: \mt{Type} \to \mt{rest} ::: \{\mt{Type}\} \to \mt{nm} :: \mt{Name} \\
adamc@544	1434 \hspace{.1in} \to \mt{sql\_exp} \; \mt{tabs} \; \mt{agg} \; ([\mt{nm} = \mt{t}] \rc \mt{rest}) \; \mt{t}
adamc@544	1435 \end{array}$$
adamc@544	1436
adamc@544	1437 Ur values of appropriate types may be injected into SQL expressions.
adamc@544	1438 $$\begin{array}{l}
adamc@786	1439 \mt{class} \; \mt{sql\_injectable\_prim} \\
adamc@786	1440 \mt{val} \; \mt{sql\_bool} : \mt{sql\_injectable\_prim} \; \mt{bool} \\
adamc@786	1441 \mt{val} \; \mt{sql\_int} : \mt{sql\_injectable\_prim} \; \mt{int} \\
adamc@786	1442 \mt{val} \; \mt{sql\_float} : \mt{sql\_injectable\_prim} \; \mt{float} \\
adamc@786	1443 \mt{val} \; \mt{sql\_string} : \mt{sql\_injectable\_prim} \; \mt{string} \\
adamc@786	1444 \mt{val} \; \mt{sql\_time} : \mt{sql\_injectable\_prim} \; \mt{time} \\
adamc@786	1445 \mt{val} \; \mt{sql\_blob} : \mt{sql\_injectable\_prim} \; \mt{blob} \\
adamc@786	1446 \mt{val} \; \mt{sql\_channel} : \mt{t} ::: \mt{Type} \to \mt{sql\_injectable\_prim} \; (\mt{channel} \; \mt{t}) \\
adamc@786	1447 \mt{val} \; \mt{sql\_client} : \mt{sql\_injectable\_prim} \; \mt{client} \\
adamc@786	1448 \\
adamc@544	1449 \mt{class} \; \mt{sql\_injectable} \\
adamc@786	1450 \mt{val} \; \mt{sql\_prim} : \mt{t} ::: \mt{Type} \to \mt{sql\_injectable\_prim} \; \mt{t} \to \mt{sql\_injectable} \; \mt{t} \\
adamc@786	1451 \mt{val} \; \mt{sql\_option\_prim} : \mt{t} ::: \mt{Type} \to \mt{sql\_injectable\_prim} \; \mt{t} \to \mt{sql\_injectable} \; (\mt{option} \; \mt{t}) \\
adamc@786	1452 \\
adamc@544	1453 \mt{val} \; \mt{sql\_inject} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \to \mt{sql\_injectable} \; \mt{t} \\
adamc@544	1454 \hspace{.1in} \to \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t}
adamc@544	1455 \end{array}$$
adamc@544	1456
adamc@544	1457 We have the SQL nullness test, which is necessary because of the strange SQL semantics of equality in the presence of null values.
adamc@544	1458 $$\begin{array}{l}
adamc@544	1459 \mt{val} \; \mt{sql\_is\_null} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544	1460 \hspace{.1in} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; (\mt{option} \; \mt{t}) \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{bool}
adamc@544	1461 \end{array}$$
adamc@544	1462
adamc@559	1463 We have generic nullary, unary, and binary operators.
adamc@544	1464 $$\begin{array}{l}
adamc@544	1465 \mt{con} \; \mt{sql\_nfunc} :: \mt{Type} \to \mt{Type} \\
adamc@544	1466 \mt{val} \; \mt{sql\_current\_timestamp} : \mt{sql\_nfunc} \; \mt{time} \\
adamc@544	1467 \mt{val} \; \mt{sql\_nfunc} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544	1468 \hspace{.1in} \to \mt{sql\_nfunc} \; \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t} \\\end{array}$$
adamc@544	1469
adamc@544	1470 $$\begin{array}{l}
adamc@544	1471 \mt{con} \; \mt{sql\_unary} :: \mt{Type} \to \mt{Type} \to \mt{Type} \\
adamc@544	1472 \mt{val} \; \mt{sql\_not} : \mt{sql\_unary} \; \mt{bool} \; \mt{bool} \\
adamc@544	1473 \mt{val} \; \mt{sql\_unary} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{arg} ::: \mt{Type} \to \mt{res} ::: \mt{Type} \\
adamc@544	1474 \hspace{.1in} \to \mt{sql\_unary} \; \mt{arg} \; \mt{res} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{arg} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{res} \\
adamc@544	1475 \end{array}$$
adamc@544	1476
adamc@544	1477 $$\begin{array}{l}
adamc@544	1478 \mt{con} \; \mt{sql\_binary} :: \mt{Type} \to \mt{Type} \to \mt{Type} \to \mt{Type} \\
adamc@544	1479 \mt{val} \; \mt{sql\_and} : \mt{sql\_binary} \; \mt{bool} \; \mt{bool} \; \mt{bool} \\
adamc@544	1480 \mt{val} \; \mt{sql\_or} : \mt{sql\_binary} \; \mt{bool} \; \mt{bool} \; \mt{bool} \\
adamc@544	1481 \mt{val} \; \mt{sql\_binary} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{arg_1} ::: \mt{Type} \to \mt{arg_2} ::: \mt{Type} \to \mt{res} ::: \mt{Type} \\
adamc@544	1482 \hspace{.1in} \to \mt{sql\_binary} \; \mt{arg_1} \; \mt{arg_2} \; \mt{res} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{arg_1} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{arg_2} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{res}
adamc@544	1483 \end{array}$$
adamc@544	1484
adamc@544	1485 $$\begin{array}{l}
adamc@559	1486 \mt{class} \; \mt{sql\_arith} \\
adamc@559	1487 \mt{val} \; \mt{sql\_int\_arith} : \mt{sql\_arith} \; \mt{int} \\
adamc@559	1488 \mt{val} \; \mt{sql\_float\_arith} : \mt{sql\_arith} \; \mt{float} \\
adamc@559	1489 \mt{val} \; \mt{sql\_neg} : \mt{t} ::: \mt{Type} \to \mt{sql\_arith} \; \mt{t} \to \mt{sql\_unary} \; \mt{t} \; \mt{t} \\
adamc@559	1490 \mt{val} \; \mt{sql\_plus} : \mt{t} ::: \mt{Type} \to \mt{sql\_arith} \; \mt{t} \to \mt{sql\_binary} \; \mt{t} \; \mt{t} \; \mt{t} \\
adamc@559	1491 \mt{val} \; \mt{sql\_minus} : \mt{t} ::: \mt{Type} \to \mt{sql\_arith} \; \mt{t} \to \mt{sql\_binary} \; \mt{t} \; \mt{t} \; \mt{t} \\
adamc@559	1492 \mt{val} \; \mt{sql\_times} : \mt{t} ::: \mt{Type} \to \mt{sql\_arith} \; \mt{t} \to \mt{sql\_binary} \; \mt{t} \; \mt{t} \; \mt{t} \\
adamc@559	1493 \mt{val} \; \mt{sql\_div} : \mt{t} ::: \mt{Type} \to \mt{sql\_arith} \; \mt{t} \to \mt{sql\_binary} \; \mt{t} \; \mt{t} \; \mt{t} \\
adamc@559	1494 \mt{val} \; \mt{sql\_mod} : \mt{sql\_binary} \; \mt{int} \; \mt{int} \; \mt{int}
adamc@559	1495 \end{array}$$
adamc@544	1496
adamc@656	1497 Finally, we have aggregate functions. The $\mt{COUNT(\ast)}$ syntax is handled specially, since it takes no real argument. The other aggregate functions are placed into a general type family, using constructor classes to restrict usage to properly-typed arguments. The key aspect of the $\mt{sql\_aggregate}$ function's type is the shift of aggregate-function-only fields into unrestricted fields.
adamc@544	1498 $$\begin{array}{l}
adamc@544	1499 \mt{val} \; \mt{sql\_count} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{int}
adamc@544	1500 \end{array}$$
adamc@544	1501
adamc@544	1502 $$\begin{array}{l}
adamc@544	1503 \mt{con} \; \mt{sql\_aggregate} :: \mt{Type} \to \mt{Type} \\
adamc@544	1504 \mt{val} \; \mt{sql\_aggregate} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544	1505 \hspace{.1in} \to \mt{sql\_aggregate} \; \mt{t} \to \mt{sql\_exp} \; \mt{agg} \; \mt{agg} \; \mt{exps} \; \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t}
adamc@544	1506 \end{array}$$
adamc@544	1507
adamc@544	1508 $$\begin{array}{l}
adamc@544	1509 \mt{class} \; \mt{sql\_summable} \\
adamc@544	1510 \mt{val} \; \mt{sql\_summable\_int} : \mt{sql\_summable} \; \mt{int} \\
adamc@544	1511 \mt{val} \; \mt{sql\_summable\_float} : \mt{sql\_summable} \; \mt{float} \\
adamc@544	1512 \mt{val} \; \mt{sql\_avg} : \mt{t} ::: \mt{Type} \to \mt{sql\_summable} \; \mt{t} \to \mt{sql\_aggregate} \; \mt{t} \\
adamc@544	1513 \mt{val} \; \mt{sql\_sum} : \mt{t} ::: \mt{Type} \to \mt{sql\_summable} \mt{t} \to \mt{sql\_aggregate} \; \mt{t}
adamc@544	1514 \end{array}$$
adamc@544	1515
adamc@544	1516 $$\begin{array}{l}
adamc@544	1517 \mt{class} \; \mt{sql\_maxable} \\
adamc@544	1518 \mt{val} \; \mt{sql\_maxable\_int} : \mt{sql\_maxable} \; \mt{int} \\
adamc@544	1519 \mt{val} \; \mt{sql\_maxable\_float} : \mt{sql\_maxable} \; \mt{float} \\
adamc@544	1520 \mt{val} \; \mt{sql\_maxable\_string} : \mt{sql\_maxable} \; \mt{string} \\
adamc@544	1521 \mt{val} \; \mt{sql\_maxable\_time} : \mt{sql\_maxable} \; \mt{time} \\
adamc@544	1522 \mt{val} \; \mt{sql\_max} : \mt{t} ::: \mt{Type} \to \mt{sql\_maxable} \; \mt{t} \to \mt{sql\_aggregate} \; \mt{t} \\
adamc@544	1523 \mt{val} \; \mt{sql\_min} : \mt{t} ::: \mt{Type} \to \mt{sql\_maxable} \; \mt{t} \to \mt{sql\_aggregate} \; \mt{t}
adamc@544	1524 \end{array}$$
adamc@544	1525
adamc@786	1526 \texttt{FROM} clauses are specified using a type family.
adamc@786	1527 $$\begin{array}{l}
adamc@786	1528 \mt{con} \; \mt{sql\_from\_items} :: \{\{\mt{Type}\}\} \to \mt{Type} \\
adamc@786	1529 \mt{val} \; \mt{sql\_from\_table} : \mt{t} ::: \mt{Type} \to \mt{fs} ::: \{\mt{Type}\} \to \mt{fieldsOf} \; \mt{t} \; \mt{fs} \to \mt{name} :: \mt{Name} \to \mt{t} \to \mt{sql\_from\_items} \; [\mt{name} = \mt{fs}] \\
adamc@786	1530 \mt{val} \; \mt{sql\_from\_comma} : \mt{tabs1} ::: \{\{\mt{Type}\}\} \to \mt{tabs2} ::: \{\{\mt{Type}\}\} \to [\mt{tabs1} \sim \mt{tabs2}] \\
adamc@786	1531 \hspace{.1in} \Rightarrow \mt{sql\_from\_items} \; \mt{tabs1} \to \mt{sql\_from\_items} \; \mt{tabs2} \\
adamc@786	1532 \hspace{.1in} \to \mt{sql\_from\_items} \; (\mt{tabs1} \rc \mt{tabs2}) \\
adamc@786	1533 \mt{val} \; \mt{sql\_inner\_join} : \mt{tabs1} ::: \{\{\mt{Type}\}\} \to \mt{tabs2} ::: \{\{\mt{Type}\}\} \to [\mt{tabs1} \sim \mt{tabs2}] \\
adamc@786	1534 \hspace{.1in} \Rightarrow \mt{sql\_from\_items} \; \mt{tabs1} \to \mt{sql\_from\_items} \; \mt{tabs2} \\
adamc@786	1535 \hspace{.1in} \to \mt{sql\_exp} \; (\mt{tabs1} \rc \mt{tabs2}) \; [] \; [] \; \mt{bool} \\
adamc@786	1536 \hspace{.1in} \to \mt{sql\_from\_items} \; (\mt{tabs1} \rc \mt{tabs2})
adamc@786	1537 \end{array}$$
adamc@786	1538
adamc@786	1539 Besides these basic cases, outer joins are supported, which requires a type class for turning non-$\mt{option}$ columns into $\mt{option}$ columns.
adamc@786	1540 $$\begin{array}{l}
adamc@786	1541 \mt{class} \; \mt{nullify} :: \mt{Type} \to \mt{Type} \to \mt{Type} \\
adamc@786	1542 \mt{val} \; \mt{nullify\_option} : \mt{t} ::: \mt{Type} \to \mt{nullify} \; (\mt{option} \; \mt{t}) \; (\mt{option} \; \mt{t}) \\
adamc@786	1543 \mt{val} \; \mt{nullify\_prim} : \mt{t} ::: \mt{Type} \to \mt{sql\_injectable\_prim} \; \mt{t} \to \mt{nullify} \; \mt{t} \; (\mt{option} \; \mt{t})
adamc@786	1544 \end{array}$$
adamc@786	1545
adamc@786	1546 Left, right, and full outer joins can now be expressed using functions that accept records of $\mt{nullify}$ instances. Here, we give only the type for a left join as an example.
adamc@786	1547
adamc@786	1548 $$\begin{array}{l}
adamc@786	1549 \mt{val} \; \mt{sql\_left\_join} : \mt{tabs1} ::: \{\{\mt{Type}\}\} \to \mt{tabs2} ::: \{\{(\mt{Type} \times \mt{Type})\}\} \to [\mt{tabs1} \sim \mt{tabs2}] \\
adamc@786	1550 \hspace{.1in} \Rightarrow \$(\mt{map} \; (\lambda \mt{r} \Rightarrow \$(\mt{map} \; (\lambda \mt{p} :: (\mt{Type} \times \mt{Type}) \Rightarrow \mt{nullify} \; \mt{p}.1 \; \mt{p}.2) \; \mt{r})) \; \mt{tabs2}) \\
adamc@786	1551 \hspace{.1in} \to \mt{sql\_from\_items} \; \mt{tabs1} \to \mt{sql\_from\_items} \; (\mt{map} \; (\mt{map} \; (\lambda \mt{p} :: (\mt{Type} \times \mt{Type}) \Rightarrow \mt{p}.1)) \; \mt{tabs2}) \\
adamc@786	1552 \hspace{.1in} \to \mt{sql\_exp} \; (\mt{tabs1} \rc \mt{map} \; (\mt{map} \; (\lambda \mt{p} :: (\mt{Type} \times \mt{Type}) \Rightarrow \mt{p}.1)) \; \mt{tabs2}) \; [] \; [] \; \mt{bool} \\
adamc@786	1553 \hspace{.1in} \to \mt{sql\_from\_items} \; (\mt{tabs1} \rc \mt{map} \; (\mt{map} \; (\lambda \mt{p} :: (\mt{Type} \times \mt{Type}) \Rightarrow \mt{p}.2)) \; \mt{tabs2})
adamc@786	1554 \end{array}$$
adamc@786	1555
adamc@544	1556 We wrap up the definition of query syntax with the types used in representing $\mt{ORDER} \; \mt{BY}$, $\mt{LIMIT}$, and $\mt{OFFSET}$ clauses.
adamc@544	1557 $$\begin{array}{l}
adamc@544	1558 \mt{type} \; \mt{sql\_direction} \\
adamc@544	1559 \mt{val} \; \mt{sql\_asc} : \mt{sql\_direction} \\
adamc@544	1560 \mt{val} \; \mt{sql\_desc} : \mt{sql\_direction} \\
adamc@544	1561 \\
adamc@544	1562 \mt{con} \; \mt{sql\_order\_by} :: \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@544	1563 \mt{val} \; \mt{sql\_order\_by\_Nil} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{exps} :: \{\mt{Type}\} \to \mt{sql\_order\_by} \; \mt{tables} \; \mt{exps} \\
adamc@544	1564 \mt{val} \; \mt{sql\_order\_by\_Cons} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544	1565 \hspace{.1in} \to \mt{sql\_exp} \; \mt{tables} \; [] \; \mt{exps} \; \mt{t} \to \mt{sql\_direction} \to \mt{sql\_order\_by} \; \mt{tables} \; \mt{exps} \to \mt{sql\_order\_by} \; \mt{tables} \; \mt{exps} \\
adamc@544	1566 \\
adamc@544	1567 \mt{type} \; \mt{sql\_limit} \\
adamc@544	1568 \mt{val} \; \mt{sql\_no\_limit} : \mt{sql\_limit} \\
adamc@544	1569 \mt{val} \; \mt{sql\_limit} : \mt{int} \to \mt{sql\_limit} \\
adamc@544	1570 \\
adamc@544	1571 \mt{type} \; \mt{sql\_offset} \\
adamc@544	1572 \mt{val} \; \mt{sql\_no\_offset} : \mt{sql\_offset} \\
adamc@544	1573 \mt{val} \; \mt{sql\_offset} : \mt{int} \to \mt{sql\_offset}
adamc@544	1574 \end{array}$$
adamc@544	1575
adamc@545	1576
adamc@545	1577 \subsubsection{DML}
adamc@545	1578
adamc@545	1579 The Ur/Web library also includes an embedding of a fragment of SQL's DML, the Data Manipulation Language, for modifying database tables. Any piece of DML may be executed in a transaction.
adamc@545	1580
adamc@545	1581 $$\begin{array}{l}
adamc@545	1582 \mt{type} \; \mt{dml} \\
adamc@545	1583 \mt{val} \; \mt{dml} : \mt{dml} \to \mt{transaction} \; \mt{unit}
adamc@545	1584 \end{array}$$
adamc@545	1585
adamc@545	1586 Properly-typed records may be used to form $\mt{INSERT}$ commands.
adamc@545	1587 $$\begin{array}{l}
adamc@545	1588 \mt{val} \; \mt{insert} : \mt{fields} ::: \{\mt{Type}\} \to \mt{sql\_table} \; \mt{fields} \\
adamc@658	1589 \hspace{.1in} \to \$(\mt{map} \; (\mt{sql\_exp} \; [] \; [] \; []) \; \mt{fields}) \to \mt{dml}
adamc@545	1590 \end{array}$$
adamc@545	1591
adamc@545	1592 An $\mt{UPDATE}$ command is formed from a choice of which table fields to leave alone and which to change, along with an expression to use to compute the new value of each changed field and a $\mt{WHERE}$ clause.
adamc@545	1593 $$\begin{array}{l}
adamc@545	1594 \mt{val} \; \mt{update} : \mt{unchanged} ::: \{\mt{Type}\} \to \mt{changed} :: \{\mt{Type}\} \to \lambda [\mt{changed} \sim \mt{unchanged}] \\
adamc@658	1595 \hspace{.1in} \Rightarrow \$(\mt{map} \; (\mt{sql\_exp} \; [\mt{T} = \mt{changed} \rc \mt{unchanged}] \; [] \; []) \; \mt{changed}) \\
adamc@545	1596 \hspace{.1in} \to \mt{sql\_table} \; (\mt{changed} \rc \mt{unchanged}) \to \mt{sql\_exp} \; [\mt{T} = \mt{changed} \rc \mt{unchanged}] \; [] \; [] \; \mt{bool} \to \mt{dml}
adamc@545	1597 \end{array}$$
adamc@545	1598
adamc@545	1599 A $\mt{DELETE}$ command is formed from a table and a $\mt{WHERE}$ clause.
adamc@545	1600 $$\begin{array}{l}
adamc@545	1601 \mt{val} \; \mt{delete} : \mt{fields} ::: \{\mt{Type}\} \to \mt{sql\_table} \; \mt{fields} \to \mt{sql\_exp} \; [\mt{T} = \mt{fields}] \; [] \; [] \; \mt{bool} \to \mt{dml}
adamc@545	1602 \end{array}$$
adamc@545	1603
adamc@546	1604 \subsubsection{Sequences}
adamc@546	1605
adamc@546	1606 SQL sequences are counters with concurrency control, often used to assign unique IDs. Ur/Web supports them via a simple interface. The only way to create a sequence is with the $\mt{sequence}$ declaration form.
adamc@546	1607
adamc@546	1608 $$\begin{array}{l}
adamc@546	1609 \mt{type} \; \mt{sql\_sequence} \\
adamc@546	1610 \mt{val} \; \mt{nextval} : \mt{sql\_sequence} \to \mt{transaction} \; \mt{int}
adamc@546	1611 \end{array}$$
adamc@546	1612
adamc@546	1613
adamc@547	1614 \subsection{XML}
adamc@547	1615
adamc@547	1616 Ur/Web's library contains an encoding of XML syntax and semantic constraints. We make no effort to follow the standards governing XML schemas. Rather, XML fragments are viewed more as values of ML datatypes, and we only track which tags are allowed inside which other tags.
adamc@547	1617
adamc@547	1618 The basic XML type family has arguments respectively indicating the \emph{context} of a fragment, the fields that the fragment expects to be bound on entry (and their types), and the fields that the fragment will bind (and their types). Contexts are a record-based ``poor man's subtyping'' encoding, with each possible set of valid tags corresponding to a different context record. The arguments dealing with field binding are only relevant to HTML forms.
adamc@547	1619 $$\begin{array}{l}
adamc@547	1620 \mt{con} \; \mt{xml} :: \{\mt{Unit}\} \to \{\mt{Type}\} \to \{\mt{Type}\} \to \mt{Type}
adamc@547	1621 \end{array}$$
adamc@547	1622
adamc@547	1623 We also have a type family of XML tags, indexed respectively by the record of optional attributes accepted by the tag, the context in which the tag may be placed, the context required of children of the tag, which form fields the tag uses, and which fields the tag defines.
adamc@547	1624 $$\begin{array}{l}
adamc@547	1625 \mt{con} \; \mt{tag} :: \{\mt{Type}\} \to \{\mt{Unit}\} \to \{\mt{Unit}\} \to \{\mt{Type}\} \to \{\mt{Type}\} \to \mt{Type}
adamc@547	1626 \end{array}$$
adamc@547	1627
adamc@547	1628 Literal text may be injected into XML as ``CDATA.''
adamc@547	1629 $$\begin{array}{l}
adamc@547	1630 \mt{val} \; \mt{cdata} : \mt{ctx} ::: \{\mt{Unit}\} \to \mt{use} ::: \{\mt{Type}\} \to \mt{string} \to \mt{xml} \; \mt{ctx} \; \mt{use} \; []
adamc@547	1631 \end{array}$$
adamc@547	1632
adamc@547	1633 There is a function for producing an XML tree with a particular tag at its root.
adamc@547	1634 $$\begin{array}{l}
adamc@547	1635 \mt{val} \; \mt{tag} : \mt{attrsGiven} ::: \{\mt{Type}\} \to \mt{attrsAbsent} ::: \{\mt{Type}\} \to \mt{ctxOuter} ::: \{\mt{Unit}\} \to \mt{ctxInner} ::: \{\mt{Unit}\} \\
adamc@547	1636 \hspace{.1in} \to \mt{useOuter} ::: \{\mt{Type}\} \to \mt{useInner} ::: \{\mt{Type}\} \to \mt{bindOuter} ::: \{\mt{Type}\} \to \mt{bindInner} ::: \{\mt{Type}\} \\
adamc@787	1637 \hspace{.1in} \to \lambda [\mt{attrsGiven} \sim \mt{attrsAbsent}] \; [\mt{useOuter} \sim \mt{useInner}] \; [\mt{bindOuter} \sim \mt{bindInner}] \\
adamc@787	1638 \hspace{.1in} \Rightarrow \mt{option} \; \mt{css\_class} \\
adamc@787	1639 \hspace{.1in} \to \$\mt{attrsGiven} \\
adamc@547	1640 \hspace{.1in} \to \mt{tag} \; (\mt{attrsGiven} \rc \mt{attrsAbsent}) \; \mt{ctxOuter} \; \mt{ctxInner} \; \mt{useOuter} \; \mt{bindOuter} \\
adamc@547	1641 \hspace{.1in} \to \mt{xml} \; \mt{ctxInner} \; \mt{useInner} \; \mt{bindInner} \to \mt{xml} \; \mt{ctxOuter} \; (\mt{useOuter} \rc \mt{useInner}) \; (\mt{bindOuter} \rc \mt{bindInner})
adamc@547	1642 \end{array}$$
adamc@787	1643 Note that any tag may be assigned a CSS class. This is the sole way of making use of the values produced by $\mt{style}$ declarations. Ur/Web itself doesn't deal with the syntax or semantics of style sheets; they can be linked via URLs with \texttt{link} tags. However, Ur/Web does make it easy to calculate upper bounds on usage of CSS classes through program analysis.
adamc@547	1644
adamc@547	1645 Two XML fragments may be concatenated.
adamc@547	1646 $$\begin{array}{l}
adamc@547	1647 \mt{val} \; \mt{join} : \mt{ctx} ::: \{\mt{Unit}\} \to \mt{use_1} ::: \{\mt{Type}\} \to \mt{bind_1} ::: \{\mt{Type}\} \to \mt{bind_2} ::: \{\mt{Type}\} \\
adamc@547	1648 \hspace{.1in} \to \lambda [\mt{use_1} \sim \mt{bind_1}] \; [\mt{bind_1} \sim \mt{bind_2}] \\
adamc@547	1649 \hspace{.1in} \Rightarrow \mt{xml} \; \mt{ctx} \; \mt{use_1} \; \mt{bind_1} \to \mt{xml} \; \mt{ctx} \; (\mt{use_1} \rc \mt{bind_1}) \; \mt{bind_2} \to \mt{xml} \; \mt{ctx} \; \mt{use_1} \; (\mt{bind_1} \rc \mt{bind_2})
adamc@547	1650 \end{array}$$
adamc@547	1651
adamc@547	1652 Finally, any XML fragment may be updated to ``claim'' to use more form fields than it does.
adamc@547	1653 $$\begin{array}{l}
adamc@547	1654 \mt{val} \; \mt{useMore} : \mt{ctx} ::: \{\mt{Unit}\} \to \mt{use_1} ::: \{\mt{Type}\} \to \mt{use_2} ::: \{\mt{Type}\} \to \mt{bind} ::: \{\mt{Type}\} \to \lambda [\mt{use_1} \sim \mt{use_2}] \\
adamc@547	1655 \hspace{.1in} \Rightarrow \mt{xml} \; \mt{ctx} \; \mt{use_1} \; \mt{bind} \to \mt{xml} \; \mt{ctx} \; (\mt{use_1} \rc \mt{use_2}) \; \mt{bind}
adamc@547	1656 \end{array}$$
adamc@547	1657
adamc@547	1658 We will not list here the different HTML tags and related functions from the standard library. They should be easy enough to understand from the code in \texttt{basis.urs}. The set of tags in the library is not yet claimed to be complete for HTML standards.
adamc@547	1659
adamc@547	1660 One last useful function is for aborting any page generation, returning some XML as an error message. This function takes the place of some uses of a general exception mechanism.
adamc@547	1661 $$\begin{array}{l}
adamc@547	1662 \mt{val} \; \mt{error} : \mt{t} ::: \mt{Type} \to \mt{xml} \; [\mt{Body}] \; [] \; [] \to \mt{t}
adamc@547	1663 \end{array}$$
adamc@547	1664
adamc@549	1665
adamc@701	1666 \subsection{Client-Side Programming}
adamc@659	1667
adamc@701	1668 Ur/Web supports running code on web browsers, via automatic compilation to JavaScript.
adamc@701	1669
adamc@701	1670 \subsubsection{The Basics}
adamc@701	1671
adamc@701	1672 Clients can open alert dialog boxes, in the usual annoying JavaScript way.
adamc@701	1673 $$\begin{array}{l}
adamc@701	1674 \mt{val} \; \mt{alert} : \mt{string} \to \mt{transaction} \; \mt{unit}
adamc@701	1675 \end{array}$$
adamc@701	1676
adamc@701	1677 Any transaction may be run in a new thread with the $\mt{spawn}$ function.
adamc@701	1678 $$\begin{array}{l}
adamc@701	1679 \mt{val} \; \mt{spawn} : \mt{transaction} \; \mt{unit} \to \mt{transaction} \; \mt{unit}
adamc@701	1680 \end{array}$$
adamc@701	1681
adamc@701	1682 The current thread can be paused for at least a specified number of milliseconds.
adamc@701	1683 $$\begin{array}{l}
adamc@701	1684 \mt{val} \; \mt{sleep} : \mt{int} \to \mt{transaction} \; \mt{unit}
adamc@701	1685 \end{array}$$
adamc@701	1686
adamc@787	1687 A few functions are available to registers callbacks for particular error events. Respectively, they are triggered on calls to $\mt{error}$, uncaught JavaScript exceptions, failure of remote procedure calls, the severance of the connection serving asynchronous messages, or the occurrence of some other error with that connection. If no handlers are registered for a kind of error, then occurrences of that error are ignored silently.
adamc@787	1688 $$\begin{array}{l}
adamc@787	1689 \mt{val} \; \mt{onError} : (\mt{xbody} \to \mt{transaction} \; \mt{unit}) \to \mt{transaction} \; \mt{unit} \\
adamc@787	1690 \mt{val} \; \mt{onFail} : (\mt{string} \to \mt{transaction} \; \mt{unit}) \to \mt{transaction} \; \mt{unit} \\
adamc@787	1691 \mt{val} \; \mt{onConnectFail} : \mt{transaction} \; \mt{unit} \to \mt{transaction} \; \mt{unit} \\
adamc@787	1692 \mt{val} \; \mt{onDisconnect} : \mt{transaction} \; \mt{unit} \to \mt{transaction} \; \mt{unit} \\
adamc@787	1693 \mt{val} \; \mt{onServerError} : (\mt{string} \to \mt{transaction} \; \mt{unit}) \to \mt{transaction} \; \mt{unit}
adamc@787	1694 \end{array}$$
adamc@787	1695
adamc@701	1696 \subsubsection{Functional-Reactive Page Generation}
adamc@701	1697
adamc@701	1698 Most approaches to ``AJAX''-style coding involve imperative manipulation of the DOM tree representing an HTML document's structure. Ur/Web follows the \emph{functional-reactive} approach instead. Programs may allocate mutable \emph{sources} of arbitrary types, and an HTML page is effectively a pure function over the latest values of the sources. The page is not mutated directly, but rather it changes automatically as the sources are mutated.
adamc@659	1699
adamc@659	1700 $$\begin{array}{l}
adamc@659	1701 \mt{con} \; \mt{source} :: \mt{Type} \to \mt{Type} \\
adamc@659	1702 \mt{val} \; \mt{source} : \mt{t} ::: \mt{Type} \to \mt{t} \to \mt{transaction} \; (\mt{source} \; \mt{t}) \\
adamc@659	1703 \mt{val} \; \mt{set} : \mt{t} ::: \mt{Type} \to \mt{source} \; \mt{t} \to \mt{t} \to \mt{transaction} \; \mt{unit} \\
adamc@659	1704 \mt{val} \; \mt{get} : \mt{t} ::: \mt{Type} \to \mt{source} \; \mt{t} \to \mt{transaction} \; \mt{t}
adamc@659	1705 \end{array}$$
adamc@659	1706
adamc@659	1707 Pure functions over sources are represented in a monad of \emph{signals}.
adamc@659	1708
adamc@659	1709 $$\begin{array}{l}
adamc@659	1710 \mt{con} \; \mt{signal} :: \mt{Type} \to \mt{Type} \\
adamc@659	1711 \mt{val} \; \mt{signal\_monad} : \mt{monad} \; \mt{signal} \\
adamc@659	1712 \mt{val} \; \mt{signal} : \mt{t} ::: \mt{Type} \to \mt{source} \; \mt{t} \to \mt{signal} \; \mt{t}
adamc@659	1713 \end{array}$$
adamc@659	1714
adamc@659	1715 A reactive portion of an HTML page is injected with a $\mt{dyn}$ tag, which has a signal-valued attribute $\mt{Signal}$.
adamc@659	1716
adamc@659	1717 $$\begin{array}{l}
adamc@701	1718 \mt{val} \; \mt{dyn} : \mt{use} ::: \{\mt{Type}\} \to \mt{bind} ::: \{\mt{Type}\} \to \mt{unit} \\
adamc@701	1719 \hspace{.1in} \to \mt{tag} \; [\mt{Signal} = \mt{signal} \; (\mt{xml} \; \mt{body} \; \mt{use} \; \mt{bind})] \; \mt{body} \; [] \; \mt{use} \; \mt{bind}
adamc@659	1720 \end{array}$$
adamc@659	1721
adamc@701	1722 Transactions can be run on the client by including them in attributes like the $\mt{Onclick}$ attribute of $\mt{button}$, and GUI widgets like $\mt{ctextbox}$ have $\mt{Source}$ attributes that can be used to connect them to sources, so that their values can be read by code running because of, e.g., an $\mt{Onclick}$ event.
adamc@701	1723
adamc@701	1724 \subsubsection{Asynchronous Message-Passing}
adamc@701	1725
adamc@701	1726 To support asynchronous, ``server push'' delivery of messages to clients, any client that might need to receive an asynchronous message is assigned a unique ID. These IDs may be retrieved both on the client and on the server, during execution of code related to a client.
adamc@701	1727
adamc@701	1728 $$\begin{array}{l}
adamc@701	1729 \mt{type} \; \mt{client} \\
adamc@701	1730 \mt{val} \; \mt{self} : \mt{transaction} \; \mt{client}
adamc@701	1731 \end{array}$$
adamc@701	1732
adamc@701	1733 \emph{Channels} are the means of message-passing. Each channel is created in the context of a client and belongs to that client; no other client may receive the channel's messages. Each channel type includes the type of values that may be sent over the channel. Sending and receiving are asynchronous, in the sense that a client need not be ready to receive a message right away. Rather, sent messages may queue up, waiting to be processed.
adamc@701	1734
adamc@701	1735 $$\begin{array}{l}
adamc@701	1736 \mt{con} \; \mt{channel} :: \mt{Type} \to \mt{Type} \\
adamc@701	1737 \mt{val} \; \mt{channel} : \mt{t} ::: \mt{Type} \to \mt{transaction} \; (\mt{channel} \; \mt{t}) \\
adamc@701	1738 \mt{val} \; \mt{send} : \mt{t} ::: \mt{Type} \to \mt{channel} \; \mt{t} \to \mt{t} \to \mt{transaction} \; \mt{unit} \\
adamc@701	1739 \mt{val} \; \mt{recv} : \mt{t} ::: \mt{Type} \to \mt{channel} \; \mt{t} \to \mt{transaction} \; \mt{t}
adamc@701	1740 \end{array}$$
adamc@701	1741
adamc@701	1742 The $\mt{channel}$ and $\mt{send}$ operations may only be executed on the server, and $\mt{recv}$ may only be executed on a client. Neither clients nor channels may be passed as arguments from clients to server-side functions, so persistent channels can only be maintained by storing them in the database and looking them up using the current client ID or some application-specific value as a key.
adamc@701	1743
adamc@701	1744 Clients and channels live only as long as the web browser page views that they are associated with. When a user surfs away, his client and its channels will be garbage-collected, after that user is not heard from for the timeout period. Garbage collection deletes any database row that contains a client or channel directly. Any reference to one of these types inside an $\mt{option}$ is set to $\mt{None}$ instead. Both kinds of handling have the flavor of weak pointers, and that is a useful way to think about clients and channels in the database.
adamc@701	1745
adamc@659	1746
adamc@549	1747 \section{Ur/Web Syntax Extensions}
adamc@549	1748
adamc@549	1749 Ur/Web features some syntactic shorthands for building values using the functions from the last section. This section sketches the grammar of those extensions. We write spans of syntax inside brackets to indicate that they are optional.
adamc@549	1750
adamc@549	1751 \subsection{SQL}
adamc@549	1752
adamc@786	1753 \subsubsection{\label{tables}Table Declarations}
adamc@786	1754
adamc@788	1755 $\mt{table}$ declarations may include constraints, via these grammar rules.
adamc@788	1756 $$\begin{array}{rrcll}
adamc@788	1757 \textrm{Declarations} & d &::=& \mt{table} \; x : c \; [pk[,]] \; cts \\
adamc@788	1758 \textrm{Primary key constraints} & pk &::=& \mt{PRIMARY} \; \mt{KEY} \; K \\
adamc@788	1759 \textrm{Keys} & K &::=& f \mid (f, (f,)^+) \\
adamc@788	1760 \textrm{Constraint sets} & cts &::=& \mt{CONSTRAINT} f \; ct \mid cts, cts \mid \{\{e\}\} \\
adamc@788	1761 \textrm{Constraints} & ct &::=& \mt{UNIQUE} \; K \mid \mt{CHECK} \; E \\
adamc@788	1762 &&& \mid \mt{FOREIGN} \; \mt{KEY} \; K \; \mt{REFERENCES} \; F \; (K) \; [\mt{ON} \; \mt{DELETE} \; pr] \; [\mt{ON} \; \mt{UPDATE} \; pr] \\
adamc@788	1763 \textrm{Foreign tables} & F &::=& x \mid \{\{e\}\} \\
adamc@788	1764 \textrm{Propagation modes} & pr &::=& \mt{NO} \; \mt{ACTION} \mid \mt{RESTRICT} \mid \mt{CASCADE} \mid \mt{SET} \; \mt{NULL}
adamc@788	1765 \end{array}$$
adamc@788	1766
adamc@788	1767 A signature item $\mt{table} \; \mt{x} : \mt{c}$ is actually elaborated into two signature items: $\mt{con} \; \mt{x\_hidden\_constraints} :: \{\{\mt{Unit}\}\}$ and $\mt{val} \; \mt{x} : \mt{sql\_table} \; \mt{c} \; \mt{x\_hidden\_constraints}$. This is appropriate for common cases where client code doesn't care which keys a table has. It's also possible to include constraints after a $\mt{table}$ signature item, with the same syntax as for $\mt{table}$ declarations. This may look like dependent typing, but it's just a convenience. The constraints are type-checked to determine a constructor $u$ to include in $\mt{val} \; \mt{x} : \mt{sql\_table} \; \mt{c} \; (u \rc \mt{x\_hidden\_constraints})$, and then the expressions are thrown away. Nonetheless, it can be useful for documentation purposes to include table constraint details in signatures. Note that the automatic generation of $\mt{x\_hidden\_constraints}$ leads to a kind of free subtyping with respect to which constraints are defined.
adamc@788	1768
adamc@788	1769
adamc@549	1770 \subsubsection{Queries}
adamc@549	1771
adamc@550	1772 Queries $Q$ are added to the rules for expressions $e$.
adamc@550	1773
adamc@549	1774 $$\begin{array}{rrcll}
adamc@550	1775 \textrm{Queries} & Q &::=& (q \; [\mt{ORDER} \; \mt{BY} \; (E \; [o],)^+] \; [\mt{LIMIT} \; N] \; [\mt{OFFSET} \; N]) \\
adamc@549	1776 \textrm{Pre-queries} & q &::=& \mt{SELECT} \; P \; \mt{FROM} \; T,^+ \; [\mt{WHERE} \; E] \; [\mt{GROUP} \; \mt{BY} \; p,^+] \; [\mt{HAVING} \; E] \\
adamc@549	1777 &&& \mid q \; R \; q \\
adamc@549	1778 \textrm{Relational operators} & R &::=& \mt{UNION} \mid \mt{INTERSECT} \mid \mt{EXCEPT}
adamc@549	1779 \end{array}$$
adamc@549	1780
adamc@549	1781 $$\begin{array}{rrcll}
adamc@549	1782 \textrm{Projections} & P &::=& \ast & \textrm{all columns} \\
adamc@549	1783 &&& p,^+ & \textrm{particular columns} \\
adamc@549	1784 \textrm{Pre-projections} & p &::=& t.f & \textrm{one column from a table} \\
adamc@558	1785 &&& t.\{\{c\}\} & \textrm{a record of columns from a table (of kind $\{\mt{Type}\}$)} \\
adamc@549	1786 \textrm{Table names} & t &::=& x & \textrm{constant table name (automatically capitalized)} \\
adamc@549	1787 &&& X & \textrm{constant table name} \\
adamc@549	1788 &&& \{\{c\}\} & \textrm{computed table name (of kind $\mt{Name}$)} \\
adamc@549	1789 \textrm{Column names} & f &::=& X & \textrm{constant column name} \\
adamc@549	1790 &&& \{c\} & \textrm{computed column name (of kind $\mt{Name}$)} \\
adamc@549	1791 \textrm{Tables} & T &::=& x & \textrm{table variable, named locally by its own capitalization} \\
adamc@549	1792 &&& x \; \mt{AS} \; t & \textrm{table variable, with local name} \\
adamc@549	1793 &&& \{\{e\}\} \; \mt{AS} \; t & \textrm{computed table expression, with local name} \\
adamc@549	1794 \textrm{SQL expressions} & E &::=& p & \textrm{column references} \\
adamc@549	1795 &&& X & \textrm{named expression references} \\
adamc@549	1796 &&& \{\{e\}\} & \textrm{injected native Ur expressions} \\
adamc@549	1797 &&& \{e\} & \textrm{computed expressions, probably using $\mt{sql\_exp}$ directly} \\
adamc@549	1798 &&& \mt{TRUE} \mid \mt{FALSE} & \textrm{boolean constants} \\
adamc@549	1799 &&& \ell & \textrm{primitive type literals} \\
adamc@549	1800 &&& \mt{NULL} & \textrm{null value (injection of $\mt{None}$)} \\
adamc@549	1801 &&& E \; \mt{IS} \; \mt{NULL} & \textrm{nullness test} \\
adamc@549	1802 &&& n & \textrm{nullary operators} \\
adamc@549	1803 &&& u \; E & \textrm{unary operators} \\
adamc@549	1804 &&& E \; b \; E & \textrm{binary operators} \\
adamc@549	1805 &&& \mt{COUNT}(\ast) & \textrm{count number of rows} \\
adamc@549	1806 &&& a(E) & \textrm{other aggregate function} \\
adamc@549	1807 &&& (E) & \textrm{explicit precedence} \\
adamc@549	1808 \textrm{Nullary operators} & n &::=& \mt{CURRENT\_TIMESTAMP} \\
adamc@549	1809 \textrm{Unary operators} & u &::=& \mt{NOT} \\
adamc@549	1810 \textrm{Binary operators} & b &::=& \mt{AND} \mid \mt{OR} \mid \neq \mid < \mid \leq \mid > \mid \geq \\
adamc@549	1811 \textrm{Aggregate functions} & a &::=& \mt{AVG} \mid \mt{SUM} \mid \mt{MIN} \mid \mt{MAX} \\
adamc@550	1812 \textrm{Directions} & o &::=& \mt{ASC} \mid \mt{DESC} \\
adamc@549	1813 \textrm{SQL integer} & N &::=& n \mid \{e\} \\
adamc@549	1814 \end{array}$$
adamc@549	1815
adamc@549	1816 Additionally, an SQL expression may be inserted into normal Ur code with the syntax $(\mt{SQL} \; E)$ or $(\mt{WHERE} \; E)$.
adamc@549	1817
adamc@550	1818 \subsubsection{DML}
adamc@550	1819
adamc@550	1820 DML commands $D$ are added to the rules for expressions $e$.
adamc@550	1821
adamc@550	1822 $$\begin{array}{rrcll}
adamc@550	1823 \textrm{Commands} & D &::=& (\mt{INSERT} \; \mt{INTO} \; T^E \; (f,^+) \; \mt{VALUES} \; (E,^+)) \\
adamc@550	1824 &&& (\mt{UPDATE} \; T^E \; \mt{SET} \; (f = E,)^+ \; \mt{WHERE} \; E) \\
adamc@550	1825 &&& (\mt{DELETE} \; \mt{FROM} \; T^E \; \mt{WHERE} \; E) \\
adamc@550	1826 \textrm{Table expressions} & T^E &::=& x \mid \{\{e\}\}
adamc@550	1827 \end{array}$$
adamc@550	1828
adamc@550	1829 Inside $\mt{UPDATE}$ and $\mt{DELETE}$ commands, lone variables $X$ are interpreted as references to columns of the implicit table $\mt{T}$, rather than to named expressions.
adamc@549	1830
adamc@551	1831 \subsection{XML}
adamc@551	1832
adamc@551	1833 XML fragments $L$ are added to the rules for expressions $e$.
adamc@551	1834
adamc@551	1835 $$\begin{array}{rrcll}
adamc@551	1836 \textrm{XML fragments} & L &::=& \texttt{<xml/>} \mid \texttt{<xml>}l^*\texttt{</xml>} \\
adamc@551	1837 \textrm{XML pieces} & l &::=& \textrm{text} & \textrm{cdata} \\
adamc@551	1838 &&& \texttt{<}g\texttt{/>} & \textrm{tag with no children} \\
adamc@551	1839 &&& \texttt{<}g\texttt{>}l^*\texttt{</}x\texttt{>} & \textrm{tag with children} \\
adamc@559	1840 &&& \{e\} & \textrm{computed XML fragment} \\
adamc@559	1841 &&& \{[e]\} & \textrm{injection of an Ur expression, via the $\mt{Top}.\mt{txt}$ function} \\
adamc@551	1842 \textrm{Tag} & g &::=& h \; (x = v)^* \\
adamc@551	1843 \textrm{Tag head} & h &::=& x & \textrm{tag name} \\
adamc@551	1844 &&& h\{c\} & \textrm{constructor parameter} \\
adamc@551	1845 \textrm{Attribute value} & v &::=& \ell & \textrm{literal value} \\
adamc@551	1846 &&& \{e\} & \textrm{computed value} \\
adamc@551	1847 \end{array}$$
adamc@551	1848
adamc@552	1849
adamc@553	1850 \section{The Structure of Web Applications}
adamc@553	1851
adamc@553	1852 A web application is built from a series of modules, with one module, the last one appearing in the \texttt{.urp} file, designated as the main module. The signature of the main module determines the URL entry points to the application. Such an entry point should have type $\mt{unit} \to \mt{transaction} \; \mt{page}$, where $\mt{page}$ is a type synonym for top-level HTML pages, defined in $\mt{Basis}$. If such a function is at the top level of main module $M$, it will be accessible at URI \texttt{/M/f}, and so on for more deeply-nested functions, as described in Section \ref{tag} below.
adamc@553	1853
adamc@553	1854 When the standalone web server receives a request for a known page, it calls the function for that page, ``running'' the resulting transaction to produce the page to return to the client. Pages link to other pages with the \texttt{link} attribute of the \texttt{a} HTML tag. A link has type $\mt{transaction} \; \mt{page}$, and the semantics of a link are that this transaction should be run to compute the result page, when the link is followed. Link targets are assigned URL names in the same way as top-level entry points.
adamc@553	1855
adamc@553	1856 HTML forms are handled in a similar way. The $\mt{action}$ attribute of a $\mt{submit}$ form tag takes a value of type $\$\mt{use} \to \mt{transaction} \; \mt{page}$, where $\mt{use}$ is a kind-$\{\mt{Type}\}$ record of the form fields used by this action handler. Action handlers are assigned URL patterns in the same way as above.
adamc@553	1857
adamc@558	1858 For both links and actions, direct arguments and local variables mentioned implicitly via closures are automatically included in serialized form in URLs, in the order in which they appear in the source code.
adamc@553	1859
adamc@660	1860 Ur/Web programs generally mix server- and client-side code in a fairly transparent way. The one important restriction is that mixed client-server code must encapsulate all server-side pieces within named functions. This is because execution of such pieces will be implemented by explicit calls to the remote web server, and it is useful to get the programmer's help in designing the interface to be used. For example, this makes it easier to allow a client running an old version of an application to continue interacting with a server that has been upgraded to a new version, if the programmer took care to keep the interfaces of all of the old remote calls the same. The functions implementing these services are assigned names in the same way as normal web entry points, by using module structure.
adamc@660	1861
adamc@789	1862 \medskip
adamc@789	1863
adamc@789	1864 The HTTP standard suggests that GET requests only be used in ways that generate no side effects. Side effecting operations should use POST requests instead. The Ur/Web compiler enforces this rule strictly, via a simple conservative program analysis. Any page that may have a side effect must be accessed through a form, all of which use POST requests. A page is judged to have a side effect if its code depends syntactically on any of the side-effecting, server-side FFI functions. Links, forms, and most client-side event handlers are not followed during this syntactic traversal, but \texttt{<body onload=\{...\}>} handlers \emph{are} examined, since they run right away and could just as well be considered parts of main page handlers.
adamc@789	1865
adamc@789	1866 Ur/Web includes a kind of automatic protection against cross site request forgery attacks. Whenever any page execution can have side effects and can also read at least one cookie value, all cookie values must be signed cryptographically, to ensure that the user has come to the current page by submitting a form on a real page generated by the proper server. Signing and signature checking are inserted automatically by the compiler. This prevents attacks like phishing schemes where users are directed to counterfeit pages with forms that submit to your application, where a user's cookies might be submitted without his knowledge, causing some undesired side effect.
adamc@789	1867
adamc@553	1868
adamc@552	1869 \section{Compiler Phases}
adamc@552	1870
adamc@552	1871 The Ur/Web compiler is unconventional in that it relies on a kind of \emph{heuristic compilation}. Not all valid programs will compile successfully. Informally, programs fail to compile when they are ``too higher order.'' Compiler phases do their best to eliminate different kinds of higher order-ness, but some programs just won't compile. This is a trade-off for producing very efficient executables. Compiled Ur/Web programs use native C representations and require no garbage collection.
adamc@552	1872
adamc@552	1873 In this section, we step through the main phases of compilation, noting what consequences each phase has for effective programming.
adamc@552	1874
adamc@552	1875 \subsection{Parse}
adamc@552	1876
adamc@552	1877 The compiler reads a \texttt{.urp} file, figures out which \texttt{.urs} and \texttt{.ur} files it references, and combines them all into what is conceptually a single sequence of declarations in the core language of Section \ref{core}.
adamc@552	1878
adamc@552	1879 \subsection{Elaborate}
adamc@552	1880
adamc@552	1881 This is where type inference takes place, translating programs into an explicit form with no more wildcards. This phase is the most likely source of compiler error messages.
adamc@552	1882
adamc@552	1883 \subsection{Unnest}
adamc@552	1884
adamc@552	1885 Named local function definitions are moved to the top level, to avoid the need to generate closures.
adamc@552	1886
adamc@552	1887 \subsection{Corify}
adamc@552	1888
adamc@552	1889 Module system features are compiled away, through inlining of functor definitions at application sites. Afterward, most abstraction boundaries are broken, facilitating optimization.
adamc@552	1890
adamc@552	1891 \subsection{Especialize}
adamc@552	1892
adamc@552	1893 Functions are specialized to particular argument patterns. This is an important trick for avoiding the need to maintain any closures at runtime.
adamc@552	1894
adamc@552	1895 \subsection{Untangle}
adamc@552	1896
adamc@552	1897 Remove unnecessary mutual recursion, splitting recursive groups into strongly-connected components.
adamc@552	1898
adamc@552	1899 \subsection{Shake}
adamc@552	1900
adamc@552	1901 Remove all definitions not needed to run the page handlers that are visible in the signature of the last module listed in the \texttt{.urp} file.
adamc@552	1902
adamc@661	1903 \subsection{Rpcify}
adamc@661	1904
adamc@661	1905 Pieces of code are determined to be client-side, server-side, neither, or both, by figuring out which standard library functions might be needed to execute them. Calls to server-side functions (e.g., $\mt{query}$) within mixed client-server code are identified and replaced with explicit remote calls. Some mixed functions may be converted to continuation-passing style to facilitate this transformation.
adamc@661	1906
adamc@661	1907 \subsection{Untangle, Shake}
adamc@661	1908
adamc@661	1909 Repeat these simplifications.
adamc@661	1910
adamc@553	1911 \subsection{\label{tag}Tag}
adamc@552	1912
adamc@552	1913 Assign a URL name to each link and form action. It is important that these links and actions are written as applications of named functions, because such names are used to generate URL patterns. A URL pattern has a name built from the full module path of the named function, followed by the function name, with all pieces separated by slashes. The path of a functor application is based on the name given to the result, rather than the path of the functor itself.
adamc@552	1914
adamc@552	1915 \subsection{Reduce}
adamc@552	1916
adamc@552	1917 Apply definitional equality rules to simplify the program as much as possible. This effectively includes inlining of every non-recursive definition.
adamc@552	1918
adamc@552	1919 \subsection{Unpoly}
adamc@552	1920
adamc@552	1921 This phase specializes polymorphic functions to the specific arguments passed to them in the program. If the program contains real polymorphic recursion, Unpoly will be insufficient to avoid later error messages about too much polymorphism.
adamc@552	1922
adamc@552	1923 \subsection{Specialize}
adamc@552	1924
adamc@558	1925 Replace uses of parameterized datatypes with versions specialized to specific parameters. As for Unpoly, this phase will not be effective enough in the presence of polymorphic recursion or other fancy uses of impredicative polymorphism.
adamc@552	1926
adamc@552	1927 \subsection{Shake}
adamc@552	1928
adamc@558	1929 Here the compiler repeats the earlier Shake phase.
adamc@552	1930
adamc@552	1931 \subsection{Monoize}
adamc@552	1932
adamc@552	1933 Programs are translated to a new intermediate language without polymorphism or non-$\mt{Type}$ constructors. Error messages may pop up here if earlier phases failed to remove such features.
adamc@552	1934
adamc@552	1935 This is the stage at which concrete names are generated for cookies, tables, and sequences. They are named following the same convention as for links and actions, based on module path information saved from earlier stages. Table and sequence names separate path elements with underscores instead of slashes, and they are prefixed by \texttt{uw\_}.
adamc@664	1936
adamc@552	1937 \subsection{MonoOpt}
adamc@552	1938
adamc@552	1939 Simple algebraic laws are applied to simplify the program, focusing especially on efficient imperative generation of HTML pages.
adamc@552	1940
adamc@552	1941 \subsection{MonoUntangle}
adamc@552	1942
adamc@552	1943 Unnecessary mutual recursion is broken up again.
adamc@552	1944
adamc@552	1945 \subsection{MonoReduce}
adamc@552	1946
adamc@552	1947 Equivalents of the definitional equality rules are applied to simplify programs, with inlining again playing a major role.
adamc@552	1948
adamc@552	1949 \subsection{MonoShake, MonoOpt}
adamc@552	1950
adamc@552	1951 Unneeded declarations are removed, and basic optimizations are repeated.
adamc@552	1952
adamc@552	1953 \subsection{Fuse}
adamc@552	1954
adamc@552	1955 The compiler tries to simplify calls to recursive functions whose results are immediately written as page output. The write action is pushed inside the function definitions to avoid allocation of intermediate results.
adamc@552	1956
adamc@552	1957 \subsection{MonoUntangle, MonoShake}
adamc@552	1958
adamc@552	1959 Fuse often creates more opportunities to remove spurious mutual recursion.
adamc@552	1960
adamc@552	1961 \subsection{Pathcheck}
adamc@552	1962
adamc@552	1963 The compiler checks that no link or action name has been used more than once.
adamc@552	1964
adamc@552	1965 \subsection{Cjrize}
adamc@552	1966
adamc@552	1967 The program is translated to what is more or less a subset of C. If any use of functions as data remains at this point, the compiler will complain.
adamc@552	1968
adamc@552	1969 \subsection{C Compilation and Linking}
adamc@552	1970
adamc@552	1971 The output of the last phase is pretty-printed as C source code and passed to GCC.
adamc@552	1972
adamc@552	1973
adamc@524	1974 \end{document}

Mercurial > urweb

annotate doc/manual.tex @ 873:41971801b62d