Ur/Web: doc/manual.tex annotate

annotate doc/manual.tex @ 788:f5fea13a60b9

Constraint syntax

author	Adam Chlipala <adamc@hcoop.net>
date	Tue, 05 May 2009 14:36:16 -0400
parents	c9c76aeaae64
children	d18f6f0bcf60

rev	line source
adamc@524	1 \documentclass{article}
adamc@554	2 \usepackage{fullpage,amsmath,amssymb,proof,url}
adamc@524	3
adamc@524	4 \newcommand{\cd}[1]{\texttt{#1}}
adamc@524	5 \newcommand{\mt}[1]{\mathsf{#1}}
adamc@524	6
adamc@524	7 \newcommand{\rc}{+ \hspace{-.075in} + \;}
adamc@527	8 \newcommand{\rcut}{\; \texttt{--} \;}
adamc@527	9 \newcommand{\rcutM}{\; \texttt{---} \;}
adamc@524	10
adamc@524	11 \begin{document}
adamc@524	12
adamc@524	13 \title{The Ur/Web Manual}
adamc@524	14 \author{Adam Chlipala}
adamc@524	15
adamc@524	16 \maketitle
adamc@524	17
adamc@540	18 \tableofcontents
adamc@540	19
adamc@554	20
adamc@554	21 \section{Introduction}
adamc@554	22
adamc@554	23 \emph{Ur} is a programming language designed to introduce richer type system features into functional programming in the tradition of ML and Haskell. Ur is functional, pure, statically-typed, and strict. Ur supports a powerful kind of \emph{metaprogramming} based on \emph{row types}.
adamc@554	24
adamc@554	25 \emph{Ur/Web} is Ur plus a special standard library and associated rules for parsing and optimization. Ur/Web supports construction of dynamic web applications backed by SQL databases. The signature of the standard library is such that well-typed Ur/Web programs ``don't go wrong'' in a very broad sense. Not only do they not crash during particular page generations, but they also may not:
adamc@554	26
adamc@554	27 \begin{itemize}
adamc@554	28 \item Suffer from any kinds of code-injection attacks
adamc@554	29 \item Return invalid HTML
adamc@554	30 \item Contain dead intra-application links
adamc@554	31 \item Have mismatches between HTML forms and the fields expected by their handlers
adamc@652	32 \item Include client-side code that makes incorrect assumptions about the ``AJAX''-style services that the remote web server provides
adamc@554	33 \item Attempt invalid SQL queries
adamc@652	34 \item Use improper marshaling or unmarshaling in communication with SQL databases or between browsers and web servers
adamc@554	35 \end{itemize}
adamc@554	36
adamc@554	37 This type safety is just the foundation of the Ur/Web methodology. It is also possible to use metaprogramming to build significant application pieces by analysis of type structure. For instance, the demo includes an ML-style functor for building an admin interface for an arbitrary SQL table. The type system guarantees that the admin interface sub-application that comes out will always be free of the above-listed bugs, no matter which well-typed table description is given as input.
adamc@554	38
adamc@652	39 The Ur/Web compiler also produces very efficient object code that does not use garbage collection. These compiled programs will often be even more efficient than what most programmers would bother to write in C. The compiler also generates JavaScript versions of client-side code, with no need to write those parts of applications in a different language.
adamc@554	40
adamc@554	41 \medskip
adamc@554	42
adamc@554	43 The official web site for Ur is:
adamc@554	44 \begin{center}
adamc@554	45 \url{http://www.impredicative.com/ur/}
adamc@554	46 \end{center}
adamc@554	47
adamc@555	48
adamc@555	49 \section{Installation}
adamc@555	50
adamc@555	51 If you are lucky, then the following standard command sequence will suffice for installation, in a directory to which you have unpacked the latest distribution tarball.
adamc@555	52
adamc@555	53 \begin{verbatim}
adamc@555	54 ./configure
adamc@555	55 make
adamc@555	56 sudo make install
adamc@555	57 \end{verbatim}
adamc@555	58
adamc@783	59 Some other packages must be installed for the above to work. At a minimum, you need a standard UNIX shell, with standard UNIX tools like sed and GCC in your execution path; MLton, the whole-program optimizing compiler for Standard ML; and the mhash C library. To build programs that access SQL databases, you also need libpq, the PostgreSQL client library. As of this writing, in the ``testing'' version of Debian Linux, this command will install the more uncommon of these dependencies:
adamc@555	60
adamc@555	61 \begin{verbatim}
adamc@783	62 apt-get install mlton libmhash-dev libpq-dev
adamc@555	63 \end{verbatim}
adamc@555	64
adamc@555	65 It is also possible to access the modules of the Ur/Web compiler interactively, within Standard ML of New Jersey. To install the prerequisites in Debian testing:
adamc@555	66
adamc@555	67 \begin{verbatim}
adamc@555	68 apt-get install smlnj libsmlnj-smlnj ml-yacc ml-lpt
adamc@555	69 \end{verbatim}
adamc@555	70
adamc@555	71 To begin an interactive session with the Ur compiler modules, run \texttt{make smlnj}, and then, from within an \texttt{sml} session, run \texttt{CM.make "src/urweb.cm";}. The \texttt{Compiler} module is the main entry point.
adamc@555	72
adamc@555	73 To run an SQL-backed application, you will probably want to install the PostgreSQL server. Version 8.3 or higher is required.
adamc@555	74
adamc@555	75 \begin{verbatim}
adamc@555	76 apt-get install postgresql-8.3
adamc@555	77 \end{verbatim}
adamc@555	78
adamc@555	79 To use the Emacs mode, you must have a modern Emacs installed. We assume that you already know how to do this, if you're in the business of looking for an Emacs mode. The demo generation facility of the compiler will also call out to Emacs to syntax-highlight code, and that process depends on the \texttt{htmlize} module, which can be installed in Debian testing via:
adamc@555	80
adamc@555	81 \begin{verbatim}
adamc@555	82 apt-get install emacs-goodies-el
adamc@555	83 \end{verbatim}
adamc@555	84
adamc@555	85 Even with the right packages installed, configuration and building might fail to work. After you run \texttt{./configure}, you will see the values of some named environment variables printed. You may need to adjust these values to get proper installation for your system. To change a value, store your preferred alternative in the corresponding UNIX environment variable, before running \texttt{./configure}. For instance, here is how to change the list of extra arguments that the Ur/Web compiler will pass to GCC on every invocation.
adamc@555	86
adamc@555	87 \begin{verbatim}
adamc@555	88 GCCARGS=-fnested-functions ./configure
adamc@555	89 \end{verbatim}
adamc@555	90
adamc@555	91 Some OSX users have reported needing to use this particular GCCARGS value.
adamc@555	92
adamc@555	93 The Emacs mode can be set to autoload by adding the following to your \texttt{.emacs} file.
adamc@555	94
adamc@555	95 \begin{verbatim}
adamc@555	96 (add-to-list 'load-path "/usr/local/share/emacs/site-lisp/urweb-mode")
adamc@555	97 (load "urweb-mode-startup")
adamc@555	98 \end{verbatim}
adamc@555	99
adamc@555	100 Change the path in the first line if you chose a different Emacs installation path during configuration.
adamc@555	101
adamc@555	102
adamc@556	103 \section{Command-Line Compiler}
adamc@556	104
adamc@556	105 \subsection{Project Files}
adamc@556	106
adamc@556	107 The basic inputs to the \texttt{urweb} compiler are project files, which have the extension \texttt{.urp}. Here is a sample \texttt{.urp} file.
adamc@556	108
adamc@556	109 \begin{verbatim}
adamc@556	110 database dbname=test
adamc@556	111 sql crud1.sql
adamc@556	112
adamc@556	113 crud
adamc@556	114 crud1
adamc@556	115 \end{verbatim}
adamc@556	116
adamc@556	117 The \texttt{database} line gives the database information string to pass to libpq. In this case, the string only says to connect to a local database named \texttt{test}.
adamc@556	118
adamc@556	119 The \texttt{sql} line asks for an SQL source file to be generated, giving the commands to run to create the tables and sequences that this application expects to find. After building this \texttt{.urp} file, the following commands could be used to initialize the database, assuming that the current UNIX user exists as a Postgres user with database creation privileges:
adamc@556	120
adamc@556	121 \begin{verbatim}
adamc@556	122 createdb test
adamc@556	123 psql -f crud1.sql test
adamc@556	124 \end{verbatim}
adamc@556	125
adamc@556	126 A blank line always separates the named directives from a list of modules to include in the project; if there are no named directives, a blank line must begin the file.
adamc@556	127
adamc@556	128 For each entry \texttt{M} in the module list, the file \texttt{M.urs} is included in the project if it exists, and the file \texttt{M.ur} must exist and is always included.
adamc@556	129
adamc@783	130 Here is the complete list of directive forms. ``FFI'' stands for ``foreign function interface,'' Ur's facility for interaction between Ur programs and C and JavaScript libraries.
adamc@783	131 \begin{itemize}
adamc@783	132 \item \texttt{[allow\|deny] [url\|mime] PATTERN} registers a rule governing which URLs or MIME types are allowed in this application. The first such rule to match a URL or MIME type determines the verdict. If \texttt{PATTERN} ends in \texttt{*}, it is interpreted as a prefix rule. Otherwise, a string must match it exactly.
adamc@783	133 \item \texttt{clientOnly Module.ident} registers an FFI function or transaction that may only be run in client browsers.
adamc@783	134 \item \texttt{clientToServer Module.ident} adds FFI type \texttt{Module.ident} to the list of types that are OK to marshal from clients to servers. Values like XML trees and SQL queries are hard to marshal without introducing expensive validity checks, so it's easier to ensure that the server never trusts clients to send such values. The file \texttt{include/urweb.h} shows examples of the C support functions that are required of any type that may be marshalled. These include \texttt{attrify}, \texttt{urlify}, and \texttt{unurlify} functions.
adamc@783	135 \item \texttt{database DBSTRING} sets the string to pass to libpq to open a database connection.
adamc@783	136 \item \texttt{debug} saves some intermediate C files, which is mostly useful to help in debugging the compiler itself.
adamc@783	137 \item \texttt{effectful Module.ident} registers an FFI function or transaction as having side effects. The optimizer avoids removing, moving, or duplicating calls to such functions. Every effectful FFI function must be registered, or the optimizer may make invalid transformations.
adamc@783	138 \item \texttt{exe FILENAME} sets the filename to which to write the output executable. The default for file \texttt{P.urp} is \texttt{P.exe}.
adamc@783	139 \item \texttt{ffi FILENAME} reads the file \texttt{FILENAME.urs} to determine the interface to a new FFI module. The name of the module is calculated from \texttt{FILENAME} in the same way as for normal source files. See the files \texttt{include/urweb.h} and \texttt{src/c/urweb.c} for examples of C headers and implementations for FFI modules. In general, every type or value \texttt{Module.ident} becomes \texttt{uw\_Module\_ident} in C.
adamc@783	140 \item \texttt{jsFunc Module.ident=name} gives the JavaScript name of an FFI value.
adamc@783	141 \item \texttt{library FILENAME} parses \texttt{FILENAME.urp} and merges its contents with the rest of the current file's contents.
adamc@783	142 \item \texttt{link FILENAME} adds \texttt{FILENAME} to the list of files to be passed to the GCC linker at the end of compilation. This is most useful for importing extra libraries needed by new FFI modules.
adamc@783	143 \item \texttt{prefix PREFIX} sets the prefix included before every URI within the generated application. The default is \texttt{/}.
adamc@783	144 \item \texttt{profile} generates an executable that may be used with gprof.
adamc@783	145 \item \texttt{rewrite KIND FROM TO} gives a rule for rewriting canonical module paths. For instance, the canonical path of a page may be \texttt{Mod1.Mod2.mypage}, while you would rather the page were accessed via a URL containing only \texttt{page}. The directive \texttt{rewrite url Mod1/Mod2/mypage page} would accomplish that. The possible values of \texttt{KIND} determine which kinds of objects are affected. The kind \texttt{all} matches any object, and \texttt{url} matches page URLs. The kinds \texttt{table}, \texttt{sequence}, and \texttt{view} match those sorts of SQL entities, and \texttt{relation} matches any of those three. \texttt{cookie} matches HTTP cookies, and \texttt{style} matches CSS class names. If \texttt{FROM} ends in \texttt{/*}, it is interpreted as a prefix matching rule, and rewriting occurs by replacing only the appropriate prefix of a path with \texttt{TO}. While the actual external names of relations and styles have parts separated by underscores instead of slashes, all rewrite rules must be written in terms of slashes.
adamc@783	146 \item \texttt{script URL} adds \texttt{URL} to the list of extra JavaScript files to be included at the beginning of any page that uses JavaScript. This is most useful for importing JavaScript versions of functions found in new FFI modules.
adamc@783	147 \item \texttt{serverOnly Module.ident} registers an FFI function or transaction that may only be run on the server.
adamc@783	148 \item \texttt{sql FILENAME} sets where to write an SQL file with the commands to create the expected database schema. The default is not to create such a file.
adamc@783	149 \item \texttt{timeout N} sets to \texttt{N} seconds the amount of time that the generated server will wait after the last contact from a client before determining that that client has exited the application. Clients that remain active will take the timeout setting into account in determining how often to ping the server, so it only makes sense to set a high timeout to cope with browser and network delays and failures. Higher timeouts can lead to more unnecessary client information taking up memory on the server. The timeout goes unused by any page that doesn't involve the \texttt{recv} function, since the server only needs to store per-client information for clients that receive asynchronous messages.
adamc@783	150 \end{itemize}
adamc@701	151
adamc@701	152
adamc@557	153 \subsection{Building an Application}
adamc@557	154
adamc@557	155 To compile project \texttt{P.urp}, simply run
adamc@557	156 \begin{verbatim}
adamc@557	157 urweb P
adamc@557	158 \end{verbatim}
adamc@558	159 The output executable is a standalone web server. Run it with the command-line argument \texttt{-h} to see which options it takes. If the project file lists a database, the web server will attempt to connect to that database on startup.
adamc@557	160
adamc@557	161 To time how long the different compiler phases run, without generating an executable, run
adamc@557	162 \begin{verbatim}
adamc@557	163 urweb -timing P
adamc@557	164 \end{verbatim}
adamc@557	165
adamc@556	166
adamc@529	167 \section{Ur Syntax}
adamc@529	168
adamc@784	169 In this section, we describe the syntax of Ur, deferring to a later section discussion of most of the syntax specific to SQL and XML. The sole exceptions are the declaration forms for relations, cookies, and styles.
adamc@524	170
adamc@524	171 \subsection{Lexical Conventions}
adamc@524	172
adamc@524	173 We give the Ur language definition in \LaTeX $\;$ math mode, since that is prettier than monospaced ASCII. The corresponding ASCII syntax can be read off directly. Here is the key for mapping math symbols to ASCII character sequences.
adamc@524	174
adamc@524	175 \begin{center}
adamc@524	176 \begin{tabular}{rl}
adamc@524	177 \textbf{\LaTeX} & \textbf{ASCII} \\
adamc@524	178 $\to$ & \cd{->} \\
adamc@652	179 $\longrightarrow$ & \cd{-->} \\
adamc@524	180 $\times$ & \cd{*} \\
adamc@524	181 $\lambda$ & \cd{fn} \\
adamc@524	182 $\Rightarrow$ & \cd{=>} \\
adamc@652	183 $\Longrightarrow$ & \cd{==>} \\
adamc@529	184 $\neq$ & \cd{<>} \\
adamc@529	185 $\leq$ & \cd{<=} \\
adamc@529	186 $\geq$ & \cd{>=} \\
adamc@524	187 \\
adamc@524	188 $x$ & Normal textual identifier, not beginning with an uppercase letter \\
adamc@525	189 $X$ & Normal textual identifier, beginning with an uppercase letter \\
adamc@524	190 \end{tabular}
adamc@524	191 \end{center}
adamc@524	192
adamc@525	193 We often write syntax like $e^$ to indicate zero or more copies of $e$, $e^+$ to indicate one or more copies, and $e,^$ and $e,^+$ to indicate multiple copies separated by commas. Another separator may be used in place of a comma. The $e$ term may be surrounded by parentheses to indicate grouping; those parentheses should not be included in the actual ASCII.
adamc@524	194
adamc@526	195 We write $\ell$ for literals of the primitive types, for the most part following C conventions. There are $\mt{int}$, $\mt{float}$, and $\mt{string}$ literals.
adamc@526	196
adamc@527	197 This version of the manual doesn't include operator precedences; see \texttt{src/urweb.grm} for that.
adamc@527	198
adamc@552	199 \subsection{\label{core}Core Syntax}
adamc@524	200
adamc@524	201 \emph{Kinds} classify types and other compile-time-only entities. Each kind in the grammar is listed with a description of the sort of data it classifies.
adamc@524	202 $$\begin{array}{rrcll}
adamc@524	203 \textrm{Kinds} & \kappa &::=& \mt{Type} & \textrm{proper types} \\
adamc@525	204 &&& \mt{Unit} & \textrm{the trivial constructor} \\
adamc@525	205 &&& \mt{Name} & \textrm{field names} \\
adamc@525	206 &&& \kappa \to \kappa & \textrm{type-level functions} \\
adamc@525	207 &&& \{\kappa\} & \textrm{type-level records} \\
adamc@525	208 &&& (\kappa\times^+) & \textrm{type-level tuples} \\
adamc@652	209 &&& X & \textrm{variable} \\
adamc@652	210 &&& X \longrightarrow k & \textrm{kind-polymorphic type-level function} \\
adamc@529	211 &&& \_\_ & \textrm{wildcard} \\
adamc@525	212 &&& (\kappa) & \textrm{explicit precedence} \\
adamc@524	213 \end{array}$$
adamc@524	214
adamc@524	215 Ur supports several different notions of functions that take types as arguments. These arguments can be either implicit, causing them to be inferred at use sites; or explicit, forcing them to be specified manually at use sites. There is a common explicitness annotation convention applied at the definitions of and in the types of such functions.
adamc@524	216 $$\begin{array}{rrcll}
adamc@524	217 \textrm{Explicitness} & ? &::=& :: & \textrm{explicit} \\
adamc@558	218 &&& ::: & \textrm{implicit}
adamc@524	219 \end{array}$$
adamc@524	220
adamc@524	221 \emph{Constructors} are the main class of compile-time-only data. They include proper types and are classified by kinds.
adamc@524	222 $$\begin{array}{rrcll}
adamc@524	223 \textrm{Constructors} & c, \tau &::=& (c) :: \kappa & \textrm{kind annotation} \\
adamc@530	224 &&& \hat{x} & \textrm{constructor variable} \\
adamc@524	225 \\
adamc@525	226 &&& \tau \to \tau & \textrm{function type} \\
adamc@525	227 &&& x \; ? \; \kappa \to \tau & \textrm{polymorphic function type} \\
adamc@652	228 &&& X \longrightarrow \tau & \textrm{kind-polymorphic function type} \\
adamc@525	229 &&& \$ c & \textrm{record type} \\
adamc@524	230 \\
adamc@525	231 &&& c \; c & \textrm{type-level function application} \\
adamc@530	232 &&& \lambda x \; :: \; \kappa \Rightarrow c & \textrm{type-level function abstraction} \\
adamc@524	233 \\
adamc@652	234 &&& X \Longrightarrow c & \textrm{type-level kind-polymorphic function abstraction} \\
adamc@655	235 &&& c [\kappa] & \textrm{type-level kind-polymorphic function application} \\
adamc@652	236 \\
adamc@525	237 &&& () & \textrm{type-level unit} \\
adamc@525	238 &&& \#X & \textrm{field name} \\
adamc@524	239 \\
adamc@525	240 &&& [(c = c)^*] & \textrm{known-length type-level record} \\
adamc@525	241 &&& c \rc c & \textrm{type-level record concatenation} \\
adamc@652	242 &&& \mt{map} & \textrm{type-level record map} \\
adamc@524	243 \\
adamc@558	244 &&& (c,^+) & \textrm{type-level tuple} \\
adamc@525	245 &&& c.n & \textrm{type-level tuple projection ($n \in \mathbb N^+$)} \\
adamc@524	246 \\
adamc@652	247 &&& [c \sim c] \Rightarrow \tau & \textrm{guarded type} \\
adamc@524	248 \\
adamc@529	249 &&& \_ :: \kappa & \textrm{wildcard} \\
adamc@525	250 &&& (c) & \textrm{explicit precedence} \\
adamc@530	251 \\
adamc@530	252 \textrm{Qualified uncapitalized variables} & \hat{x} &::=& x & \textrm{not from a module} \\
adamc@530	253 &&& M.x & \textrm{projection from a module} \\
adamc@525	254 \end{array}$$
adamc@525	255
adamc@655	256 We include both abstraction and application for kind polymorphism, but applications are only inferred internally; they may not be written explicitly in source programs.
adamc@655	257
adamc@525	258 Modules of the module system are described by \emph{signatures}.
adamc@525	259 $$\begin{array}{rrcll}
adamc@525	260 \textrm{Signatures} & S &::=& \mt{sig} \; s^* \; \mt{end} & \textrm{constant} \\
adamc@525	261 &&& X & \textrm{variable} \\
adamc@525	262 &&& \mt{functor}(X : S) : S & \textrm{functor} \\
adamc@529	263 &&& S \; \mt{where} \; \mt{con} \; x = c & \textrm{concretizing an abstract constructor} \\
adamc@525	264 &&& M.X & \textrm{projection from a module} \\
adamc@525	265 \\
adamc@525	266 \textrm{Signature items} & s &::=& \mt{con} \; x :: \kappa & \textrm{abstract constructor} \\
adamc@525	267 &&& \mt{con} \; x :: \kappa = c & \textrm{concrete constructor} \\
adamc@528	268 &&& \mt{datatype} \; x \; x^* = dc\mid^+ & \textrm{algebraic datatype definition} \\
adamc@529	269 &&& \mt{datatype} \; x = \mt{datatype} \; M.x & \textrm{algebraic datatype import} \\
adamc@525	270 &&& \mt{val} \; x : \tau & \textrm{value} \\
adamc@525	271 &&& \mt{structure} \; X : S & \textrm{sub-module} \\
adamc@525	272 &&& \mt{signature} \; X = S & \textrm{sub-signature} \\
adamc@525	273 &&& \mt{include} \; S & \textrm{signature inclusion} \\
adamc@525	274 &&& \mt{constraint} \; c \sim c & \textrm{record disjointness constraint} \\
adamc@654	275 &&& \mt{class} \; x :: \kappa & \textrm{abstract constructor class} \\
adamc@654	276 &&& \mt{class} \; x :: \kappa = c & \textrm{concrete constructor class} \\
adamc@525	277 \\
adamc@525	278 \textrm{Datatype constructors} & dc &::=& X & \textrm{nullary constructor} \\
adamc@525	279 &&& X \; \mt{of} \; \tau & \textrm{unary constructor} \\
adamc@524	280 \end{array}$$
adamc@524	281
adamc@526	282 \emph{Patterns} are used to describe structural conditions on expressions, such that expressions may be tested against patterns, generating assignments to pattern variables if successful.
adamc@526	283 $$\begin{array}{rrcll}
adamc@526	284 \textrm{Patterns} & p &::=& \_ & \textrm{wildcard} \\
adamc@526	285 &&& x & \textrm{variable} \\
adamc@526	286 &&& \ell & \textrm{constant} \\
adamc@526	287 &&& \hat{X} & \textrm{nullary constructor} \\
adamc@526	288 &&& \hat{X} \; p & \textrm{unary constructor} \\
adamc@526	289 &&& \{(x = p,)^*\} & \textrm{rigid record pattern} \\
adamc@526	290 &&& \{(x = p,)^+, \ldots\} & \textrm{flexible record pattern} \\
adamc@527	291 &&& (p) & \textrm{explicit precedence} \\
adamc@526	292 \\
adamc@529	293 \textrm{Qualified capitalized variables} & \hat{X} &::=& X & \textrm{not from a module} \\
adamc@526	294 &&& M.X & \textrm{projection from a module} \\
adamc@526	295 \end{array}$$
adamc@526	296
adamc@527	297 \emph{Expressions} are the main run-time entities, corresponding to both ``expressions'' and ``statements'' in mainstream imperative languages.
adamc@527	298 $$\begin{array}{rrcll}
adamc@527	299 \textrm{Expressions} & e &::=& e : \tau & \textrm{type annotation} \\
adamc@529	300 &&& \hat{x} & \textrm{variable} \\
adamc@529	301 &&& \hat{X} & \textrm{datatype constructor} \\
adamc@527	302 &&& \ell & \textrm{constant} \\
adamc@527	303 \\
adamc@527	304 &&& e \; e & \textrm{function application} \\
adamc@527	305 &&& \lambda x : \tau \Rightarrow e & \textrm{function abstraction} \\
adamc@527	306 &&& e [c] & \textrm{polymorphic function application} \\
adamc@527	307 &&& \lambda x \; ? \; \kappa \Rightarrow e & \textrm{polymorphic function abstraction} \\
adamc@655	308 &&& e [\kappa] & \textrm{kind-polymorphic function application} \\
adamc@652	309 &&& X \Longrightarrow e & \textrm{kind-polymorphic function abstraction} \\
adamc@527	310 \\
adamc@527	311 &&& \{(c = e,)^*\} & \textrm{known-length record} \\
adamc@527	312 &&& e.c & \textrm{record field projection} \\
adamc@527	313 &&& e \rc e & \textrm{record concatenation} \\
adamc@527	314 &&& e \rcut c & \textrm{removal of a single record field} \\
adamc@527	315 &&& e \rcutM c & \textrm{removal of multiple record fields} \\
adamc@527	316 \\
adamc@527	317 &&& \mt{let} \; ed^* \; \mt{in} \; e \; \mt{end} & \textrm{local definitions} \\
adamc@527	318 \\
adamc@527	319 &&& \mt{case} \; e \; \mt{of} \; (p \Rightarrow e\|)^+ & \textrm{pattern matching} \\
adamc@527	320 \\
adamc@654	321 &&& \lambda [c \sim c] \Rightarrow e & \textrm{guarded expression abstraction} \\
adamc@654	322 &&& e \; ! & \textrm{guarded expression application} \\
adamc@527	323 \\
adamc@527	324 &&& \_ & \textrm{wildcard} \\
adamc@527	325 &&& (e) & \textrm{explicit precedence} \\
adamc@527	326 \\
adamc@527	327 \textrm{Local declarations} & ed &::=& \cd{val} \; x : \tau = e & \textrm{non-recursive value} \\
adamc@527	328 &&& \cd{val} \; \cd{rec} \; (x : \tau = e \; \cd{and})^+ & \textrm{mutually-recursive values} \\
adamc@527	329 \end{array}$$
adamc@527	330
adamc@655	331 As with constructors, we include both abstraction and application for kind polymorphism, but applications are only inferred internally.
adamc@655	332
adamc@528	333 \emph{Declarations} primarily bring new symbols into context.
adamc@528	334 $$\begin{array}{rrcll}
adamc@528	335 \textrm{Declarations} & d &::=& \mt{con} \; x :: \kappa = c & \textrm{constructor synonym} \\
adamc@528	336 &&& \mt{datatype} \; x \; x^* = dc\mid^+ & \textrm{algebraic datatype definition} \\
adamc@529	337 &&& \mt{datatype} \; x = \mt{datatype} \; M.x & \textrm{algebraic datatype import} \\
adamc@528	338 &&& \mt{val} \; x : \tau = e & \textrm{value} \\
adamc@528	339 &&& \mt{val} \; \cd{rec} \; (x : \tau = e \; \mt{and})^+ & \textrm{mutually-recursive values} \\
adamc@528	340 &&& \mt{structure} \; X : S = M & \textrm{module definition} \\
adamc@528	341 &&& \mt{signature} \; X = S & \textrm{signature definition} \\
adamc@528	342 &&& \mt{open} \; M & \textrm{module inclusion} \\
adamc@528	343 &&& \mt{constraint} \; c \sim c & \textrm{record disjointness constraint} \\
adamc@528	344 &&& \mt{open} \; \mt{constraints} \; M & \textrm{inclusion of just the constraints from a module} \\
adamc@528	345 &&& \mt{table} \; x : c & \textrm{SQL table} \\
adamc@784	346 &&& \mt{view} \; x : c & \textrm{SQL view} \\
adamc@528	347 &&& \mt{sequence} \; x & \textrm{SQL sequence} \\
adamc@535	348 &&& \mt{cookie} \; x : \tau & \textrm{HTTP cookie} \\
adamc@784	349 &&& \mt{style} \; x : \tau & \textrm{CSS class} \\
adamc@654	350 &&& \mt{class} \; x :: \kappa = c & \textrm{concrete constructor class} \\
adamc@528	351 \\
adamc@529	352 \textrm{Modules} & M &::=& \mt{struct} \; d^* \; \mt{end} & \textrm{constant} \\
adamc@529	353 &&& X & \textrm{variable} \\
adamc@529	354 &&& M.X & \textrm{projection} \\
adamc@529	355 &&& M(M) & \textrm{functor application} \\
adamc@529	356 &&& \mt{functor}(X : S) : S = M & \textrm{functor abstraction} \\
adamc@528	357 \end{array}$$
adamc@528	358
adamc@528	359 There are two kinds of Ur files. A file named $M\texttt{.ur}$ is an \emph{implementation file}, and it should contain a sequence of declarations $d^$. A file named $M\texttt{.urs}$ is an \emph{interface file}; it must always have a matching $M\texttt{.ur}$ and should contain a sequence of signature items $s^$. When both files are present, the overall effect is the same as a monolithic declaration $\mt{structure} \; M : \mt{sig} \; s^* \; \mt{end} = \mt{struct} \; d^* \; \mt{end}$. When no interface file is included, the overall effect is similar, with a signature for module $M$ being inferred rather than just checked against an interface.
adamc@527	360
adamc@784	361 We omit some extra possibilities in $\mt{table}$ syntax, deferring them to Section \ref{tables}.
adamc@784	362
adamc@529	363 \subsection{Shorthands}
adamc@529	364
adamc@529	365 There are a variety of derived syntactic forms that elaborate into the core syntax from the last subsection. We will present the additional forms roughly following the order in which we presented the constructs that they elaborate into.
adamc@529	366
adamc@529	367 In many contexts where record fields are expected, like in a projection $e.c$, a constant field may be written as simply $X$, rather than $\#X$.
adamc@529	368
adamc@529	369 A record type may be written $\{(c = c,)^\}$, which elaborates to $\$[(c = c,)^]$.
adamc@529	370
adamc@533	371 The notation $[c_1, \ldots, c_n]$ is shorthand for $[c_1 = (), \ldots, c_n = ()]$.
adamc@533	372
adamc@529	373 A tuple type $(\tau_1, \ldots, \tau_n)$ expands to a record type $\{1 = \tau_1, \ldots, n = \tau_n\}$, with natural numbers as field names. A tuple pattern $(p_1, \ldots, p_n)$ expands to a rigid record pattern $\{1 = p_1, \ldots, n = p_n\}$. Positive natural numbers may be used in most places where field names would be allowed.
adamc@529	374
adamc@654	375 In general, several adjacent $\lambda$ forms may be combined into one, and kind and type annotations may be omitted, in which case they are implicitly included as wildcards. More formally, for constructor-level abstractions, we can define a new non-terminal $b ::= x \mid (x :: \kappa) \mid X$ and allow composite abstractions of the form $\lambda b^+ \Rightarrow c$, elaborating into the obvious sequence of one core $\lambda$ per element of $b^+$.
adamc@529	376
adamc@529	377 For any signature item or declaration that defines some entity to be equal to $A$ with classification annotation $B$ (e.g., $\mt{val} \; x : B = A$), $B$ and the preceding colon (or similar punctuation) may be omitted, in which case it is filled in as a wildcard.
adamc@529	378
adamc@529	379 A signature item or declaration $\mt{type} \; x$ or $\mt{type} \; x = \tau$ is elaborated into $\mt{con} \; x :: \mt{Type}$ or $\mt{con} \; x :: \mt{Type} = \tau$, respectively.
adamc@529	380
adamc@654	381 A signature item or declaration $\mt{class} \; x = \lambda y \Rightarrow c$ may be abbreviated $\mt{class} \; x \; y = c$.
adamc@529	382
adamc@654	383 Handling of implicit and explicit constructor arguments may be tweaked with some prefixes to variable references. An expression $@x$ is a version of $x$ where all implicit constructor arguments have been made explicit. An expression $@@x$ achieves the same effect, additionally halting automatic resolution of type class instances and automatic proving of disjointness constraints. The default is that any prefix of a variable's type consisting only of implicit polymorphism, type class instances, and disjointness obligations is resolved automatically, with the variable treated as having the type that starts after the last implicit element, with suitable unification variables substituted. The same syntax works for variables projected out of modules and for capitalized variables (datatype constructors).
adamc@529	384
adamc@654	385 At the expression level, an analogue is available of the composite $\lambda$ form for constructors. We define the language of binders as $b ::= x \mid (x : \tau) \mid (x \; ? \; \kappa) \mid X \mid [c \sim c]$. A lone variable $x$ as a binder stands for an expression variable of unspecified type.
adamc@529	386
adamc@529	387 A $\mt{val}$ or $\mt{val} \; \mt{rec}$ declaration may include expression binders before the equal sign, following the binder grammar from the last paragraph. Such declarations are elaborated into versions that add additional $\lambda$s to the fronts of the righthand sides, as appropriate. The keyword $\mt{fun}$ is a synonym for $\mt{val} \; \mt{rec}$.
adamc@529	388
adamc@529	389 A signature item $\mt{functor} \; X_1 \; (X_2 : S_1) : S_2$ is elaborated into $\mt{structure} \; X_1 : \mt{functor}(X_2 : S_1) : S_2$. A declaration $\mt{functor} \; X_1 \; (X_2 : S_1) : S_2 = M$ is elaborated into $\mt{structure} \; X_1 : \mt{functor}(X_2 : S_1) : S_2 = \mt{functor}(X_2 : S_1) : S_2 = M$.
adamc@529	390
adamc@529	391 A declaration $\mt{table} \; x : \{(c = c,)^\}$ is elaborated into $\mt{table} \; x : [(c = c,)^]$
adamc@529	392
adamc@529	393 The syntax $\mt{where} \; \mt{type}$ is an alternate form of $\mt{where} \; \mt{con}$.
adamc@529	394
adamc@529	395 The syntax $\mt{if} \; e \; \mt{then} \; e_1 \; \mt{else} \; e_2$ expands to $\mt{case} \; e \; \mt{of} \; \mt{Basis}.\mt{True} \Rightarrow e_1 \mid \mt{Basis}.\mt{False} \Rightarrow e_2$.
adamc@529	396
adamc@529	397 There are infix operator syntaxes for a number of functions defined in the $\mt{Basis}$ module. There is $=$ for $\mt{eq}$, $\neq$ for $\mt{neq}$, $-$ for $\mt{neg}$ (as a prefix operator) and $\mt{minus}$, $+$ for $\mt{plus}$, $\times$ for $\mt{times}$, $/$ for $\mt{div}$, $\%$ for $\mt{mod}$, $<$ for $\mt{lt}$, $\leq$ for $\mt{le}$, $>$ for $\mt{gt}$, and $\geq$ for $\mt{ge}$.
adamc@529	398
adamc@784	399 A signature item $\mt{table} \; x : c$ is shorthand for $\mt{val} \; x : \mt{Basis}.\mt{sql\_table} \; c \; []$. $\mt{view} \; x : c$ is shorthand for $\mt{val} \; x : \mt{Basis}.\mt{sql\_view} \; c$, $\mt{sequence} \; x$ is short for $\mt{val} \; x : \mt{Basis}.\mt{sql\_sequence}$. $\mt{cookie} \; x : \tau$ is shorthand for $\mt{val} \; x : \mt{Basis}.\mt{http\_cookie} \; \tau$, and $\mt{style} \; x$ is shorthand for $\mt{val} \; x : \mt{Basis}.\mt{css\_class}$.
adamc@529	400
adamc@530	401
adamc@530	402 \section{Static Semantics}
adamc@530	403
adamc@530	404 In this section, we give a declarative presentation of Ur's typing rules and related judgments. Inference is the subject of the next section; here, we assume that an oracle has filled in all wildcards with concrete values.
adamc@530	405
adamc@530	406 Since there is significant mutual recursion among the judgments, we introduce them all before beginning to give rules. We use the same variety of contexts throughout this section, implicitly introducing new sorts of context entries as needed.
adamc@530	407 \begin{itemize}
adamc@655	408 \item $\Gamma \vdash \kappa$ expresses kind well-formedness.
adamc@530	409 \item $\Gamma \vdash c :: \kappa$ assigns a kind to a constructor in a context.
adamc@530	410 \item $\Gamma \vdash c \sim c$ proves the disjointness of two record constructors; that is, that they share no field names. We overload the judgment to apply to pairs of field names as well.
adamc@531	411 \item $\Gamma \vdash c \hookrightarrow C$ proves that record constructor $c$ decomposes into set $C$ of field names and record constructors.
adamc@530	412 \item $\Gamma \vdash c \equiv c$ proves the computational equivalence of two constructors. This is often called a \emph{definitional equality} in the world of type theory.
adamc@530	413 \item $\Gamma \vdash e : \tau$ is a standard typing judgment.
adamc@534	414 \item $\Gamma \vdash p \leadsto \Gamma; \tau$ combines typing of patterns with calculation of which new variables they bind.
adamc@537	415 \item $\Gamma \vdash d \leadsto \Gamma$ expresses how a declaration modifies a context. We overload this judgment to apply to sequences of declarations, as well as to signature items and sequences of signature items.
adamc@537	416 \item $\Gamma \vdash S \equiv S$ is the signature equivalence judgment.
adamc@536	417 \item $\Gamma \vdash S \leq S$ is the signature compatibility judgment. We write $\Gamma \vdash S$ as shorthand for $\Gamma \vdash S \leq S$.
adamc@530	418 \item $\Gamma \vdash M : S$ is the module signature checking judgment.
adamc@537	419 \item $\mt{proj}(M, \overline{s}, V)$ is a partial function for projecting a signature item from $\overline{s}$, given the module $M$ that we project from. $V$ may be $\mt{con} \; x$, $\mt{datatype} \; x$, $\mt{val} \; x$, $\mt{signature} \; X$, or $\mt{structure} \; X$. The parameter $M$ is needed because the projected signature item may refer to other items from $\overline{s}$.
adamc@539	420 \item $\mt{selfify}(M, \overline{s})$ adds information to signature items $\overline{s}$ to reflect the fact that we are concerned with the particular module $M$. This function is overloaded to work over individual signature items as well.
adamc@530	421 \end{itemize}
adamc@530	422
adamc@655	423
adamc@655	424 \subsection{Kind Well-Formedness}
adamc@655	425
adamc@655	426 $$\infer{\Gamma \vdash \mt{Type}}{}
adamc@655	427 \quad \infer{\Gamma \vdash \mt{Unit}}{}
adamc@655	428 \quad \infer{\Gamma \vdash \mt{Name}}{}
adamc@655	429 \quad \infer{\Gamma \vdash \kappa_1 \to \kappa_2}{
adamc@655	430 \Gamma \vdash \kappa_1
adamc@655	431 & \Gamma \vdash \kappa_2
adamc@655	432 }
adamc@655	433 \quad \infer{\Gamma \vdash \{\kappa\}}{
adamc@655	434 \Gamma \vdash \kappa
adamc@655	435 }
adamc@655	436 \quad \infer{\Gamma \vdash (\kappa_1 \times \ldots \times \kappa_n)}{
adamc@655	437 \forall i: \Gamma \vdash \kappa_i
adamc@655	438 }$$
adamc@655	439
adamc@655	440 $$\infer{\Gamma \vdash X}{
adamc@655	441 X \in \Gamma
adamc@655	442 }
adamc@655	443 \quad \infer{\Gamma \vdash X \longrightarrow \kappa}{
adamc@655	444 \Gamma, X \vdash \kappa
adamc@655	445 }$$
adamc@655	446
adamc@530	447 \subsection{Kinding}
adamc@530	448
adamc@655	449 We write $[X \mapsto \kappa_1]\kappa_2$ for capture-avoiding substitution of $\kappa_1$ for $X$ in $\kappa_2$.
adamc@655	450
adamc@530	451 $$\infer{\Gamma \vdash (c) :: \kappa :: \kappa}{
adamc@530	452 \Gamma \vdash c :: \kappa
adamc@530	453 }
adamc@530	454 \quad \infer{\Gamma \vdash x :: \kappa}{
adamc@530	455 x :: \kappa \in \Gamma
adamc@530	456 }
adamc@530	457 \quad \infer{\Gamma \vdash x :: \kappa}{
adamc@530	458 x :: \kappa = c \in \Gamma
adamc@530	459 }$$
adamc@530	460
adamc@530	461 $$\infer{\Gamma \vdash M.x :: \kappa}{
adamc@537	462 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	463 & \mt{proj}(M, \overline{s}, \mt{con} \; x) = \kappa
adamc@530	464 }
adamc@530	465 \quad \infer{\Gamma \vdash M.x :: \kappa}{
adamc@537	466 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	467 & \mt{proj}(M, \overline{s}, \mt{con} \; x) = (\kappa, c)
adamc@530	468 }$$
adamc@530	469
adamc@530	470 $$\infer{\Gamma \vdash \tau_1 \to \tau_2 :: \mt{Type}}{
adamc@530	471 \Gamma \vdash \tau_1 :: \mt{Type}
adamc@530	472 & \Gamma \vdash \tau_2 :: \mt{Type}
adamc@530	473 }
adamc@530	474 \quad \infer{\Gamma \vdash x \; ? \: \kappa \to \tau :: \mt{Type}}{
adamc@530	475 \Gamma, x :: \kappa \vdash \tau :: \mt{Type}
adamc@530	476 }
adamc@655	477 \quad \infer{\Gamma \vdash X \longrightarrow \tau :: \mt{Type}}{
adamc@655	478 \Gamma, X \vdash \tau :: \mt{Type}
adamc@655	479 }
adamc@530	480 \quad \infer{\Gamma \vdash \$c :: \mt{Type}}{
adamc@530	481 \Gamma \vdash c :: \{\mt{Type}\}
adamc@530	482 }$$
adamc@530	483
adamc@530	484 $$\infer{\Gamma \vdash c_1 \; c_2 :: \kappa_2}{
adamc@530	485 \Gamma \vdash c_1 :: \kappa_1 \to \kappa_2
adamc@530	486 & \Gamma \vdash c_2 :: \kappa_1
adamc@530	487 }
adamc@530	488 \quad \infer{\Gamma \vdash \lambda x \; :: \; \kappa_1 \Rightarrow c :: \kappa_1 \to \kappa_2}{
adamc@530	489 \Gamma, x :: \kappa_1 \vdash c :: \kappa_2
adamc@530	490 }$$
adamc@530	491
adamc@655	492 $$\infer{\Gamma \vdash c[\kappa'] :: [X \mapsto \kappa']\kappa}{
adamc@655	493 \Gamma \vdash c :: X \to \kappa
adamc@655	494 & \Gamma \vdash \kappa'
adamc@655	495 }
adamc@655	496 \quad \infer{\Gamma \vdash X \Longrightarrow c :: X \to \kappa}{
adamc@655	497 \Gamma, X \vdash c :: \kappa
adamc@655	498 }$$
adamc@655	499
adamc@530	500 $$\infer{\Gamma \vdash () :: \mt{Unit}}{}
adamc@530	501 \quad \infer{\Gamma \vdash \#X :: \mt{Name}}{}$$
adamc@530	502
adamc@530	503 $$\infer{\Gamma \vdash [\overline{c_i = c'_i}] :: \{\kappa\}}{
adamc@530	504 \forall i: \Gamma \vdash c_i : \mt{Name}
adamc@530	505 & \Gamma \vdash c'_i :: \kappa
adamc@530	506 & \forall i \neq j: \Gamma \vdash c_i \sim c_j
adamc@530	507 }
adamc@530	508 \quad \infer{\Gamma \vdash c_1 \rc c_2 :: \{\kappa\}}{
adamc@530	509 \Gamma \vdash c_1 :: \{\kappa\}
adamc@530	510 & \Gamma \vdash c_2 :: \{\kappa\}
adamc@530	511 & \Gamma \vdash c_1 \sim c_2
adamc@530	512 }$$
adamc@530	513
adamc@655	514 $$\infer{\Gamma \vdash \mt{map} :: (\kappa_1 \to \kappa_2) \to \{\kappa_1\} \to \{\kappa_2\}}{}$$
adamc@530	515
adamc@573	516 $$\infer{\Gamma \vdash (\overline c) :: (\kappa_1 \times \ldots \times \kappa_n)}{
adamc@573	517 \forall i: \Gamma \vdash c_i :: \kappa_i
adamc@530	518 }
adamc@573	519 \quad \infer{\Gamma \vdash c.i :: \kappa_i}{
adamc@573	520 \Gamma \vdash c :: (\kappa_1 \times \ldots \times \kappa_n)
adamc@530	521 }$$
adamc@530	522
adamc@655	523 $$\infer{\Gamma \vdash \lambda [c_1 \sim c_2] \Rightarrow \tau :: \mt{Type}}{
adamc@655	524 \Gamma \vdash c_1 :: \{\kappa\}
adamc@530	525 & \Gamma \vdash c_2 :: \{\kappa'\}
adamc@655	526 & \Gamma, c_1 \sim c_2 \vdash \tau :: \mt{Type}
adamc@530	527 }$$
adamc@530	528
adamc@531	529 \subsection{Record Disjointness}
adamc@531	530
adamc@531	531 $$\infer{\Gamma \vdash c_1 \sim c_2}{
adamc@558	532 \Gamma \vdash c_1 \hookrightarrow C_1
adamc@558	533 & \Gamma \vdash c_2 \hookrightarrow C_2
adamc@558	534 & \forall c'_1 \in C_1, c'_2 \in C_2: \Gamma \vdash c'_1 \sim c'_2
adamc@531	535 }
adamc@531	536 \quad \infer{\Gamma \vdash X \sim X'}{
adamc@531	537 X \neq X'
adamc@531	538 }$$
adamc@531	539
adamc@531	540 $$\infer{\Gamma \vdash c_1 \sim c_2}{
adamc@531	541 c'_1 \sim c'_2 \in \Gamma
adamc@558	542 & \Gamma \vdash c'_1 \hookrightarrow C_1
adamc@558	543 & \Gamma \vdash c'_2 \hookrightarrow C_2
adamc@558	544 & c_1 \in C_1
adamc@558	545 & c_2 \in C_2
adamc@531	546 }$$
adamc@531	547
adamc@531	548 $$\infer{\Gamma \vdash c \hookrightarrow \{c\}}{}
adamc@531	549 \quad \infer{\Gamma \vdash [\overline{c = c'}] \hookrightarrow \{\overline{c}\}}{}
adamc@531	550 \quad \infer{\Gamma \vdash c_1 \rc c_2 \hookrightarrow C_1 \cup C_2}{
adamc@531	551 \Gamma \vdash c_1 \hookrightarrow C_1
adamc@531	552 & \Gamma \vdash c_2 \hookrightarrow C_2
adamc@531	553 }
adamc@531	554 \quad \infer{\Gamma \vdash c \hookrightarrow C}{
adamc@531	555 \Gamma \vdash c \equiv c'
adamc@531	556 & \Gamma \vdash c' \hookrightarrow C
adamc@531	557 }
adamc@531	558 \quad \infer{\Gamma \vdash \mt{map} \; f \; c \hookrightarrow C}{
adamc@531	559 \Gamma \vdash c \hookrightarrow C
adamc@531	560 }$$
adamc@531	561
adamc@541	562 \subsection{\label{definitional}Definitional Equality}
adamc@532	563
adamc@655	564 We use $\mathcal C$ to stand for a one-hole context that, when filled, yields a constructor. The notation $\mathcal C[c]$ plugs $c$ into $\mathcal C$. We omit the standard definition of one-hole contexts. We write $[x \mapsto c_1]c_2$ for capture-avoiding substitution of $c_1$ for $x$ in $c_2$, with analogous notation for substituting a kind in a constructor.
adamc@532	565
adamc@532	566 $$\infer{\Gamma \vdash c \equiv c}{}
adamc@532	567 \quad \infer{\Gamma \vdash c_1 \equiv c_2}{
adamc@532	568 \Gamma \vdash c_2 \equiv c_1
adamc@532	569 }
adamc@532	570 \quad \infer{\Gamma \vdash c_1 \equiv c_3}{
adamc@532	571 \Gamma \vdash c_1 \equiv c_2
adamc@532	572 & \Gamma \vdash c_2 \equiv c_3
adamc@532	573 }
adamc@532	574 \quad \infer{\Gamma \vdash \mathcal C[c_1] \equiv \mathcal C[c_2]}{
adamc@532	575 \Gamma \vdash c_1 \equiv c_2
adamc@532	576 }$$
adamc@532	577
adamc@532	578 $$\infer{\Gamma \vdash x \equiv c}{
adamc@532	579 x :: \kappa = c \in \Gamma
adamc@532	580 }
adamc@532	581 \quad \infer{\Gamma \vdash M.x \equiv c}{
adamc@537	582 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	583 & \mt{proj}(M, \overline{s}, \mt{con} \; x) = (\kappa, c)
adamc@532	584 }
adamc@532	585 \quad \infer{\Gamma \vdash (\overline c).i \equiv c_i}{}$$
adamc@532	586
adamc@532	587 $$\infer{\Gamma \vdash (\lambda x :: \kappa \Rightarrow c) \; c' \equiv [x \mapsto c'] c}{}
adamc@655	588 \quad \infer{\Gamma \vdash (X \Longrightarrow c) [\kappa] \equiv [X \mapsto \kappa] c}{}$$
adamc@655	589
adamc@655	590 $$\infer{\Gamma \vdash c_1 \rc c_2 \equiv c_2 \rc c_1}{}
adamc@532	591 \quad \infer{\Gamma \vdash c_1 \rc (c_2 \rc c_3) \equiv (c_1 \rc c_2) \rc c_3}{}$$
adamc@532	592
adamc@532	593 $$\infer{\Gamma \vdash [] \rc c \equiv c}{}
adamc@532	594 \quad \infer{\Gamma \vdash [\overline{c_1 = c'_1}] \rc [\overline{c_2 = c'_2}] \equiv [\overline{c_1 = c'_1}, \overline{c_2 = c'_2}]}{}$$
adamc@532	595
adamc@655	596 $$\infer{\Gamma \vdash \mt{map} \; f \; [] \equiv []}{}
adamc@655	597 \quad \infer{\Gamma \vdash \mt{map} \; f \; ([c_1 = c_2] \rc c) \equiv [c_1 = f \; c_2] \rc \mt{map} \; f \; c}{}$$
adamc@532	598
adamc@532	599 $$\infer{\Gamma \vdash \mt{map} \; (\lambda x \Rightarrow x) \; c \equiv c}{}
adamc@655	600 \quad \infer{\Gamma \vdash \mt{map} \; f \; (\mt{map} \; f' \; c)
adamc@655	601 \equiv \mt{map} \; (\lambda x \Rightarrow f \; (f' \; x)) \; c}{}$$
adamc@532	602
adamc@532	603 $$\infer{\Gamma \vdash \mt{map} \; f \; (c_1 \rc c_2) \equiv \mt{map} \; f \; c_1 \rc \mt{map} \; f \; c_2}{}$$
adamc@531	604
adamc@534	605 \subsection{Expression Typing}
adamc@533	606
adamc@533	607 We assume the existence of a function $T$ assigning types to literal constants. It maps integer constants to $\mt{Basis}.\mt{int}$, float constants to $\mt{Basis}.\mt{float}$, and string constants to $\mt{Basis}.\mt{string}$.
adamc@533	608
adamc@533	609 We also refer to a function $\mathcal I$, such that $\mathcal I(\tau)$ ``uses an oracle'' to instantiate all constructor function arguments at the beginning of $\tau$ that are marked implicit; i.e., replace $x_1 ::: \kappa_1 \to \ldots \to x_n ::: \kappa_n \to \tau$ with $[x_1 \mapsto c_1]\ldots[x_n \mapsto c_n]\tau$, where the $c_i$s are inferred and $\tau$ does not start like $x ::: \kappa \to \tau'$.
adamc@533	610
adamc@533	611 $$\infer{\Gamma \vdash e : \tau : \tau}{
adamc@533	612 \Gamma \vdash e : \tau
adamc@533	613 }
adamc@533	614 \quad \infer{\Gamma \vdash e : \tau}{
adamc@533	615 \Gamma \vdash e : \tau'
adamc@533	616 & \Gamma \vdash \tau' \equiv \tau
adamc@533	617 }
adamc@533	618 \quad \infer{\Gamma \vdash \ell : T(\ell)}{}$$
adamc@533	619
adamc@533	620 $$\infer{\Gamma \vdash x : \mathcal I(\tau)}{
adamc@533	621 x : \tau \in \Gamma
adamc@533	622 }
adamc@533	623 \quad \infer{\Gamma \vdash M.x : \mathcal I(\tau)}{
adamc@537	624 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	625 & \mt{proj}(M, \overline{s}, \mt{val} \; x) = \tau
adamc@533	626 }
adamc@533	627 \quad \infer{\Gamma \vdash X : \mathcal I(\tau)}{
adamc@533	628 X : \tau \in \Gamma
adamc@533	629 }
adamc@533	630 \quad \infer{\Gamma \vdash M.X : \mathcal I(\tau)}{
adamc@537	631 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	632 & \mt{proj}(M, \overline{s}, \mt{val} \; X) = \tau
adamc@533	633 }$$
adamc@533	634
adamc@533	635 $$\infer{\Gamma \vdash e_1 \; e_2 : \tau_2}{
adamc@533	636 \Gamma \vdash e_1 : \tau_1 \to \tau_2
adamc@533	637 & \Gamma \vdash e_2 : \tau_1
adamc@533	638 }
adamc@533	639 \quad \infer{\Gamma \vdash \lambda x : \tau_1 \Rightarrow e : \tau_1 \to \tau_2}{
adamc@533	640 \Gamma, x : \tau_1 \vdash e : \tau_2
adamc@533	641 }$$
adamc@533	642
adamc@533	643 $$\infer{\Gamma \vdash e [c] : [x \mapsto c]\tau}{
adamc@533	644 \Gamma \vdash e : x :: \kappa \to \tau
adamc@533	645 & \Gamma \vdash c :: \kappa
adamc@533	646 }
adamc@533	647 \quad \infer{\Gamma \vdash \lambda x \; ? \; \kappa \Rightarrow e : x \; ? \; \kappa \to \tau}{
adamc@533	648 \Gamma, x :: \kappa \vdash e : \tau
adamc@533	649 }$$
adamc@533	650
adamc@655	651 $$\infer{\Gamma \vdash e [\kappa] : [X \mapsto \kappa]\tau}{
adamc@655	652 \Gamma \vdash e : X \longrightarrow \tau
adamc@655	653 & \Gamma \vdash \kappa
adamc@655	654 }
adamc@655	655 \quad \infer{\Gamma \vdash X \Longrightarrow e : X \longrightarrow \tau}{
adamc@655	656 \Gamma, X \vdash e : \tau
adamc@655	657 }$$
adamc@655	658
adamc@533	659 $$\infer{\Gamma \vdash \{\overline{c = e}\} : \{\overline{c : \tau}\}}{
adamc@533	660 \forall i: \Gamma \vdash c_i :: \mt{Name}
adamc@533	661 & \Gamma \vdash e_i : \tau_i
adamc@533	662 & \forall i \neq j: \Gamma \vdash c_i \sim c_j
adamc@533	663 }
adamc@533	664 \quad \infer{\Gamma \vdash e.c : \tau}{
adamc@533	665 \Gamma \vdash e : \$([c = \tau] \rc c')
adamc@533	666 }
adamc@533	667 \quad \infer{\Gamma \vdash e_1 \rc e_2 : \$(c_1 \rc c_2)}{
adamc@533	668 \Gamma \vdash e_1 : \$c_1
adamc@533	669 & \Gamma \vdash e_2 : \$c_2
adamc@573	670 & \Gamma \vdash c_1 \sim c_2
adamc@533	671 }$$
adamc@533	672
adamc@533	673 $$\infer{\Gamma \vdash e \rcut c : \$c'}{
adamc@533	674 \Gamma \vdash e : \$([c = \tau] \rc c')
adamc@533	675 }
adamc@533	676 \quad \infer{\Gamma \vdash e \rcutM c : \$c'}{
adamc@533	677 \Gamma \vdash e : \$(c \rc c')
adamc@533	678 }$$
adamc@533	679
adamc@533	680 $$\infer{\Gamma \vdash \mt{let} \; \overline{ed} \; \mt{in} \; e \; \mt{end} : \tau}{
adamc@533	681 \Gamma \vdash \overline{ed} \leadsto \Gamma'
adamc@533	682 & \Gamma' \vdash e : \tau
adamc@533	683 }
adamc@533	684 \quad \infer{\Gamma \vdash \mt{case} \; e \; \mt{of} \; \overline{p \Rightarrow e} : \tau}{
adamc@533	685 \forall i: \Gamma \vdash p_i \leadsto \Gamma_i, \tau'
adamc@533	686 & \Gamma_i \vdash e_i : \tau
adamc@533	687 }$$
adamc@533	688
adamc@573	689 $$\infer{\Gamma \vdash \lambda [c_1 \sim c_2] \Rightarrow e : \lambda [c_1 \sim c_2] \Rightarrow \tau}{
adamc@533	690 \Gamma \vdash c_1 :: \{\kappa\}
adamc@655	691 & \Gamma \vdash c_2 :: \{\kappa'\}
adamc@533	692 & \Gamma, c_1 \sim c_2 \vdash e : \tau
adamc@662	693 }
adamc@662	694 \quad \infer{\Gamma \vdash e \; ! : \tau}{
adamc@662	695 \Gamma \vdash e : [c_1 \sim c_2] \Rightarrow \tau
adamc@662	696 & \Gamma \vdash c_1 \sim c_2
adamc@533	697 }$$
adamc@533	698
adamc@534	699 \subsection{Pattern Typing}
adamc@534	700
adamc@534	701 $$\infer{\Gamma \vdash \_ \leadsto \Gamma; \tau}{}
adamc@534	702 \quad \infer{\Gamma \vdash x \leadsto \Gamma, x : \tau; \tau}{}
adamc@534	703 \quad \infer{\Gamma \vdash \ell \leadsto \Gamma; T(\ell)}{}$$
adamc@534	704
adamc@534	705 $$\infer{\Gamma \vdash X \leadsto \Gamma; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@534	706 X : \overline{x ::: \mt{Type}} \to \tau \in \Gamma
adamc@534	707 & \textrm{$\tau$ not a function type}
adamc@534	708 }
adamc@534	709 \quad \infer{\Gamma \vdash X \; p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@534	710 X : \overline{x ::: \mt{Type}} \to \tau'' \to \tau \in \Gamma
adamc@534	711 & \Gamma \vdash p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau''
adamc@534	712 }$$
adamc@534	713
adamc@534	714 $$\infer{\Gamma \vdash M.X \leadsto \Gamma; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@537	715 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	716 & \mt{proj}(M, \overline{s}, \mt{val} \; X) = \overline{x ::: \mt{Type}} \to \tau
adamc@534	717 & \textrm{$\tau$ not a function type}
adamc@534	718 }$$
adamc@534	719
adamc@534	720 $$\infer{\Gamma \vdash M.X \; p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@537	721 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	722 & \mt{proj}(M, \overline{s}, \mt{val} \; X) = \overline{x ::: \mt{Type}} \to \tau'' \to \tau
adamc@534	723 & \Gamma \vdash p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau''
adamc@534	724 }$$
adamc@534	725
adamc@534	726 $$\infer{\Gamma \vdash \{\overline{x = p}\} \leadsto \Gamma_n; \{\overline{x = \tau}\}}{
adamc@534	727 \Gamma_0 = \Gamma
adamc@534	728 & \forall i: \Gamma_i \vdash p_i \leadsto \Gamma_{i+1}; \tau_i
adamc@534	729 }
adamc@534	730 \quad \infer{\Gamma \vdash \{\overline{x = p}, \ldots\} \leadsto \Gamma_n; \$([\overline{x = \tau}] \rc c)}{
adamc@534	731 \Gamma_0 = \Gamma
adamc@534	732 & \forall i: \Gamma_i \vdash p_i \leadsto \Gamma_{i+1}; \tau_i
adamc@534	733 }$$
adamc@534	734
adamc@535	735 \subsection{Declaration Typing}
adamc@535	736
adamc@535	737 We use an auxiliary judgment $\overline{y}; x; \Gamma \vdash \overline{dc} \leadsto \Gamma'$, expressing the enrichment of $\Gamma$ with the types of the datatype constructors $\overline{dc}$, when they are known to belong to datatype $x$ with type parameters $\overline{y}$.
adamc@535	738
adamc@655	739 This is the first judgment where we deal with constructor classes, for the $\mt{class}$ declaration form. We will omit their special handling in this formal specification. Section \ref{typeclasses} gives an informal description of how constructor classes influence type inference.
adamc@535	740
adamc@558	741 We presuppose the existence of a function $\mathcal O$, where $\mathcal O(M, \overline{s})$ implements the $\mt{open}$ declaration by producing a context with the appropriate entry for each available component of module $M$ with signature items $\overline{s}$. Where possible, $\mathcal O$ uses ``transparent'' entries (e.g., an abstract type $M.x$ is mapped to $x :: \mt{Type} = M.x$), so that the relationship with $M$ is maintained. A related function $\mathcal O_c$ builds a context containing the disjointness constraints found in $\overline s$.
adamc@537	742 We write $\kappa_1^n \to \kappa$ as a shorthand, where $\kappa_1^0 \to \kappa = \kappa$ and $\kappa_1^{n+1} \to \kappa_2 = \kappa_1 \to (\kappa_1^n \to \kappa_2)$. We write $\mt{len}(\overline{y})$ for the length of vector $\overline{y}$ of variables.
adamc@535	743
adamc@535	744 $$\infer{\Gamma \vdash \cdot \leadsto \Gamma}{}
adamc@535	745 \quad \infer{\Gamma \vdash d, \overline{d} \leadsto \Gamma''}{
adamc@535	746 \Gamma \vdash d \leadsto \Gamma'
adamc@535	747 & \Gamma' \vdash \overline{d} \leadsto \Gamma''
adamc@535	748 }$$
adamc@535	749
adamc@535	750 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa = c \leadsto \Gamma, x :: \kappa = c}{
adamc@535	751 \Gamma \vdash c :: \kappa
adamc@535	752 }
adamc@535	753 \quad \infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leadsto \Gamma'}{
adamc@535	754 \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} \vdash \overline{dc} \leadsto \Gamma'
adamc@535	755 }$$
adamc@535	756
adamc@535	757 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leadsto \Gamma'}{
adamc@537	758 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	759 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@535	760 & \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} = M.z \vdash \overline{dc} \leadsto \Gamma'
adamc@535	761 }$$
adamc@535	762
adamc@535	763 $$\infer{\Gamma \vdash \mt{val} \; x : \tau = e \leadsto \Gamma, x : \tau}{
adamc@535	764 \Gamma \vdash e : \tau
adamc@535	765 }$$
adamc@535	766
adamc@535	767 $$\infer{\Gamma \vdash \mt{val} \; \mt{rec} \; \overline{x : \tau = e} \leadsto \Gamma, \overline{x : \tau}}{
adamc@535	768 \forall i: \Gamma, \overline{x : \tau} \vdash e_i : \tau_i
adamc@535	769 & \textrm{$e_i$ starts with an expression $\lambda$, optionally preceded by constructor and disjointness $\lambda$s}
adamc@535	770 }$$
adamc@535	771
adamc@535	772 $$\infer{\Gamma \vdash \mt{structure} \; X : S = M \leadsto \Gamma, X : S}{
adamc@535	773 \Gamma \vdash M : S
adamc@558	774 & \textrm{ $M$ not a constant or application}
adamc@535	775 }
adamc@558	776 \quad \infer{\Gamma \vdash \mt{structure} \; X : S = M \leadsto \Gamma, X : \mt{selfify}(X, \overline{s})}{
adamc@558	777 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@539	778 }$$
adamc@539	779
adamc@539	780 $$\infer{\Gamma \vdash \mt{signature} \; X = S \leadsto \Gamma, X = S}{
adamc@535	781 \Gamma \vdash S
adamc@535	782 }$$
adamc@535	783
adamc@537	784 $$\infer{\Gamma \vdash \mt{open} \; M \leadsto \Gamma, \mathcal O(M, \overline{s})}{
adamc@537	785 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@535	786 }$$
adamc@535	787
adamc@535	788 $$\infer{\Gamma \vdash \mt{constraint} \; c_1 \sim c_2 \leadsto \Gamma}{
adamc@535	789 \Gamma \vdash c_1 :: \{\kappa\}
adamc@535	790 & \Gamma \vdash c_2 :: \{\kappa\}
adamc@535	791 & \Gamma \vdash c_1 \sim c_2
adamc@535	792 }
adamc@537	793 \quad \infer{\Gamma \vdash \mt{open} \; \mt{constraints} \; M \leadsto \Gamma, \mathcal O_c(M, \overline{s})}{
adamc@537	794 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@535	795 }$$
adamc@535	796
adamc@784	797 $$\infer{\Gamma \vdash \mt{table} \; x : c \leadsto \Gamma, x : \mt{Basis}.\mt{sql\_table} \; c \; []}{
adamc@535	798 \Gamma \vdash c :: \{\mt{Type}\}
adamc@535	799 }
adamc@784	800 \quad \infer{\Gamma \vdash \mt{view} \; x : c \leadsto \Gamma, x : \mt{Basis}.\mt{sql\_view} \; c}{
adamc@784	801 \Gamma \vdash c :: \{\mt{Type}\}
adamc@784	802 }$$
adamc@784	803
adamc@784	804 $$\infer{\Gamma \vdash \mt{sequence} \; x \leadsto \Gamma, x : \mt{Basis}.\mt{sql\_sequence}}{}$$
adamc@535	805
adamc@535	806 $$\infer{\Gamma \vdash \mt{cookie} \; x : \tau \leadsto \Gamma, x : \mt{Basis}.\mt{http\_cookie} \; \tau}{
adamc@535	807 \Gamma \vdash \tau :: \mt{Type}
adamc@784	808 }
adamc@784	809 \quad \infer{\Gamma \vdash \mt{style} \; x \leadsto \Gamma, x : \mt{Basis}.\mt{css\_class}}{}$$
adamc@535	810
adamc@784	811 $$\infer{\Gamma \vdash \mt{class} \; x :: \kappa = c \leadsto \Gamma, x :: \kappa = c}{
adamc@784	812 \Gamma \vdash c :: \kappa
adamc@535	813 }$$
adamc@535	814
adamc@535	815 $$\infer{\overline{y}; x; \Gamma \vdash \cdot \leadsto \Gamma}{}
adamc@535	816 \quad \infer{\overline{y}; x; \Gamma \vdash X \mid \overline{dc} \leadsto \Gamma', X : \overline{y ::: \mt{Type}} \to x \; \overline{y}}{
adamc@535	817 \overline{y}; x; \Gamma \vdash \overline{dc} \leadsto \Gamma'
adamc@535	818 }
adamc@535	819 \quad \infer{\overline{y}; x; \Gamma \vdash X \; \mt{of} \; \tau \mid \overline{dc} \leadsto \Gamma', X : \overline{y ::: \mt{Type}} \to \tau \to x \; \overline{y}}{
adamc@535	820 \overline{y}; x; \Gamma \vdash \overline{dc} \leadsto \Gamma'
adamc@535	821 }$$
adamc@535	822
adamc@537	823 \subsection{Signature Item Typing}
adamc@537	824
adamc@537	825 We appeal to a signature item analogue of the $\mathcal O$ function from the last subsection.
adamc@537	826
adamc@537	827 $$\infer{\Gamma \vdash \cdot \leadsto \Gamma}{}
adamc@537	828 \quad \infer{\Gamma \vdash s, \overline{s} \leadsto \Gamma''}{
adamc@537	829 \Gamma \vdash s \leadsto \Gamma'
adamc@537	830 & \Gamma' \vdash \overline{s} \leadsto \Gamma''
adamc@537	831 }$$
adamc@537	832
adamc@537	833 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa \leadsto \Gamma, x :: \kappa}{}
adamc@537	834 \quad \infer{\Gamma \vdash \mt{con} \; x :: \kappa = c \leadsto \Gamma, x :: \kappa = c}{
adamc@537	835 \Gamma \vdash c :: \kappa
adamc@537	836 }
adamc@537	837 \quad \infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leadsto \Gamma'}{
adamc@537	838 \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} \vdash \overline{dc} \leadsto \Gamma'
adamc@537	839 }$$
adamc@537	840
adamc@537	841 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leadsto \Gamma'}{
adamc@537	842 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	843 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@537	844 & \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} = M.z \vdash \overline{dc} \leadsto \Gamma'
adamc@537	845 }$$
adamc@537	846
adamc@537	847 $$\infer{\Gamma \vdash \mt{val} \; x : \tau \leadsto \Gamma, x : \tau}{
adamc@537	848 \Gamma \vdash \tau :: \mt{Type}
adamc@537	849 }$$
adamc@537	850
adamc@537	851 $$\infer{\Gamma \vdash \mt{structure} \; X : S \leadsto \Gamma, X : S}{
adamc@537	852 \Gamma \vdash S
adamc@537	853 }
adamc@537	854 \quad \infer{\Gamma \vdash \mt{signature} \; X = S \leadsto \Gamma, X = S}{
adamc@537	855 \Gamma \vdash S
adamc@537	856 }$$
adamc@537	857
adamc@537	858 $$\infer{\Gamma \vdash \mt{include} \; S \leadsto \Gamma, \mathcal O(\overline{s})}{
adamc@537	859 \Gamma \vdash S
adamc@537	860 & \Gamma \vdash S \equiv \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	861 }$$
adamc@537	862
adamc@537	863 $$\infer{\Gamma \vdash \mt{constraint} \; c_1 \sim c_2 \leadsto \Gamma, c_1 \sim c_2}{
adamc@537	864 \Gamma \vdash c_1 :: \{\kappa\}
adamc@537	865 & \Gamma \vdash c_2 :: \{\kappa\}
adamc@537	866 }$$
adamc@537	867
adamc@784	868 $$\infer{\Gamma \vdash \mt{class} \; x :: \kappa = c \leadsto \Gamma, x :: \kappa = c}{
adamc@784	869 \Gamma \vdash c :: \kappa
adamc@537	870 }
adamc@784	871 \quad \infer{\Gamma \vdash \mt{class} \; x :: \kappa \leadsto \Gamma, x :: \kappa}{}$$
adamc@537	872
adamc@536	873 \subsection{Signature Compatibility}
adamc@536	874
adamc@558	875 To simplify the judgments in this section, we assume that all signatures are alpha-varied as necessary to avoid including multiple bindings for the same identifier. This is in addition to the usual alpha-variation of locally-bound variables.
adamc@537	876
adamc@537	877 We rely on a judgment $\Gamma \vdash \overline{s} \leq s'$, which expresses the occurrence in signature items $\overline{s}$ of an item compatible with $s'$. We also use a judgment $\Gamma \vdash \overline{dc} \leq \overline{dc}$, which expresses compatibility of datatype definitions.
adamc@537	878
adamc@536	879 $$\infer{\Gamma \vdash S \equiv S}{}
adamc@536	880 \quad \infer{\Gamma \vdash S_1 \equiv S_2}{
adamc@536	881 \Gamma \vdash S_2 \equiv S_1
adamc@536	882 }
adamc@536	883 \quad \infer{\Gamma \vdash X \equiv S}{
adamc@536	884 X = S \in \Gamma
adamc@536	885 }
adamc@536	886 \quad \infer{\Gamma \vdash M.X \equiv S}{
adamc@537	887 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	888 & \mt{proj}(M, \overline{s}, \mt{signature} \; X) = S
adamc@536	889 }$$
adamc@536	890
adamc@536	891 $$\infer{\Gamma \vdash S \; \mt{where} \; \mt{con} \; x = c \equiv \mt{sig} \; \overline{s^1} \; \mt{con} \; x :: \kappa = c \; \overline{s_2} \; \mt{end}}{
adamc@536	892 \Gamma \vdash S \equiv \mt{sig} \; \overline{s^1} \; \mt{con} \; x :: \kappa \; \overline{s_2} \; \mt{end}
adamc@536	893 & \Gamma \vdash c :: \kappa
adamc@537	894 }
adamc@537	895 \quad \infer{\Gamma \vdash \mt{sig} \; \overline{s^1} \; \mt{include} \; S \; \overline{s^2} \; \mt{end} \equiv \mt{sig} \; \overline{s^1} \; \overline{s} \; \overline{s^2} \; \mt{end}}{
adamc@537	896 \Gamma \vdash S \equiv \mt{sig} \; \overline{s} \; \mt{end}
adamc@536	897 }$$
adamc@536	898
adamc@536	899 $$\infer{\Gamma \vdash S_1 \leq S_2}{
adamc@536	900 \Gamma \vdash S_1 \equiv S_2
adamc@536	901 }
adamc@536	902 \quad \infer{\Gamma \vdash \mt{sig} \; \overline{s} \; \mt{end} \leq \mt{sig} \; \mt{end}}{}
adamc@537	903 \quad \infer{\Gamma \vdash \mt{sig} \; \overline{s} \; \mt{end} \leq \mt{sig} \; s' \; \overline{s'} \; \mt{end}}{
adamc@537	904 \Gamma \vdash \overline{s} \leq s'
adamc@537	905 & \Gamma \vdash s' \leadsto \Gamma'
adamc@537	906 & \Gamma' \vdash \mt{sig} \; \overline{s} \; \mt{end} \leq \mt{sig} \; \overline{s'} \; \mt{end}
adamc@537	907 }$$
adamc@537	908
adamc@537	909 $$\infer{\Gamma \vdash s \; \overline{s} \leq s'}{
adamc@537	910 \Gamma \vdash s \leq s'
adamc@537	911 }
adamc@537	912 \quad \infer{\Gamma \vdash s \; \overline{s} \leq s'}{
adamc@537	913 \Gamma \vdash s \leadsto \Gamma'
adamc@537	914 & \Gamma' \vdash \overline{s} \leq s'
adamc@536	915 }$$
adamc@536	916
adamc@536	917 $$\infer{\Gamma \vdash \mt{functor} (X : S_1) : S_2 \leq \mt{functor} (X : S'_1) : S'_2}{
adamc@536	918 \Gamma \vdash S'_1 \leq S_1
adamc@536	919 & \Gamma, X : S'_1 \vdash S_2 \leq S'_2
adamc@536	920 }$$
adamc@536	921
adamc@537	922 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa \leq \mt{con} \; x :: \kappa}{}
adamc@537	923 \quad \infer{\Gamma \vdash \mt{con} \; x :: \kappa = c \leq \mt{con} \; x :: \kappa}{}
adamc@558	924 \quad \infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leq \mt{con} \; x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type}}{}$$
adamc@537	925
adamc@537	926 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leq \mt{con} \; x :: \mt{Type}^{\mt{len}(y)} \to \mt{Type}}{
adamc@537	927 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	928 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@537	929 }$$
adamc@537	930
adamc@784	931 $$\infer{\Gamma \vdash \mt{class} \; x :: \kappa \leq \mt{con} \; x :: \kappa}{}
adamc@784	932 \quad \infer{\Gamma \vdash \mt{class} \; x :: \kappa = c \leq \mt{con} \; x :: \kappa}{}$$
adamc@537	933
adamc@537	934 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa = c_1 \leq \mt{con} \; x :: \mt{\kappa} = c_2}{
adamc@537	935 \Gamma \vdash c_1 \equiv c_2
adamc@537	936 }
adamc@784	937 \quad \infer{\Gamma \vdash \mt{class} \; x :: \kappa = c_1 \leq \mt{con} \; x :: \kappa = c_2}{
adamc@537	938 \Gamma \vdash c_1 \equiv c_2
adamc@537	939 }$$
adamc@537	940
adamc@537	941 $$\infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leq \mt{datatype} \; x \; \overline{y} = \overline{dc'}}{
adamc@537	942 \Gamma, \overline{y :: \mt{Type}} \vdash \overline{dc} \leq \overline{dc'}
adamc@537	943 }$$
adamc@537	944
adamc@537	945 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leq \mt{datatype} \; x \; \overline{y} = \overline{dc'}}{
adamc@537	946 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	947 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@537	948 & \Gamma, \overline{y :: \mt{Type}} \vdash \overline{dc} \leq \overline{dc'}
adamc@537	949 }$$
adamc@537	950
adamc@537	951 $$\infer{\Gamma \vdash \cdot \leq \cdot}{}
adamc@537	952 \quad \infer{\Gamma \vdash X; \overline{dc} \leq X; \overline{dc'}}{
adamc@537	953 \Gamma \vdash \overline{dc} \leq \overline{dc'}
adamc@537	954 }
adamc@537	955 \quad \infer{\Gamma \vdash X \; \mt{of} \; \tau_1; \overline{dc} \leq X \; \mt{of} \; \tau_2; \overline{dc'}}{
adamc@537	956 \Gamma \vdash \tau_1 \equiv \tau_2
adamc@537	957 & \Gamma \vdash \overline{dc} \leq \overline{dc'}
adamc@537	958 }$$
adamc@537	959
adamc@537	960 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leq \mt{datatype} \; x = \mt{datatype} \; M'.z'}{
adamc@537	961 \Gamma \vdash M.z \equiv M'.z'
adamc@537	962 }$$
adamc@537	963
adamc@537	964 $$\infer{\Gamma \vdash \mt{val} \; x : \tau_1 \leq \mt{val} \; x : \tau_2}{
adamc@537	965 \Gamma \vdash \tau_1 \equiv \tau_2
adamc@537	966 }
adamc@537	967 \quad \infer{\Gamma \vdash \mt{structure} \; X : S_1 \leq \mt{structure} \; X : S_2}{
adamc@537	968 \Gamma \vdash S_1 \leq S_2
adamc@537	969 }
adamc@537	970 \quad \infer{\Gamma \vdash \mt{signature} \; X = S_1 \leq \mt{signature} \; X = S_2}{
adamc@537	971 \Gamma \vdash S_1 \leq S_2
adamc@537	972 & \Gamma \vdash S_2 \leq S_1
adamc@537	973 }$$
adamc@537	974
adamc@537	975 $$\infer{\Gamma \vdash \mt{constraint} \; c_1 \sim c_2 \leq \mt{constraint} \; c'_1 \sim c'_2}{
adamc@537	976 \Gamma \vdash c_1 \equiv c'_1
adamc@537	977 & \Gamma \vdash c_2 \equiv c'_2
adamc@537	978 }$$
adamc@537	979
adamc@655	980 $$\infer{\Gamma \vdash \mt{class} \; x :: \kappa \leq \mt{class} \; x :: \kappa}{}
adamc@655	981 \quad \infer{\Gamma \vdash \mt{class} \; x :: \kappa = c \leq \mt{class} \; x :: \kappa}{}
adamc@655	982 \quad \infer{\Gamma \vdash \mt{class} \; x :: \kappa = c_1 \leq \mt{class} \; x :: \kappa = c_2}{
adamc@537	983 \Gamma \vdash c_1 \equiv c_2
adamc@537	984 }$$
adamc@537	985
adamc@538	986 \subsection{Module Typing}
adamc@538	987
adamc@538	988 We use a helper function $\mt{sigOf}$, which converts declarations and sequences of declarations into their principal signature items and sequences of signature items, respectively.
adamc@538	989
adamc@538	990 $$\infer{\Gamma \vdash M : S}{
adamc@538	991 \Gamma \vdash M : S'
adamc@538	992 & \Gamma \vdash S' \leq S
adamc@538	993 }
adamc@538	994 \quad \infer{\Gamma \vdash \mt{struct} \; \overline{d} \; \mt{end} : \mt{sig} \; \mt{sigOf}(\overline{d}) \; \mt{end}}{
adamc@538	995 \Gamma \vdash \overline{d} \leadsto \Gamma'
adamc@538	996 }
adamc@538	997 \quad \infer{\Gamma \vdash X : S}{
adamc@538	998 X : S \in \Gamma
adamc@538	999 }$$
adamc@538	1000
adamc@538	1001 $$\infer{\Gamma \vdash M.X : S}{
adamc@538	1002 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@538	1003 & \mt{proj}(M, \overline{s}, \mt{structure} \; X) = S
adamc@538	1004 }$$
adamc@538	1005
adamc@538	1006 $$\infer{\Gamma \vdash M_1(M_2) : [X \mapsto M_2]S_2}{
adamc@538	1007 \Gamma \vdash M_1 : \mt{functor}(X : S_1) : S_2
adamc@538	1008 & \Gamma \vdash M_2 : S_1
adamc@538	1009 }
adamc@538	1010 \quad \infer{\Gamma \vdash \mt{functor} (X : S_1) : S_2 = M : \mt{functor} (X : S_1) : S_2}{
adamc@538	1011 \Gamma \vdash S_1
adamc@538	1012 & \Gamma, X : S_1 \vdash S_2
adamc@538	1013 & \Gamma, X : S_1 \vdash M : S_2
adamc@538	1014 }$$
adamc@538	1015
adamc@538	1016 \begin{eqnarray*}
adamc@538	1017 \mt{sigOf}(\cdot) &=& \cdot \\
adamc@538	1018 \mt{sigOf}(s \; \overline{s'}) &=& \mt{sigOf}(s) \; \mt{sigOf}(\overline{s'}) \\
adamc@538	1019 \\
adamc@538	1020 \mt{sigOf}(\mt{con} \; x :: \kappa = c) &=& \mt{con} \; x :: \kappa = c \\
adamc@538	1021 \mt{sigOf}(\mt{datatype} \; x \; \overline{y} = \overline{dc}) &=& \mt{datatype} \; x \; \overline{y} = \overline{dc} \\
adamc@538	1022 \mt{sigOf}(\mt{datatype} \; x = \mt{datatype} \; M.z) &=& \mt{datatype} \; x = \mt{datatype} \; M.z \\
adamc@538	1023 \mt{sigOf}(\mt{val} \; x : \tau = e) &=& \mt{val} \; x : \tau \\
adamc@538	1024 \mt{sigOf}(\mt{val} \; \mt{rec} \; \overline{x : \tau = e}) &=& \overline{\mt{val} \; x : \tau} \\
adamc@538	1025 \mt{sigOf}(\mt{structure} \; X : S = M) &=& \mt{structure} \; X : S \\
adamc@538	1026 \mt{sigOf}(\mt{signature} \; X = S) &=& \mt{signature} \; X = S \\
adamc@538	1027 \mt{sigOf}(\mt{open} \; M) &=& \mt{include} \; S \textrm{ (where $\Gamma \vdash M : S$)} \\
adamc@538	1028 \mt{sigOf}(\mt{constraint} \; c_1 \sim c_2) &=& \mt{constraint} \; c_1 \sim c_2 \\
adamc@538	1029 \mt{sigOf}(\mt{open} \; \mt{constraints} \; M) &=& \cdot \\
adamc@538	1030 \mt{sigOf}(\mt{table} \; x : c) &=& \mt{table} \; x : c \\
adamc@784	1031 \mt{sigOf}(\mt{view} \; x : c) &=& \mt{view} \; x : c \\
adamc@538	1032 \mt{sigOf}(\mt{sequence} \; x) &=& \mt{sequence} \; x \\
adamc@538	1033 \mt{sigOf}(\mt{cookie} \; x : \tau) &=& \mt{cookie} \; x : \tau \\
adamc@784	1034 \mt{sigOf}(\mt{style} \; x) &=& \mt{style} \; x \\
adamc@655	1035 \mt{sigOf}(\mt{class} \; x :: \kappa = c) &=& \mt{class} \; x :: \kappa = c \\
adamc@538	1036 \end{eqnarray*}
adamc@539	1037 \begin{eqnarray*}
adamc@539	1038 \mt{selfify}(M, \cdot) &=& \cdot \\
adamc@558	1039 \mt{selfify}(M, s \; \overline{s'}) &=& \mt{selfify}(M, s) \; \mt{selfify}(M, \overline{s'}) \\
adamc@539	1040 \\
adamc@539	1041 \mt{selfify}(M, \mt{con} \; x :: \kappa) &=& \mt{con} \; x :: \kappa = M.x \\
adamc@539	1042 \mt{selfify}(M, \mt{con} \; x :: \kappa = c) &=& \mt{con} \; x :: \kappa = c \\
adamc@539	1043 \mt{selfify}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc}) &=& \mt{datatype} \; x \; \overline{y} = \mt{datatype} \; M.x \\
adamc@539	1044 \mt{selfify}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z) &=& \mt{datatype} \; x = \mt{datatype} \; M'.z \\
adamc@539	1045 \mt{selfify}(M, \mt{val} \; x : \tau) &=& \mt{val} \; x : \tau \\
adamc@539	1046 \mt{selfify}(M, \mt{structure} \; X : S) &=& \mt{structure} \; X : \mt{selfify}(M.X, \overline{s}) \textrm{ (where $\Gamma \vdash S \equiv \mt{sig} \; \overline{s} \; \mt{end}$)} \\
adamc@539	1047 \mt{selfify}(M, \mt{signature} \; X = S) &=& \mt{signature} \; X = S \\
adamc@539	1048 \mt{selfify}(M, \mt{include} \; S) &=& \mt{include} \; S \\
adamc@539	1049 \mt{selfify}(M, \mt{constraint} \; c_1 \sim c_2) &=& \mt{constraint} \; c_1 \sim c_2 \\
adamc@655	1050 \mt{selfify}(M, \mt{class} \; x :: \kappa) &=& \mt{class} \; x :: \kappa = M.x \\
adamc@655	1051 \mt{selfify}(M, \mt{class} \; x :: \kappa = c) &=& \mt{class} \; x :: \kappa = c \\
adamc@539	1052 \end{eqnarray*}
adamc@539	1053
adamc@540	1054 \subsection{Module Projection}
adamc@540	1055
adamc@540	1056 \begin{eqnarray*}
adamc@540	1057 \mt{proj}(M, \mt{con} \; x :: \kappa \; \overline{s}, \mt{con} \; x) &=& \kappa \\
adamc@540	1058 \mt{proj}(M, \mt{con} \; x :: \kappa = c \; \overline{s}, \mt{con} \; x) &=& (\kappa, c) \\
adamc@540	1059 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{con} \; x) &=& \mt{Type}^{\mt{len}(\overline{y})} \to \mt{Type} \\
adamc@540	1060 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z \; \overline{s}, \mt{con} \; x) &=& (\mt{Type}^{\mt{len}(\overline{y})} \to \mt{Type}, M'.z) \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$} \\
adamc@540	1061 && \textrm{and $\mt{proj}(M', \overline{s'}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})$)} \\
adamc@655	1062 \mt{proj}(M, \mt{class} \; x :: \kappa \; \overline{s}, \mt{con} \; x) &=& \kappa \to \mt{Type} \\
adamc@655	1063 \mt{proj}(M, \mt{class} \; x :: \kappa = c \; \overline{s}, \mt{con} \; x) &=& (\kappa \to \mt{Type}, c) \\
adamc@540	1064 \\
adamc@540	1065 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{datatype} \; x) &=& (\overline{y}, \overline{dc}) \\
adamc@540	1066 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z \; \overline{s}, \mt{con} \; x) &=& \mt{proj}(M', \overline{s'}, \mt{datatype} \; z) \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$)} \\
adamc@540	1067 \\
adamc@540	1068 \mt{proj}(M, \mt{val} \; x : \tau \; \overline{s}, \mt{val} \; x) &=& \tau \\
adamc@540	1069 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to M.x \; \overline y \textrm{ (where $X \in \overline{dc}$)} \\
adamc@540	1070 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to \tau \to M.x \; \overline y \textrm{ (where $X \; \mt{of} \; \tau \in \overline{dc}$)} \\
adamc@540	1071 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to M.x \; \overline y \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$} \\
adamc@540	1072 && \textrm{and $\mt{proj}(M', \overline{s'}, \mt{datatype} \; z = (\overline{y}, \overline{dc})$ and $X \in \overline{dc}$)} \\
adamc@540	1073 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to \tau \to M.x \; \overline y \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$} \\
adamc@558	1074 && \textrm{and $\mt{proj}(M', \overline{s'}, \mt{datatype} \; z = (\overline{y}, \overline{dc})$ and $X \; \mt{of} \; \tau \in \overline{dc}$)} \\
adamc@540	1075 \\
adamc@540	1076 \mt{proj}(M, \mt{structure} \; X : S \; \overline{s}, \mt{structure} \; X) &=& S \\
adamc@540	1077 \\
adamc@540	1078 \mt{proj}(M, \mt{signature} \; X = S \; \overline{s}, \mt{signature} \; X) &=& S \\
adamc@540	1079 \\
adamc@540	1080 \mt{proj}(M, \mt{con} \; x :: \kappa \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540	1081 \mt{proj}(M, \mt{con} \; x :: \kappa = c \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540	1082 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540	1083 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540	1084 \mt{proj}(M, \mt{val} \; x : \tau \; \overline{s}, V) &=& \mt{proj}(M, \overline{s}, V) \\
adamc@540	1085 \mt{proj}(M, \mt{structure} \; X : S \; \overline{s}, V) &=& [X \mapsto M.X]\mt{proj}(M, \overline{s}, V) \\
adamc@540	1086 \mt{proj}(M, \mt{signature} \; X = S \; \overline{s}, V) &=& [X \mapsto M.X]\mt{proj}(M, \overline{s}, V) \\
adamc@540	1087 \mt{proj}(M, \mt{include} \; S \; \overline{s}, V) &=& \mt{proj}(M, \overline{s'} \; \overline{s}, V) \textrm{ (where $\Gamma \vdash S \equiv \mt{sig} \; \overline{s'} \; \mt{end}$)} \\
adamc@540	1088 \mt{proj}(M, \mt{constraint} \; c_1 \sim c_2 \; \overline{s}, V) &=& \mt{proj}(M, \overline{s}, V) \\
adamc@655	1089 \mt{proj}(M, \mt{class} \; x :: \kappa \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@655	1090 \mt{proj}(M, \mt{class} \; x :: \kappa = c \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540	1091 \end{eqnarray*}
adamc@540	1092
adamc@541	1093
adamc@541	1094 \section{Type Inference}
adamc@541	1095
adamc@541	1096 The Ur/Web compiler uses \emph{heuristic type inference}, with no claims of completeness with respect to the declarative specification of the last section. The rules in use seem to work well in practice. This section summarizes those rules, to help Ur programmers predict what will work and what won't.
adamc@541	1097
adamc@541	1098 \subsection{Basic Unification}
adamc@541	1099
adamc@560	1100 Type-checkers for languages based on the Hindley-Milner type discipline, like ML and Haskell, take advantage of \emph{principal typing} properties, making complete type inference relatively straightforward. Inference algorithms are traditionally implemented using type unification variables, at various points asserting equalities between types, in the process discovering the values of type variables. The Ur/Web compiler uses the same basic strategy, but the complexity of the type system rules out easy completeness.
adamc@541	1101
adamc@656	1102 Type-checking can require evaluating recursive functional programs, thanks to the type-level $\mt{map}$ operator. When a unification variable appears in such a type, the next step of computation can be undetermined. The value of that variable might be determined later, but this would be ``too late'' for the unification problems generated at the first occurrence. This is the essential source of incompleteness.
adamc@541	1103
adamc@541	1104 Nonetheless, the unification engine tends to do reasonably well. Unlike in ML, polymorphism is never inferred in definitions; it must be indicated explicitly by writing out constructor-level parameters. By writing these and other annotations, the programmer can generally get the type inference engine to do most of the type reconstruction work.
adamc@541	1105
adamc@541	1106 \subsection{Unifying Record Types}
adamc@541	1107
adamc@570	1108 The type inference engine tries to take advantage of the algebraic rules governing type-level records, as shown in Section \ref{definitional}. When two constructors of record kind are unified, they are reduced to normal forms, with like terms crossed off from each normal form until, hopefully, nothing remains. This cannot be complete, with the inclusion of unification variables. The type-checker can help you understand what goes wrong when the process fails, as it outputs the unmatched remainders of the two normal forms.
adamc@541	1109
adamc@656	1110 \subsection{\label{typeclasses}Constructor Classes}
adamc@541	1111
adamc@784	1112 Ur includes a constructor class facility inspired by Haskell's. The current version is experimental, with very general Prolog-like facilities that can lead to compile-time non-termination.
adamc@541	1113
adamc@784	1114 Constructor classes are integrated with the module system. A constructor class of kind $\kappa$ is just a constructor of kind $\kappa$. By marking such a constructor $c$ as a constructor class, the programmer instructs the type inference engine to, in each scope, record all values of types $c \; c_1 \; \ldots \; c_n$ as \emph{instances}. Any function argument whose type is of such a form is treated as implicit, to be determined by examining the current instance database.
adamc@541	1115
adamc@656	1116 The ``dictionary encoding'' often used in Haskell implementations is made explicit in Ur. Constructor class instances are just properly-typed values, and they can also be considered as ``proofs'' of membership in the class. In some cases, it is useful to pass these proofs around explicitly. An underscore written where a proof is expected will also be inferred, if possible, from the current instance database.
adamc@541	1117
adamc@656	1118 Just as for constructors, constructors classes may be exported from modules, and they may be exported as concrete or abstract. Concrete constructor classes have their ``real'' definitions exposed, so that client code may add new instances freely. Abstract constructor classes are useful as ``predicates'' that can be used to enforce invariants, as we will see in some definitions of SQL syntax in the Ur/Web standard library.
adamc@541	1119
adamc@541	1120 \subsection{Reverse-Engineering Record Types}
adamc@541	1121
adamc@656	1122 It's useful to write Ur functions and functors that take record constructors as inputs, but these constructors can grow quite long, even though their values are often implied by other arguments. The compiler uses a simple heuristic to infer the values of unification variables that are mapped over, yielding known results. If the result is empty, we're done; if it's not empty, we replace a single unification variable with a new constructor formed from three new unification variables, as in $[\alpha = \beta] \rc \gamma$. This process can often be repeated to determine a unification variable fully.
adamc@541	1123
adamc@541	1124 \subsection{Implicit Arguments in Functor Applications}
adamc@541	1125
adamc@656	1126 Constructor, constraint, and constructor class witness members of structures may be omitted, when those structures are used in contexts where their assigned signatures imply how to fill in those missing members. This feature combines well with reverse-engineering to allow for uses of complicated meta-programming functors with little more code than would be necessary to invoke an untyped, ad-hoc code generator.
adamc@541	1127
adamc@541	1128
adamc@542	1129 \section{The Ur Standard Library}
adamc@542	1130
adamc@542	1131 The built-in parts of the Ur/Web standard library are described by the signature in \texttt{lib/basis.urs} in the distribution. A module $\mt{Basis}$ ascribing to that signature is available in the initial environment, and every program is implicitly prefixed by $\mt{open} \; \mt{Basis}$.
adamc@542	1132
adamc@542	1133 Additionally, other common functions that are definable within Ur are included in \texttt{lib/top.urs} and \texttt{lib/top.ur}. This $\mt{Top}$ module is also opened implicitly.
adamc@542	1134
adamc@542	1135 The idea behind Ur is to serve as the ideal host for embedded domain-specific languages. For now, however, the ``generic'' functionality is intermixed with Ur/Web-specific functionality, including in these two library modules. We hope that these generic library components have types that speak for themselves. The next section introduces the Ur/Web-specific elements. Here, we only give the type declarations from the beginning of $\mt{Basis}$.
adamc@542	1136 $$\begin{array}{l}
adamc@542	1137 \mt{type} \; \mt{int} \\
adamc@542	1138 \mt{type} \; \mt{float} \\
adamc@542	1139 \mt{type} \; \mt{string} \\
adamc@542	1140 \mt{type} \; \mt{time} \\
adamc@785	1141 \mt{type} \; \mt{blob} \\
adamc@542	1142 \\
adamc@542	1143 \mt{type} \; \mt{unit} = \{\} \\
adamc@542	1144 \\
adamc@542	1145 \mt{datatype} \; \mt{bool} = \mt{False} \mid \mt{True} \\
adamc@542	1146 \\
adamc@785	1147 \mt{datatype} \; \mt{option} \; \mt{t} = \mt{None} \mid \mt{Some} \; \mt{of} \; \mt{t} \\
adamc@785	1148 \\
adamc@785	1149 \mt{datatype} \; \mt{list} \; \mt{t} = \mt{Nil} \mid \mt{Cons} \; \mt{of} \; \mt{t} \times \mt{list} \; \mt{t}
adamc@542	1150 \end{array}$$
adamc@542	1151
adamc@785	1152 The only unusual element of this list is the $\mt{blob}$ type, which stands for binary sequences.
adamc@785	1153
adamc@657	1154 Another important generic Ur element comes at the beginning of \texttt{top.urs}.
adamc@657	1155
adamc@657	1156 $$\begin{array}{l}
adamc@657	1157 \mt{con} \; \mt{folder} :: \mt{K} \longrightarrow \{\mt{K}\} \to \mt{Type} \\
adamc@657	1158 \\
adamc@657	1159 \mt{val} \; \mt{fold} : \mt{K} \longrightarrow \mt{tf} :: (\{\mt{K}\} \to \mt{Type}) \\
adamc@657	1160 \hspace{.1in} \to (\mt{nm} :: \mt{Name} \to \mt{v} :: \mt{K} \to \mt{r} :: \{\mt{K}\} \to [[\mt{nm}] \sim \mt{r}] \Rightarrow \\
adamc@657	1161 \hspace{.2in} \mt{tf} \; \mt{r} \to \mt{tf} \; ([\mt{nm} = \mt{v}] \rc \mt{r})) \\
adamc@657	1162 \hspace{.1in} \to \mt{tf} \; [] \\
adamc@657	1163 \hspace{.1in} \to \mt{r} :: \{\mt{K}\} \to \mt{folder} \; \mt{r} \to \mt{tf} \; \mt{r}
adamc@657	1164 \end{array}$$
adamc@657	1165
adamc@657	1166 For a type-level record $\mt{r}$, a $\mt{folder} \; \mt{r}$ encodes a permutation of $\mt{r}$'s elements. The $\mt{fold}$ function can be called on a $\mt{folder}$ to iterate over the elements of $\mt{r}$ in that order. $\mt{fold}$ is parameterized on a type-level function to be used to calculate the type of each intermediate result of folding. After processing a subset $\mt{r'}$ of $\mt{r}$'s entries, the type of the accumulator should be $\mt{tf} \; \mt{r'}$. The next two expression arguments to $\mt{fold}$ are the usual step function and initial accumulator, familiar from fold functions over lists. The final two arguments are the record to fold over and a $\mt{folder}$ for it.
adamc@657	1167
adamc@664	1168 The Ur compiler treats $\mt{folder}$ like a constructor class, using built-in rules to infer $\mt{folder}$s for records with known structure. The order in which field names are mentioned in source code is used as a hint about the permutation that the programmer would like.
adamc@657	1169
adamc@542	1170
adamc@542	1171 \section{The Ur/Web Standard Library}
adamc@542	1172
adamc@658	1173 \subsection{Monads}
adamc@658	1174
adamc@658	1175 The Ur Basis defines the monad constructor class from Haskell.
adamc@658	1176
adamc@658	1177 $$\begin{array}{l}
adamc@658	1178 \mt{class} \; \mt{monad} :: \mt{Type} \to \mt{Type} \\
adamc@658	1179 \mt{val} \; \mt{return} : \mt{m} ::: (\mt{Type} \to \mt{Type}) \to \mt{t} ::: \mt{Type} \\
adamc@658	1180 \hspace{.1in} \to \mt{monad} \; \mt{m} \\
adamc@658	1181 \hspace{.1in} \to \mt{t} \to \mt{m} \; \mt{t} \\
adamc@658	1182 \mt{val} \; \mt{bind} : \mt{m} ::: (\mt{Type} \to \mt{Type}) \to \mt{t1} ::: \mt{Type} \to \mt{t2} ::: \mt{Type} \\
adamc@658	1183 \hspace{.1in} \to \mt{monad} \; \mt{m} \\
adamc@658	1184 \hspace{.1in} \to \mt{m} \; \mt{t1} \to (\mt{t1} \to \mt{m} \; \mt{t2}) \\
adamc@658	1185 \hspace{.1in} \to \mt{m} \; \mt{t2}
adamc@658	1186 \end{array}$$
adamc@658	1187
adamc@542	1188 \subsection{Transactions}
adamc@542	1189
adamc@542	1190 Ur is a pure language; we use Haskell's trick to support controlled side effects. The standard library defines a monad $\mt{transaction}$, meant to stand for actions that may be undone cleanly. By design, no other kinds of actions are supported.
adamc@542	1191 $$\begin{array}{l}
adamc@542	1192 \mt{con} \; \mt{transaction} :: \mt{Type} \to \mt{Type} \\
adamc@658	1193 \mt{val} \; \mt{transaction\_monad} : \mt{monad} \; \mt{transaction}
adamc@542	1194 \end{array}$$
adamc@542	1195
adamc@542	1196 \subsection{HTTP}
adamc@542	1197
adamc@542	1198 There are transactions for reading an HTTP header by name and for getting and setting strongly-typed cookies. Cookies may only be created by the $\mt{cookie}$ declaration form, ensuring that they be named consistently based on module structure.
adamc@542	1199 $$\begin{array}{l}
adamc@786	1200 \mt{val} \; \mt{requestHeader} : \mt{string} \to \mt{transaction} \; (\mt{option} \; \mt{string}) \\
adamc@786	1201 \\
adamc@786	1202 \mt{con} \; \mt{http\_cookie} :: \mt{Type} \to \mt{Type} \\
adamc@786	1203 \mt{val} \; \mt{getCookie} : \mt{t} ::: \mt{Type} \to \mt{http\_cookie} \; \mt{t} \to \mt{transaction} \; (\mt{option} \; \mt{t}) \\
adamc@786	1204 \mt{val} \; \mt{setCookie} : \mt{t} ::: \mt{Type} \to \mt{http\_cookie} \; \mt{t} \to \mt{t} \to \mt{transaction} \; \mt{unit}
adamc@786	1205 \end{array}$$
adamc@786	1206
adamc@786	1207 There are also an abstract $\mt{url}$ type and functions for converting to it, based on the policy defined by \texttt{[allow\|deny] url} directives in the project file.
adamc@786	1208 $$\begin{array}{l}
adamc@786	1209 \mt{type} \; \mt{url} \\
adamc@786	1210 \mt{val} \; \mt{bless} : \mt{string} \to \mt{url} \\
adamc@786	1211 \mt{val} \; \mt{checkUrl} : \mt{string} \to \mt{option} \; \mt{url}
adamc@786	1212 \end{array}$$
adamc@786	1213 $\mt{bless}$ raises a runtime error if the string passed to it fails the URL policy.
adamc@786	1214
adamc@786	1215 It's possible for pages to return files of arbitrary MIME types. A file can be input from the user using this data type, along with the $\mt{upload}$ form tag.
adamc@786	1216 $$\begin{array}{l}
adamc@786	1217 \mt{type} \; \mt{file} \\
adamc@786	1218 \mt{val} \; \mt{fileName} : \mt{file} \to \mt{option} \; \mt{string} \\
adamc@786	1219 \mt{val} \; \mt{fileMimeType} : \mt{file} \to \mt{string} \\
adamc@786	1220 \mt{val} \; \mt{fileData} : \mt{file} \to \mt{blob}
adamc@786	1221 \end{array}$$
adamc@786	1222
adamc@786	1223 A blob can be extracted from a file and returned as the page result. There are bless and check functions for MIME types analogous to those for URLs.
adamc@786	1224 $$\begin{array}{l}
adamc@786	1225 \mt{type} \; \mt{mimeType} \\
adamc@786	1226 \mt{val} \; \mt{blessMime} : \mt{string} \to \mt{mimeType} \\
adamc@786	1227 \mt{val} \; \mt{checkMime} : \mt{string} \to \mt{option} \; \mt{mimeType} \\
adamc@786	1228 \mt{val} \; \mt{returnBlob} : \mt{t} ::: \mt{Type} \to \mt{blob} \to \mt{mimeType} \to \mt{transaction} \; \mt{t}
adamc@542	1229 \end{array}$$
adamc@542	1230
adamc@543	1231 \subsection{SQL}
adamc@543	1232
adamc@543	1233 The fundamental unit of interest in the embedding of SQL is tables, described by a type family and creatable only via the $\mt{table}$ declaration form.
adamc@543	1234 $$\begin{array}{l}
adamc@785	1235 \mt{con} \; \mt{sql\_table} :: \{\mt{Type}\} \to \{\{\mt{Unit}\}\} \to \mt{Type}
adamc@785	1236 \end{array}$$
adamc@785	1237 The first argument to this constructor gives the names and types of a table's columns, and the second argument gives the set of valid keys. Keys are the only subsets of the columns that may be referenced as foreign keys. Each key has a name.
adamc@785	1238
adamc@785	1239 We also have the simpler type family of SQL views, which have no keys.
adamc@785	1240 $$\begin{array}{l}
adamc@785	1241 \mt{con} \; \mt{sql\_view} :: \{\mt{Type}\} \to \mt{Type}
adamc@543	1242 \end{array}$$
adamc@543	1243
adamc@785	1244 A multi-parameter type class is used to allow tables and views to be used interchangeably, with a way of extracting the set of columns from each.
adamc@785	1245 $$\begin{array}{l}
adamc@785	1246 \mt{class} \; \mt{fieldsOf} :: \mt{Type} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@785	1247 \mt{val} \; \mt{fieldsOf\_table} : \mt{fs} ::: \{\mt{Type}\} \to \mt{keys} ::: \{\{\mt{Unit}\}\} \to \mt{fieldsOf} \; (\mt{sql\_table} \; \mt{fs} \; \mt{keys}) \; \mt{fs} \\
adamc@785	1248 \mt{val} \; \mt{fieldsOf\_view} : \mt{fs} ::: \{\mt{Type}\} \to \mt{fieldsOf} \; (\mt{sql\_view} \; \mt{fs}) \; \mt{fs}
adamc@785	1249 \end{array}$$
adamc@785	1250
adamc@785	1251 \subsubsection{Table Constraints}
adamc@785	1252
adamc@785	1253 Tables may be declared with constraints, such that database modifications that violate the constraints are blocked. A table may have at most one \texttt{PRIMARY KEY} constraint, which gives the subset of columns that will most often be used to look up individual rows in the table.
adamc@785	1254
adamc@785	1255 $$\begin{array}{l}
adamc@785	1256 \mt{con} \; \mt{primary\_key} :: \{\mt{Type}\} \to \{\{\mt{Unit}\}\} \to \mt{Type} \\
adamc@785	1257 \mt{val} \; \mt{no\_primary\_key} : \mt{fs} ::: \{\mt{Type}\} \to \mt{primary\_key} \; \mt{fs} \; [] \\
adamc@785	1258 \mt{val} \; \mt{primary\_key} : \mt{rest} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \to \mt{key1} :: \mt{Name} \to \mt{keys} :: \{\mt{Type}\} \\
adamc@785	1259 \hspace{.1in} \to [[\mt{key1}] \sim \mt{keys}] \Rightarrow [[\mt{key1} = \mt{t}] \rc \mt{keys} \sim \mt{rest}] \\
adamc@785	1260 \hspace{.1in} \Rightarrow \$([\mt{key1} = \mt{sql\_injectable\_prim} \; \mt{t}] \rc \mt{map} \; \mt{sql\_injectable\_prim} \; \mt{keys}) \\
adamc@785	1261 \hspace{.1in} \to \mt{primary\_key} \; ([\mt{key1} = \mt{t}] \rc \mt{keys} \rc \mt{rest}) \; [\mt{Pkey} = [\mt{key1}] \rc \mt{map} \; (\lambda \_ \Rightarrow ()) \; \mt{keys}]
adamc@785	1262 \end{array}$$
adamc@785	1263 The type class $\mt{sql\_injectable\_prim}$ characterizes which types are allowed in SQL and are not $\mt{option}$ types. In SQL, a \texttt{PRIMARY KEY} constraint enforces after-the-fact that a column may not contain \texttt{NULL}s, but Ur/Web forces that information to be included in table types from the beginning. Thus, the only effect of this kind of constraint in Ur/Web is to enforce uniqueness of the given key within the table.
adamc@785	1264
adamc@785	1265 A type family stands for sets of named constraints of the remaining varieties.
adamc@785	1266 $$\begin{array}{l}
adamc@785	1267 \mt{con} \; \mt{sql\_constraints} :: \{\mt{Type}\} \to \{\{\mt{Unit}\}\} \to \mt{Type}
adamc@785	1268 \end{array}$$
adamc@785	1269 The first argument gives the column types of the table being constrained, and the second argument maps constraint names to the keys that they define. Constraints that don't define keys are mapped to ``empty keys.''
adamc@785	1270
adamc@785	1271 There is a type family of individual, unnamed constraints.
adamc@785	1272 $$\begin{array}{l}
adamc@785	1273 \mt{con} \; \mt{sql\_constraint} :: \{\mt{Type}\} \to \{\mt{Unit}\} \to \mt{Type}
adamc@785	1274 \end{array}$$
adamc@785	1275 The first argument is the same as above, and the second argument gives the key columns for just this constraint.
adamc@785	1276
adamc@785	1277 We have operations for assembling constraints into constraint sets.
adamc@785	1278 $$\begin{array}{l}
adamc@785	1279 \mt{val} \; \mt{no\_constraint} : \mt{fs} ::: \{\mt{Type}\} \to \mt{sql\_constraints} \; \mt{fs} \; [] \\
adamc@785	1280 \mt{val} \; \mt{one\_constraint} : \mt{fs} ::: \{\mt{Type}\} \to \mt{unique} ::: \{\mt{Unit}\} \to \mt{name} :: \mt{Name} \\
adamc@785	1281 \hspace{.1in} \to \mt{sql\_constraint} \; \mt{fs} \; \mt{unique} \to \mt{sql\_constraints} \; \mt{fs} \; [\mt{name} = \mt{unique}] \\
adamc@785	1282 \mt{val} \; \mt{join\_constraints} : \mt{fs} ::: \{\mt{Type}\} \to \mt{uniques1} ::: \{\{\mt{Unit}\}\} \to \mt{uniques2} ::: \{\{\mt{Unit}\}\} \to [\mt{uniques1} \sim \mt{uniques2}] \\
adamc@785	1283 \hspace{.1in} \Rightarrow \mt{sql\_constraints} \; \mt{fs} \; \mt{uniques1} \to \mt{sql\_constraints} \; \mt{fs} \; \mt{uniques2} \to \mt{sql\_constraints} \; \mt{fs} \; (\mt{uniques1} \rc \mt{uniques2})
adamc@785	1284 \end{array}$$
adamc@785	1285
adamc@785	1286 A \texttt{UNIQUE} constraint forces a set of columns to be a key, which means that no combination of column values may occur more than once in the table. The $\mt{unique1}$ and $\mt{unique}$ arguments are separated out only to ensure that empty \texttt{UNIQUE} constraints are rejected.
adamc@785	1287 $$\begin{array}{l}
adamc@785	1288 \mt{val} \; \mt{unique} : \mt{rest} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \to \mt{unique1} :: \mt{Name} \to \mt{unique} :: \{\mt{Type}\} \\
adamc@785	1289 \hspace{.1in} \to [[\mt{unique1}] \sim \mt{unique}] \Rightarrow [[\mt{unique1} = \mt{t}] \rc \mt{unique} \sim \mt{rest}] \\
adamc@785	1290 \hspace{.1in} \Rightarrow \mt{sql\_constraint} \; ([\mt{unique1} = \mt{t}] \rc \mt{unique} \rc \mt{rest}) \; ([\mt{unique1}] \rc \mt{map} \; (\lambda \_ \Rightarrow ()) \; \mt{unique})
adamc@785	1291 \end{array}$$
adamc@785	1292
adamc@785	1293 A \texttt{FOREIGN KEY} constraint connects a set of local columns to a local or remote key, enforcing that the local columns always reference an existent row of the foreign key's table. A local column of type $\mt{t}$ may be linked to a foreign column of type $\mt{option} \; \mt{t}$, and vice versa. We formalize that notion with a type class.
adamc@785	1294 $$\begin{array}{l}
adamc@785	1295 \mt{class} \; \mt{linkable} :: \mt{Type} \to \mt{Type} \to \mt{Type} \\
adamc@785	1296 \mt{val} \; \mt{linkable\_same} : \mt{t} ::: \mt{Type} \to \mt{linkable} \; \mt{t} \; \mt{t} \\
adamc@785	1297 \mt{val} \; \mt{linkable\_from\_nullable} : \mt{t} ::: \mt{Type} \to \mt{linkable} \; (\mt{option} \; \mt{t}) \; \mt{t} \\
adamc@785	1298 \mt{val} \; \mt{linkable\_to\_nullable} : \mt{t} ::: \mt{Type} \to \mt{linkable} \; \mt{t} \; (\mt{option} \; \mt{t})
adamc@785	1299 \end{array}$$
adamc@785	1300
adamc@785	1301 The $\mt{matching}$ type family uses $\mt{linkable}$ to define when two keys match up type-wise.
adamc@785	1302 $$\begin{array}{l}
adamc@785	1303 \mt{con} \; \mt{matching} :: \{\mt{Type}\} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@785	1304 \mt{val} \; \mt{mat\_nil} : \mt{matching} \; [] \; [] \\
adamc@785	1305 \mt{val} \; \mt{mat\_cons} : \mt{t1} ::: \mt{Type} \to \mt{rest1} ::: \{\mt{Type}\} \to \mt{t2} ::: \mt{Type} \to \mt{rest2} ::: \{\mt{Type}\} \to \mt{nm1} :: \mt{Name} \to \mt{nm2} :: \mt{Name} \\
adamc@785	1306 \hspace{.1in} \to [[\mt{nm1}] \sim \mt{rest1}] \Rightarrow [[\mt{nm2}] \sim \mt{rest2}] \Rightarrow \mt{linkable} \; \mt{t1} \; \mt{t2} \to \mt{matching} \; \mt{rest1} \; \mt{rest2} \\
adamc@785	1307 \hspace{.1in} \to \mt{matching} \; ([\mt{nm1} = \mt{t1}] \rc \mt{rest1}) \; ([\mt{nm2} = \mt{t2}] \rc \mt{rest2})
adamc@785	1308 \end{array}$$
adamc@785	1309
adamc@785	1310 SQL provides a number of different propagation modes for \texttt{FOREIGN KEY} constraints, governing what happens when a row containing a still-referenced foreign key value is deleted or modified to have a different key value. The argument of a propagation mode's type gives the local key type.
adamc@785	1311 $$\begin{array}{l}
adamc@785	1312 \mt{con} \; \mt{propagation\_mode} :: \{\mt{Type}\} \to \mt{Type} \\
adamc@785	1313 \mt{val} \; \mt{restrict} : \mt{fs} ::: \{\mt{Type}\} \to \mt{propagation\_mode} \; \mt{fs} \\
adamc@785	1314 \mt{val} \; \mt{cascade} : \mt{fs} ::: \{\mt{Type}\} \to \mt{propagation\_mode} \; \mt{fs} \\
adamc@785	1315 \mt{val} \; \mt{no\_action} : \mt{fs} ::: \{\mt{Type}\} \to \mt{propagation\_mode} \; \mt{fs} \\
adamc@785	1316 \mt{val} \; \mt{set\_null} : \mt{fs} ::: \{\mt{Type}\} \to \mt{propagation\_mode} \; (\mt{map} \; \mt{option} \; \mt{fs})
adamc@785	1317 \end{array}$$
adamc@785	1318
adamc@785	1319 Finally, we put these ingredient together to define the \texttt{FOREIGN KEY} constraint function.
adamc@785	1320 $$\begin{array}{l}
adamc@785	1321 \mt{val} \; \mt{foreign\_key} : \mt{mine1} ::: \mt{Name} \to \mt{t} ::: \mt{Type} \to \mt{mine} ::: \{\mt{Type}\} \to \mt{munused} ::: \{\mt{Type}\} \to \mt{foreign} ::: \{\mt{Type}\} \\
adamc@785	1322 \hspace{.1in} \to \mt{funused} ::: \{\mt{Type}\} \to \mt{nm} ::: \mt{Name} \to \mt{uniques} ::: \{\{\mt{Unit}\}\} \\
adamc@785	1323 \hspace{.1in} \to [[\mt{mine1}] \sim \mt{mine}] \Rightarrow [[\mt{mine1} = \mt{t}] \rc \mt{mine} \sim \mt{munused}] \Rightarrow [\mt{foreign} \sim \mt{funused}] \Rightarrow [[\mt{nm}] \sim \mt{uniques}] \\
adamc@785	1324 \hspace{.1in} \Rightarrow \mt{matching} \; ([\mt{mine1} = \mt{t}] \rc \mt{mine}) \; \mt{foreign} \\
adamc@785	1325 \hspace{.1in} \to \mt{sql\_table} \; (\mt{foreign} \rc \mt{funused}) \; ([\mt{nm} = \mt{map} \; (\lambda \_ \Rightarrow ()) \; \mt{foreign}] \rc \mt{uniques}) \\
adamc@785	1326 \hspace{.1in} \to \{\mt{OnDelete} : \mt{propagation\_mode} \; ([\mt{mine1} = \mt{t}] \rc \mt{mine}), \\
adamc@785	1327 \hspace{.2in} \mt{OnUpdate} : \mt{propagation\_mode} \; ([\mt{mine1} = \mt{t}] \rc \mt{mine})\} \\
adamc@785	1328 \hspace{.1in} \to \mt{sql\_constraint} \; ([\mt{mine1} = \mt{t}] \rc \mt{mine} \rc \mt{munused}) \; []
adamc@785	1329 \end{array}$$
adamc@785	1330
adamc@785	1331 The last kind of constraint is a \texttt{CHECK} constraint, which attaches a boolean invariant over a row's contents. It is defined using the $\mt{sql\_exp}$ type family, which we discuss in more detail below.
adamc@785	1332 $$\begin{array}{l}
adamc@785	1333 \mt{val} \; \mt{check} : \mt{fs} ::: \{\mt{Type}\} \to \mt{sql\_exp} \; [] \; [] \; \mt{fs} \; \mt{bool} \to \mt{sql\_constraint} \; \mt{fs} \; []
adamc@785	1334 \end{array}$$
adamc@785	1335
adamc@785	1336 Section \ref{tables} shows the expanded syntax of the $\mt{table}$ declaration and signature item that includes constraints. There is no other way to use constraints with SQL in Ur/Web.
adamc@785	1337
adamc@784	1338
adamc@543	1339 \subsubsection{Queries}
adamc@543	1340
adamc@543	1341 A final query is constructed via the $\mt{sql\_query}$ function. Constructor arguments respectively specify the table fields we select (as records mapping tables to the subsets of their fields that we choose) and the (always named) extra expressions that we select.
adamc@543	1342 $$\begin{array}{l}
adamc@543	1343 \mt{con} \; \mt{sql\_query} :: \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@543	1344 \mt{val} \; \mt{sql\_query} : \mt{tables} ::: \{\{\mt{Type}\}\} \\
adamc@543	1345 \hspace{.1in} \to \mt{selectedFields} ::: \{\{\mt{Type}\}\} \\
adamc@543	1346 \hspace{.1in} \to \mt{selectedExps} ::: \{\mt{Type}\} \\
adamc@543	1347 \hspace{.1in} \to \{\mt{Rows} : \mt{sql\_query1} \; \mt{tables} \; \mt{selectedFields} \; \mt{selectedExps}, \\
adamc@543	1348 \hspace{.2in} \mt{OrderBy} : \mt{sql\_order\_by} \; \mt{tables} \; \mt{selectedExps}, \\
adamc@543	1349 \hspace{.2in} \mt{Limit} : \mt{sql\_limit}, \\
adamc@543	1350 \hspace{.2in} \mt{Offset} : \mt{sql\_offset}\} \\
adamc@543	1351 \hspace{.1in} \to \mt{sql\_query} \; \mt{selectedFields} \; \mt{selectedExps}
adamc@543	1352 \end{array}$$
adamc@543	1353
adamc@545	1354 Queries are used by folding over their results inside transactions.
adamc@545	1355 $$\begin{array}{l}
adamc@545	1356 \mt{val} \; \mt{query} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \lambda [\mt{tables} \sim \mt{exps}] \Rightarrow \mt{state} ::: \mt{Type} \to \mt{sql\_query} \; \mt{tables} \; \mt{exps} \\
adamc@658	1357 \hspace{.1in} \to (\$(\mt{exps} \rc \mt{map} \; (\lambda \mt{fields} :: \{\mt{Type}\} \Rightarrow \$\mt{fields}) \; \mt{tables}) \\
adamc@545	1358 \hspace{.2in} \to \mt{state} \to \mt{transaction} \; \mt{state}) \\
adamc@545	1359 \hspace{.1in} \to \mt{state} \to \mt{transaction} \; \mt{state}
adamc@545	1360 \end{array}$$
adamc@545	1361
adamc@543	1362 Most of the complexity of the query encoding is in the type $\mt{sql\_query1}$, which includes simple queries and derived queries based on relational operators. Constructor arguments respectively specify the tables we select from, the subset of fields that we keep from each table for the result rows, and the extra expressions that we select.
adamc@543	1363 $$\begin{array}{l}
adamc@543	1364 \mt{con} \; \mt{sql\_query1} :: \{\{\mt{Type}\}\} \to \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@543	1365 \\
adamc@543	1366 \mt{type} \; \mt{sql\_relop} \\
adamc@543	1367 \mt{val} \; \mt{sql\_union} : \mt{sql\_relop} \\
adamc@543	1368 \mt{val} \; \mt{sql\_intersect} : \mt{sql\_relop} \\
adamc@543	1369 \mt{val} \; \mt{sql\_except} : \mt{sql\_relop} \\
adamc@543	1370 \mt{val} \; \mt{sql\_relop} : \mt{tables1} ::: \{\{\mt{Type}\}\} \\
adamc@543	1371 \hspace{.1in} \to \mt{tables2} ::: \{\{\mt{Type}\}\} \\
adamc@543	1372 \hspace{.1in} \to \mt{selectedFields} ::: \{\{\mt{Type}\}\} \\
adamc@543	1373 \hspace{.1in} \to \mt{selectedExps} ::: \{\mt{Type}\} \\
adamc@543	1374 \hspace{.1in} \to \mt{sql\_relop} \\
adamc@543	1375 \hspace{.1in} \to \mt{sql\_query1} \; \mt{tables1} \; \mt{selectedFields} \; \mt{selectedExps} \\
adamc@543	1376 \hspace{.1in} \to \mt{sql\_query1} \; \mt{tables2} \; \mt{selectedFields} \; \mt{selectedExps} \\
adamc@543	1377 \hspace{.1in} \to \mt{sql\_query1} \; \mt{selectedFields} \; \mt{selectedFields} \; \mt{selectedExps}
adamc@543	1378 \end{array}$$
adamc@543	1379
adamc@543	1380 $$\begin{array}{l}
adamc@543	1381 \mt{val} \; \mt{sql\_query1} : \mt{tables} ::: \{\{\mt{Type}\}\} \\
adamc@543	1382 \hspace{.1in} \to \mt{grouped} ::: \{\{\mt{Type}\}\} \\
adamc@543	1383 \hspace{.1in} \to \mt{selectedFields} ::: \{\{\mt{Type}\}\} \\
adamc@543	1384 \hspace{.1in} \to \mt{selectedExps} ::: \{\mt{Type}\} \\
adamc@786	1385 \hspace{.1in} \to \{\mt{From} : \mt{sql\_from\_items} \; \mt{tables}, \\
adamc@543	1386 \hspace{.2in} \mt{Where} : \mt{sql\_exp} \; \mt{tables} \; [] \; [] \; \mt{bool}, \\
adamc@543	1387 \hspace{.2in} \mt{GroupBy} : \mt{sql\_subset} \; \mt{tables} \; \mt{grouped}, \\
adamc@543	1388 \hspace{.2in} \mt{Having} : \mt{sql\_exp} \; \mt{grouped} \; \mt{tables} \; [] \; \mt{bool}, \\
adamc@543	1389 \hspace{.2in} \mt{SelectFields} : \mt{sql\_subset} \; \mt{grouped} \; \mt{selectedFields}, \\
adamc@658	1390 \hspace{.2in} \mt {SelectExps} : \$(\mt{map} \; (\mt{sql\_exp} \; \mt{grouped} \; \mt{tables} \; []) \; \mt{selectedExps}) \} \\
adamc@543	1391 \hspace{.1in} \to \mt{sql\_query1} \; \mt{tables} \; \mt{selectedFields} \; \mt{selectedExps}
adamc@543	1392 \end{array}$$
adamc@543	1393
adamc@543	1394 To encode projection of subsets of fields in $\mt{SELECT}$ clauses, and to encode $\mt{GROUP} \; \mt{BY}$ clauses, we rely on a type family $\mt{sql\_subset}$, capturing what it means for one record of table fields to be a subset of another. The main constructor $\mt{sql\_subset}$ ``proves subset facts'' by requiring a split of a record into kept and dropped parts. The extra constructor $\mt{sql\_subset\_all}$ is a convenience for keeping all fields of a record.
adamc@543	1395 $$\begin{array}{l}
adamc@543	1396 \mt{con} \; \mt{sql\_subset} :: \{\{\mt{Type}\}\} \to \{\{\mt{Type}\}\} \to \mt{Type} \\
adamc@543	1397 \mt{val} \; \mt{sql\_subset} : \mt{keep\_drop} :: \{(\{\mt{Type}\} \times \{\mt{Type}\})\} \\
adamc@543	1398 \hspace{.1in} \to \mt{sql\_subset} \\
adamc@658	1399 \hspace{.2in} (\mt{map} \; (\lambda \mt{fields} :: (\{\mt{Type}\} \times \{\mt{Type}\}) \Rightarrow \mt{fields}.1 \rc \mt{fields}.2)\; \mt{keep\_drop}) \\
adamc@658	1400 \hspace{.2in} (\mt{map} \; (\lambda \mt{fields} :: (\{\mt{Type}\} \times \{\mt{Type}\}) \Rightarrow \mt{fields}.1) \; \mt{keep\_drop}) \\
adamc@543	1401 \mt{val} \; \mt{sql\_subset\_all} : \mt{tables} :: \{\{\mt{Type}\}\} \to \mt{sql\_subset} \; \mt{tables} \; \mt{tables}
adamc@543	1402 \end{array}$$
adamc@543	1403
adamc@560	1404 SQL expressions are used in several places, including $\mt{SELECT}$, $\mt{WHERE}$, $\mt{HAVING}$, and $\mt{ORDER} \; \mt{BY}$ clauses. They reify a fragment of the standard SQL expression language, while making it possible to inject ``native'' Ur values in some places. The arguments to the $\mt{sql\_exp}$ type family respectively give the unrestricted-availability table fields, the table fields that may only be used in arguments to aggregate functions, the available selected expressions, and the type of the expression.
adamc@543	1405 $$\begin{array}{l}
adamc@543	1406 \mt{con} \; \mt{sql\_exp} :: \{\{\mt{Type}\}\} \to \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \to \mt{Type}
adamc@543	1407 \end{array}$$
adamc@543	1408
adamc@543	1409 Any field in scope may be converted to an expression.
adamc@543	1410 $$\begin{array}{l}
adamc@543	1411 \mt{val} \; \mt{sql\_field} : \mt{otherTabs} ::: \{\{\mt{Type}\}\} \to \mt{otherFields} ::: \{\mt{Type}\} \\
adamc@543	1412 \hspace{.1in} \to \mt{fieldType} ::: \mt{Type} \to \mt{agg} ::: \{\{\mt{Type}\}\} \\
adamc@543	1413 \hspace{.1in} \to \mt{exps} ::: \{\mt{Type}\} \\
adamc@543	1414 \hspace{.1in} \to \mt{tab} :: \mt{Name} \to \mt{field} :: \mt{Name} \\
adamc@543	1415 \hspace{.1in} \to \mt{sql\_exp} \; ([\mt{tab} = [\mt{field} = \mt{fieldType}] \rc \mt{otherFields}] \rc \mt{otherTabs}) \; \mt{agg} \; \mt{exps} \; \mt{fieldType}
adamc@543	1416 \end{array}$$
adamc@543	1417
adamc@544	1418 There is an analogous function for referencing named expressions.
adamc@544	1419 $$\begin{array}{l}
adamc@544	1420 \mt{val} \; \mt{sql\_exp} : \mt{tabs} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{t} ::: \mt{Type} \to \mt{rest} ::: \{\mt{Type}\} \to \mt{nm} :: \mt{Name} \\
adamc@544	1421 \hspace{.1in} \to \mt{sql\_exp} \; \mt{tabs} \; \mt{agg} \; ([\mt{nm} = \mt{t}] \rc \mt{rest}) \; \mt{t}
adamc@544	1422 \end{array}$$
adamc@544	1423
adamc@544	1424 Ur values of appropriate types may be injected into SQL expressions.
adamc@544	1425 $$\begin{array}{l}
adamc@786	1426 \mt{class} \; \mt{sql\_injectable\_prim} \\
adamc@786	1427 \mt{val} \; \mt{sql\_bool} : \mt{sql\_injectable\_prim} \; \mt{bool} \\
adamc@786	1428 \mt{val} \; \mt{sql\_int} : \mt{sql\_injectable\_prim} \; \mt{int} \\
adamc@786	1429 \mt{val} \; \mt{sql\_float} : \mt{sql\_injectable\_prim} \; \mt{float} \\
adamc@786	1430 \mt{val} \; \mt{sql\_string} : \mt{sql\_injectable\_prim} \; \mt{string} \\
adamc@786	1431 \mt{val} \; \mt{sql\_time} : \mt{sql\_injectable\_prim} \; \mt{time} \\
adamc@786	1432 \mt{val} \; \mt{sql\_blob} : \mt{sql\_injectable\_prim} \; \mt{blob} \\
adamc@786	1433 \mt{val} \; \mt{sql\_channel} : \mt{t} ::: \mt{Type} \to \mt{sql\_injectable\_prim} \; (\mt{channel} \; \mt{t}) \\
adamc@786	1434 \mt{val} \; \mt{sql\_client} : \mt{sql\_injectable\_prim} \; \mt{client} \\
adamc@786	1435 \\
adamc@544	1436 \mt{class} \; \mt{sql\_injectable} \\
adamc@786	1437 \mt{val} \; \mt{sql\_prim} : \mt{t} ::: \mt{Type} \to \mt{sql\_injectable\_prim} \; \mt{t} \to \mt{sql\_injectable} \; \mt{t} \\
adamc@786	1438 \mt{val} \; \mt{sql\_option\_prim} : \mt{t} ::: \mt{Type} \to \mt{sql\_injectable\_prim} \; \mt{t} \to \mt{sql\_injectable} \; (\mt{option} \; \mt{t}) \\
adamc@786	1439 \\
adamc@544	1440 \mt{val} \; \mt{sql\_inject} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \to \mt{sql\_injectable} \; \mt{t} \\
adamc@544	1441 \hspace{.1in} \to \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t}
adamc@544	1442 \end{array}$$
adamc@544	1443
adamc@544	1444 We have the SQL nullness test, which is necessary because of the strange SQL semantics of equality in the presence of null values.
adamc@544	1445 $$\begin{array}{l}
adamc@544	1446 \mt{val} \; \mt{sql\_is\_null} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544	1447 \hspace{.1in} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; (\mt{option} \; \mt{t}) \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{bool}
adamc@544	1448 \end{array}$$
adamc@544	1449
adamc@559	1450 We have generic nullary, unary, and binary operators.
adamc@544	1451 $$\begin{array}{l}
adamc@544	1452 \mt{con} \; \mt{sql\_nfunc} :: \mt{Type} \to \mt{Type} \\
adamc@544	1453 \mt{val} \; \mt{sql\_current\_timestamp} : \mt{sql\_nfunc} \; \mt{time} \\
adamc@544	1454 \mt{val} \; \mt{sql\_nfunc} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544	1455 \hspace{.1in} \to \mt{sql\_nfunc} \; \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t} \\\end{array}$$
adamc@544	1456
adamc@544	1457 $$\begin{array}{l}
adamc@544	1458 \mt{con} \; \mt{sql\_unary} :: \mt{Type} \to \mt{Type} \to \mt{Type} \\
adamc@544	1459 \mt{val} \; \mt{sql\_not} : \mt{sql\_unary} \; \mt{bool} \; \mt{bool} \\
adamc@544	1460 \mt{val} \; \mt{sql\_unary} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{arg} ::: \mt{Type} \to \mt{res} ::: \mt{Type} \\
adamc@544	1461 \hspace{.1in} \to \mt{sql\_unary} \; \mt{arg} \; \mt{res} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{arg} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{res} \\
adamc@544	1462 \end{array}$$
adamc@544	1463
adamc@544	1464 $$\begin{array}{l}
adamc@544	1465 \mt{con} \; \mt{sql\_binary} :: \mt{Type} \to \mt{Type} \to \mt{Type} \to \mt{Type} \\
adamc@544	1466 \mt{val} \; \mt{sql\_and} : \mt{sql\_binary} \; \mt{bool} \; \mt{bool} \; \mt{bool} \\
adamc@544	1467 \mt{val} \; \mt{sql\_or} : \mt{sql\_binary} \; \mt{bool} \; \mt{bool} \; \mt{bool} \\
adamc@544	1468 \mt{val} \; \mt{sql\_binary} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{arg_1} ::: \mt{Type} \to \mt{arg_2} ::: \mt{Type} \to \mt{res} ::: \mt{Type} \\
adamc@544	1469 \hspace{.1in} \to \mt{sql\_binary} \; \mt{arg_1} \; \mt{arg_2} \; \mt{res} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{arg_1} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{arg_2} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{res}
adamc@544	1470 \end{array}$$
adamc@544	1471
adamc@544	1472 $$\begin{array}{l}
adamc@559	1473 \mt{class} \; \mt{sql\_arith} \\
adamc@559	1474 \mt{val} \; \mt{sql\_int\_arith} : \mt{sql\_arith} \; \mt{int} \\
adamc@559	1475 \mt{val} \; \mt{sql\_float\_arith} : \mt{sql\_arith} \; \mt{float} \\
adamc@559	1476 \mt{val} \; \mt{sql\_neg} : \mt{t} ::: \mt{Type} \to \mt{sql\_arith} \; \mt{t} \to \mt{sql\_unary} \; \mt{t} \; \mt{t} \\
adamc@559	1477 \mt{val} \; \mt{sql\_plus} : \mt{t} ::: \mt{Type} \to \mt{sql\_arith} \; \mt{t} \to \mt{sql\_binary} \; \mt{t} \; \mt{t} \; \mt{t} \\
adamc@559	1478 \mt{val} \; \mt{sql\_minus} : \mt{t} ::: \mt{Type} \to \mt{sql\_arith} \; \mt{t} \to \mt{sql\_binary} \; \mt{t} \; \mt{t} \; \mt{t} \\
adamc@559	1479 \mt{val} \; \mt{sql\_times} : \mt{t} ::: \mt{Type} \to \mt{sql\_arith} \; \mt{t} \to \mt{sql\_binary} \; \mt{t} \; \mt{t} \; \mt{t} \\
adamc@559	1480 \mt{val} \; \mt{sql\_div} : \mt{t} ::: \mt{Type} \to \mt{sql\_arith} \; \mt{t} \to \mt{sql\_binary} \; \mt{t} \; \mt{t} \; \mt{t} \\
adamc@559	1481 \mt{val} \; \mt{sql\_mod} : \mt{sql\_binary} \; \mt{int} \; \mt{int} \; \mt{int}
adamc@559	1482 \end{array}$$
adamc@544	1483
adamc@656	1484 Finally, we have aggregate functions. The $\mt{COUNT(\ast)}$ syntax is handled specially, since it takes no real argument. The other aggregate functions are placed into a general type family, using constructor classes to restrict usage to properly-typed arguments. The key aspect of the $\mt{sql\_aggregate}$ function's type is the shift of aggregate-function-only fields into unrestricted fields.
adamc@544	1485 $$\begin{array}{l}
adamc@544	1486 \mt{val} \; \mt{sql\_count} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{int}
adamc@544	1487 \end{array}$$
adamc@544	1488
adamc@544	1489 $$\begin{array}{l}
adamc@544	1490 \mt{con} \; \mt{sql\_aggregate} :: \mt{Type} \to \mt{Type} \\
adamc@544	1491 \mt{val} \; \mt{sql\_aggregate} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544	1492 \hspace{.1in} \to \mt{sql\_aggregate} \; \mt{t} \to \mt{sql\_exp} \; \mt{agg} \; \mt{agg} \; \mt{exps} \; \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t}
adamc@544	1493 \end{array}$$
adamc@544	1494
adamc@544	1495 $$\begin{array}{l}
adamc@544	1496 \mt{class} \; \mt{sql\_summable} \\
adamc@544	1497 \mt{val} \; \mt{sql\_summable\_int} : \mt{sql\_summable} \; \mt{int} \\
adamc@544	1498 \mt{val} \; \mt{sql\_summable\_float} : \mt{sql\_summable} \; \mt{float} \\
adamc@544	1499 \mt{val} \; \mt{sql\_avg} : \mt{t} ::: \mt{Type} \to \mt{sql\_summable} \; \mt{t} \to \mt{sql\_aggregate} \; \mt{t} \\
adamc@544	1500 \mt{val} \; \mt{sql\_sum} : \mt{t} ::: \mt{Type} \to \mt{sql\_summable} \mt{t} \to \mt{sql\_aggregate} \; \mt{t}
adamc@544	1501 \end{array}$$
adamc@544	1502
adamc@544	1503 $$\begin{array}{l}
adamc@544	1504 \mt{class} \; \mt{sql\_maxable} \\
adamc@544	1505 \mt{val} \; \mt{sql\_maxable\_int} : \mt{sql\_maxable} \; \mt{int} \\
adamc@544	1506 \mt{val} \; \mt{sql\_maxable\_float} : \mt{sql\_maxable} \; \mt{float} \\
adamc@544	1507 \mt{val} \; \mt{sql\_maxable\_string} : \mt{sql\_maxable} \; \mt{string} \\
adamc@544	1508 \mt{val} \; \mt{sql\_maxable\_time} : \mt{sql\_maxable} \; \mt{time} \\
adamc@544	1509 \mt{val} \; \mt{sql\_max} : \mt{t} ::: \mt{Type} \to \mt{sql\_maxable} \; \mt{t} \to \mt{sql\_aggregate} \; \mt{t} \\
adamc@544	1510 \mt{val} \; \mt{sql\_min} : \mt{t} ::: \mt{Type} \to \mt{sql\_maxable} \; \mt{t} \to \mt{sql\_aggregate} \; \mt{t}
adamc@544	1511 \end{array}$$
adamc@544	1512
adamc@786	1513 \texttt{FROM} clauses are specified using a type family.
adamc@786	1514 $$\begin{array}{l}
adamc@786	1515 \mt{con} \; \mt{sql\_from\_items} :: \{\{\mt{Type}\}\} \to \mt{Type} \\
adamc@786	1516 \mt{val} \; \mt{sql\_from\_table} : \mt{t} ::: \mt{Type} \to \mt{fs} ::: \{\mt{Type}\} \to \mt{fieldsOf} \; \mt{t} \; \mt{fs} \to \mt{name} :: \mt{Name} \to \mt{t} \to \mt{sql\_from\_items} \; [\mt{name} = \mt{fs}] \\
adamc@786	1517 \mt{val} \; \mt{sql\_from\_comma} : \mt{tabs1} ::: \{\{\mt{Type}\}\} \to \mt{tabs2} ::: \{\{\mt{Type}\}\} \to [\mt{tabs1} \sim \mt{tabs2}] \\
adamc@786	1518 \hspace{.1in} \Rightarrow \mt{sql\_from\_items} \; \mt{tabs1} \to \mt{sql\_from\_items} \; \mt{tabs2} \\
adamc@786	1519 \hspace{.1in} \to \mt{sql\_from\_items} \; (\mt{tabs1} \rc \mt{tabs2}) \\
adamc@786	1520 \mt{val} \; \mt{sql\_inner\_join} : \mt{tabs1} ::: \{\{\mt{Type}\}\} \to \mt{tabs2} ::: \{\{\mt{Type}\}\} \to [\mt{tabs1} \sim \mt{tabs2}] \\
adamc@786	1521 \hspace{.1in} \Rightarrow \mt{sql\_from\_items} \; \mt{tabs1} \to \mt{sql\_from\_items} \; \mt{tabs2} \\
adamc@786	1522 \hspace{.1in} \to \mt{sql\_exp} \; (\mt{tabs1} \rc \mt{tabs2}) \; [] \; [] \; \mt{bool} \\
adamc@786	1523 \hspace{.1in} \to \mt{sql\_from\_items} \; (\mt{tabs1} \rc \mt{tabs2})
adamc@786	1524 \end{array}$$
adamc@786	1525
adamc@786	1526 Besides these basic cases, outer joins are supported, which requires a type class for turning non-$\mt{option}$ columns into $\mt{option}$ columns.
adamc@786	1527 $$\begin{array}{l}
adamc@786	1528 \mt{class} \; \mt{nullify} :: \mt{Type} \to \mt{Type} \to \mt{Type} \\
adamc@786	1529 \mt{val} \; \mt{nullify\_option} : \mt{t} ::: \mt{Type} \to \mt{nullify} \; (\mt{option} \; \mt{t}) \; (\mt{option} \; \mt{t}) \\
adamc@786	1530 \mt{val} \; \mt{nullify\_prim} : \mt{t} ::: \mt{Type} \to \mt{sql\_injectable\_prim} \; \mt{t} \to \mt{nullify} \; \mt{t} \; (\mt{option} \; \mt{t})
adamc@786	1531 \end{array}$$
adamc@786	1532
adamc@786	1533 Left, right, and full outer joins can now be expressed using functions that accept records of $\mt{nullify}$ instances. Here, we give only the type for a left join as an example.
adamc@786	1534
adamc@786	1535 $$\begin{array}{l}
adamc@786	1536 \mt{val} \; \mt{sql\_left\_join} : \mt{tabs1} ::: \{\{\mt{Type}\}\} \to \mt{tabs2} ::: \{\{(\mt{Type} \times \mt{Type})\}\} \to [\mt{tabs1} \sim \mt{tabs2}] \\
adamc@786	1537 \hspace{.1in} \Rightarrow \$(\mt{map} \; (\lambda \mt{r} \Rightarrow \$(\mt{map} \; (\lambda \mt{p} :: (\mt{Type} \times \mt{Type}) \Rightarrow \mt{nullify} \; \mt{p}.1 \; \mt{p}.2) \; \mt{r})) \; \mt{tabs2}) \\
adamc@786	1538 \hspace{.1in} \to \mt{sql\_from\_items} \; \mt{tabs1} \to \mt{sql\_from\_items} \; (\mt{map} \; (\mt{map} \; (\lambda \mt{p} :: (\mt{Type} \times \mt{Type}) \Rightarrow \mt{p}.1)) \; \mt{tabs2}) \\
adamc@786	1539 \hspace{.1in} \to \mt{sql\_exp} \; (\mt{tabs1} \rc \mt{map} \; (\mt{map} \; (\lambda \mt{p} :: (\mt{Type} \times \mt{Type}) \Rightarrow \mt{p}.1)) \; \mt{tabs2}) \; [] \; [] \; \mt{bool} \\
adamc@786	1540 \hspace{.1in} \to \mt{sql\_from\_items} \; (\mt{tabs1} \rc \mt{map} \; (\mt{map} \; (\lambda \mt{p} :: (\mt{Type} \times \mt{Type}) \Rightarrow \mt{p}.2)) \; \mt{tabs2})
adamc@786	1541 \end{array}$$
adamc@786	1542
adamc@544	1543 We wrap up the definition of query syntax with the types used in representing $\mt{ORDER} \; \mt{BY}$, $\mt{LIMIT}$, and $\mt{OFFSET}$ clauses.
adamc@544	1544 $$\begin{array}{l}
adamc@544	1545 \mt{type} \; \mt{sql\_direction} \\
adamc@544	1546 \mt{val} \; \mt{sql\_asc} : \mt{sql\_direction} \\
adamc@544	1547 \mt{val} \; \mt{sql\_desc} : \mt{sql\_direction} \\
adamc@544	1548 \\
adamc@544	1549 \mt{con} \; \mt{sql\_order\_by} :: \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@544	1550 \mt{val} \; \mt{sql\_order\_by\_Nil} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{exps} :: \{\mt{Type}\} \to \mt{sql\_order\_by} \; \mt{tables} \; \mt{exps} \\
adamc@544	1551 \mt{val} \; \mt{sql\_order\_by\_Cons} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544	1552 \hspace{.1in} \to \mt{sql\_exp} \; \mt{tables} \; [] \; \mt{exps} \; \mt{t} \to \mt{sql\_direction} \to \mt{sql\_order\_by} \; \mt{tables} \; \mt{exps} \to \mt{sql\_order\_by} \; \mt{tables} \; \mt{exps} \\
adamc@544	1553 \\
adamc@544	1554 \mt{type} \; \mt{sql\_limit} \\
adamc@544	1555 \mt{val} \; \mt{sql\_no\_limit} : \mt{sql\_limit} \\
adamc@544	1556 \mt{val} \; \mt{sql\_limit} : \mt{int} \to \mt{sql\_limit} \\
adamc@544	1557 \\
adamc@544	1558 \mt{type} \; \mt{sql\_offset} \\
adamc@544	1559 \mt{val} \; \mt{sql\_no\_offset} : \mt{sql\_offset} \\
adamc@544	1560 \mt{val} \; \mt{sql\_offset} : \mt{int} \to \mt{sql\_offset}
adamc@544	1561 \end{array}$$
adamc@544	1562
adamc@545	1563
adamc@545	1564 \subsubsection{DML}
adamc@545	1565
adamc@545	1566 The Ur/Web library also includes an embedding of a fragment of SQL's DML, the Data Manipulation Language, for modifying database tables. Any piece of DML may be executed in a transaction.
adamc@545	1567
adamc@545	1568 $$\begin{array}{l}
adamc@545	1569 \mt{type} \; \mt{dml} \\
adamc@545	1570 \mt{val} \; \mt{dml} : \mt{dml} \to \mt{transaction} \; \mt{unit}
adamc@545	1571 \end{array}$$
adamc@545	1572
adamc@545	1573 Properly-typed records may be used to form $\mt{INSERT}$ commands.
adamc@545	1574 $$\begin{array}{l}
adamc@545	1575 \mt{val} \; \mt{insert} : \mt{fields} ::: \{\mt{Type}\} \to \mt{sql\_table} \; \mt{fields} \\
adamc@658	1576 \hspace{.1in} \to \$(\mt{map} \; (\mt{sql\_exp} \; [] \; [] \; []) \; \mt{fields}) \to \mt{dml}
adamc@545	1577 \end{array}$$
adamc@545	1578
adamc@545	1579 An $\mt{UPDATE}$ command is formed from a choice of which table fields to leave alone and which to change, along with an expression to use to compute the new value of each changed field and a $\mt{WHERE}$ clause.
adamc@545	1580 $$\begin{array}{l}
adamc@545	1581 \mt{val} \; \mt{update} : \mt{unchanged} ::: \{\mt{Type}\} \to \mt{changed} :: \{\mt{Type}\} \to \lambda [\mt{changed} \sim \mt{unchanged}] \\
adamc@658	1582 \hspace{.1in} \Rightarrow \$(\mt{map} \; (\mt{sql\_exp} \; [\mt{T} = \mt{changed} \rc \mt{unchanged}] \; [] \; []) \; \mt{changed}) \\
adamc@545	1583 \hspace{.1in} \to \mt{sql\_table} \; (\mt{changed} \rc \mt{unchanged}) \to \mt{sql\_exp} \; [\mt{T} = \mt{changed} \rc \mt{unchanged}] \; [] \; [] \; \mt{bool} \to \mt{dml}
adamc@545	1584 \end{array}$$
adamc@545	1585
adamc@545	1586 A $\mt{DELETE}$ command is formed from a table and a $\mt{WHERE}$ clause.
adamc@545	1587 $$\begin{array}{l}
adamc@545	1588 \mt{val} \; \mt{delete} : \mt{fields} ::: \{\mt{Type}\} \to \mt{sql\_table} \; \mt{fields} \to \mt{sql\_exp} \; [\mt{T} = \mt{fields}] \; [] \; [] \; \mt{bool} \to \mt{dml}
adamc@545	1589 \end{array}$$
adamc@545	1590
adamc@546	1591 \subsubsection{Sequences}
adamc@546	1592
adamc@546	1593 SQL sequences are counters with concurrency control, often used to assign unique IDs. Ur/Web supports them via a simple interface. The only way to create a sequence is with the $\mt{sequence}$ declaration form.
adamc@546	1594
adamc@546	1595 $$\begin{array}{l}
adamc@546	1596 \mt{type} \; \mt{sql\_sequence} \\
adamc@546	1597 \mt{val} \; \mt{nextval} : \mt{sql\_sequence} \to \mt{transaction} \; \mt{int}
adamc@546	1598 \end{array}$$
adamc@546	1599
adamc@546	1600
adamc@547	1601 \subsection{XML}
adamc@547	1602
adamc@547	1603 Ur/Web's library contains an encoding of XML syntax and semantic constraints. We make no effort to follow the standards governing XML schemas. Rather, XML fragments are viewed more as values of ML datatypes, and we only track which tags are allowed inside which other tags.
adamc@547	1604
adamc@547	1605 The basic XML type family has arguments respectively indicating the \emph{context} of a fragment, the fields that the fragment expects to be bound on entry (and their types), and the fields that the fragment will bind (and their types). Contexts are a record-based ``poor man's subtyping'' encoding, with each possible set of valid tags corresponding to a different context record. The arguments dealing with field binding are only relevant to HTML forms.
adamc@547	1606 $$\begin{array}{l}
adamc@547	1607 \mt{con} \; \mt{xml} :: \{\mt{Unit}\} \to \{\mt{Type}\} \to \{\mt{Type}\} \to \mt{Type}
adamc@547	1608 \end{array}$$
adamc@547	1609
adamc@547	1610 We also have a type family of XML tags, indexed respectively by the record of optional attributes accepted by the tag, the context in which the tag may be placed, the context required of children of the tag, which form fields the tag uses, and which fields the tag defines.
adamc@547	1611 $$\begin{array}{l}
adamc@547	1612 \mt{con} \; \mt{tag} :: \{\mt{Type}\} \to \{\mt{Unit}\} \to \{\mt{Unit}\} \to \{\mt{Type}\} \to \{\mt{Type}\} \to \mt{Type}
adamc@547	1613 \end{array}$$
adamc@547	1614
adamc@547	1615 Literal text may be injected into XML as ``CDATA.''
adamc@547	1616 $$\begin{array}{l}
adamc@547	1617 \mt{val} \; \mt{cdata} : \mt{ctx} ::: \{\mt{Unit}\} \to \mt{use} ::: \{\mt{Type}\} \to \mt{string} \to \mt{xml} \; \mt{ctx} \; \mt{use} \; []
adamc@547	1618 \end{array}$$
adamc@547	1619
adamc@547	1620 There is a function for producing an XML tree with a particular tag at its root.
adamc@547	1621 $$\begin{array}{l}
adamc@547	1622 \mt{val} \; \mt{tag} : \mt{attrsGiven} ::: \{\mt{Type}\} \to \mt{attrsAbsent} ::: \{\mt{Type}\} \to \mt{ctxOuter} ::: \{\mt{Unit}\} \to \mt{ctxInner} ::: \{\mt{Unit}\} \\
adamc@547	1623 \hspace{.1in} \to \mt{useOuter} ::: \{\mt{Type}\} \to \mt{useInner} ::: \{\mt{Type}\} \to \mt{bindOuter} ::: \{\mt{Type}\} \to \mt{bindInner} ::: \{\mt{Type}\} \\
adamc@787	1624 \hspace{.1in} \to \lambda [\mt{attrsGiven} \sim \mt{attrsAbsent}] \; [\mt{useOuter} \sim \mt{useInner}] \; [\mt{bindOuter} \sim \mt{bindInner}] \\
adamc@787	1625 \hspace{.1in} \Rightarrow \mt{option} \; \mt{css\_class} \\
adamc@787	1626 \hspace{.1in} \to \$\mt{attrsGiven} \\
adamc@547	1627 \hspace{.1in} \to \mt{tag} \; (\mt{attrsGiven} \rc \mt{attrsAbsent}) \; \mt{ctxOuter} \; \mt{ctxInner} \; \mt{useOuter} \; \mt{bindOuter} \\
adamc@547	1628 \hspace{.1in} \to \mt{xml} \; \mt{ctxInner} \; \mt{useInner} \; \mt{bindInner} \to \mt{xml} \; \mt{ctxOuter} \; (\mt{useOuter} \rc \mt{useInner}) \; (\mt{bindOuter} \rc \mt{bindInner})
adamc@547	1629 \end{array}$$
adamc@787	1630 Note that any tag may be assigned a CSS class. This is the sole way of making use of the values produced by $\mt{style}$ declarations. Ur/Web itself doesn't deal with the syntax or semantics of style sheets; they can be linked via URLs with \texttt{link} tags. However, Ur/Web does make it easy to calculate upper bounds on usage of CSS classes through program analysis.
adamc@547	1631
adamc@547	1632 Two XML fragments may be concatenated.
adamc@547	1633 $$\begin{array}{l}
adamc@547	1634 \mt{val} \; \mt{join} : \mt{ctx} ::: \{\mt{Unit}\} \to \mt{use_1} ::: \{\mt{Type}\} \to \mt{bind_1} ::: \{\mt{Type}\} \to \mt{bind_2} ::: \{\mt{Type}\} \\
adamc@547	1635 \hspace{.1in} \to \lambda [\mt{use_1} \sim \mt{bind_1}] \; [\mt{bind_1} \sim \mt{bind_2}] \\
adamc@547	1636 \hspace{.1in} \Rightarrow \mt{xml} \; \mt{ctx} \; \mt{use_1} \; \mt{bind_1} \to \mt{xml} \; \mt{ctx} \; (\mt{use_1} \rc \mt{bind_1}) \; \mt{bind_2} \to \mt{xml} \; \mt{ctx} \; \mt{use_1} \; (\mt{bind_1} \rc \mt{bind_2})
adamc@547	1637 \end{array}$$
adamc@547	1638
adamc@547	1639 Finally, any XML fragment may be updated to ``claim'' to use more form fields than it does.
adamc@547	1640 $$\begin{array}{l}
adamc@547	1641 \mt{val} \; \mt{useMore} : \mt{ctx} ::: \{\mt{Unit}\} \to \mt{use_1} ::: \{\mt{Type}\} \to \mt{use_2} ::: \{\mt{Type}\} \to \mt{bind} ::: \{\mt{Type}\} \to \lambda [\mt{use_1} \sim \mt{use_2}] \\
adamc@547	1642 \hspace{.1in} \Rightarrow \mt{xml} \; \mt{ctx} \; \mt{use_1} \; \mt{bind} \to \mt{xml} \; \mt{ctx} \; (\mt{use_1} \rc \mt{use_2}) \; \mt{bind}
adamc@547	1643 \end{array}$$
adamc@547	1644
adamc@547	1645 We will not list here the different HTML tags and related functions from the standard library. They should be easy enough to understand from the code in \texttt{basis.urs}. The set of tags in the library is not yet claimed to be complete for HTML standards.
adamc@547	1646
adamc@547	1647 One last useful function is for aborting any page generation, returning some XML as an error message. This function takes the place of some uses of a general exception mechanism.
adamc@547	1648 $$\begin{array}{l}
adamc@547	1649 \mt{val} \; \mt{error} : \mt{t} ::: \mt{Type} \to \mt{xml} \; [\mt{Body}] \; [] \; [] \to \mt{t}
adamc@547	1650 \end{array}$$
adamc@547	1651
adamc@549	1652
adamc@701	1653 \subsection{Client-Side Programming}
adamc@659	1654
adamc@701	1655 Ur/Web supports running code on web browsers, via automatic compilation to JavaScript.
adamc@701	1656
adamc@701	1657 \subsubsection{The Basics}
adamc@701	1658
adamc@701	1659 Clients can open alert dialog boxes, in the usual annoying JavaScript way.
adamc@701	1660 $$\begin{array}{l}
adamc@701	1661 \mt{val} \; \mt{alert} : \mt{string} \to \mt{transaction} \; \mt{unit}
adamc@701	1662 \end{array}$$
adamc@701	1663
adamc@701	1664 Any transaction may be run in a new thread with the $\mt{spawn}$ function.
adamc@701	1665 $$\begin{array}{l}
adamc@701	1666 \mt{val} \; \mt{spawn} : \mt{transaction} \; \mt{unit} \to \mt{transaction} \; \mt{unit}
adamc@701	1667 \end{array}$$
adamc@701	1668
adamc@701	1669 The current thread can be paused for at least a specified number of milliseconds.
adamc@701	1670 $$\begin{array}{l}
adamc@701	1671 \mt{val} \; \mt{sleep} : \mt{int} \to \mt{transaction} \; \mt{unit}
adamc@701	1672 \end{array}$$
adamc@701	1673
adamc@787	1674 A few functions are available to registers callbacks for particular error events. Respectively, they are triggered on calls to $\mt{error}$, uncaught JavaScript exceptions, failure of remote procedure calls, the severance of the connection serving asynchronous messages, or the occurrence of some other error with that connection. If no handlers are registered for a kind of error, then occurrences of that error are ignored silently.
adamc@787	1675 $$\begin{array}{l}
adamc@787	1676 \mt{val} \; \mt{onError} : (\mt{xbody} \to \mt{transaction} \; \mt{unit}) \to \mt{transaction} \; \mt{unit} \\
adamc@787	1677 \mt{val} \; \mt{onFail} : (\mt{string} \to \mt{transaction} \; \mt{unit}) \to \mt{transaction} \; \mt{unit} \\
adamc@787	1678 \mt{val} \; \mt{onConnectFail} : \mt{transaction} \; \mt{unit} \to \mt{transaction} \; \mt{unit} \\
adamc@787	1679 \mt{val} \; \mt{onDisconnect} : \mt{transaction} \; \mt{unit} \to \mt{transaction} \; \mt{unit} \\
adamc@787	1680 \mt{val} \; \mt{onServerError} : (\mt{string} \to \mt{transaction} \; \mt{unit}) \to \mt{transaction} \; \mt{unit}
adamc@787	1681 \end{array}$$
adamc@787	1682
adamc@701	1683 \subsubsection{Functional-Reactive Page Generation}
adamc@701	1684
adamc@701	1685 Most approaches to ``AJAX''-style coding involve imperative manipulation of the DOM tree representing an HTML document's structure. Ur/Web follows the \emph{functional-reactive} approach instead. Programs may allocate mutable \emph{sources} of arbitrary types, and an HTML page is effectively a pure function over the latest values of the sources. The page is not mutated directly, but rather it changes automatically as the sources are mutated.
adamc@659	1686
adamc@659	1687 $$\begin{array}{l}
adamc@659	1688 \mt{con} \; \mt{source} :: \mt{Type} \to \mt{Type} \\
adamc@659	1689 \mt{val} \; \mt{source} : \mt{t} ::: \mt{Type} \to \mt{t} \to \mt{transaction} \; (\mt{source} \; \mt{t}) \\
adamc@659	1690 \mt{val} \; \mt{set} : \mt{t} ::: \mt{Type} \to \mt{source} \; \mt{t} \to \mt{t} \to \mt{transaction} \; \mt{unit} \\
adamc@659	1691 \mt{val} \; \mt{get} : \mt{t} ::: \mt{Type} \to \mt{source} \; \mt{t} \to \mt{transaction} \; \mt{t}
adamc@659	1692 \end{array}$$
adamc@659	1693
adamc@659	1694 Pure functions over sources are represented in a monad of \emph{signals}.
adamc@659	1695
adamc@659	1696 $$\begin{array}{l}
adamc@659	1697 \mt{con} \; \mt{signal} :: \mt{Type} \to \mt{Type} \\
adamc@659	1698 \mt{val} \; \mt{signal\_monad} : \mt{monad} \; \mt{signal} \\
adamc@659	1699 \mt{val} \; \mt{signal} : \mt{t} ::: \mt{Type} \to \mt{source} \; \mt{t} \to \mt{signal} \; \mt{t}
adamc@659	1700 \end{array}$$
adamc@659	1701
adamc@659	1702 A reactive portion of an HTML page is injected with a $\mt{dyn}$ tag, which has a signal-valued attribute $\mt{Signal}$.
adamc@659	1703
adamc@659	1704 $$\begin{array}{l}
adamc@701	1705 \mt{val} \; \mt{dyn} : \mt{use} ::: \{\mt{Type}\} \to \mt{bind} ::: \{\mt{Type}\} \to \mt{unit} \\
adamc@701	1706 \hspace{.1in} \to \mt{tag} \; [\mt{Signal} = \mt{signal} \; (\mt{xml} \; \mt{body} \; \mt{use} \; \mt{bind})] \; \mt{body} \; [] \; \mt{use} \; \mt{bind}
adamc@659	1707 \end{array}$$
adamc@659	1708
adamc@701	1709 Transactions can be run on the client by including them in attributes like the $\mt{Onclick}$ attribute of $\mt{button}$, and GUI widgets like $\mt{ctextbox}$ have $\mt{Source}$ attributes that can be used to connect them to sources, so that their values can be read by code running because of, e.g., an $\mt{Onclick}$ event.
adamc@701	1710
adamc@701	1711 \subsubsection{Asynchronous Message-Passing}
adamc@701	1712
adamc@701	1713 To support asynchronous, ``server push'' delivery of messages to clients, any client that might need to receive an asynchronous message is assigned a unique ID. These IDs may be retrieved both on the client and on the server, during execution of code related to a client.
adamc@701	1714
adamc@701	1715 $$\begin{array}{l}
adamc@701	1716 \mt{type} \; \mt{client} \\
adamc@701	1717 \mt{val} \; \mt{self} : \mt{transaction} \; \mt{client}
adamc@701	1718 \end{array}$$
adamc@701	1719
adamc@701	1720 \emph{Channels} are the means of message-passing. Each channel is created in the context of a client and belongs to that client; no other client may receive the channel's messages. Each channel type includes the type of values that may be sent over the channel. Sending and receiving are asynchronous, in the sense that a client need not be ready to receive a message right away. Rather, sent messages may queue up, waiting to be processed.
adamc@701	1721
adamc@701	1722 $$\begin{array}{l}
adamc@701	1723 \mt{con} \; \mt{channel} :: \mt{Type} \to \mt{Type} \\
adamc@701	1724 \mt{val} \; \mt{channel} : \mt{t} ::: \mt{Type} \to \mt{transaction} \; (\mt{channel} \; \mt{t}) \\
adamc@701	1725 \mt{val} \; \mt{send} : \mt{t} ::: \mt{Type} \to \mt{channel} \; \mt{t} \to \mt{t} \to \mt{transaction} \; \mt{unit} \\
adamc@701	1726 \mt{val} \; \mt{recv} : \mt{t} ::: \mt{Type} \to \mt{channel} \; \mt{t} \to \mt{transaction} \; \mt{t}
adamc@701	1727 \end{array}$$
adamc@701	1728
adamc@701	1729 The $\mt{channel}$ and $\mt{send}$ operations may only be executed on the server, and $\mt{recv}$ may only be executed on a client. Neither clients nor channels may be passed as arguments from clients to server-side functions, so persistent channels can only be maintained by storing them in the database and looking them up using the current client ID or some application-specific value as a key.
adamc@701	1730
adamc@701	1731 Clients and channels live only as long as the web browser page views that they are associated with. When a user surfs away, his client and its channels will be garbage-collected, after that user is not heard from for the timeout period. Garbage collection deletes any database row that contains a client or channel directly. Any reference to one of these types inside an $\mt{option}$ is set to $\mt{None}$ instead. Both kinds of handling have the flavor of weak pointers, and that is a useful way to think about clients and channels in the database.
adamc@701	1732
adamc@659	1733
adamc@549	1734 \section{Ur/Web Syntax Extensions}
adamc@549	1735
adamc@549	1736 Ur/Web features some syntactic shorthands for building values using the functions from the last section. This section sketches the grammar of those extensions. We write spans of syntax inside brackets to indicate that they are optional.
adamc@549	1737
adamc@549	1738 \subsection{SQL}
adamc@549	1739
adamc@786	1740 \subsubsection{\label{tables}Table Declarations}
adamc@786	1741
adamc@788	1742 $\mt{table}$ declarations may include constraints, via these grammar rules.
adamc@788	1743 $$\begin{array}{rrcll}
adamc@788	1744 \textrm{Declarations} & d &::=& \mt{table} \; x : c \; [pk[,]] \; cts \\
adamc@788	1745 \textrm{Primary key constraints} & pk &::=& \mt{PRIMARY} \; \mt{KEY} \; K \\
adamc@788	1746 \textrm{Keys} & K &::=& f \mid (f, (f,)^+) \\
adamc@788	1747 \textrm{Constraint sets} & cts &::=& \mt{CONSTRAINT} f \; ct \mid cts, cts \mid \{\{e\}\} \\
adamc@788	1748 \textrm{Constraints} & ct &::=& \mt{UNIQUE} \; K \mid \mt{CHECK} \; E \\
adamc@788	1749 &&& \mid \mt{FOREIGN} \; \mt{KEY} \; K \; \mt{REFERENCES} \; F \; (K) \; [\mt{ON} \; \mt{DELETE} \; pr] \; [\mt{ON} \; \mt{UPDATE} \; pr] \\
adamc@788	1750 \textrm{Foreign tables} & F &::=& x \mid \{\{e\}\} \\
adamc@788	1751 \textrm{Propagation modes} & pr &::=& \mt{NO} \; \mt{ACTION} \mid \mt{RESTRICT} \mid \mt{CASCADE} \mid \mt{SET} \; \mt{NULL}
adamc@788	1752 \end{array}$$
adamc@788	1753
adamc@788	1754 A signature item $\mt{table} \; \mt{x} : \mt{c}$ is actually elaborated into two signature items: $\mt{con} \; \mt{x\_hidden\_constraints} :: \{\{\mt{Unit}\}\}$ and $\mt{val} \; \mt{x} : \mt{sql\_table} \; \mt{c} \; \mt{x\_hidden\_constraints}$. This is appropriate for common cases where client code doesn't care which keys a table has. It's also possible to include constraints after a $\mt{table}$ signature item, with the same syntax as for $\mt{table}$ declarations. This may look like dependent typing, but it's just a convenience. The constraints are type-checked to determine a constructor $u$ to include in $\mt{val} \; \mt{x} : \mt{sql\_table} \; \mt{c} \; (u \rc \mt{x\_hidden\_constraints})$, and then the expressions are thrown away. Nonetheless, it can be useful for documentation purposes to include table constraint details in signatures. Note that the automatic generation of $\mt{x\_hidden\_constraints}$ leads to a kind of free subtyping with respect to which constraints are defined.
adamc@788	1755
adamc@788	1756
adamc@549	1757 \subsubsection{Queries}
adamc@549	1758
adamc@550	1759 Queries $Q$ are added to the rules for expressions $e$.
adamc@550	1760
adamc@549	1761 $$\begin{array}{rrcll}
adamc@550	1762 \textrm{Queries} & Q &::=& (q \; [\mt{ORDER} \; \mt{BY} \; (E \; [o],)^+] \; [\mt{LIMIT} \; N] \; [\mt{OFFSET} \; N]) \\
adamc@549	1763 \textrm{Pre-queries} & q &::=& \mt{SELECT} \; P \; \mt{FROM} \; T,^+ \; [\mt{WHERE} \; E] \; [\mt{GROUP} \; \mt{BY} \; p,^+] \; [\mt{HAVING} \; E] \\
adamc@549	1764 &&& \mid q \; R \; q \\
adamc@549	1765 \textrm{Relational operators} & R &::=& \mt{UNION} \mid \mt{INTERSECT} \mid \mt{EXCEPT}
adamc@549	1766 \end{array}$$
adamc@549	1767
adamc@549	1768 $$\begin{array}{rrcll}
adamc@549	1769 \textrm{Projections} & P &::=& \ast & \textrm{all columns} \\
adamc@549	1770 &&& p,^+ & \textrm{particular columns} \\
adamc@549	1771 \textrm{Pre-projections} & p &::=& t.f & \textrm{one column from a table} \\
adamc@558	1772 &&& t.\{\{c\}\} & \textrm{a record of columns from a table (of kind $\{\mt{Type}\}$)} \\
adamc@549	1773 \textrm{Table names} & t &::=& x & \textrm{constant table name (automatically capitalized)} \\
adamc@549	1774 &&& X & \textrm{constant table name} \\
adamc@549	1775 &&& \{\{c\}\} & \textrm{computed table name (of kind $\mt{Name}$)} \\
adamc@549	1776 \textrm{Column names} & f &::=& X & \textrm{constant column name} \\
adamc@549	1777 &&& \{c\} & \textrm{computed column name (of kind $\mt{Name}$)} \\
adamc@549	1778 \textrm{Tables} & T &::=& x & \textrm{table variable, named locally by its own capitalization} \\
adamc@549	1779 &&& x \; \mt{AS} \; t & \textrm{table variable, with local name} \\
adamc@549	1780 &&& \{\{e\}\} \; \mt{AS} \; t & \textrm{computed table expression, with local name} \\
adamc@549	1781 \textrm{SQL expressions} & E &::=& p & \textrm{column references} \\
adamc@549	1782 &&& X & \textrm{named expression references} \\
adamc@549	1783 &&& \{\{e\}\} & \textrm{injected native Ur expressions} \\
adamc@549	1784 &&& \{e\} & \textrm{computed expressions, probably using $\mt{sql\_exp}$ directly} \\
adamc@549	1785 &&& \mt{TRUE} \mid \mt{FALSE} & \textrm{boolean constants} \\
adamc@549	1786 &&& \ell & \textrm{primitive type literals} \\
adamc@549	1787 &&& \mt{NULL} & \textrm{null value (injection of $\mt{None}$)} \\
adamc@549	1788 &&& E \; \mt{IS} \; \mt{NULL} & \textrm{nullness test} \\
adamc@549	1789 &&& n & \textrm{nullary operators} \\
adamc@549	1790 &&& u \; E & \textrm{unary operators} \\
adamc@549	1791 &&& E \; b \; E & \textrm{binary operators} \\
adamc@549	1792 &&& \mt{COUNT}(\ast) & \textrm{count number of rows} \\
adamc@549	1793 &&& a(E) & \textrm{other aggregate function} \\
adamc@549	1794 &&& (E) & \textrm{explicit precedence} \\
adamc@549	1795 \textrm{Nullary operators} & n &::=& \mt{CURRENT\_TIMESTAMP} \\
adamc@549	1796 \textrm{Unary operators} & u &::=& \mt{NOT} \\
adamc@549	1797 \textrm{Binary operators} & b &::=& \mt{AND} \mid \mt{OR} \mid \neq \mid < \mid \leq \mid > \mid \geq \\
adamc@549	1798 \textrm{Aggregate functions} & a &::=& \mt{AVG} \mid \mt{SUM} \mid \mt{MIN} \mid \mt{MAX} \\
adamc@550	1799 \textrm{Directions} & o &::=& \mt{ASC} \mid \mt{DESC} \\
adamc@549	1800 \textrm{SQL integer} & N &::=& n \mid \{e\} \\
adamc@549	1801 \end{array}$$
adamc@549	1802
adamc@549	1803 Additionally, an SQL expression may be inserted into normal Ur code with the syntax $(\mt{SQL} \; E)$ or $(\mt{WHERE} \; E)$.
adamc@549	1804
adamc@550	1805 \subsubsection{DML}
adamc@550	1806
adamc@550	1807 DML commands $D$ are added to the rules for expressions $e$.
adamc@550	1808
adamc@550	1809 $$\begin{array}{rrcll}
adamc@550	1810 \textrm{Commands} & D &::=& (\mt{INSERT} \; \mt{INTO} \; T^E \; (f,^+) \; \mt{VALUES} \; (E,^+)) \\
adamc@550	1811 &&& (\mt{UPDATE} \; T^E \; \mt{SET} \; (f = E,)^+ \; \mt{WHERE} \; E) \\
adamc@550	1812 &&& (\mt{DELETE} \; \mt{FROM} \; T^E \; \mt{WHERE} \; E) \\
adamc@550	1813 \textrm{Table expressions} & T^E &::=& x \mid \{\{e\}\}
adamc@550	1814 \end{array}$$
adamc@550	1815
adamc@550	1816 Inside $\mt{UPDATE}$ and $\mt{DELETE}$ commands, lone variables $X$ are interpreted as references to columns of the implicit table $\mt{T}$, rather than to named expressions.
adamc@549	1817
adamc@551	1818 \subsection{XML}
adamc@551	1819
adamc@551	1820 XML fragments $L$ are added to the rules for expressions $e$.
adamc@551	1821
adamc@551	1822 $$\begin{array}{rrcll}
adamc@551	1823 \textrm{XML fragments} & L &::=& \texttt{<xml/>} \mid \texttt{<xml>}l^*\texttt{</xml>} \\
adamc@551	1824 \textrm{XML pieces} & l &::=& \textrm{text} & \textrm{cdata} \\
adamc@551	1825 &&& \texttt{<}g\texttt{/>} & \textrm{tag with no children} \\
adamc@551	1826 &&& \texttt{<}g\texttt{>}l^*\texttt{</}x\texttt{>} & \textrm{tag with children} \\
adamc@559	1827 &&& \{e\} & \textrm{computed XML fragment} \\
adamc@559	1828 &&& \{[e]\} & \textrm{injection of an Ur expression, via the $\mt{Top}.\mt{txt}$ function} \\
adamc@551	1829 \textrm{Tag} & g &::=& h \; (x = v)^* \\
adamc@551	1830 \textrm{Tag head} & h &::=& x & \textrm{tag name} \\
adamc@551	1831 &&& h\{c\} & \textrm{constructor parameter} \\
adamc@551	1832 \textrm{Attribute value} & v &::=& \ell & \textrm{literal value} \\
adamc@551	1833 &&& \{e\} & \textrm{computed value} \\
adamc@551	1834 \end{array}$$
adamc@551	1835
adamc@552	1836
adamc@553	1837 \section{The Structure of Web Applications}
adamc@553	1838
adamc@553	1839 A web application is built from a series of modules, with one module, the last one appearing in the \texttt{.urp} file, designated as the main module. The signature of the main module determines the URL entry points to the application. Such an entry point should have type $\mt{unit} \to \mt{transaction} \; \mt{page}$, where $\mt{page}$ is a type synonym for top-level HTML pages, defined in $\mt{Basis}$. If such a function is at the top level of main module $M$, it will be accessible at URI \texttt{/M/f}, and so on for more deeply-nested functions, as described in Section \ref{tag} below.
adamc@553	1840
adamc@553	1841 When the standalone web server receives a request for a known page, it calls the function for that page, ``running'' the resulting transaction to produce the page to return to the client. Pages link to other pages with the \texttt{link} attribute of the \texttt{a} HTML tag. A link has type $\mt{transaction} \; \mt{page}$, and the semantics of a link are that this transaction should be run to compute the result page, when the link is followed. Link targets are assigned URL names in the same way as top-level entry points.
adamc@553	1842
adamc@553	1843 HTML forms are handled in a similar way. The $\mt{action}$ attribute of a $\mt{submit}$ form tag takes a value of type $\$\mt{use} \to \mt{transaction} \; \mt{page}$, where $\mt{use}$ is a kind-$\{\mt{Type}\}$ record of the form fields used by this action handler. Action handlers are assigned URL patterns in the same way as above.
adamc@553	1844
adamc@558	1845 For both links and actions, direct arguments and local variables mentioned implicitly via closures are automatically included in serialized form in URLs, in the order in which they appear in the source code.
adamc@553	1846
adamc@660	1847 Ur/Web programs generally mix server- and client-side code in a fairly transparent way. The one important restriction is that mixed client-server code must encapsulate all server-side pieces within named functions. This is because execution of such pieces will be implemented by explicit calls to the remote web server, and it is useful to get the programmer's help in designing the interface to be used. For example, this makes it easier to allow a client running an old version of an application to continue interacting with a server that has been upgraded to a new version, if the programmer took care to keep the interfaces of all of the old remote calls the same. The functions implementing these services are assigned names in the same way as normal web entry points, by using module structure.
adamc@660	1848
adamc@553	1849
adamc@552	1850 \section{Compiler Phases}
adamc@552	1851
adamc@552	1852 The Ur/Web compiler is unconventional in that it relies on a kind of \emph{heuristic compilation}. Not all valid programs will compile successfully. Informally, programs fail to compile when they are ``too higher order.'' Compiler phases do their best to eliminate different kinds of higher order-ness, but some programs just won't compile. This is a trade-off for producing very efficient executables. Compiled Ur/Web programs use native C representations and require no garbage collection.
adamc@552	1853
adamc@552	1854 In this section, we step through the main phases of compilation, noting what consequences each phase has for effective programming.
adamc@552	1855
adamc@552	1856 \subsection{Parse}
adamc@552	1857
adamc@552	1858 The compiler reads a \texttt{.urp} file, figures out which \texttt{.urs} and \texttt{.ur} files it references, and combines them all into what is conceptually a single sequence of declarations in the core language of Section \ref{core}.
adamc@552	1859
adamc@552	1860 \subsection{Elaborate}
adamc@552	1861
adamc@552	1862 This is where type inference takes place, translating programs into an explicit form with no more wildcards. This phase is the most likely source of compiler error messages.
adamc@552	1863
adamc@552	1864 \subsection{Unnest}
adamc@552	1865
adamc@552	1866 Named local function definitions are moved to the top level, to avoid the need to generate closures.
adamc@552	1867
adamc@552	1868 \subsection{Corify}
adamc@552	1869
adamc@552	1870 Module system features are compiled away, through inlining of functor definitions at application sites. Afterward, most abstraction boundaries are broken, facilitating optimization.
adamc@552	1871
adamc@552	1872 \subsection{Especialize}
adamc@552	1873
adamc@552	1874 Functions are specialized to particular argument patterns. This is an important trick for avoiding the need to maintain any closures at runtime.
adamc@552	1875
adamc@552	1876 \subsection{Untangle}
adamc@552	1877
adamc@552	1878 Remove unnecessary mutual recursion, splitting recursive groups into strongly-connected components.
adamc@552	1879
adamc@552	1880 \subsection{Shake}
adamc@552	1881
adamc@552	1882 Remove all definitions not needed to run the page handlers that are visible in the signature of the last module listed in the \texttt{.urp} file.
adamc@552	1883
adamc@661	1884 \subsection{Rpcify}
adamc@661	1885
adamc@661	1886 Pieces of code are determined to be client-side, server-side, neither, or both, by figuring out which standard library functions might be needed to execute them. Calls to server-side functions (e.g., $\mt{query}$) within mixed client-server code are identified and replaced with explicit remote calls. Some mixed functions may be converted to continuation-passing style to facilitate this transformation.
adamc@661	1887
adamc@661	1888 \subsection{Untangle, Shake}
adamc@661	1889
adamc@661	1890 Repeat these simplifications.
adamc@661	1891
adamc@553	1892 \subsection{\label{tag}Tag}
adamc@552	1893
adamc@552	1894 Assign a URL name to each link and form action. It is important that these links and actions are written as applications of named functions, because such names are used to generate URL patterns. A URL pattern has a name built from the full module path of the named function, followed by the function name, with all pieces separated by slashes. The path of a functor application is based on the name given to the result, rather than the path of the functor itself.
adamc@552	1895
adamc@552	1896 \subsection{Reduce}
adamc@552	1897
adamc@552	1898 Apply definitional equality rules to simplify the program as much as possible. This effectively includes inlining of every non-recursive definition.
adamc@552	1899
adamc@552	1900 \subsection{Unpoly}
adamc@552	1901
adamc@552	1902 This phase specializes polymorphic functions to the specific arguments passed to them in the program. If the program contains real polymorphic recursion, Unpoly will be insufficient to avoid later error messages about too much polymorphism.
adamc@552	1903
adamc@552	1904 \subsection{Specialize}
adamc@552	1905
adamc@558	1906 Replace uses of parameterized datatypes with versions specialized to specific parameters. As for Unpoly, this phase will not be effective enough in the presence of polymorphic recursion or other fancy uses of impredicative polymorphism.
adamc@552	1907
adamc@552	1908 \subsection{Shake}
adamc@552	1909
adamc@558	1910 Here the compiler repeats the earlier Shake phase.
adamc@552	1911
adamc@552	1912 \subsection{Monoize}
adamc@552	1913
adamc@552	1914 Programs are translated to a new intermediate language without polymorphism or non-$\mt{Type}$ constructors. Error messages may pop up here if earlier phases failed to remove such features.
adamc@552	1915
adamc@552	1916 This is the stage at which concrete names are generated for cookies, tables, and sequences. They are named following the same convention as for links and actions, based on module path information saved from earlier stages. Table and sequence names separate path elements with underscores instead of slashes, and they are prefixed by \texttt{uw\_}.
adamc@664	1917
adamc@552	1918 \subsection{MonoOpt}
adamc@552	1919
adamc@552	1920 Simple algebraic laws are applied to simplify the program, focusing especially on efficient imperative generation of HTML pages.
adamc@552	1921
adamc@552	1922 \subsection{MonoUntangle}
adamc@552	1923
adamc@552	1924 Unnecessary mutual recursion is broken up again.
adamc@552	1925
adamc@552	1926 \subsection{MonoReduce}
adamc@552	1927
adamc@552	1928 Equivalents of the definitional equality rules are applied to simplify programs, with inlining again playing a major role.
adamc@552	1929
adamc@552	1930 \subsection{MonoShake, MonoOpt}
adamc@552	1931
adamc@552	1932 Unneeded declarations are removed, and basic optimizations are repeated.
adamc@552	1933
adamc@552	1934 \subsection{Fuse}
adamc@552	1935
adamc@552	1936 The compiler tries to simplify calls to recursive functions whose results are immediately written as page output. The write action is pushed inside the function definitions to avoid allocation of intermediate results.
adamc@552	1937
adamc@552	1938 \subsection{MonoUntangle, MonoShake}
adamc@552	1939
adamc@552	1940 Fuse often creates more opportunities to remove spurious mutual recursion.
adamc@552	1941
adamc@552	1942 \subsection{Pathcheck}
adamc@552	1943
adamc@552	1944 The compiler checks that no link or action name has been used more than once.
adamc@552	1945
adamc@552	1946 \subsection{Cjrize}
adamc@552	1947
adamc@552	1948 The program is translated to what is more or less a subset of C. If any use of functions as data remains at this point, the compiler will complain.
adamc@552	1949
adamc@552	1950 \subsection{C Compilation and Linking}
adamc@552	1951
adamc@552	1952 The output of the last phase is pretty-printed as C source code and passed to GCC.
adamc@552	1953
adamc@552	1954
adamc@524	1955 \end{document}

Mercurial > urweb

annotate doc/manual.tex @ 788:f5fea13a60b9