Ur/Web: doc/manual.tex annotate

annotate doc/manual.tex @ 756:8ce31c052dce

Subforms

author	Adam Chlipala <adamc@hcoop.net>
date	Tue, 28 Apr 2009 17:26:53 -0400
parents	f963356b53fd
children	ec0a0dd0ca12

rev	line source
adamc@524	1 \documentclass{article}
adamc@554	2 \usepackage{fullpage,amsmath,amssymb,proof,url}
adamc@524	3
adamc@524	4 \newcommand{\cd}[1]{\texttt{#1}}
adamc@524	5 \newcommand{\mt}[1]{\mathsf{#1}}
adamc@524	6
adamc@524	7 \newcommand{\rc}{+ \hspace{-.075in} + \;}
adamc@527	8 \newcommand{\rcut}{\; \texttt{--} \;}
adamc@527	9 \newcommand{\rcutM}{\; \texttt{---} \;}
adamc@524	10
adamc@524	11 \begin{document}
adamc@524	12
adamc@524	13 \title{The Ur/Web Manual}
adamc@524	14 \author{Adam Chlipala}
adamc@524	15
adamc@524	16 \maketitle
adamc@524	17
adamc@540	18 \tableofcontents
adamc@540	19
adamc@554	20
adamc@554	21 \section{Introduction}
adamc@554	22
adamc@554	23 \emph{Ur} is a programming language designed to introduce richer type system features into functional programming in the tradition of ML and Haskell. Ur is functional, pure, statically-typed, and strict. Ur supports a powerful kind of \emph{metaprogramming} based on \emph{row types}.
adamc@554	24
adamc@554	25 \emph{Ur/Web} is Ur plus a special standard library and associated rules for parsing and optimization. Ur/Web supports construction of dynamic web applications backed by SQL databases. The signature of the standard library is such that well-typed Ur/Web programs ``don't go wrong'' in a very broad sense. Not only do they not crash during particular page generations, but they also may not:
adamc@554	26
adamc@554	27 \begin{itemize}
adamc@554	28 \item Suffer from any kinds of code-injection attacks
adamc@554	29 \item Return invalid HTML
adamc@554	30 \item Contain dead intra-application links
adamc@554	31 \item Have mismatches between HTML forms and the fields expected by their handlers
adamc@652	32 \item Include client-side code that makes incorrect assumptions about the ``AJAX''-style services that the remote web server provides
adamc@554	33 \item Attempt invalid SQL queries
adamc@652	34 \item Use improper marshaling or unmarshaling in communication with SQL databases or between browsers and web servers
adamc@554	35 \end{itemize}
adamc@554	36
adamc@554	37 This type safety is just the foundation of the Ur/Web methodology. It is also possible to use metaprogramming to build significant application pieces by analysis of type structure. For instance, the demo includes an ML-style functor for building an admin interface for an arbitrary SQL table. The type system guarantees that the admin interface sub-application that comes out will always be free of the above-listed bugs, no matter which well-typed table description is given as input.
adamc@554	38
adamc@652	39 The Ur/Web compiler also produces very efficient object code that does not use garbage collection. These compiled programs will often be even more efficient than what most programmers would bother to write in C. The compiler also generates JavaScript versions of client-side code, with no need to write those parts of applications in a different language.
adamc@554	40
adamc@554	41 \medskip
adamc@554	42
adamc@554	43 The official web site for Ur is:
adamc@554	44 \begin{center}
adamc@554	45 \url{http://www.impredicative.com/ur/}
adamc@554	46 \end{center}
adamc@554	47
adamc@555	48
adamc@555	49 \section{Installation}
adamc@555	50
adamc@555	51 If you are lucky, then the following standard command sequence will suffice for installation, in a directory to which you have unpacked the latest distribution tarball.
adamc@555	52
adamc@555	53 \begin{verbatim}
adamc@555	54 ./configure
adamc@555	55 make
adamc@555	56 sudo make install
adamc@555	57 \end{verbatim}
adamc@555	58
adamc@555	59 Some other packages must be installed for the above to work. At a minimum, you need a standard UNIX shell, with standard UNIX tools like sed and GCC in your execution path; and MLton, the whole-program optimizing compiler for Standard ML. To build programs that access SQL databases, you also need libpq, the PostgreSQL client library. As of this writing, in the ``testing'' version of Debian Linux, this command will install the more uncommon of these dependencies:
adamc@555	60
adamc@555	61 \begin{verbatim}
adamc@555	62 apt-get install mlton libpq-dev
adamc@555	63 \end{verbatim}
adamc@555	64
adamc@555	65 It is also possible to access the modules of the Ur/Web compiler interactively, within Standard ML of New Jersey. To install the prerequisites in Debian testing:
adamc@555	66
adamc@555	67 \begin{verbatim}
adamc@555	68 apt-get install smlnj libsmlnj-smlnj ml-yacc ml-lpt
adamc@555	69 \end{verbatim}
adamc@555	70
adamc@555	71 To begin an interactive session with the Ur compiler modules, run \texttt{make smlnj}, and then, from within an \texttt{sml} session, run \texttt{CM.make "src/urweb.cm";}. The \texttt{Compiler} module is the main entry point.
adamc@555	72
adamc@555	73 To run an SQL-backed application, you will probably want to install the PostgreSQL server. Version 8.3 or higher is required.
adamc@555	74
adamc@555	75 \begin{verbatim}
adamc@555	76 apt-get install postgresql-8.3
adamc@555	77 \end{verbatim}
adamc@555	78
adamc@555	79 To use the Emacs mode, you must have a modern Emacs installed. We assume that you already know how to do this, if you're in the business of looking for an Emacs mode. The demo generation facility of the compiler will also call out to Emacs to syntax-highlight code, and that process depends on the \texttt{htmlize} module, which can be installed in Debian testing via:
adamc@555	80
adamc@555	81 \begin{verbatim}
adamc@555	82 apt-get install emacs-goodies-el
adamc@555	83 \end{verbatim}
adamc@555	84
adamc@555	85 Even with the right packages installed, configuration and building might fail to work. After you run \texttt{./configure}, you will see the values of some named environment variables printed. You may need to adjust these values to get proper installation for your system. To change a value, store your preferred alternative in the corresponding UNIX environment variable, before running \texttt{./configure}. For instance, here is how to change the list of extra arguments that the Ur/Web compiler will pass to GCC on every invocation.
adamc@555	86
adamc@555	87 \begin{verbatim}
adamc@555	88 GCCARGS=-fnested-functions ./configure
adamc@555	89 \end{verbatim}
adamc@555	90
adamc@555	91 Some OSX users have reported needing to use this particular GCCARGS value.
adamc@555	92
adamc@555	93 The Emacs mode can be set to autoload by adding the following to your \texttt{.emacs} file.
adamc@555	94
adamc@555	95 \begin{verbatim}
adamc@555	96 (add-to-list 'load-path "/usr/local/share/emacs/site-lisp/urweb-mode")
adamc@555	97 (load "urweb-mode-startup")
adamc@555	98 \end{verbatim}
adamc@555	99
adamc@555	100 Change the path in the first line if you chose a different Emacs installation path during configuration.
adamc@555	101
adamc@555	102
adamc@556	103 \section{Command-Line Compiler}
adamc@556	104
adamc@556	105 \subsection{Project Files}
adamc@556	106
adamc@556	107 The basic inputs to the \texttt{urweb} compiler are project files, which have the extension \texttt{.urp}. Here is a sample \texttt{.urp} file.
adamc@556	108
adamc@556	109 \begin{verbatim}
adamc@556	110 database dbname=test
adamc@556	111 sql crud1.sql
adamc@556	112
adamc@556	113 crud
adamc@556	114 crud1
adamc@556	115 \end{verbatim}
adamc@556	116
adamc@556	117 The \texttt{database} line gives the database information string to pass to libpq. In this case, the string only says to connect to a local database named \texttt{test}.
adamc@556	118
adamc@556	119 The \texttt{sql} line asks for an SQL source file to be generated, giving the commands to run to create the tables and sequences that this application expects to find. After building this \texttt{.urp} file, the following commands could be used to initialize the database, assuming that the current UNIX user exists as a Postgres user with database creation privileges:
adamc@556	120
adamc@556	121 \begin{verbatim}
adamc@556	122 createdb test
adamc@556	123 psql -f crud1.sql test
adamc@556	124 \end{verbatim}
adamc@556	125
adamc@556	126 A blank line always separates the named directives from a list of modules to include in the project; if there are no named directives, a blank line must begin the file.
adamc@556	127
adamc@556	128 For each entry \texttt{M} in the module list, the file \texttt{M.urs} is included in the project if it exists, and the file \texttt{M.ur} must exist and is always included.
adamc@556	129
adamc@556	130 A few other named directives are supported. \texttt{prefix PREFIX} sets the prefix included before every URI within the generated application; the default is \texttt{/}. \texttt{exe FILENAME} sets the filename to which to write the output executable; the default for file \texttt{P.urp} is \texttt{P.exe}. \texttt{debug} saves some intermediate C files, which is mostly useful to help in debugging the compiler itself. \texttt{profile} generates an executable that may be used with gprof.
adamc@556	131
adamc@701	132 \texttt{timeout N} sets to \texttt{N} seconds the amount of time that the generated server will wait after the last contact from a client before determining that that client has exited the application. Clients that remain active will take the timeout setting into account in determining how often to ping the server, so it only makes sense to set a high timeout to cope with browser and network delays and failures. Higher timeouts can lead to more unnecessary client information taking up memory on the server. The timeout goes unused by any page that doesn't involve the \texttt{recv} function, since the server only needs to store per-client information for clients that receive asynchronous messages.
adamc@701	133
adamc@701	134
adamc@557	135 \subsection{Building an Application}
adamc@557	136
adamc@557	137 To compile project \texttt{P.urp}, simply run
adamc@557	138 \begin{verbatim}
adamc@557	139 urweb P
adamc@557	140 \end{verbatim}
adamc@558	141 The output executable is a standalone web server. Run it with the command-line argument \texttt{-h} to see which options it takes. If the project file lists a database, the web server will attempt to connect to that database on startup.
adamc@557	142
adamc@557	143 To time how long the different compiler phases run, without generating an executable, run
adamc@557	144 \begin{verbatim}
adamc@557	145 urweb -timing P
adamc@557	146 \end{verbatim}
adamc@557	147
adamc@556	148
adamc@529	149 \section{Ur Syntax}
adamc@529	150
adamc@529	151 In this section, we describe the syntax of Ur, deferring to a later section discussion of most of the syntax specific to SQL and XML. The sole exceptions are the declaration forms for tables, sequences, and cookies.
adamc@524	152
adamc@524	153 \subsection{Lexical Conventions}
adamc@524	154
adamc@524	155 We give the Ur language definition in \LaTeX $\;$ math mode, since that is prettier than monospaced ASCII. The corresponding ASCII syntax can be read off directly. Here is the key for mapping math symbols to ASCII character sequences.
adamc@524	156
adamc@524	157 \begin{center}
adamc@524	158 \begin{tabular}{rl}
adamc@524	159 \textbf{\LaTeX} & \textbf{ASCII} \\
adamc@524	160 $\to$ & \cd{->} \\
adamc@652	161 $\longrightarrow$ & \cd{-->} \\
adamc@524	162 $\times$ & \cd{*} \\
adamc@524	163 $\lambda$ & \cd{fn} \\
adamc@524	164 $\Rightarrow$ & \cd{=>} \\
adamc@652	165 $\Longrightarrow$ & \cd{==>} \\
adamc@529	166 $\neq$ & \cd{<>} \\
adamc@529	167 $\leq$ & \cd{<=} \\
adamc@529	168 $\geq$ & \cd{>=} \\
adamc@524	169 \\
adamc@524	170 $x$ & Normal textual identifier, not beginning with an uppercase letter \\
adamc@525	171 $X$ & Normal textual identifier, beginning with an uppercase letter \\
adamc@524	172 \end{tabular}
adamc@524	173 \end{center}
adamc@524	174
adamc@525	175 We often write syntax like $e^$ to indicate zero or more copies of $e$, $e^+$ to indicate one or more copies, and $e,^$ and $e,^+$ to indicate multiple copies separated by commas. Another separator may be used in place of a comma. The $e$ term may be surrounded by parentheses to indicate grouping; those parentheses should not be included in the actual ASCII.
adamc@524	176
adamc@526	177 We write $\ell$ for literals of the primitive types, for the most part following C conventions. There are $\mt{int}$, $\mt{float}$, and $\mt{string}$ literals.
adamc@526	178
adamc@527	179 This version of the manual doesn't include operator precedences; see \texttt{src/urweb.grm} for that.
adamc@527	180
adamc@552	181 \subsection{\label{core}Core Syntax}
adamc@524	182
adamc@524	183 \emph{Kinds} classify types and other compile-time-only entities. Each kind in the grammar is listed with a description of the sort of data it classifies.
adamc@524	184 $$\begin{array}{rrcll}
adamc@524	185 \textrm{Kinds} & \kappa &::=& \mt{Type} & \textrm{proper types} \\
adamc@525	186 &&& \mt{Unit} & \textrm{the trivial constructor} \\
adamc@525	187 &&& \mt{Name} & \textrm{field names} \\
adamc@525	188 &&& \kappa \to \kappa & \textrm{type-level functions} \\
adamc@525	189 &&& \{\kappa\} & \textrm{type-level records} \\
adamc@525	190 &&& (\kappa\times^+) & \textrm{type-level tuples} \\
adamc@652	191 &&& X & \textrm{variable} \\
adamc@652	192 &&& X \longrightarrow k & \textrm{kind-polymorphic type-level function} \\
adamc@529	193 &&& \_\_ & \textrm{wildcard} \\
adamc@525	194 &&& (\kappa) & \textrm{explicit precedence} \\
adamc@524	195 \end{array}$$
adamc@524	196
adamc@524	197 Ur supports several different notions of functions that take types as arguments. These arguments can be either implicit, causing them to be inferred at use sites; or explicit, forcing them to be specified manually at use sites. There is a common explicitness annotation convention applied at the definitions of and in the types of such functions.
adamc@524	198 $$\begin{array}{rrcll}
adamc@524	199 \textrm{Explicitness} & ? &::=& :: & \textrm{explicit} \\
adamc@558	200 &&& ::: & \textrm{implicit}
adamc@524	201 \end{array}$$
adamc@524	202
adamc@524	203 \emph{Constructors} are the main class of compile-time-only data. They include proper types and are classified by kinds.
adamc@524	204 $$\begin{array}{rrcll}
adamc@524	205 \textrm{Constructors} & c, \tau &::=& (c) :: \kappa & \textrm{kind annotation} \\
adamc@530	206 &&& \hat{x} & \textrm{constructor variable} \\
adamc@524	207 \\
adamc@525	208 &&& \tau \to \tau & \textrm{function type} \\
adamc@525	209 &&& x \; ? \; \kappa \to \tau & \textrm{polymorphic function type} \\
adamc@652	210 &&& X \longrightarrow \tau & \textrm{kind-polymorphic function type} \\
adamc@525	211 &&& \$ c & \textrm{record type} \\
adamc@524	212 \\
adamc@525	213 &&& c \; c & \textrm{type-level function application} \\
adamc@530	214 &&& \lambda x \; :: \; \kappa \Rightarrow c & \textrm{type-level function abstraction} \\
adamc@524	215 \\
adamc@652	216 &&& X \Longrightarrow c & \textrm{type-level kind-polymorphic function abstraction} \\
adamc@655	217 &&& c [\kappa] & \textrm{type-level kind-polymorphic function application} \\
adamc@652	218 \\
adamc@525	219 &&& () & \textrm{type-level unit} \\
adamc@525	220 &&& \#X & \textrm{field name} \\
adamc@524	221 \\
adamc@525	222 &&& [(c = c)^*] & \textrm{known-length type-level record} \\
adamc@525	223 &&& c \rc c & \textrm{type-level record concatenation} \\
adamc@652	224 &&& \mt{map} & \textrm{type-level record map} \\
adamc@524	225 \\
adamc@558	226 &&& (c,^+) & \textrm{type-level tuple} \\
adamc@525	227 &&& c.n & \textrm{type-level tuple projection ($n \in \mathbb N^+$)} \\
adamc@524	228 \\
adamc@652	229 &&& [c \sim c] \Rightarrow \tau & \textrm{guarded type} \\
adamc@524	230 \\
adamc@529	231 &&& \_ :: \kappa & \textrm{wildcard} \\
adamc@525	232 &&& (c) & \textrm{explicit precedence} \\
adamc@530	233 \\
adamc@530	234 \textrm{Qualified uncapitalized variables} & \hat{x} &::=& x & \textrm{not from a module} \\
adamc@530	235 &&& M.x & \textrm{projection from a module} \\
adamc@525	236 \end{array}$$
adamc@525	237
adamc@655	238 We include both abstraction and application for kind polymorphism, but applications are only inferred internally; they may not be written explicitly in source programs.
adamc@655	239
adamc@525	240 Modules of the module system are described by \emph{signatures}.
adamc@525	241 $$\begin{array}{rrcll}
adamc@525	242 \textrm{Signatures} & S &::=& \mt{sig} \; s^* \; \mt{end} & \textrm{constant} \\
adamc@525	243 &&& X & \textrm{variable} \\
adamc@525	244 &&& \mt{functor}(X : S) : S & \textrm{functor} \\
adamc@529	245 &&& S \; \mt{where} \; \mt{con} \; x = c & \textrm{concretizing an abstract constructor} \\
adamc@525	246 &&& M.X & \textrm{projection from a module} \\
adamc@525	247 \\
adamc@525	248 \textrm{Signature items} & s &::=& \mt{con} \; x :: \kappa & \textrm{abstract constructor} \\
adamc@525	249 &&& \mt{con} \; x :: \kappa = c & \textrm{concrete constructor} \\
adamc@528	250 &&& \mt{datatype} \; x \; x^* = dc\mid^+ & \textrm{algebraic datatype definition} \\
adamc@529	251 &&& \mt{datatype} \; x = \mt{datatype} \; M.x & \textrm{algebraic datatype import} \\
adamc@525	252 &&& \mt{val} \; x : \tau & \textrm{value} \\
adamc@525	253 &&& \mt{structure} \; X : S & \textrm{sub-module} \\
adamc@525	254 &&& \mt{signature} \; X = S & \textrm{sub-signature} \\
adamc@525	255 &&& \mt{include} \; S & \textrm{signature inclusion} \\
adamc@525	256 &&& \mt{constraint} \; c \sim c & \textrm{record disjointness constraint} \\
adamc@654	257 &&& \mt{class} \; x :: \kappa & \textrm{abstract constructor class} \\
adamc@654	258 &&& \mt{class} \; x :: \kappa = c & \textrm{concrete constructor class} \\
adamc@525	259 \\
adamc@525	260 \textrm{Datatype constructors} & dc &::=& X & \textrm{nullary constructor} \\
adamc@525	261 &&& X \; \mt{of} \; \tau & \textrm{unary constructor} \\
adamc@524	262 \end{array}$$
adamc@524	263
adamc@526	264 \emph{Patterns} are used to describe structural conditions on expressions, such that expressions may be tested against patterns, generating assignments to pattern variables if successful.
adamc@526	265 $$\begin{array}{rrcll}
adamc@526	266 \textrm{Patterns} & p &::=& \_ & \textrm{wildcard} \\
adamc@526	267 &&& x & \textrm{variable} \\
adamc@526	268 &&& \ell & \textrm{constant} \\
adamc@526	269 &&& \hat{X} & \textrm{nullary constructor} \\
adamc@526	270 &&& \hat{X} \; p & \textrm{unary constructor} \\
adamc@526	271 &&& \{(x = p,)^*\} & \textrm{rigid record pattern} \\
adamc@526	272 &&& \{(x = p,)^+, \ldots\} & \textrm{flexible record pattern} \\
adamc@527	273 &&& (p) & \textrm{explicit precedence} \\
adamc@526	274 \\
adamc@529	275 \textrm{Qualified capitalized variables} & \hat{X} &::=& X & \textrm{not from a module} \\
adamc@526	276 &&& M.X & \textrm{projection from a module} \\
adamc@526	277 \end{array}$$
adamc@526	278
adamc@527	279 \emph{Expressions} are the main run-time entities, corresponding to both ``expressions'' and ``statements'' in mainstream imperative languages.
adamc@527	280 $$\begin{array}{rrcll}
adamc@527	281 \textrm{Expressions} & e &::=& e : \tau & \textrm{type annotation} \\
adamc@529	282 &&& \hat{x} & \textrm{variable} \\
adamc@529	283 &&& \hat{X} & \textrm{datatype constructor} \\
adamc@527	284 &&& \ell & \textrm{constant} \\
adamc@527	285 \\
adamc@527	286 &&& e \; e & \textrm{function application} \\
adamc@527	287 &&& \lambda x : \tau \Rightarrow e & \textrm{function abstraction} \\
adamc@527	288 &&& e [c] & \textrm{polymorphic function application} \\
adamc@527	289 &&& \lambda x \; ? \; \kappa \Rightarrow e & \textrm{polymorphic function abstraction} \\
adamc@655	290 &&& e [\kappa] & \textrm{kind-polymorphic function application} \\
adamc@652	291 &&& X \Longrightarrow e & \textrm{kind-polymorphic function abstraction} \\
adamc@527	292 \\
adamc@527	293 &&& \{(c = e,)^*\} & \textrm{known-length record} \\
adamc@527	294 &&& e.c & \textrm{record field projection} \\
adamc@527	295 &&& e \rc e & \textrm{record concatenation} \\
adamc@527	296 &&& e \rcut c & \textrm{removal of a single record field} \\
adamc@527	297 &&& e \rcutM c & \textrm{removal of multiple record fields} \\
adamc@527	298 \\
adamc@527	299 &&& \mt{let} \; ed^* \; \mt{in} \; e \; \mt{end} & \textrm{local definitions} \\
adamc@527	300 \\
adamc@527	301 &&& \mt{case} \; e \; \mt{of} \; (p \Rightarrow e\|)^+ & \textrm{pattern matching} \\
adamc@527	302 \\
adamc@654	303 &&& \lambda [c \sim c] \Rightarrow e & \textrm{guarded expression abstraction} \\
adamc@654	304 &&& e \; ! & \textrm{guarded expression application} \\
adamc@527	305 \\
adamc@527	306 &&& \_ & \textrm{wildcard} \\
adamc@527	307 &&& (e) & \textrm{explicit precedence} \\
adamc@527	308 \\
adamc@527	309 \textrm{Local declarations} & ed &::=& \cd{val} \; x : \tau = e & \textrm{non-recursive value} \\
adamc@527	310 &&& \cd{val} \; \cd{rec} \; (x : \tau = e \; \cd{and})^+ & \textrm{mutually-recursive values} \\
adamc@527	311 \end{array}$$
adamc@527	312
adamc@655	313 As with constructors, we include both abstraction and application for kind polymorphism, but applications are only inferred internally.
adamc@655	314
adamc@528	315 \emph{Declarations} primarily bring new symbols into context.
adamc@528	316 $$\begin{array}{rrcll}
adamc@528	317 \textrm{Declarations} & d &::=& \mt{con} \; x :: \kappa = c & \textrm{constructor synonym} \\
adamc@528	318 &&& \mt{datatype} \; x \; x^* = dc\mid^+ & \textrm{algebraic datatype definition} \\
adamc@529	319 &&& \mt{datatype} \; x = \mt{datatype} \; M.x & \textrm{algebraic datatype import} \\
adamc@528	320 &&& \mt{val} \; x : \tau = e & \textrm{value} \\
adamc@528	321 &&& \mt{val} \; \cd{rec} \; (x : \tau = e \; \mt{and})^+ & \textrm{mutually-recursive values} \\
adamc@528	322 &&& \mt{structure} \; X : S = M & \textrm{module definition} \\
adamc@528	323 &&& \mt{signature} \; X = S & \textrm{signature definition} \\
adamc@528	324 &&& \mt{open} \; M & \textrm{module inclusion} \\
adamc@528	325 &&& \mt{constraint} \; c \sim c & \textrm{record disjointness constraint} \\
adamc@528	326 &&& \mt{open} \; \mt{constraints} \; M & \textrm{inclusion of just the constraints from a module} \\
adamc@528	327 &&& \mt{table} \; x : c & \textrm{SQL table} \\
adamc@528	328 &&& \mt{sequence} \; x & \textrm{SQL sequence} \\
adamc@535	329 &&& \mt{cookie} \; x : \tau & \textrm{HTTP cookie} \\
adamc@654	330 &&& \mt{class} \; x :: \kappa = c & \textrm{concrete constructor class} \\
adamc@528	331 \\
adamc@529	332 \textrm{Modules} & M &::=& \mt{struct} \; d^* \; \mt{end} & \textrm{constant} \\
adamc@529	333 &&& X & \textrm{variable} \\
adamc@529	334 &&& M.X & \textrm{projection} \\
adamc@529	335 &&& M(M) & \textrm{functor application} \\
adamc@529	336 &&& \mt{functor}(X : S) : S = M & \textrm{functor abstraction} \\
adamc@528	337 \end{array}$$
adamc@528	338
adamc@528	339 There are two kinds of Ur files. A file named $M\texttt{.ur}$ is an \emph{implementation file}, and it should contain a sequence of declarations $d^$. A file named $M\texttt{.urs}$ is an \emph{interface file}; it must always have a matching $M\texttt{.ur}$ and should contain a sequence of signature items $s^$. When both files are present, the overall effect is the same as a monolithic declaration $\mt{structure} \; M : \mt{sig} \; s^* \; \mt{end} = \mt{struct} \; d^* \; \mt{end}$. When no interface file is included, the overall effect is similar, with a signature for module $M$ being inferred rather than just checked against an interface.
adamc@527	340
adamc@529	341 \subsection{Shorthands}
adamc@529	342
adamc@529	343 There are a variety of derived syntactic forms that elaborate into the core syntax from the last subsection. We will present the additional forms roughly following the order in which we presented the constructs that they elaborate into.
adamc@529	344
adamc@529	345 In many contexts where record fields are expected, like in a projection $e.c$, a constant field may be written as simply $X$, rather than $\#X$.
adamc@529	346
adamc@529	347 A record type may be written $\{(c = c,)^\}$, which elaborates to $\$[(c = c,)^]$.
adamc@529	348
adamc@533	349 The notation $[c_1, \ldots, c_n]$ is shorthand for $[c_1 = (), \ldots, c_n = ()]$.
adamc@533	350
adamc@529	351 A tuple type $(\tau_1, \ldots, \tau_n)$ expands to a record type $\{1 = \tau_1, \ldots, n = \tau_n\}$, with natural numbers as field names. A tuple pattern $(p_1, \ldots, p_n)$ expands to a rigid record pattern $\{1 = p_1, \ldots, n = p_n\}$. Positive natural numbers may be used in most places where field names would be allowed.
adamc@529	352
adamc@654	353 In general, several adjacent $\lambda$ forms may be combined into one, and kind and type annotations may be omitted, in which case they are implicitly included as wildcards. More formally, for constructor-level abstractions, we can define a new non-terminal $b ::= x \mid (x :: \kappa) \mid X$ and allow composite abstractions of the form $\lambda b^+ \Rightarrow c$, elaborating into the obvious sequence of one core $\lambda$ per element of $b^+$.
adamc@529	354
adamc@529	355 For any signature item or declaration that defines some entity to be equal to $A$ with classification annotation $B$ (e.g., $\mt{val} \; x : B = A$), $B$ and the preceding colon (or similar punctuation) may be omitted, in which case it is filled in as a wildcard.
adamc@529	356
adamc@529	357 A signature item or declaration $\mt{type} \; x$ or $\mt{type} \; x = \tau$ is elaborated into $\mt{con} \; x :: \mt{Type}$ or $\mt{con} \; x :: \mt{Type} = \tau$, respectively.
adamc@529	358
adamc@654	359 A signature item or declaration $\mt{class} \; x = \lambda y \Rightarrow c$ may be abbreviated $\mt{class} \; x \; y = c$.
adamc@529	360
adamc@654	361 Handling of implicit and explicit constructor arguments may be tweaked with some prefixes to variable references. An expression $@x$ is a version of $x$ where all implicit constructor arguments have been made explicit. An expression $@@x$ achieves the same effect, additionally halting automatic resolution of type class instances and automatic proving of disjointness constraints. The default is that any prefix of a variable's type consisting only of implicit polymorphism, type class instances, and disjointness obligations is resolved automatically, with the variable treated as having the type that starts after the last implicit element, with suitable unification variables substituted. The same syntax works for variables projected out of modules and for capitalized variables (datatype constructors).
adamc@529	362
adamc@654	363 At the expression level, an analogue is available of the composite $\lambda$ form for constructors. We define the language of binders as $b ::= x \mid (x : \tau) \mid (x \; ? \; \kappa) \mid X \mid [c \sim c]$. A lone variable $x$ as a binder stands for an expression variable of unspecified type.
adamc@529	364
adamc@529	365 A $\mt{val}$ or $\mt{val} \; \mt{rec}$ declaration may include expression binders before the equal sign, following the binder grammar from the last paragraph. Such declarations are elaborated into versions that add additional $\lambda$s to the fronts of the righthand sides, as appropriate. The keyword $\mt{fun}$ is a synonym for $\mt{val} \; \mt{rec}$.
adamc@529	366
adamc@529	367 A signature item $\mt{functor} \; X_1 \; (X_2 : S_1) : S_2$ is elaborated into $\mt{structure} \; X_1 : \mt{functor}(X_2 : S_1) : S_2$. A declaration $\mt{functor} \; X_1 \; (X_2 : S_1) : S_2 = M$ is elaborated into $\mt{structure} \; X_1 : \mt{functor}(X_2 : S_1) : S_2 = \mt{functor}(X_2 : S_1) : S_2 = M$.
adamc@529	368
adamc@529	369 A declaration $\mt{table} \; x : \{(c = c,)^\}$ is elaborated into $\mt{table} \; x : [(c = c,)^]$
adamc@529	370
adamc@529	371 The syntax $\mt{where} \; \mt{type}$ is an alternate form of $\mt{where} \; \mt{con}$.
adamc@529	372
adamc@529	373 The syntax $\mt{if} \; e \; \mt{then} \; e_1 \; \mt{else} \; e_2$ expands to $\mt{case} \; e \; \mt{of} \; \mt{Basis}.\mt{True} \Rightarrow e_1 \mid \mt{Basis}.\mt{False} \Rightarrow e_2$.
adamc@529	374
adamc@529	375 There are infix operator syntaxes for a number of functions defined in the $\mt{Basis}$ module. There is $=$ for $\mt{eq}$, $\neq$ for $\mt{neq}$, $-$ for $\mt{neg}$ (as a prefix operator) and $\mt{minus}$, $+$ for $\mt{plus}$, $\times$ for $\mt{times}$, $/$ for $\mt{div}$, $\%$ for $\mt{mod}$, $<$ for $\mt{lt}$, $\leq$ for $\mt{le}$, $>$ for $\mt{gt}$, and $\geq$ for $\mt{ge}$.
adamc@529	376
adamc@529	377 A signature item $\mt{table} \; x : c$ is shorthand for $\mt{val} \; x : \mt{Basis}.\mt{sql\_table} \; c$. $\mt{sequence} \; x$ is short for $\mt{val} \; x : \mt{Basis}.\mt{sql\_sequence}$, and $\mt{cookie} \; x : \tau$ is shorthand for $\mt{val} \; x : \mt{Basis}.\mt{http\_cookie} \; \tau$.
adamc@529	378
adamc@530	379
adamc@530	380 \section{Static Semantics}
adamc@530	381
adamc@530	382 In this section, we give a declarative presentation of Ur's typing rules and related judgments. Inference is the subject of the next section; here, we assume that an oracle has filled in all wildcards with concrete values.
adamc@530	383
adamc@530	384 Since there is significant mutual recursion among the judgments, we introduce them all before beginning to give rules. We use the same variety of contexts throughout this section, implicitly introducing new sorts of context entries as needed.
adamc@530	385 \begin{itemize}
adamc@655	386 \item $\Gamma \vdash \kappa$ expresses kind well-formedness.
adamc@530	387 \item $\Gamma \vdash c :: \kappa$ assigns a kind to a constructor in a context.
adamc@530	388 \item $\Gamma \vdash c \sim c$ proves the disjointness of two record constructors; that is, that they share no field names. We overload the judgment to apply to pairs of field names as well.
adamc@531	389 \item $\Gamma \vdash c \hookrightarrow C$ proves that record constructor $c$ decomposes into set $C$ of field names and record constructors.
adamc@530	390 \item $\Gamma \vdash c \equiv c$ proves the computational equivalence of two constructors. This is often called a \emph{definitional equality} in the world of type theory.
adamc@530	391 \item $\Gamma \vdash e : \tau$ is a standard typing judgment.
adamc@534	392 \item $\Gamma \vdash p \leadsto \Gamma; \tau$ combines typing of patterns with calculation of which new variables they bind.
adamc@537	393 \item $\Gamma \vdash d \leadsto \Gamma$ expresses how a declaration modifies a context. We overload this judgment to apply to sequences of declarations, as well as to signature items and sequences of signature items.
adamc@537	394 \item $\Gamma \vdash S \equiv S$ is the signature equivalence judgment.
adamc@536	395 \item $\Gamma \vdash S \leq S$ is the signature compatibility judgment. We write $\Gamma \vdash S$ as shorthand for $\Gamma \vdash S \leq S$.
adamc@530	396 \item $\Gamma \vdash M : S$ is the module signature checking judgment.
adamc@537	397 \item $\mt{proj}(M, \overline{s}, V)$ is a partial function for projecting a signature item from $\overline{s}$, given the module $M$ that we project from. $V$ may be $\mt{con} \; x$, $\mt{datatype} \; x$, $\mt{val} \; x$, $\mt{signature} \; X$, or $\mt{structure} \; X$. The parameter $M$ is needed because the projected signature item may refer to other items from $\overline{s}$.
adamc@539	398 \item $\mt{selfify}(M, \overline{s})$ adds information to signature items $\overline{s}$ to reflect the fact that we are concerned with the particular module $M$. This function is overloaded to work over individual signature items as well.
adamc@530	399 \end{itemize}
adamc@530	400
adamc@655	401
adamc@655	402 \subsection{Kind Well-Formedness}
adamc@655	403
adamc@655	404 $$\infer{\Gamma \vdash \mt{Type}}{}
adamc@655	405 \quad \infer{\Gamma \vdash \mt{Unit}}{}
adamc@655	406 \quad \infer{\Gamma \vdash \mt{Name}}{}
adamc@655	407 \quad \infer{\Gamma \vdash \kappa_1 \to \kappa_2}{
adamc@655	408 \Gamma \vdash \kappa_1
adamc@655	409 & \Gamma \vdash \kappa_2
adamc@655	410 }
adamc@655	411 \quad \infer{\Gamma \vdash \{\kappa\}}{
adamc@655	412 \Gamma \vdash \kappa
adamc@655	413 }
adamc@655	414 \quad \infer{\Gamma \vdash (\kappa_1 \times \ldots \times \kappa_n)}{
adamc@655	415 \forall i: \Gamma \vdash \kappa_i
adamc@655	416 }$$
adamc@655	417
adamc@655	418 $$\infer{\Gamma \vdash X}{
adamc@655	419 X \in \Gamma
adamc@655	420 }
adamc@655	421 \quad \infer{\Gamma \vdash X \longrightarrow \kappa}{
adamc@655	422 \Gamma, X \vdash \kappa
adamc@655	423 }$$
adamc@655	424
adamc@530	425 \subsection{Kinding}
adamc@530	426
adamc@655	427 We write $[X \mapsto \kappa_1]\kappa_2$ for capture-avoiding substitution of $\kappa_1$ for $X$ in $\kappa_2$.
adamc@655	428
adamc@530	429 $$\infer{\Gamma \vdash (c) :: \kappa :: \kappa}{
adamc@530	430 \Gamma \vdash c :: \kappa
adamc@530	431 }
adamc@530	432 \quad \infer{\Gamma \vdash x :: \kappa}{
adamc@530	433 x :: \kappa \in \Gamma
adamc@530	434 }
adamc@530	435 \quad \infer{\Gamma \vdash x :: \kappa}{
adamc@530	436 x :: \kappa = c \in \Gamma
adamc@530	437 }$$
adamc@530	438
adamc@530	439 $$\infer{\Gamma \vdash M.x :: \kappa}{
adamc@537	440 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	441 & \mt{proj}(M, \overline{s}, \mt{con} \; x) = \kappa
adamc@530	442 }
adamc@530	443 \quad \infer{\Gamma \vdash M.x :: \kappa}{
adamc@537	444 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	445 & \mt{proj}(M, \overline{s}, \mt{con} \; x) = (\kappa, c)
adamc@530	446 }$$
adamc@530	447
adamc@530	448 $$\infer{\Gamma \vdash \tau_1 \to \tau_2 :: \mt{Type}}{
adamc@530	449 \Gamma \vdash \tau_1 :: \mt{Type}
adamc@530	450 & \Gamma \vdash \tau_2 :: \mt{Type}
adamc@530	451 }
adamc@530	452 \quad \infer{\Gamma \vdash x \; ? \: \kappa \to \tau :: \mt{Type}}{
adamc@530	453 \Gamma, x :: \kappa \vdash \tau :: \mt{Type}
adamc@530	454 }
adamc@655	455 \quad \infer{\Gamma \vdash X \longrightarrow \tau :: \mt{Type}}{
adamc@655	456 \Gamma, X \vdash \tau :: \mt{Type}
adamc@655	457 }
adamc@530	458 \quad \infer{\Gamma \vdash \$c :: \mt{Type}}{
adamc@530	459 \Gamma \vdash c :: \{\mt{Type}\}
adamc@530	460 }$$
adamc@530	461
adamc@530	462 $$\infer{\Gamma \vdash c_1 \; c_2 :: \kappa_2}{
adamc@530	463 \Gamma \vdash c_1 :: \kappa_1 \to \kappa_2
adamc@530	464 & \Gamma \vdash c_2 :: \kappa_1
adamc@530	465 }
adamc@530	466 \quad \infer{\Gamma \vdash \lambda x \; :: \; \kappa_1 \Rightarrow c :: \kappa_1 \to \kappa_2}{
adamc@530	467 \Gamma, x :: \kappa_1 \vdash c :: \kappa_2
adamc@530	468 }$$
adamc@530	469
adamc@655	470 $$\infer{\Gamma \vdash c[\kappa'] :: [X \mapsto \kappa']\kappa}{
adamc@655	471 \Gamma \vdash c :: X \to \kappa
adamc@655	472 & \Gamma \vdash \kappa'
adamc@655	473 }
adamc@655	474 \quad \infer{\Gamma \vdash X \Longrightarrow c :: X \to \kappa}{
adamc@655	475 \Gamma, X \vdash c :: \kappa
adamc@655	476 }$$
adamc@655	477
adamc@530	478 $$\infer{\Gamma \vdash () :: \mt{Unit}}{}
adamc@530	479 \quad \infer{\Gamma \vdash \#X :: \mt{Name}}{}$$
adamc@530	480
adamc@530	481 $$\infer{\Gamma \vdash [\overline{c_i = c'_i}] :: \{\kappa\}}{
adamc@530	482 \forall i: \Gamma \vdash c_i : \mt{Name}
adamc@530	483 & \Gamma \vdash c'_i :: \kappa
adamc@530	484 & \forall i \neq j: \Gamma \vdash c_i \sim c_j
adamc@530	485 }
adamc@530	486 \quad \infer{\Gamma \vdash c_1 \rc c_2 :: \{\kappa\}}{
adamc@530	487 \Gamma \vdash c_1 :: \{\kappa\}
adamc@530	488 & \Gamma \vdash c_2 :: \{\kappa\}
adamc@530	489 & \Gamma \vdash c_1 \sim c_2
adamc@530	490 }$$
adamc@530	491
adamc@655	492 $$\infer{\Gamma \vdash \mt{map} :: (\kappa_1 \to \kappa_2) \to \{\kappa_1\} \to \{\kappa_2\}}{}$$
adamc@530	493
adamc@573	494 $$\infer{\Gamma \vdash (\overline c) :: (\kappa_1 \times \ldots \times \kappa_n)}{
adamc@573	495 \forall i: \Gamma \vdash c_i :: \kappa_i
adamc@530	496 }
adamc@573	497 \quad \infer{\Gamma \vdash c.i :: \kappa_i}{
adamc@573	498 \Gamma \vdash c :: (\kappa_1 \times \ldots \times \kappa_n)
adamc@530	499 }$$
adamc@530	500
adamc@655	501 $$\infer{\Gamma \vdash \lambda [c_1 \sim c_2] \Rightarrow \tau :: \mt{Type}}{
adamc@655	502 \Gamma \vdash c_1 :: \{\kappa\}
adamc@530	503 & \Gamma \vdash c_2 :: \{\kappa'\}
adamc@655	504 & \Gamma, c_1 \sim c_2 \vdash \tau :: \mt{Type}
adamc@530	505 }$$
adamc@530	506
adamc@531	507 \subsection{Record Disjointness}
adamc@531	508
adamc@531	509 $$\infer{\Gamma \vdash c_1 \sim c_2}{
adamc@558	510 \Gamma \vdash c_1 \hookrightarrow C_1
adamc@558	511 & \Gamma \vdash c_2 \hookrightarrow C_2
adamc@558	512 & \forall c'_1 \in C_1, c'_2 \in C_2: \Gamma \vdash c'_1 \sim c'_2
adamc@531	513 }
adamc@531	514 \quad \infer{\Gamma \vdash X \sim X'}{
adamc@531	515 X \neq X'
adamc@531	516 }$$
adamc@531	517
adamc@531	518 $$\infer{\Gamma \vdash c_1 \sim c_2}{
adamc@531	519 c'_1 \sim c'_2 \in \Gamma
adamc@558	520 & \Gamma \vdash c'_1 \hookrightarrow C_1
adamc@558	521 & \Gamma \vdash c'_2 \hookrightarrow C_2
adamc@558	522 & c_1 \in C_1
adamc@558	523 & c_2 \in C_2
adamc@531	524 }$$
adamc@531	525
adamc@531	526 $$\infer{\Gamma \vdash c \hookrightarrow \{c\}}{}
adamc@531	527 \quad \infer{\Gamma \vdash [\overline{c = c'}] \hookrightarrow \{\overline{c}\}}{}
adamc@531	528 \quad \infer{\Gamma \vdash c_1 \rc c_2 \hookrightarrow C_1 \cup C_2}{
adamc@531	529 \Gamma \vdash c_1 \hookrightarrow C_1
adamc@531	530 & \Gamma \vdash c_2 \hookrightarrow C_2
adamc@531	531 }
adamc@531	532 \quad \infer{\Gamma \vdash c \hookrightarrow C}{
adamc@531	533 \Gamma \vdash c \equiv c'
adamc@531	534 & \Gamma \vdash c' \hookrightarrow C
adamc@531	535 }
adamc@531	536 \quad \infer{\Gamma \vdash \mt{map} \; f \; c \hookrightarrow C}{
adamc@531	537 \Gamma \vdash c \hookrightarrow C
adamc@531	538 }$$
adamc@531	539
adamc@541	540 \subsection{\label{definitional}Definitional Equality}
adamc@532	541
adamc@655	542 We use $\mathcal C$ to stand for a one-hole context that, when filled, yields a constructor. The notation $\mathcal C[c]$ plugs $c$ into $\mathcal C$. We omit the standard definition of one-hole contexts. We write $[x \mapsto c_1]c_2$ for capture-avoiding substitution of $c_1$ for $x$ in $c_2$, with analogous notation for substituting a kind in a constructor.
adamc@532	543
adamc@532	544 $$\infer{\Gamma \vdash c \equiv c}{}
adamc@532	545 \quad \infer{\Gamma \vdash c_1 \equiv c_2}{
adamc@532	546 \Gamma \vdash c_2 \equiv c_1
adamc@532	547 }
adamc@532	548 \quad \infer{\Gamma \vdash c_1 \equiv c_3}{
adamc@532	549 \Gamma \vdash c_1 \equiv c_2
adamc@532	550 & \Gamma \vdash c_2 \equiv c_3
adamc@532	551 }
adamc@532	552 \quad \infer{\Gamma \vdash \mathcal C[c_1] \equiv \mathcal C[c_2]}{
adamc@532	553 \Gamma \vdash c_1 \equiv c_2
adamc@532	554 }$$
adamc@532	555
adamc@532	556 $$\infer{\Gamma \vdash x \equiv c}{
adamc@532	557 x :: \kappa = c \in \Gamma
adamc@532	558 }
adamc@532	559 \quad \infer{\Gamma \vdash M.x \equiv c}{
adamc@537	560 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	561 & \mt{proj}(M, \overline{s}, \mt{con} \; x) = (\kappa, c)
adamc@532	562 }
adamc@532	563 \quad \infer{\Gamma \vdash (\overline c).i \equiv c_i}{}$$
adamc@532	564
adamc@532	565 $$\infer{\Gamma \vdash (\lambda x :: \kappa \Rightarrow c) \; c' \equiv [x \mapsto c'] c}{}
adamc@655	566 \quad \infer{\Gamma \vdash (X \Longrightarrow c) [\kappa] \equiv [X \mapsto \kappa] c}{}$$
adamc@655	567
adamc@655	568 $$\infer{\Gamma \vdash c_1 \rc c_2 \equiv c_2 \rc c_1}{}
adamc@532	569 \quad \infer{\Gamma \vdash c_1 \rc (c_2 \rc c_3) \equiv (c_1 \rc c_2) \rc c_3}{}$$
adamc@532	570
adamc@532	571 $$\infer{\Gamma \vdash [] \rc c \equiv c}{}
adamc@532	572 \quad \infer{\Gamma \vdash [\overline{c_1 = c'_1}] \rc [\overline{c_2 = c'_2}] \equiv [\overline{c_1 = c'_1}, \overline{c_2 = c'_2}]}{}$$
adamc@532	573
adamc@655	574 $$\infer{\Gamma \vdash \mt{map} \; f \; [] \equiv []}{}
adamc@655	575 \quad \infer{\Gamma \vdash \mt{map} \; f \; ([c_1 = c_2] \rc c) \equiv [c_1 = f \; c_2] \rc \mt{map} \; f \; c}{}$$
adamc@532	576
adamc@532	577 $$\infer{\Gamma \vdash \mt{map} \; (\lambda x \Rightarrow x) \; c \equiv c}{}
adamc@655	578 \quad \infer{\Gamma \vdash \mt{map} \; f \; (\mt{map} \; f' \; c)
adamc@655	579 \equiv \mt{map} \; (\lambda x \Rightarrow f \; (f' \; x)) \; c}{}$$
adamc@532	580
adamc@532	581 $$\infer{\Gamma \vdash \mt{map} \; f \; (c_1 \rc c_2) \equiv \mt{map} \; f \; c_1 \rc \mt{map} \; f \; c_2}{}$$
adamc@531	582
adamc@534	583 \subsection{Expression Typing}
adamc@533	584
adamc@533	585 We assume the existence of a function $T$ assigning types to literal constants. It maps integer constants to $\mt{Basis}.\mt{int}$, float constants to $\mt{Basis}.\mt{float}$, and string constants to $\mt{Basis}.\mt{string}$.
adamc@533	586
adamc@533	587 We also refer to a function $\mathcal I$, such that $\mathcal I(\tau)$ ``uses an oracle'' to instantiate all constructor function arguments at the beginning of $\tau$ that are marked implicit; i.e., replace $x_1 ::: \kappa_1 \to \ldots \to x_n ::: \kappa_n \to \tau$ with $[x_1 \mapsto c_1]\ldots[x_n \mapsto c_n]\tau$, where the $c_i$s are inferred and $\tau$ does not start like $x ::: \kappa \to \tau'$.
adamc@533	588
adamc@533	589 $$\infer{\Gamma \vdash e : \tau : \tau}{
adamc@533	590 \Gamma \vdash e : \tau
adamc@533	591 }
adamc@533	592 \quad \infer{\Gamma \vdash e : \tau}{
adamc@533	593 \Gamma \vdash e : \tau'
adamc@533	594 & \Gamma \vdash \tau' \equiv \tau
adamc@533	595 }
adamc@533	596 \quad \infer{\Gamma \vdash \ell : T(\ell)}{}$$
adamc@533	597
adamc@533	598 $$\infer{\Gamma \vdash x : \mathcal I(\tau)}{
adamc@533	599 x : \tau \in \Gamma
adamc@533	600 }
adamc@533	601 \quad \infer{\Gamma \vdash M.x : \mathcal I(\tau)}{
adamc@537	602 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	603 & \mt{proj}(M, \overline{s}, \mt{val} \; x) = \tau
adamc@533	604 }
adamc@533	605 \quad \infer{\Gamma \vdash X : \mathcal I(\tau)}{
adamc@533	606 X : \tau \in \Gamma
adamc@533	607 }
adamc@533	608 \quad \infer{\Gamma \vdash M.X : \mathcal I(\tau)}{
adamc@537	609 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	610 & \mt{proj}(M, \overline{s}, \mt{val} \; X) = \tau
adamc@533	611 }$$
adamc@533	612
adamc@533	613 $$\infer{\Gamma \vdash e_1 \; e_2 : \tau_2}{
adamc@533	614 \Gamma \vdash e_1 : \tau_1 \to \tau_2
adamc@533	615 & \Gamma \vdash e_2 : \tau_1
adamc@533	616 }
adamc@533	617 \quad \infer{\Gamma \vdash \lambda x : \tau_1 \Rightarrow e : \tau_1 \to \tau_2}{
adamc@533	618 \Gamma, x : \tau_1 \vdash e : \tau_2
adamc@533	619 }$$
adamc@533	620
adamc@533	621 $$\infer{\Gamma \vdash e [c] : [x \mapsto c]\tau}{
adamc@533	622 \Gamma \vdash e : x :: \kappa \to \tau
adamc@533	623 & \Gamma \vdash c :: \kappa
adamc@533	624 }
adamc@533	625 \quad \infer{\Gamma \vdash \lambda x \; ? \; \kappa \Rightarrow e : x \; ? \; \kappa \to \tau}{
adamc@533	626 \Gamma, x :: \kappa \vdash e : \tau
adamc@533	627 }$$
adamc@533	628
adamc@655	629 $$\infer{\Gamma \vdash e [\kappa] : [X \mapsto \kappa]\tau}{
adamc@655	630 \Gamma \vdash e : X \longrightarrow \tau
adamc@655	631 & \Gamma \vdash \kappa
adamc@655	632 }
adamc@655	633 \quad \infer{\Gamma \vdash X \Longrightarrow e : X \longrightarrow \tau}{
adamc@655	634 \Gamma, X \vdash e : \tau
adamc@655	635 }$$
adamc@655	636
adamc@533	637 $$\infer{\Gamma \vdash \{\overline{c = e}\} : \{\overline{c : \tau}\}}{
adamc@533	638 \forall i: \Gamma \vdash c_i :: \mt{Name}
adamc@533	639 & \Gamma \vdash e_i : \tau_i
adamc@533	640 & \forall i \neq j: \Gamma \vdash c_i \sim c_j
adamc@533	641 }
adamc@533	642 \quad \infer{\Gamma \vdash e.c : \tau}{
adamc@533	643 \Gamma \vdash e : \$([c = \tau] \rc c')
adamc@533	644 }
adamc@533	645 \quad \infer{\Gamma \vdash e_1 \rc e_2 : \$(c_1 \rc c_2)}{
adamc@533	646 \Gamma \vdash e_1 : \$c_1
adamc@533	647 & \Gamma \vdash e_2 : \$c_2
adamc@573	648 & \Gamma \vdash c_1 \sim c_2
adamc@533	649 }$$
adamc@533	650
adamc@533	651 $$\infer{\Gamma \vdash e \rcut c : \$c'}{
adamc@533	652 \Gamma \vdash e : \$([c = \tau] \rc c')
adamc@533	653 }
adamc@533	654 \quad \infer{\Gamma \vdash e \rcutM c : \$c'}{
adamc@533	655 \Gamma \vdash e : \$(c \rc c')
adamc@533	656 }$$
adamc@533	657
adamc@533	658 $$\infer{\Gamma \vdash \mt{let} \; \overline{ed} \; \mt{in} \; e \; \mt{end} : \tau}{
adamc@533	659 \Gamma \vdash \overline{ed} \leadsto \Gamma'
adamc@533	660 & \Gamma' \vdash e : \tau
adamc@533	661 }
adamc@533	662 \quad \infer{\Gamma \vdash \mt{case} \; e \; \mt{of} \; \overline{p \Rightarrow e} : \tau}{
adamc@533	663 \forall i: \Gamma \vdash p_i \leadsto \Gamma_i, \tau'
adamc@533	664 & \Gamma_i \vdash e_i : \tau
adamc@533	665 }$$
adamc@533	666
adamc@573	667 $$\infer{\Gamma \vdash \lambda [c_1 \sim c_2] \Rightarrow e : \lambda [c_1 \sim c_2] \Rightarrow \tau}{
adamc@533	668 \Gamma \vdash c_1 :: \{\kappa\}
adamc@655	669 & \Gamma \vdash c_2 :: \{\kappa'\}
adamc@533	670 & \Gamma, c_1 \sim c_2 \vdash e : \tau
adamc@662	671 }
adamc@662	672 \quad \infer{\Gamma \vdash e \; ! : \tau}{
adamc@662	673 \Gamma \vdash e : [c_1 \sim c_2] \Rightarrow \tau
adamc@662	674 & \Gamma \vdash c_1 \sim c_2
adamc@533	675 }$$
adamc@533	676
adamc@534	677 \subsection{Pattern Typing}
adamc@534	678
adamc@534	679 $$\infer{\Gamma \vdash \_ \leadsto \Gamma; \tau}{}
adamc@534	680 \quad \infer{\Gamma \vdash x \leadsto \Gamma, x : \tau; \tau}{}
adamc@534	681 \quad \infer{\Gamma \vdash \ell \leadsto \Gamma; T(\ell)}{}$$
adamc@534	682
adamc@534	683 $$\infer{\Gamma \vdash X \leadsto \Gamma; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@534	684 X : \overline{x ::: \mt{Type}} \to \tau \in \Gamma
adamc@534	685 & \textrm{$\tau$ not a function type}
adamc@534	686 }
adamc@534	687 \quad \infer{\Gamma \vdash X \; p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@534	688 X : \overline{x ::: \mt{Type}} \to \tau'' \to \tau \in \Gamma
adamc@534	689 & \Gamma \vdash p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau''
adamc@534	690 }$$
adamc@534	691
adamc@534	692 $$\infer{\Gamma \vdash M.X \leadsto \Gamma; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@537	693 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	694 & \mt{proj}(M, \overline{s}, \mt{val} \; X) = \overline{x ::: \mt{Type}} \to \tau
adamc@534	695 & \textrm{$\tau$ not a function type}
adamc@534	696 }$$
adamc@534	697
adamc@534	698 $$\infer{\Gamma \vdash M.X \; p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau}{
adamc@537	699 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	700 & \mt{proj}(M, \overline{s}, \mt{val} \; X) = \overline{x ::: \mt{Type}} \to \tau'' \to \tau
adamc@534	701 & \Gamma \vdash p \leadsto \Gamma'; \overline{[x_i \mapsto \tau'_i]}\tau''
adamc@534	702 }$$
adamc@534	703
adamc@534	704 $$\infer{\Gamma \vdash \{\overline{x = p}\} \leadsto \Gamma_n; \{\overline{x = \tau}\}}{
adamc@534	705 \Gamma_0 = \Gamma
adamc@534	706 & \forall i: \Gamma_i \vdash p_i \leadsto \Gamma_{i+1}; \tau_i
adamc@534	707 }
adamc@534	708 \quad \infer{\Gamma \vdash \{\overline{x = p}, \ldots\} \leadsto \Gamma_n; \$([\overline{x = \tau}] \rc c)}{
adamc@534	709 \Gamma_0 = \Gamma
adamc@534	710 & \forall i: \Gamma_i \vdash p_i \leadsto \Gamma_{i+1}; \tau_i
adamc@534	711 }$$
adamc@534	712
adamc@535	713 \subsection{Declaration Typing}
adamc@535	714
adamc@535	715 We use an auxiliary judgment $\overline{y}; x; \Gamma \vdash \overline{dc} \leadsto \Gamma'$, expressing the enrichment of $\Gamma$ with the types of the datatype constructors $\overline{dc}$, when they are known to belong to datatype $x$ with type parameters $\overline{y}$.
adamc@535	716
adamc@655	717 This is the first judgment where we deal with constructor classes, for the $\mt{class}$ declaration form. We will omit their special handling in this formal specification. Section \ref{typeclasses} gives an informal description of how constructor classes influence type inference.
adamc@535	718
adamc@558	719 We presuppose the existence of a function $\mathcal O$, where $\mathcal O(M, \overline{s})$ implements the $\mt{open}$ declaration by producing a context with the appropriate entry for each available component of module $M$ with signature items $\overline{s}$. Where possible, $\mathcal O$ uses ``transparent'' entries (e.g., an abstract type $M.x$ is mapped to $x :: \mt{Type} = M.x$), so that the relationship with $M$ is maintained. A related function $\mathcal O_c$ builds a context containing the disjointness constraints found in $\overline s$.
adamc@537	720 We write $\kappa_1^n \to \kappa$ as a shorthand, where $\kappa_1^0 \to \kappa = \kappa$ and $\kappa_1^{n+1} \to \kappa_2 = \kappa_1 \to (\kappa_1^n \to \kappa_2)$. We write $\mt{len}(\overline{y})$ for the length of vector $\overline{y}$ of variables.
adamc@535	721
adamc@535	722 $$\infer{\Gamma \vdash \cdot \leadsto \Gamma}{}
adamc@535	723 \quad \infer{\Gamma \vdash d, \overline{d} \leadsto \Gamma''}{
adamc@535	724 \Gamma \vdash d \leadsto \Gamma'
adamc@535	725 & \Gamma' \vdash \overline{d} \leadsto \Gamma''
adamc@535	726 }$$
adamc@535	727
adamc@535	728 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa = c \leadsto \Gamma, x :: \kappa = c}{
adamc@535	729 \Gamma \vdash c :: \kappa
adamc@535	730 }
adamc@535	731 \quad \infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leadsto \Gamma'}{
adamc@535	732 \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} \vdash \overline{dc} \leadsto \Gamma'
adamc@535	733 }$$
adamc@535	734
adamc@535	735 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leadsto \Gamma'}{
adamc@537	736 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	737 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@535	738 & \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} = M.z \vdash \overline{dc} \leadsto \Gamma'
adamc@535	739 }$$
adamc@535	740
adamc@535	741 $$\infer{\Gamma \vdash \mt{val} \; x : \tau = e \leadsto \Gamma, x : \tau}{
adamc@535	742 \Gamma \vdash e : \tau
adamc@535	743 }$$
adamc@535	744
adamc@535	745 $$\infer{\Gamma \vdash \mt{val} \; \mt{rec} \; \overline{x : \tau = e} \leadsto \Gamma, \overline{x : \tau}}{
adamc@535	746 \forall i: \Gamma, \overline{x : \tau} \vdash e_i : \tau_i
adamc@535	747 & \textrm{$e_i$ starts with an expression $\lambda$, optionally preceded by constructor and disjointness $\lambda$s}
adamc@535	748 }$$
adamc@535	749
adamc@535	750 $$\infer{\Gamma \vdash \mt{structure} \; X : S = M \leadsto \Gamma, X : S}{
adamc@535	751 \Gamma \vdash M : S
adamc@558	752 & \textrm{ $M$ not a constant or application}
adamc@535	753 }
adamc@558	754 \quad \infer{\Gamma \vdash \mt{structure} \; X : S = M \leadsto \Gamma, X : \mt{selfify}(X, \overline{s})}{
adamc@558	755 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@539	756 }$$
adamc@539	757
adamc@539	758 $$\infer{\Gamma \vdash \mt{signature} \; X = S \leadsto \Gamma, X = S}{
adamc@535	759 \Gamma \vdash S
adamc@535	760 }$$
adamc@535	761
adamc@537	762 $$\infer{\Gamma \vdash \mt{open} \; M \leadsto \Gamma, \mathcal O(M, \overline{s})}{
adamc@537	763 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@535	764 }$$
adamc@535	765
adamc@535	766 $$\infer{\Gamma \vdash \mt{constraint} \; c_1 \sim c_2 \leadsto \Gamma}{
adamc@535	767 \Gamma \vdash c_1 :: \{\kappa\}
adamc@535	768 & \Gamma \vdash c_2 :: \{\kappa\}
adamc@535	769 & \Gamma \vdash c_1 \sim c_2
adamc@535	770 }
adamc@537	771 \quad \infer{\Gamma \vdash \mt{open} \; \mt{constraints} \; M \leadsto \Gamma, \mathcal O_c(M, \overline{s})}{
adamc@537	772 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@535	773 }$$
adamc@535	774
adamc@535	775 $$\infer{\Gamma \vdash \mt{table} \; x : c \leadsto \Gamma, x : \mt{Basis}.\mt{sql\_table} \; c}{
adamc@535	776 \Gamma \vdash c :: \{\mt{Type}\}
adamc@535	777 }
adamc@535	778 \quad \infer{\Gamma \vdash \mt{sequence} \; x \leadsto \Gamma, x : \mt{Basis}.\mt{sql\_sequence}}{}$$
adamc@535	779
adamc@535	780 $$\infer{\Gamma \vdash \mt{cookie} \; x : \tau \leadsto \Gamma, x : \mt{Basis}.\mt{http\_cookie} \; \tau}{
adamc@535	781 \Gamma \vdash \tau :: \mt{Type}
adamc@535	782 }$$
adamc@535	783
adamc@655	784 $$\infer{\Gamma \vdash \mt{class} \; x :: \kappa = c \leadsto \Gamma, x :: \kappa \to \mt{Type} = c}{
adamc@655	785 \Gamma \vdash c :: \kappa \to \mt{Type}
adamc@535	786 }$$
adamc@535	787
adamc@535	788 $$\infer{\overline{y}; x; \Gamma \vdash \cdot \leadsto \Gamma}{}
adamc@535	789 \quad \infer{\overline{y}; x; \Gamma \vdash X \mid \overline{dc} \leadsto \Gamma', X : \overline{y ::: \mt{Type}} \to x \; \overline{y}}{
adamc@535	790 \overline{y}; x; \Gamma \vdash \overline{dc} \leadsto \Gamma'
adamc@535	791 }
adamc@535	792 \quad \infer{\overline{y}; x; \Gamma \vdash X \; \mt{of} \; \tau \mid \overline{dc} \leadsto \Gamma', X : \overline{y ::: \mt{Type}} \to \tau \to x \; \overline{y}}{
adamc@535	793 \overline{y}; x; \Gamma \vdash \overline{dc} \leadsto \Gamma'
adamc@535	794 }$$
adamc@535	795
adamc@537	796 \subsection{Signature Item Typing}
adamc@537	797
adamc@537	798 We appeal to a signature item analogue of the $\mathcal O$ function from the last subsection.
adamc@537	799
adamc@537	800 $$\infer{\Gamma \vdash \cdot \leadsto \Gamma}{}
adamc@537	801 \quad \infer{\Gamma \vdash s, \overline{s} \leadsto \Gamma''}{
adamc@537	802 \Gamma \vdash s \leadsto \Gamma'
adamc@537	803 & \Gamma' \vdash \overline{s} \leadsto \Gamma''
adamc@537	804 }$$
adamc@537	805
adamc@537	806 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa \leadsto \Gamma, x :: \kappa}{}
adamc@537	807 \quad \infer{\Gamma \vdash \mt{con} \; x :: \kappa = c \leadsto \Gamma, x :: \kappa = c}{
adamc@537	808 \Gamma \vdash c :: \kappa
adamc@537	809 }
adamc@537	810 \quad \infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leadsto \Gamma'}{
adamc@537	811 \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} \vdash \overline{dc} \leadsto \Gamma'
adamc@537	812 }$$
adamc@537	813
adamc@537	814 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leadsto \Gamma'}{
adamc@537	815 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	816 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@537	817 & \overline{y}; x; \Gamma, x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type} = M.z \vdash \overline{dc} \leadsto \Gamma'
adamc@537	818 }$$
adamc@537	819
adamc@537	820 $$\infer{\Gamma \vdash \mt{val} \; x : \tau \leadsto \Gamma, x : \tau}{
adamc@537	821 \Gamma \vdash \tau :: \mt{Type}
adamc@537	822 }$$
adamc@537	823
adamc@537	824 $$\infer{\Gamma \vdash \mt{structure} \; X : S \leadsto \Gamma, X : S}{
adamc@537	825 \Gamma \vdash S
adamc@537	826 }
adamc@537	827 \quad \infer{\Gamma \vdash \mt{signature} \; X = S \leadsto \Gamma, X = S}{
adamc@537	828 \Gamma \vdash S
adamc@537	829 }$$
adamc@537	830
adamc@537	831 $$\infer{\Gamma \vdash \mt{include} \; S \leadsto \Gamma, \mathcal O(\overline{s})}{
adamc@537	832 \Gamma \vdash S
adamc@537	833 & \Gamma \vdash S \equiv \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	834 }$$
adamc@537	835
adamc@537	836 $$\infer{\Gamma \vdash \mt{constraint} \; c_1 \sim c_2 \leadsto \Gamma, c_1 \sim c_2}{
adamc@537	837 \Gamma \vdash c_1 :: \{\kappa\}
adamc@537	838 & \Gamma \vdash c_2 :: \{\kappa\}
adamc@537	839 }$$
adamc@537	840
adamc@655	841 $$\infer{\Gamma \vdash \mt{class} \; x :: \kappa = c \leadsto \Gamma, x :: \kappa \to \mt{Type} = c}{
adamc@655	842 \Gamma \vdash c :: \kappa \to \mt{Type}
adamc@537	843 }
adamc@655	844 \quad \infer{\Gamma \vdash \mt{class} \; x :: \kappa \leadsto \Gamma, x :: \kappa \to \mt{Type}}{}$$
adamc@537	845
adamc@536	846 \subsection{Signature Compatibility}
adamc@536	847
adamc@558	848 To simplify the judgments in this section, we assume that all signatures are alpha-varied as necessary to avoid including multiple bindings for the same identifier. This is in addition to the usual alpha-variation of locally-bound variables.
adamc@537	849
adamc@537	850 We rely on a judgment $\Gamma \vdash \overline{s} \leq s'$, which expresses the occurrence in signature items $\overline{s}$ of an item compatible with $s'$. We also use a judgment $\Gamma \vdash \overline{dc} \leq \overline{dc}$, which expresses compatibility of datatype definitions.
adamc@537	851
adamc@536	852 $$\infer{\Gamma \vdash S \equiv S}{}
adamc@536	853 \quad \infer{\Gamma \vdash S_1 \equiv S_2}{
adamc@536	854 \Gamma \vdash S_2 \equiv S_1
adamc@536	855 }
adamc@536	856 \quad \infer{\Gamma \vdash X \equiv S}{
adamc@536	857 X = S \in \Gamma
adamc@536	858 }
adamc@536	859 \quad \infer{\Gamma \vdash M.X \equiv S}{
adamc@537	860 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	861 & \mt{proj}(M, \overline{s}, \mt{signature} \; X) = S
adamc@536	862 }$$
adamc@536	863
adamc@536	864 $$\infer{\Gamma \vdash S \; \mt{where} \; \mt{con} \; x = c \equiv \mt{sig} \; \overline{s^1} \; \mt{con} \; x :: \kappa = c \; \overline{s_2} \; \mt{end}}{
adamc@536	865 \Gamma \vdash S \equiv \mt{sig} \; \overline{s^1} \; \mt{con} \; x :: \kappa \; \overline{s_2} \; \mt{end}
adamc@536	866 & \Gamma \vdash c :: \kappa
adamc@537	867 }
adamc@537	868 \quad \infer{\Gamma \vdash \mt{sig} \; \overline{s^1} \; \mt{include} \; S \; \overline{s^2} \; \mt{end} \equiv \mt{sig} \; \overline{s^1} \; \overline{s} \; \overline{s^2} \; \mt{end}}{
adamc@537	869 \Gamma \vdash S \equiv \mt{sig} \; \overline{s} \; \mt{end}
adamc@536	870 }$$
adamc@536	871
adamc@536	872 $$\infer{\Gamma \vdash S_1 \leq S_2}{
adamc@536	873 \Gamma \vdash S_1 \equiv S_2
adamc@536	874 }
adamc@536	875 \quad \infer{\Gamma \vdash \mt{sig} \; \overline{s} \; \mt{end} \leq \mt{sig} \; \mt{end}}{}
adamc@537	876 \quad \infer{\Gamma \vdash \mt{sig} \; \overline{s} \; \mt{end} \leq \mt{sig} \; s' \; \overline{s'} \; \mt{end}}{
adamc@537	877 \Gamma \vdash \overline{s} \leq s'
adamc@537	878 & \Gamma \vdash s' \leadsto \Gamma'
adamc@537	879 & \Gamma' \vdash \mt{sig} \; \overline{s} \; \mt{end} \leq \mt{sig} \; \overline{s'} \; \mt{end}
adamc@537	880 }$$
adamc@537	881
adamc@537	882 $$\infer{\Gamma \vdash s \; \overline{s} \leq s'}{
adamc@537	883 \Gamma \vdash s \leq s'
adamc@537	884 }
adamc@537	885 \quad \infer{\Gamma \vdash s \; \overline{s} \leq s'}{
adamc@537	886 \Gamma \vdash s \leadsto \Gamma'
adamc@537	887 & \Gamma' \vdash \overline{s} \leq s'
adamc@536	888 }$$
adamc@536	889
adamc@536	890 $$\infer{\Gamma \vdash \mt{functor} (X : S_1) : S_2 \leq \mt{functor} (X : S'_1) : S'_2}{
adamc@536	891 \Gamma \vdash S'_1 \leq S_1
adamc@536	892 & \Gamma, X : S'_1 \vdash S_2 \leq S'_2
adamc@536	893 }$$
adamc@536	894
adamc@537	895 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa \leq \mt{con} \; x :: \kappa}{}
adamc@537	896 \quad \infer{\Gamma \vdash \mt{con} \; x :: \kappa = c \leq \mt{con} \; x :: \kappa}{}
adamc@558	897 \quad \infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leq \mt{con} \; x :: \mt{Type}^{\mt{len}(\overline y)} \to \mt{Type}}{}$$
adamc@537	898
adamc@537	899 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leq \mt{con} \; x :: \mt{Type}^{\mt{len}(y)} \to \mt{Type}}{
adamc@537	900 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	901 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@537	902 }$$
adamc@537	903
adamc@655	904 $$\infer{\Gamma \vdash \mt{class} \; x :: \kappa \leq \mt{con} \; x :: \kappa \to \mt{Type}}{}
adamc@655	905 \quad \infer{\Gamma \vdash \mt{class} \; x :: \kappa = c \leq \mt{con} \; x :: \kappa \to \mt{Type}}{}$$
adamc@537	906
adamc@537	907 $$\infer{\Gamma \vdash \mt{con} \; x :: \kappa = c_1 \leq \mt{con} \; x :: \mt{\kappa} = c_2}{
adamc@537	908 \Gamma \vdash c_1 \equiv c_2
adamc@537	909 }
adamc@655	910 \quad \infer{\Gamma \vdash \mt{class} \; x :: \kappa = c_1 \leq \mt{con} \; x :: \kappa \to \mt{Type} = c_2}{
adamc@537	911 \Gamma \vdash c_1 \equiv c_2
adamc@537	912 }$$
adamc@537	913
adamc@537	914 $$\infer{\Gamma \vdash \mt{datatype} \; x \; \overline{y} = \overline{dc} \leq \mt{datatype} \; x \; \overline{y} = \overline{dc'}}{
adamc@537	915 \Gamma, \overline{y :: \mt{Type}} \vdash \overline{dc} \leq \overline{dc'}
adamc@537	916 }$$
adamc@537	917
adamc@537	918 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leq \mt{datatype} \; x \; \overline{y} = \overline{dc'}}{
adamc@537	919 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@537	920 & \mt{proj}(M, \overline{s}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})
adamc@537	921 & \Gamma, \overline{y :: \mt{Type}} \vdash \overline{dc} \leq \overline{dc'}
adamc@537	922 }$$
adamc@537	923
adamc@537	924 $$\infer{\Gamma \vdash \cdot \leq \cdot}{}
adamc@537	925 \quad \infer{\Gamma \vdash X; \overline{dc} \leq X; \overline{dc'}}{
adamc@537	926 \Gamma \vdash \overline{dc} \leq \overline{dc'}
adamc@537	927 }
adamc@537	928 \quad \infer{\Gamma \vdash X \; \mt{of} \; \tau_1; \overline{dc} \leq X \; \mt{of} \; \tau_2; \overline{dc'}}{
adamc@537	929 \Gamma \vdash \tau_1 \equiv \tau_2
adamc@537	930 & \Gamma \vdash \overline{dc} \leq \overline{dc'}
adamc@537	931 }$$
adamc@537	932
adamc@537	933 $$\infer{\Gamma \vdash \mt{datatype} \; x = \mt{datatype} \; M.z \leq \mt{datatype} \; x = \mt{datatype} \; M'.z'}{
adamc@537	934 \Gamma \vdash M.z \equiv M'.z'
adamc@537	935 }$$
adamc@537	936
adamc@537	937 $$\infer{\Gamma \vdash \mt{val} \; x : \tau_1 \leq \mt{val} \; x : \tau_2}{
adamc@537	938 \Gamma \vdash \tau_1 \equiv \tau_2
adamc@537	939 }
adamc@537	940 \quad \infer{\Gamma \vdash \mt{structure} \; X : S_1 \leq \mt{structure} \; X : S_2}{
adamc@537	941 \Gamma \vdash S_1 \leq S_2
adamc@537	942 }
adamc@537	943 \quad \infer{\Gamma \vdash \mt{signature} \; X = S_1 \leq \mt{signature} \; X = S_2}{
adamc@537	944 \Gamma \vdash S_1 \leq S_2
adamc@537	945 & \Gamma \vdash S_2 \leq S_1
adamc@537	946 }$$
adamc@537	947
adamc@537	948 $$\infer{\Gamma \vdash \mt{constraint} \; c_1 \sim c_2 \leq \mt{constraint} \; c'_1 \sim c'_2}{
adamc@537	949 \Gamma \vdash c_1 \equiv c'_1
adamc@537	950 & \Gamma \vdash c_2 \equiv c'_2
adamc@537	951 }$$
adamc@537	952
adamc@655	953 $$\infer{\Gamma \vdash \mt{class} \; x :: \kappa \leq \mt{class} \; x :: \kappa}{}
adamc@655	954 \quad \infer{\Gamma \vdash \mt{class} \; x :: \kappa = c \leq \mt{class} \; x :: \kappa}{}
adamc@655	955 \quad \infer{\Gamma \vdash \mt{class} \; x :: \kappa = c_1 \leq \mt{class} \; x :: \kappa = c_2}{
adamc@537	956 \Gamma \vdash c_1 \equiv c_2
adamc@537	957 }$$
adamc@537	958
adamc@538	959 \subsection{Module Typing}
adamc@538	960
adamc@538	961 We use a helper function $\mt{sigOf}$, which converts declarations and sequences of declarations into their principal signature items and sequences of signature items, respectively.
adamc@538	962
adamc@538	963 $$\infer{\Gamma \vdash M : S}{
adamc@538	964 \Gamma \vdash M : S'
adamc@538	965 & \Gamma \vdash S' \leq S
adamc@538	966 }
adamc@538	967 \quad \infer{\Gamma \vdash \mt{struct} \; \overline{d} \; \mt{end} : \mt{sig} \; \mt{sigOf}(\overline{d}) \; \mt{end}}{
adamc@538	968 \Gamma \vdash \overline{d} \leadsto \Gamma'
adamc@538	969 }
adamc@538	970 \quad \infer{\Gamma \vdash X : S}{
adamc@538	971 X : S \in \Gamma
adamc@538	972 }$$
adamc@538	973
adamc@538	974 $$\infer{\Gamma \vdash M.X : S}{
adamc@538	975 \Gamma \vdash M : \mt{sig} \; \overline{s} \; \mt{end}
adamc@538	976 & \mt{proj}(M, \overline{s}, \mt{structure} \; X) = S
adamc@538	977 }$$
adamc@538	978
adamc@538	979 $$\infer{\Gamma \vdash M_1(M_2) : [X \mapsto M_2]S_2}{
adamc@538	980 \Gamma \vdash M_1 : \mt{functor}(X : S_1) : S_2
adamc@538	981 & \Gamma \vdash M_2 : S_1
adamc@538	982 }
adamc@538	983 \quad \infer{\Gamma \vdash \mt{functor} (X : S_1) : S_2 = M : \mt{functor} (X : S_1) : S_2}{
adamc@538	984 \Gamma \vdash S_1
adamc@538	985 & \Gamma, X : S_1 \vdash S_2
adamc@538	986 & \Gamma, X : S_1 \vdash M : S_2
adamc@538	987 }$$
adamc@538	988
adamc@538	989 \begin{eqnarray*}
adamc@538	990 \mt{sigOf}(\cdot) &=& \cdot \\
adamc@538	991 \mt{sigOf}(s \; \overline{s'}) &=& \mt{sigOf}(s) \; \mt{sigOf}(\overline{s'}) \\
adamc@538	992 \\
adamc@538	993 \mt{sigOf}(\mt{con} \; x :: \kappa = c) &=& \mt{con} \; x :: \kappa = c \\
adamc@538	994 \mt{sigOf}(\mt{datatype} \; x \; \overline{y} = \overline{dc}) &=& \mt{datatype} \; x \; \overline{y} = \overline{dc} \\
adamc@538	995 \mt{sigOf}(\mt{datatype} \; x = \mt{datatype} \; M.z) &=& \mt{datatype} \; x = \mt{datatype} \; M.z \\
adamc@538	996 \mt{sigOf}(\mt{val} \; x : \tau = e) &=& \mt{val} \; x : \tau \\
adamc@538	997 \mt{sigOf}(\mt{val} \; \mt{rec} \; \overline{x : \tau = e}) &=& \overline{\mt{val} \; x : \tau} \\
adamc@538	998 \mt{sigOf}(\mt{structure} \; X : S = M) &=& \mt{structure} \; X : S \\
adamc@538	999 \mt{sigOf}(\mt{signature} \; X = S) &=& \mt{signature} \; X = S \\
adamc@538	1000 \mt{sigOf}(\mt{open} \; M) &=& \mt{include} \; S \textrm{ (where $\Gamma \vdash M : S$)} \\
adamc@538	1001 \mt{sigOf}(\mt{constraint} \; c_1 \sim c_2) &=& \mt{constraint} \; c_1 \sim c_2 \\
adamc@538	1002 \mt{sigOf}(\mt{open} \; \mt{constraints} \; M) &=& \cdot \\
adamc@538	1003 \mt{sigOf}(\mt{table} \; x : c) &=& \mt{table} \; x : c \\
adamc@538	1004 \mt{sigOf}(\mt{sequence} \; x) &=& \mt{sequence} \; x \\
adamc@538	1005 \mt{sigOf}(\mt{cookie} \; x : \tau) &=& \mt{cookie} \; x : \tau \\
adamc@655	1006 \mt{sigOf}(\mt{class} \; x :: \kappa = c) &=& \mt{class} \; x :: \kappa = c \\
adamc@538	1007 \end{eqnarray*}
adamc@539	1008 \begin{eqnarray*}
adamc@539	1009 \mt{selfify}(M, \cdot) &=& \cdot \\
adamc@558	1010 \mt{selfify}(M, s \; \overline{s'}) &=& \mt{selfify}(M, s) \; \mt{selfify}(M, \overline{s'}) \\
adamc@539	1011 \\
adamc@539	1012 \mt{selfify}(M, \mt{con} \; x :: \kappa) &=& \mt{con} \; x :: \kappa = M.x \\
adamc@539	1013 \mt{selfify}(M, \mt{con} \; x :: \kappa = c) &=& \mt{con} \; x :: \kappa = c \\
adamc@539	1014 \mt{selfify}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc}) &=& \mt{datatype} \; x \; \overline{y} = \mt{datatype} \; M.x \\
adamc@539	1015 \mt{selfify}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z) &=& \mt{datatype} \; x = \mt{datatype} \; M'.z \\
adamc@539	1016 \mt{selfify}(M, \mt{val} \; x : \tau) &=& \mt{val} \; x : \tau \\
adamc@539	1017 \mt{selfify}(M, \mt{structure} \; X : S) &=& \mt{structure} \; X : \mt{selfify}(M.X, \overline{s}) \textrm{ (where $\Gamma \vdash S \equiv \mt{sig} \; \overline{s} \; \mt{end}$)} \\
adamc@539	1018 \mt{selfify}(M, \mt{signature} \; X = S) &=& \mt{signature} \; X = S \\
adamc@539	1019 \mt{selfify}(M, \mt{include} \; S) &=& \mt{include} \; S \\
adamc@539	1020 \mt{selfify}(M, \mt{constraint} \; c_1 \sim c_2) &=& \mt{constraint} \; c_1 \sim c_2 \\
adamc@655	1021 \mt{selfify}(M, \mt{class} \; x :: \kappa) &=& \mt{class} \; x :: \kappa = M.x \\
adamc@655	1022 \mt{selfify}(M, \mt{class} \; x :: \kappa = c) &=& \mt{class} \; x :: \kappa = c \\
adamc@539	1023 \end{eqnarray*}
adamc@539	1024
adamc@540	1025 \subsection{Module Projection}
adamc@540	1026
adamc@540	1027 \begin{eqnarray*}
adamc@540	1028 \mt{proj}(M, \mt{con} \; x :: \kappa \; \overline{s}, \mt{con} \; x) &=& \kappa \\
adamc@540	1029 \mt{proj}(M, \mt{con} \; x :: \kappa = c \; \overline{s}, \mt{con} \; x) &=& (\kappa, c) \\
adamc@540	1030 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{con} \; x) &=& \mt{Type}^{\mt{len}(\overline{y})} \to \mt{Type} \\
adamc@540	1031 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z \; \overline{s}, \mt{con} \; x) &=& (\mt{Type}^{\mt{len}(\overline{y})} \to \mt{Type}, M'.z) \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$} \\
adamc@540	1032 && \textrm{and $\mt{proj}(M', \overline{s'}, \mt{datatype} \; z) = (\overline{y}, \overline{dc})$)} \\
adamc@655	1033 \mt{proj}(M, \mt{class} \; x :: \kappa \; \overline{s}, \mt{con} \; x) &=& \kappa \to \mt{Type} \\
adamc@655	1034 \mt{proj}(M, \mt{class} \; x :: \kappa = c \; \overline{s}, \mt{con} \; x) &=& (\kappa \to \mt{Type}, c) \\
adamc@540	1035 \\
adamc@540	1036 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{datatype} \; x) &=& (\overline{y}, \overline{dc}) \\
adamc@540	1037 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z \; \overline{s}, \mt{con} \; x) &=& \mt{proj}(M', \overline{s'}, \mt{datatype} \; z) \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$)} \\
adamc@540	1038 \\
adamc@540	1039 \mt{proj}(M, \mt{val} \; x : \tau \; \overline{s}, \mt{val} \; x) &=& \tau \\
adamc@540	1040 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to M.x \; \overline y \textrm{ (where $X \in \overline{dc}$)} \\
adamc@540	1041 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to \tau \to M.x \; \overline y \textrm{ (where $X \; \mt{of} \; \tau \in \overline{dc}$)} \\
adamc@540	1042 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to M.x \; \overline y \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$} \\
adamc@540	1043 && \textrm{and $\mt{proj}(M', \overline{s'}, \mt{datatype} \; z = (\overline{y}, \overline{dc})$ and $X \in \overline{dc}$)} \\
adamc@540	1044 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z, \mt{val} \; X) &=& \overline{y ::: \mt{Type}} \to \tau \to M.x \; \overline y \textrm{ (where $\Gamma \vdash M' : \mt{sig} \; \overline{s'} \; \mt{end}$} \\
adamc@558	1045 && \textrm{and $\mt{proj}(M', \overline{s'}, \mt{datatype} \; z = (\overline{y}, \overline{dc})$ and $X \; \mt{of} \; \tau \in \overline{dc}$)} \\
adamc@540	1046 \\
adamc@540	1047 \mt{proj}(M, \mt{structure} \; X : S \; \overline{s}, \mt{structure} \; X) &=& S \\
adamc@540	1048 \\
adamc@540	1049 \mt{proj}(M, \mt{signature} \; X = S \; \overline{s}, \mt{signature} \; X) &=& S \\
adamc@540	1050 \\
adamc@540	1051 \mt{proj}(M, \mt{con} \; x :: \kappa \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540	1052 \mt{proj}(M, \mt{con} \; x :: \kappa = c \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540	1053 \mt{proj}(M, \mt{datatype} \; x \; \overline{y} = \overline{dc} \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540	1054 \mt{proj}(M, \mt{datatype} \; x = \mt{datatype} \; M'.z \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540	1055 \mt{proj}(M, \mt{val} \; x : \tau \; \overline{s}, V) &=& \mt{proj}(M, \overline{s}, V) \\
adamc@540	1056 \mt{proj}(M, \mt{structure} \; X : S \; \overline{s}, V) &=& [X \mapsto M.X]\mt{proj}(M, \overline{s}, V) \\
adamc@540	1057 \mt{proj}(M, \mt{signature} \; X = S \; \overline{s}, V) &=& [X \mapsto M.X]\mt{proj}(M, \overline{s}, V) \\
adamc@540	1058 \mt{proj}(M, \mt{include} \; S \; \overline{s}, V) &=& \mt{proj}(M, \overline{s'} \; \overline{s}, V) \textrm{ (where $\Gamma \vdash S \equiv \mt{sig} \; \overline{s'} \; \mt{end}$)} \\
adamc@540	1059 \mt{proj}(M, \mt{constraint} \; c_1 \sim c_2 \; \overline{s}, V) &=& \mt{proj}(M, \overline{s}, V) \\
adamc@655	1060 \mt{proj}(M, \mt{class} \; x :: \kappa \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@655	1061 \mt{proj}(M, \mt{class} \; x :: \kappa = c \; \overline{s}, V) &=& [x \mapsto M.x]\mt{proj}(M, \overline{s}, V) \\
adamc@540	1062 \end{eqnarray*}
adamc@540	1063
adamc@541	1064
adamc@541	1065 \section{Type Inference}
adamc@541	1066
adamc@541	1067 The Ur/Web compiler uses \emph{heuristic type inference}, with no claims of completeness with respect to the declarative specification of the last section. The rules in use seem to work well in practice. This section summarizes those rules, to help Ur programmers predict what will work and what won't.
adamc@541	1068
adamc@541	1069 \subsection{Basic Unification}
adamc@541	1070
adamc@560	1071 Type-checkers for languages based on the Hindley-Milner type discipline, like ML and Haskell, take advantage of \emph{principal typing} properties, making complete type inference relatively straightforward. Inference algorithms are traditionally implemented using type unification variables, at various points asserting equalities between types, in the process discovering the values of type variables. The Ur/Web compiler uses the same basic strategy, but the complexity of the type system rules out easy completeness.
adamc@541	1072
adamc@656	1073 Type-checking can require evaluating recursive functional programs, thanks to the type-level $\mt{map}$ operator. When a unification variable appears in such a type, the next step of computation can be undetermined. The value of that variable might be determined later, but this would be ``too late'' for the unification problems generated at the first occurrence. This is the essential source of incompleteness.
adamc@541	1074
adamc@541	1075 Nonetheless, the unification engine tends to do reasonably well. Unlike in ML, polymorphism is never inferred in definitions; it must be indicated explicitly by writing out constructor-level parameters. By writing these and other annotations, the programmer can generally get the type inference engine to do most of the type reconstruction work.
adamc@541	1076
adamc@541	1077 \subsection{Unifying Record Types}
adamc@541	1078
adamc@570	1079 The type inference engine tries to take advantage of the algebraic rules governing type-level records, as shown in Section \ref{definitional}. When two constructors of record kind are unified, they are reduced to normal forms, with like terms crossed off from each normal form until, hopefully, nothing remains. This cannot be complete, with the inclusion of unification variables. The type-checker can help you understand what goes wrong when the process fails, as it outputs the unmatched remainders of the two normal forms.
adamc@541	1080
adamc@656	1081 \subsection{\label{typeclasses}Constructor Classes}
adamc@541	1082
adamc@656	1083 Ur includes a constructor class facility inspired by Haskell's. The current version is very rudimentary, only supporting instances for particular constructors built up from abstract constructors and datatypes and type-level application.
adamc@541	1084
adamc@656	1085 Constructor classes are integrated with the module system. A constructor class of kind $\kappa$ is just a constructor of kind $\kappa \to \mt{Type}$. By marking such a constructor $c$ as a constructor class, the programmer instructs the type inference engine to, in each scope, record all values of types $c \; c'$ as \emph{instances}. Any function argument whose type is of such a form is treated as implicit, to be determined by examining the current instance database.
adamc@541	1086
adamc@656	1087 The ``dictionary encoding'' often used in Haskell implementations is made explicit in Ur. Constructor class instances are just properly-typed values, and they can also be considered as ``proofs'' of membership in the class. In some cases, it is useful to pass these proofs around explicitly. An underscore written where a proof is expected will also be inferred, if possible, from the current instance database.
adamc@541	1088
adamc@656	1089 Just as for constructors, constructors classes may be exported from modules, and they may be exported as concrete or abstract. Concrete constructor classes have their ``real'' definitions exposed, so that client code may add new instances freely. Abstract constructor classes are useful as ``predicates'' that can be used to enforce invariants, as we will see in some definitions of SQL syntax in the Ur/Web standard library.
adamc@541	1090
adamc@541	1091 \subsection{Reverse-Engineering Record Types}
adamc@541	1092
adamc@656	1093 It's useful to write Ur functions and functors that take record constructors as inputs, but these constructors can grow quite long, even though their values are often implied by other arguments. The compiler uses a simple heuristic to infer the values of unification variables that are mapped over, yielding known results. If the result is empty, we're done; if it's not empty, we replace a single unification variable with a new constructor formed from three new unification variables, as in $[\alpha = \beta] \rc \gamma$. This process can often be repeated to determine a unification variable fully.
adamc@541	1094
adamc@541	1095 \subsection{Implicit Arguments in Functor Applications}
adamc@541	1096
adamc@656	1097 Constructor, constraint, and constructor class witness members of structures may be omitted, when those structures are used in contexts where their assigned signatures imply how to fill in those missing members. This feature combines well with reverse-engineering to allow for uses of complicated meta-programming functors with little more code than would be necessary to invoke an untyped, ad-hoc code generator.
adamc@541	1098
adamc@541	1099
adamc@542	1100 \section{The Ur Standard Library}
adamc@542	1101
adamc@542	1102 The built-in parts of the Ur/Web standard library are described by the signature in \texttt{lib/basis.urs} in the distribution. A module $\mt{Basis}$ ascribing to that signature is available in the initial environment, and every program is implicitly prefixed by $\mt{open} \; \mt{Basis}$.
adamc@542	1103
adamc@542	1104 Additionally, other common functions that are definable within Ur are included in \texttt{lib/top.urs} and \texttt{lib/top.ur}. This $\mt{Top}$ module is also opened implicitly.
adamc@542	1105
adamc@542	1106 The idea behind Ur is to serve as the ideal host for embedded domain-specific languages. For now, however, the ``generic'' functionality is intermixed with Ur/Web-specific functionality, including in these two library modules. We hope that these generic library components have types that speak for themselves. The next section introduces the Ur/Web-specific elements. Here, we only give the type declarations from the beginning of $\mt{Basis}$.
adamc@542	1107 $$\begin{array}{l}
adamc@542	1108 \mt{type} \; \mt{int} \\
adamc@542	1109 \mt{type} \; \mt{float} \\
adamc@542	1110 \mt{type} \; \mt{string} \\
adamc@542	1111 \mt{type} \; \mt{time} \\
adamc@542	1112 \\
adamc@542	1113 \mt{type} \; \mt{unit} = \{\} \\
adamc@542	1114 \\
adamc@542	1115 \mt{datatype} \; \mt{bool} = \mt{False} \mid \mt{True} \\
adamc@542	1116 \\
adamc@542	1117 \mt{datatype} \; \mt{option} \; \mt{t} = \mt{None} \mid \mt{Some} \; \mt{of} \; \mt{t}
adamc@542	1118 \end{array}$$
adamc@542	1119
adamc@657	1120 Another important generic Ur element comes at the beginning of \texttt{top.urs}.
adamc@657	1121
adamc@657	1122 $$\begin{array}{l}
adamc@657	1123 \mt{con} \; \mt{folder} :: \mt{K} \longrightarrow \{\mt{K}\} \to \mt{Type} \\
adamc@657	1124 \\
adamc@657	1125 \mt{val} \; \mt{fold} : \mt{K} \longrightarrow \mt{tf} :: (\{\mt{K}\} \to \mt{Type}) \\
adamc@657	1126 \hspace{.1in} \to (\mt{nm} :: \mt{Name} \to \mt{v} :: \mt{K} \to \mt{r} :: \{\mt{K}\} \to [[\mt{nm}] \sim \mt{r}] \Rightarrow \\
adamc@657	1127 \hspace{.2in} \mt{tf} \; \mt{r} \to \mt{tf} \; ([\mt{nm} = \mt{v}] \rc \mt{r})) \\
adamc@657	1128 \hspace{.1in} \to \mt{tf} \; [] \\
adamc@657	1129 \hspace{.1in} \to \mt{r} :: \{\mt{K}\} \to \mt{folder} \; \mt{r} \to \mt{tf} \; \mt{r}
adamc@657	1130 \end{array}$$
adamc@657	1131
adamc@657	1132 For a type-level record $\mt{r}$, a $\mt{folder} \; \mt{r}$ encodes a permutation of $\mt{r}$'s elements. The $\mt{fold}$ function can be called on a $\mt{folder}$ to iterate over the elements of $\mt{r}$ in that order. $\mt{fold}$ is parameterized on a type-level function to be used to calculate the type of each intermediate result of folding. After processing a subset $\mt{r'}$ of $\mt{r}$'s entries, the type of the accumulator should be $\mt{tf} \; \mt{r'}$. The next two expression arguments to $\mt{fold}$ are the usual step function and initial accumulator, familiar from fold functions over lists. The final two arguments are the record to fold over and a $\mt{folder}$ for it.
adamc@657	1133
adamc@664	1134 The Ur compiler treats $\mt{folder}$ like a constructor class, using built-in rules to infer $\mt{folder}$s for records with known structure. The order in which field names are mentioned in source code is used as a hint about the permutation that the programmer would like.
adamc@657	1135
adamc@542	1136
adamc@542	1137 \section{The Ur/Web Standard Library}
adamc@542	1138
adamc@658	1139 \subsection{Monads}
adamc@658	1140
adamc@658	1141 The Ur Basis defines the monad constructor class from Haskell.
adamc@658	1142
adamc@658	1143 $$\begin{array}{l}
adamc@658	1144 \mt{class} \; \mt{monad} :: \mt{Type} \to \mt{Type} \\
adamc@658	1145 \mt{val} \; \mt{return} : \mt{m} ::: (\mt{Type} \to \mt{Type}) \to \mt{t} ::: \mt{Type} \\
adamc@658	1146 \hspace{.1in} \to \mt{monad} \; \mt{m} \\
adamc@658	1147 \hspace{.1in} \to \mt{t} \to \mt{m} \; \mt{t} \\
adamc@658	1148 \mt{val} \; \mt{bind} : \mt{m} ::: (\mt{Type} \to \mt{Type}) \to \mt{t1} ::: \mt{Type} \to \mt{t2} ::: \mt{Type} \\
adamc@658	1149 \hspace{.1in} \to \mt{monad} \; \mt{m} \\
adamc@658	1150 \hspace{.1in} \to \mt{m} \; \mt{t1} \to (\mt{t1} \to \mt{m} \; \mt{t2}) \\
adamc@658	1151 \hspace{.1in} \to \mt{m} \; \mt{t2}
adamc@658	1152 \end{array}$$
adamc@658	1153
adamc@542	1154 \subsection{Transactions}
adamc@542	1155
adamc@542	1156 Ur is a pure language; we use Haskell's trick to support controlled side effects. The standard library defines a monad $\mt{transaction}$, meant to stand for actions that may be undone cleanly. By design, no other kinds of actions are supported.
adamc@542	1157 $$\begin{array}{l}
adamc@542	1158 \mt{con} \; \mt{transaction} :: \mt{Type} \to \mt{Type} \\
adamc@658	1159 \mt{val} \; \mt{transaction\_monad} : \mt{monad} \; \mt{transaction}
adamc@542	1160 \end{array}$$
adamc@542	1161
adamc@542	1162 \subsection{HTTP}
adamc@542	1163
adamc@542	1164 There are transactions for reading an HTTP header by name and for getting and setting strongly-typed cookies. Cookies may only be created by the $\mt{cookie}$ declaration form, ensuring that they be named consistently based on module structure.
adamc@542	1165 $$\begin{array}{l}
adamc@542	1166 \mt{val} \; \mt{requestHeader} : \mt{string} \to \mt{transaction} \; (\mt{option} \; \mt{string}) \\
adamc@542	1167 \\
adamc@542	1168 \mt{con} \; \mt{http\_cookie} :: \mt{Type} \to \mt{Type} \\
adamc@542	1169 \mt{val} \; \mt{getCookie} : \mt{t} ::: \mt{Type} \to \mt{http\_cookie} \; \mt{t} \to \mt{transaction} \; (\mt{option} \; \mt{t}) \\
adamc@542	1170 \mt{val} \; \mt{setCookie} : \mt{t} ::: \mt{Type} \to \mt{http\_cookie} \; \mt{t} \to \mt{t} \to \mt{transaction} \; \mt{unit}
adamc@542	1171 \end{array}$$
adamc@542	1172
adamc@543	1173 \subsection{SQL}
adamc@543	1174
adamc@543	1175 The fundamental unit of interest in the embedding of SQL is tables, described by a type family and creatable only via the $\mt{table}$ declaration form.
adamc@543	1176 $$\begin{array}{l}
adamc@543	1177 \mt{con} \; \mt{sql\_table} :: \{\mt{Type}\} \to \mt{Type}
adamc@543	1178 \end{array}$$
adamc@543	1179
adamc@543	1180 \subsubsection{Queries}
adamc@543	1181
adamc@543	1182 A final query is constructed via the $\mt{sql\_query}$ function. Constructor arguments respectively specify the table fields we select (as records mapping tables to the subsets of their fields that we choose) and the (always named) extra expressions that we select.
adamc@543	1183 $$\begin{array}{l}
adamc@543	1184 \mt{con} \; \mt{sql\_query} :: \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@543	1185 \mt{val} \; \mt{sql\_query} : \mt{tables} ::: \{\{\mt{Type}\}\} \\
adamc@543	1186 \hspace{.1in} \to \mt{selectedFields} ::: \{\{\mt{Type}\}\} \\
adamc@543	1187 \hspace{.1in} \to \mt{selectedExps} ::: \{\mt{Type}\} \\
adamc@543	1188 \hspace{.1in} \to \{\mt{Rows} : \mt{sql\_query1} \; \mt{tables} \; \mt{selectedFields} \; \mt{selectedExps}, \\
adamc@543	1189 \hspace{.2in} \mt{OrderBy} : \mt{sql\_order\_by} \; \mt{tables} \; \mt{selectedExps}, \\
adamc@543	1190 \hspace{.2in} \mt{Limit} : \mt{sql\_limit}, \\
adamc@543	1191 \hspace{.2in} \mt{Offset} : \mt{sql\_offset}\} \\
adamc@543	1192 \hspace{.1in} \to \mt{sql\_query} \; \mt{selectedFields} \; \mt{selectedExps}
adamc@543	1193 \end{array}$$
adamc@543	1194
adamc@545	1195 Queries are used by folding over their results inside transactions.
adamc@545	1196 $$\begin{array}{l}
adamc@545	1197 \mt{val} \; \mt{query} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \lambda [\mt{tables} \sim \mt{exps}] \Rightarrow \mt{state} ::: \mt{Type} \to \mt{sql\_query} \; \mt{tables} \; \mt{exps} \\
adamc@658	1198 \hspace{.1in} \to (\$(\mt{exps} \rc \mt{map} \; (\lambda \mt{fields} :: \{\mt{Type}\} \Rightarrow \$\mt{fields}) \; \mt{tables}) \\
adamc@545	1199 \hspace{.2in} \to \mt{state} \to \mt{transaction} \; \mt{state}) \\
adamc@545	1200 \hspace{.1in} \to \mt{state} \to \mt{transaction} \; \mt{state}
adamc@545	1201 \end{array}$$
adamc@545	1202
adamc@543	1203 Most of the complexity of the query encoding is in the type $\mt{sql\_query1}$, which includes simple queries and derived queries based on relational operators. Constructor arguments respectively specify the tables we select from, the subset of fields that we keep from each table for the result rows, and the extra expressions that we select.
adamc@543	1204 $$\begin{array}{l}
adamc@543	1205 \mt{con} \; \mt{sql\_query1} :: \{\{\mt{Type}\}\} \to \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@543	1206 \\
adamc@543	1207 \mt{type} \; \mt{sql\_relop} \\
adamc@543	1208 \mt{val} \; \mt{sql\_union} : \mt{sql\_relop} \\
adamc@543	1209 \mt{val} \; \mt{sql\_intersect} : \mt{sql\_relop} \\
adamc@543	1210 \mt{val} \; \mt{sql\_except} : \mt{sql\_relop} \\
adamc@543	1211 \mt{val} \; \mt{sql\_relop} : \mt{tables1} ::: \{\{\mt{Type}\}\} \\
adamc@543	1212 \hspace{.1in} \to \mt{tables2} ::: \{\{\mt{Type}\}\} \\
adamc@543	1213 \hspace{.1in} \to \mt{selectedFields} ::: \{\{\mt{Type}\}\} \\
adamc@543	1214 \hspace{.1in} \to \mt{selectedExps} ::: \{\mt{Type}\} \\
adamc@543	1215 \hspace{.1in} \to \mt{sql\_relop} \\
adamc@543	1216 \hspace{.1in} \to \mt{sql\_query1} \; \mt{tables1} \; \mt{selectedFields} \; \mt{selectedExps} \\
adamc@543	1217 \hspace{.1in} \to \mt{sql\_query1} \; \mt{tables2} \; \mt{selectedFields} \; \mt{selectedExps} \\
adamc@543	1218 \hspace{.1in} \to \mt{sql\_query1} \; \mt{selectedFields} \; \mt{selectedFields} \; \mt{selectedExps}
adamc@543	1219 \end{array}$$
adamc@543	1220
adamc@543	1221 $$\begin{array}{l}
adamc@543	1222 \mt{val} \; \mt{sql\_query1} : \mt{tables} ::: \{\{\mt{Type}\}\} \\
adamc@543	1223 \hspace{.1in} \to \mt{grouped} ::: \{\{\mt{Type}\}\} \\
adamc@543	1224 \hspace{.1in} \to \mt{selectedFields} ::: \{\{\mt{Type}\}\} \\
adamc@543	1225 \hspace{.1in} \to \mt{selectedExps} ::: \{\mt{Type}\} \\
adamc@658	1226 \hspace{.1in} \to \{\mt{From} : \$(\mt{map} \; \mt{sql\_table} \; \mt{tables}), \\
adamc@543	1227 \hspace{.2in} \mt{Where} : \mt{sql\_exp} \; \mt{tables} \; [] \; [] \; \mt{bool}, \\
adamc@543	1228 \hspace{.2in} \mt{GroupBy} : \mt{sql\_subset} \; \mt{tables} \; \mt{grouped}, \\
adamc@543	1229 \hspace{.2in} \mt{Having} : \mt{sql\_exp} \; \mt{grouped} \; \mt{tables} \; [] \; \mt{bool}, \\
adamc@543	1230 \hspace{.2in} \mt{SelectFields} : \mt{sql\_subset} \; \mt{grouped} \; \mt{selectedFields}, \\
adamc@658	1231 \hspace{.2in} \mt {SelectExps} : \$(\mt{map} \; (\mt{sql\_exp} \; \mt{grouped} \; \mt{tables} \; []) \; \mt{selectedExps}) \} \\
adamc@543	1232 \hspace{.1in} \to \mt{sql\_query1} \; \mt{tables} \; \mt{selectedFields} \; \mt{selectedExps}
adamc@543	1233 \end{array}$$
adamc@543	1234
adamc@543	1235 To encode projection of subsets of fields in $\mt{SELECT}$ clauses, and to encode $\mt{GROUP} \; \mt{BY}$ clauses, we rely on a type family $\mt{sql\_subset}$, capturing what it means for one record of table fields to be a subset of another. The main constructor $\mt{sql\_subset}$ ``proves subset facts'' by requiring a split of a record into kept and dropped parts. The extra constructor $\mt{sql\_subset\_all}$ is a convenience for keeping all fields of a record.
adamc@543	1236 $$\begin{array}{l}
adamc@543	1237 \mt{con} \; \mt{sql\_subset} :: \{\{\mt{Type}\}\} \to \{\{\mt{Type}\}\} \to \mt{Type} \\
adamc@543	1238 \mt{val} \; \mt{sql\_subset} : \mt{keep\_drop} :: \{(\{\mt{Type}\} \times \{\mt{Type}\})\} \\
adamc@543	1239 \hspace{.1in} \to \mt{sql\_subset} \\
adamc@658	1240 \hspace{.2in} (\mt{map} \; (\lambda \mt{fields} :: (\{\mt{Type}\} \times \{\mt{Type}\}) \Rightarrow \mt{fields}.1 \rc \mt{fields}.2)\; \mt{keep\_drop}) \\
adamc@658	1241 \hspace{.2in} (\mt{map} \; (\lambda \mt{fields} :: (\{\mt{Type}\} \times \{\mt{Type}\}) \Rightarrow \mt{fields}.1) \; \mt{keep\_drop}) \\
adamc@543	1242 \mt{val} \; \mt{sql\_subset\_all} : \mt{tables} :: \{\{\mt{Type}\}\} \to \mt{sql\_subset} \; \mt{tables} \; \mt{tables}
adamc@543	1243 \end{array}$$
adamc@543	1244
adamc@560	1245 SQL expressions are used in several places, including $\mt{SELECT}$, $\mt{WHERE}$, $\mt{HAVING}$, and $\mt{ORDER} \; \mt{BY}$ clauses. They reify a fragment of the standard SQL expression language, while making it possible to inject ``native'' Ur values in some places. The arguments to the $\mt{sql\_exp}$ type family respectively give the unrestricted-availability table fields, the table fields that may only be used in arguments to aggregate functions, the available selected expressions, and the type of the expression.
adamc@543	1246 $$\begin{array}{l}
adamc@543	1247 \mt{con} \; \mt{sql\_exp} :: \{\{\mt{Type}\}\} \to \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \to \mt{Type}
adamc@543	1248 \end{array}$$
adamc@543	1249
adamc@543	1250 Any field in scope may be converted to an expression.
adamc@543	1251 $$\begin{array}{l}
adamc@543	1252 \mt{val} \; \mt{sql\_field} : \mt{otherTabs} ::: \{\{\mt{Type}\}\} \to \mt{otherFields} ::: \{\mt{Type}\} \\
adamc@543	1253 \hspace{.1in} \to \mt{fieldType} ::: \mt{Type} \to \mt{agg} ::: \{\{\mt{Type}\}\} \\
adamc@543	1254 \hspace{.1in} \to \mt{exps} ::: \{\mt{Type}\} \\
adamc@543	1255 \hspace{.1in} \to \mt{tab} :: \mt{Name} \to \mt{field} :: \mt{Name} \\
adamc@543	1256 \hspace{.1in} \to \mt{sql\_exp} \; ([\mt{tab} = [\mt{field} = \mt{fieldType}] \rc \mt{otherFields}] \rc \mt{otherTabs}) \; \mt{agg} \; \mt{exps} \; \mt{fieldType}
adamc@543	1257 \end{array}$$
adamc@543	1258
adamc@544	1259 There is an analogous function for referencing named expressions.
adamc@544	1260 $$\begin{array}{l}
adamc@544	1261 \mt{val} \; \mt{sql\_exp} : \mt{tabs} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{t} ::: \mt{Type} \to \mt{rest} ::: \{\mt{Type}\} \to \mt{nm} :: \mt{Name} \\
adamc@544	1262 \hspace{.1in} \to \mt{sql\_exp} \; \mt{tabs} \; \mt{agg} \; ([\mt{nm} = \mt{t}] \rc \mt{rest}) \; \mt{t}
adamc@544	1263 \end{array}$$
adamc@544	1264
adamc@544	1265 Ur values of appropriate types may be injected into SQL expressions.
adamc@544	1266 $$\begin{array}{l}
adamc@544	1267 \mt{class} \; \mt{sql\_injectable} \\
adamc@544	1268 \mt{val} \; \mt{sql\_bool} : \mt{sql\_injectable} \; \mt{bool} \\
adamc@544	1269 \mt{val} \; \mt{sql\_int} : \mt{sql\_injectable} \; \mt{int} \\
adamc@544	1270 \mt{val} \; \mt{sql\_float} : \mt{sql\_injectable} \; \mt{float} \\
adamc@544	1271 \mt{val} \; \mt{sql\_string} : \mt{sql\_injectable} \; \mt{string} \\
adamc@544	1272 \mt{val} \; \mt{sql\_time} : \mt{sql\_injectable} \; \mt{time} \\
adamc@544	1273 \mt{val} \; \mt{sql\_option\_bool} : \mt{sql\_injectable} \; (\mt{option} \; \mt{bool}) \\
adamc@544	1274 \mt{val} \; \mt{sql\_option\_int} : \mt{sql\_injectable} \; (\mt{option} \; \mt{int}) \\
adamc@544	1275 \mt{val} \; \mt{sql\_option\_float} : \mt{sql\_injectable} \; (\mt{option} \; \mt{float}) \\
adamc@544	1276 \mt{val} \; \mt{sql\_option\_string} : \mt{sql\_injectable} \; (\mt{option} \; \mt{string}) \\
adamc@544	1277 \mt{val} \; \mt{sql\_option\_time} : \mt{sql\_injectable} \; (\mt{option} \; \mt{time}) \\
adamc@544	1278 \mt{val} \; \mt{sql\_inject} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \to \mt{sql\_injectable} \; \mt{t} \\
adamc@544	1279 \hspace{.1in} \to \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t}
adamc@544	1280 \end{array}$$
adamc@544	1281
adamc@544	1282 We have the SQL nullness test, which is necessary because of the strange SQL semantics of equality in the presence of null values.
adamc@544	1283 $$\begin{array}{l}
adamc@544	1284 \mt{val} \; \mt{sql\_is\_null} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544	1285 \hspace{.1in} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; (\mt{option} \; \mt{t}) \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{bool}
adamc@544	1286 \end{array}$$
adamc@544	1287
adamc@559	1288 We have generic nullary, unary, and binary operators.
adamc@544	1289 $$\begin{array}{l}
adamc@544	1290 \mt{con} \; \mt{sql\_nfunc} :: \mt{Type} \to \mt{Type} \\
adamc@544	1291 \mt{val} \; \mt{sql\_current\_timestamp} : \mt{sql\_nfunc} \; \mt{time} \\
adamc@544	1292 \mt{val} \; \mt{sql\_nfunc} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544	1293 \hspace{.1in} \to \mt{sql\_nfunc} \; \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t} \\\end{array}$$
adamc@544	1294
adamc@544	1295 $$\begin{array}{l}
adamc@544	1296 \mt{con} \; \mt{sql\_unary} :: \mt{Type} \to \mt{Type} \to \mt{Type} \\
adamc@544	1297 \mt{val} \; \mt{sql\_not} : \mt{sql\_unary} \; \mt{bool} \; \mt{bool} \\
adamc@544	1298 \mt{val} \; \mt{sql\_unary} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{arg} ::: \mt{Type} \to \mt{res} ::: \mt{Type} \\
adamc@544	1299 \hspace{.1in} \to \mt{sql\_unary} \; \mt{arg} \; \mt{res} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{arg} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{res} \\
adamc@544	1300 \end{array}$$
adamc@544	1301
adamc@544	1302 $$\begin{array}{l}
adamc@544	1303 \mt{con} \; \mt{sql\_binary} :: \mt{Type} \to \mt{Type} \to \mt{Type} \to \mt{Type} \\
adamc@544	1304 \mt{val} \; \mt{sql\_and} : \mt{sql\_binary} \; \mt{bool} \; \mt{bool} \; \mt{bool} \\
adamc@544	1305 \mt{val} \; \mt{sql\_or} : \mt{sql\_binary} \; \mt{bool} \; \mt{bool} \; \mt{bool} \\
adamc@544	1306 \mt{val} \; \mt{sql\_binary} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{arg_1} ::: \mt{Type} \to \mt{arg_2} ::: \mt{Type} \to \mt{res} ::: \mt{Type} \\
adamc@544	1307 \hspace{.1in} \to \mt{sql\_binary} \; \mt{arg_1} \; \mt{arg_2} \; \mt{res} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{arg_1} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{arg_2} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{res}
adamc@544	1308 \end{array}$$
adamc@544	1309
adamc@544	1310 $$\begin{array}{l}
adamc@559	1311 \mt{class} \; \mt{sql\_arith} \\
adamc@559	1312 \mt{val} \; \mt{sql\_int\_arith} : \mt{sql\_arith} \; \mt{int} \\
adamc@559	1313 \mt{val} \; \mt{sql\_float\_arith} : \mt{sql\_arith} \; \mt{float} \\
adamc@559	1314 \mt{val} \; \mt{sql\_neg} : \mt{t} ::: \mt{Type} \to \mt{sql\_arith} \; \mt{t} \to \mt{sql\_unary} \; \mt{t} \; \mt{t} \\
adamc@559	1315 \mt{val} \; \mt{sql\_plus} : \mt{t} ::: \mt{Type} \to \mt{sql\_arith} \; \mt{t} \to \mt{sql\_binary} \; \mt{t} \; \mt{t} \; \mt{t} \\
adamc@559	1316 \mt{val} \; \mt{sql\_minus} : \mt{t} ::: \mt{Type} \to \mt{sql\_arith} \; \mt{t} \to \mt{sql\_binary} \; \mt{t} \; \mt{t} \; \mt{t} \\
adamc@559	1317 \mt{val} \; \mt{sql\_times} : \mt{t} ::: \mt{Type} \to \mt{sql\_arith} \; \mt{t} \to \mt{sql\_binary} \; \mt{t} \; \mt{t} \; \mt{t} \\
adamc@559	1318 \mt{val} \; \mt{sql\_div} : \mt{t} ::: \mt{Type} \to \mt{sql\_arith} \; \mt{t} \to \mt{sql\_binary} \; \mt{t} \; \mt{t} \; \mt{t} \\
adamc@559	1319 \mt{val} \; \mt{sql\_mod} : \mt{sql\_binary} \; \mt{int} \; \mt{int} \; \mt{int}
adamc@559	1320 \end{array}$$
adamc@544	1321
adamc@656	1322 Finally, we have aggregate functions. The $\mt{COUNT(\ast)}$ syntax is handled specially, since it takes no real argument. The other aggregate functions are placed into a general type family, using constructor classes to restrict usage to properly-typed arguments. The key aspect of the $\mt{sql\_aggregate}$ function's type is the shift of aggregate-function-only fields into unrestricted fields.
adamc@544	1323
adamc@544	1324 $$\begin{array}{l}
adamc@544	1325 \mt{val} \; \mt{sql\_count} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{int}
adamc@544	1326 \end{array}$$
adamc@544	1327
adamc@544	1328 $$\begin{array}{l}
adamc@544	1329 \mt{con} \; \mt{sql\_aggregate} :: \mt{Type} \to \mt{Type} \\
adamc@544	1330 \mt{val} \; \mt{sql\_aggregate} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{agg} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544	1331 \hspace{.1in} \to \mt{sql\_aggregate} \; \mt{t} \to \mt{sql\_exp} \; \mt{agg} \; \mt{agg} \; \mt{exps} \; \mt{t} \to \mt{sql\_exp} \; \mt{tables} \; \mt{agg} \; \mt{exps} \; \mt{t}
adamc@544	1332 \end{array}$$
adamc@544	1333
adamc@544	1334 $$\begin{array}{l}
adamc@544	1335 \mt{class} \; \mt{sql\_summable} \\
adamc@544	1336 \mt{val} \; \mt{sql\_summable\_int} : \mt{sql\_summable} \; \mt{int} \\
adamc@544	1337 \mt{val} \; \mt{sql\_summable\_float} : \mt{sql\_summable} \; \mt{float} \\
adamc@544	1338 \mt{val} \; \mt{sql\_avg} : \mt{t} ::: \mt{Type} \to \mt{sql\_summable} \; \mt{t} \to \mt{sql\_aggregate} \; \mt{t} \\
adamc@544	1339 \mt{val} \; \mt{sql\_sum} : \mt{t} ::: \mt{Type} \to \mt{sql\_summable} \mt{t} \to \mt{sql\_aggregate} \; \mt{t}
adamc@544	1340 \end{array}$$
adamc@544	1341
adamc@544	1342 $$\begin{array}{l}
adamc@544	1343 \mt{class} \; \mt{sql\_maxable} \\
adamc@544	1344 \mt{val} \; \mt{sql\_maxable\_int} : \mt{sql\_maxable} \; \mt{int} \\
adamc@544	1345 \mt{val} \; \mt{sql\_maxable\_float} : \mt{sql\_maxable} \; \mt{float} \\
adamc@544	1346 \mt{val} \; \mt{sql\_maxable\_string} : \mt{sql\_maxable} \; \mt{string} \\
adamc@544	1347 \mt{val} \; \mt{sql\_maxable\_time} : \mt{sql\_maxable} \; \mt{time} \\
adamc@544	1348 \mt{val} \; \mt{sql\_max} : \mt{t} ::: \mt{Type} \to \mt{sql\_maxable} \; \mt{t} \to \mt{sql\_aggregate} \; \mt{t} \\
adamc@544	1349 \mt{val} \; \mt{sql\_min} : \mt{t} ::: \mt{Type} \to \mt{sql\_maxable} \; \mt{t} \to \mt{sql\_aggregate} \; \mt{t}
adamc@544	1350 \end{array}$$
adamc@544	1351
adamc@544	1352 We wrap up the definition of query syntax with the types used in representing $\mt{ORDER} \; \mt{BY}$, $\mt{LIMIT}$, and $\mt{OFFSET}$ clauses.
adamc@544	1353 $$\begin{array}{l}
adamc@544	1354 \mt{type} \; \mt{sql\_direction} \\
adamc@544	1355 \mt{val} \; \mt{sql\_asc} : \mt{sql\_direction} \\
adamc@544	1356 \mt{val} \; \mt{sql\_desc} : \mt{sql\_direction} \\
adamc@544	1357 \\
adamc@544	1358 \mt{con} \; \mt{sql\_order\_by} :: \{\{\mt{Type}\}\} \to \{\mt{Type}\} \to \mt{Type} \\
adamc@544	1359 \mt{val} \; \mt{sql\_order\_by\_Nil} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{exps} :: \{\mt{Type}\} \to \mt{sql\_order\_by} \; \mt{tables} \; \mt{exps} \\
adamc@544	1360 \mt{val} \; \mt{sql\_order\_by\_Cons} : \mt{tables} ::: \{\{\mt{Type}\}\} \to \mt{exps} ::: \{\mt{Type}\} \to \mt{t} ::: \mt{Type} \\
adamc@544	1361 \hspace{.1in} \to \mt{sql\_exp} \; \mt{tables} \; [] \; \mt{exps} \; \mt{t} \to \mt{sql\_direction} \to \mt{sql\_order\_by} \; \mt{tables} \; \mt{exps} \to \mt{sql\_order\_by} \; \mt{tables} \; \mt{exps} \\
adamc@544	1362 \\
adamc@544	1363 \mt{type} \; \mt{sql\_limit} \\
adamc@544	1364 \mt{val} \; \mt{sql\_no\_limit} : \mt{sql\_limit} \\
adamc@544	1365 \mt{val} \; \mt{sql\_limit} : \mt{int} \to \mt{sql\_limit} \\
adamc@544	1366 \\
adamc@544	1367 \mt{type} \; \mt{sql\_offset} \\
adamc@544	1368 \mt{val} \; \mt{sql\_no\_offset} : \mt{sql\_offset} \\
adamc@544	1369 \mt{val} \; \mt{sql\_offset} : \mt{int} \to \mt{sql\_offset}
adamc@544	1370 \end{array}$$
adamc@544	1371
adamc@545	1372
adamc@545	1373 \subsubsection{DML}
adamc@545	1374
adamc@545	1375 The Ur/Web library also includes an embedding of a fragment of SQL's DML, the Data Manipulation Language, for modifying database tables. Any piece of DML may be executed in a transaction.
adamc@545	1376
adamc@545	1377 $$\begin{array}{l}
adamc@545	1378 \mt{type} \; \mt{dml} \\
adamc@545	1379 \mt{val} \; \mt{dml} : \mt{dml} \to \mt{transaction} \; \mt{unit}
adamc@545	1380 \end{array}$$
adamc@545	1381
adamc@545	1382 Properly-typed records may be used to form $\mt{INSERT}$ commands.
adamc@545	1383 $$\begin{array}{l}
adamc@545	1384 \mt{val} \; \mt{insert} : \mt{fields} ::: \{\mt{Type}\} \to \mt{sql\_table} \; \mt{fields} \\
adamc@658	1385 \hspace{.1in} \to \$(\mt{map} \; (\mt{sql\_exp} \; [] \; [] \; []) \; \mt{fields}) \to \mt{dml}
adamc@545	1386 \end{array}$$
adamc@545	1387
adamc@545	1388 An $\mt{UPDATE}$ command is formed from a choice of which table fields to leave alone and which to change, along with an expression to use to compute the new value of each changed field and a $\mt{WHERE}$ clause.
adamc@545	1389 $$\begin{array}{l}
adamc@545	1390 \mt{val} \; \mt{update} : \mt{unchanged} ::: \{\mt{Type}\} \to \mt{changed} :: \{\mt{Type}\} \to \lambda [\mt{changed} \sim \mt{unchanged}] \\
adamc@658	1391 \hspace{.1in} \Rightarrow \$(\mt{map} \; (\mt{sql\_exp} \; [\mt{T} = \mt{changed} \rc \mt{unchanged}] \; [] \; []) \; \mt{changed}) \\
adamc@545	1392 \hspace{.1in} \to \mt{sql\_table} \; (\mt{changed} \rc \mt{unchanged}) \to \mt{sql\_exp} \; [\mt{T} = \mt{changed} \rc \mt{unchanged}] \; [] \; [] \; \mt{bool} \to \mt{dml}
adamc@545	1393 \end{array}$$
adamc@545	1394
adamc@545	1395 A $\mt{DELETE}$ command is formed from a table and a $\mt{WHERE}$ clause.
adamc@545	1396 $$\begin{array}{l}
adamc@545	1397 \mt{val} \; \mt{delete} : \mt{fields} ::: \{\mt{Type}\} \to \mt{sql\_table} \; \mt{fields} \to \mt{sql\_exp} \; [\mt{T} = \mt{fields}] \; [] \; [] \; \mt{bool} \to \mt{dml}
adamc@545	1398 \end{array}$$
adamc@545	1399
adamc@546	1400 \subsubsection{Sequences}
adamc@546	1401
adamc@546	1402 SQL sequences are counters with concurrency control, often used to assign unique IDs. Ur/Web supports them via a simple interface. The only way to create a sequence is with the $\mt{sequence}$ declaration form.
adamc@546	1403
adamc@546	1404 $$\begin{array}{l}
adamc@546	1405 \mt{type} \; \mt{sql\_sequence} \\
adamc@546	1406 \mt{val} \; \mt{nextval} : \mt{sql\_sequence} \to \mt{transaction} \; \mt{int}
adamc@546	1407 \end{array}$$
adamc@546	1408
adamc@546	1409
adamc@547	1410 \subsection{XML}
adamc@547	1411
adamc@547	1412 Ur/Web's library contains an encoding of XML syntax and semantic constraints. We make no effort to follow the standards governing XML schemas. Rather, XML fragments are viewed more as values of ML datatypes, and we only track which tags are allowed inside which other tags.
adamc@547	1413
adamc@547	1414 The basic XML type family has arguments respectively indicating the \emph{context} of a fragment, the fields that the fragment expects to be bound on entry (and their types), and the fields that the fragment will bind (and their types). Contexts are a record-based ``poor man's subtyping'' encoding, with each possible set of valid tags corresponding to a different context record. The arguments dealing with field binding are only relevant to HTML forms.
adamc@547	1415 $$\begin{array}{l}
adamc@547	1416 \mt{con} \; \mt{xml} :: \{\mt{Unit}\} \to \{\mt{Type}\} \to \{\mt{Type}\} \to \mt{Type}
adamc@547	1417 \end{array}$$
adamc@547	1418
adamc@547	1419 We also have a type family of XML tags, indexed respectively by the record of optional attributes accepted by the tag, the context in which the tag may be placed, the context required of children of the tag, which form fields the tag uses, and which fields the tag defines.
adamc@547	1420 $$\begin{array}{l}
adamc@547	1421 \mt{con} \; \mt{tag} :: \{\mt{Type}\} \to \{\mt{Unit}\} \to \{\mt{Unit}\} \to \{\mt{Type}\} \to \{\mt{Type}\} \to \mt{Type}
adamc@547	1422 \end{array}$$
adamc@547	1423
adamc@547	1424 Literal text may be injected into XML as ``CDATA.''
adamc@547	1425 $$\begin{array}{l}
adamc@547	1426 \mt{val} \; \mt{cdata} : \mt{ctx} ::: \{\mt{Unit}\} \to \mt{use} ::: \{\mt{Type}\} \to \mt{string} \to \mt{xml} \; \mt{ctx} \; \mt{use} \; []
adamc@547	1427 \end{array}$$
adamc@547	1428
adamc@547	1429 There is a function for producing an XML tree with a particular tag at its root.
adamc@547	1430 $$\begin{array}{l}
adamc@547	1431 \mt{val} \; \mt{tag} : \mt{attrsGiven} ::: \{\mt{Type}\} \to \mt{attrsAbsent} ::: \{\mt{Type}\} \to \mt{ctxOuter} ::: \{\mt{Unit}\} \to \mt{ctxInner} ::: \{\mt{Unit}\} \\
adamc@547	1432 \hspace{.1in} \to \mt{useOuter} ::: \{\mt{Type}\} \to \mt{useInner} ::: \{\mt{Type}\} \to \mt{bindOuter} ::: \{\mt{Type}\} \to \mt{bindInner} ::: \{\mt{Type}\} \\
adamc@547	1433 \hspace{.1in} \to \lambda [\mt{attrsGiven} \sim \mt{attrsAbsent}] \; [\mt{useOuter} \sim \mt{useInner}] \; [\mt{bindOuter} \sim \mt{bindInner}] \Rightarrow \$\mt{attrsGiven} \\
adamc@547	1434 \hspace{.1in} \to \mt{tag} \; (\mt{attrsGiven} \rc \mt{attrsAbsent}) \; \mt{ctxOuter} \; \mt{ctxInner} \; \mt{useOuter} \; \mt{bindOuter} \\
adamc@547	1435 \hspace{.1in} \to \mt{xml} \; \mt{ctxInner} \; \mt{useInner} \; \mt{bindInner} \to \mt{xml} \; \mt{ctxOuter} \; (\mt{useOuter} \rc \mt{useInner}) \; (\mt{bindOuter} \rc \mt{bindInner})
adamc@547	1436 \end{array}$$
adamc@547	1437
adamc@547	1438 Two XML fragments may be concatenated.
adamc@547	1439 $$\begin{array}{l}
adamc@547	1440 \mt{val} \; \mt{join} : \mt{ctx} ::: \{\mt{Unit}\} \to \mt{use_1} ::: \{\mt{Type}\} \to \mt{bind_1} ::: \{\mt{Type}\} \to \mt{bind_2} ::: \{\mt{Type}\} \\
adamc@547	1441 \hspace{.1in} \to \lambda [\mt{use_1} \sim \mt{bind_1}] \; [\mt{bind_1} \sim \mt{bind_2}] \\
adamc@547	1442 \hspace{.1in} \Rightarrow \mt{xml} \; \mt{ctx} \; \mt{use_1} \; \mt{bind_1} \to \mt{xml} \; \mt{ctx} \; (\mt{use_1} \rc \mt{bind_1}) \; \mt{bind_2} \to \mt{xml} \; \mt{ctx} \; \mt{use_1} \; (\mt{bind_1} \rc \mt{bind_2})
adamc@547	1443 \end{array}$$
adamc@547	1444
adamc@547	1445 Finally, any XML fragment may be updated to ``claim'' to use more form fields than it does.
adamc@547	1446 $$\begin{array}{l}
adamc@547	1447 \mt{val} \; \mt{useMore} : \mt{ctx} ::: \{\mt{Unit}\} \to \mt{use_1} ::: \{\mt{Type}\} \to \mt{use_2} ::: \{\mt{Type}\} \to \mt{bind} ::: \{\mt{Type}\} \to \lambda [\mt{use_1} \sim \mt{use_2}] \\
adamc@547	1448 \hspace{.1in} \Rightarrow \mt{xml} \; \mt{ctx} \; \mt{use_1} \; \mt{bind} \to \mt{xml} \; \mt{ctx} \; (\mt{use_1} \rc \mt{use_2}) \; \mt{bind}
adamc@547	1449 \end{array}$$
adamc@547	1450
adamc@547	1451 We will not list here the different HTML tags and related functions from the standard library. They should be easy enough to understand from the code in \texttt{basis.urs}. The set of tags in the library is not yet claimed to be complete for HTML standards.
adamc@547	1452
adamc@547	1453 One last useful function is for aborting any page generation, returning some XML as an error message. This function takes the place of some uses of a general exception mechanism.
adamc@547	1454 $$\begin{array}{l}
adamc@547	1455 \mt{val} \; \mt{error} : \mt{t} ::: \mt{Type} \to \mt{xml} \; [\mt{Body}] \; [] \; [] \to \mt{t}
adamc@547	1456 \end{array}$$
adamc@547	1457
adamc@549	1458
adamc@701	1459 \subsection{Client-Side Programming}
adamc@659	1460
adamc@701	1461 Ur/Web supports running code on web browsers, via automatic compilation to JavaScript.
adamc@701	1462
adamc@701	1463 \subsubsection{The Basics}
adamc@701	1464
adamc@701	1465 Clients can open alert dialog boxes, in the usual annoying JavaScript way.
adamc@701	1466 $$\begin{array}{l}
adamc@701	1467 \mt{val} \; \mt{alert} : \mt{string} \to \mt{transaction} \; \mt{unit}
adamc@701	1468 \end{array}$$
adamc@701	1469
adamc@701	1470 Any transaction may be run in a new thread with the $\mt{spawn}$ function.
adamc@701	1471 $$\begin{array}{l}
adamc@701	1472 \mt{val} \; \mt{spawn} : \mt{transaction} \; \mt{unit} \to \mt{transaction} \; \mt{unit}
adamc@701	1473 \end{array}$$
adamc@701	1474
adamc@701	1475 The current thread can be paused for at least a specified number of milliseconds.
adamc@701	1476 $$\begin{array}{l}
adamc@701	1477 \mt{val} \; \mt{sleep} : \mt{int} \to \mt{transaction} \; \mt{unit}
adamc@701	1478 \end{array}$$
adamc@701	1479
adamc@701	1480 \subsubsection{Functional-Reactive Page Generation}
adamc@701	1481
adamc@701	1482 Most approaches to ``AJAX''-style coding involve imperative manipulation of the DOM tree representing an HTML document's structure. Ur/Web follows the \emph{functional-reactive} approach instead. Programs may allocate mutable \emph{sources} of arbitrary types, and an HTML page is effectively a pure function over the latest values of the sources. The page is not mutated directly, but rather it changes automatically as the sources are mutated.
adamc@659	1483
adamc@659	1484 $$\begin{array}{l}
adamc@659	1485 \mt{con} \; \mt{source} :: \mt{Type} \to \mt{Type} \\
adamc@659	1486 \mt{val} \; \mt{source} : \mt{t} ::: \mt{Type} \to \mt{t} \to \mt{transaction} \; (\mt{source} \; \mt{t}) \\
adamc@659	1487 \mt{val} \; \mt{set} : \mt{t} ::: \mt{Type} \to \mt{source} \; \mt{t} \to \mt{t} \to \mt{transaction} \; \mt{unit} \\
adamc@659	1488 \mt{val} \; \mt{get} : \mt{t} ::: \mt{Type} \to \mt{source} \; \mt{t} \to \mt{transaction} \; \mt{t}
adamc@659	1489 \end{array}$$
adamc@659	1490
adamc@659	1491 Pure functions over sources are represented in a monad of \emph{signals}.
adamc@659	1492
adamc@659	1493 $$\begin{array}{l}
adamc@659	1494 \mt{con} \; \mt{signal} :: \mt{Type} \to \mt{Type} \\
adamc@659	1495 \mt{val} \; \mt{signal\_monad} : \mt{monad} \; \mt{signal} \\
adamc@659	1496 \mt{val} \; \mt{signal} : \mt{t} ::: \mt{Type} \to \mt{source} \; \mt{t} \to \mt{signal} \; \mt{t}
adamc@659	1497 \end{array}$$
adamc@659	1498
adamc@659	1499 A reactive portion of an HTML page is injected with a $\mt{dyn}$ tag, which has a signal-valued attribute $\mt{Signal}$.
adamc@659	1500
adamc@659	1501 $$\begin{array}{l}
adamc@701	1502 \mt{val} \; \mt{dyn} : \mt{use} ::: \{\mt{Type}\} \to \mt{bind} ::: \{\mt{Type}\} \to \mt{unit} \\
adamc@701	1503 \hspace{.1in} \to \mt{tag} \; [\mt{Signal} = \mt{signal} \; (\mt{xml} \; \mt{body} \; \mt{use} \; \mt{bind})] \; \mt{body} \; [] \; \mt{use} \; \mt{bind}
adamc@659	1504 \end{array}$$
adamc@659	1505
adamc@701	1506 Transactions can be run on the client by including them in attributes like the $\mt{Onclick}$ attribute of $\mt{button}$, and GUI widgets like $\mt{ctextbox}$ have $\mt{Source}$ attributes that can be used to connect them to sources, so that their values can be read by code running because of, e.g., an $\mt{Onclick}$ event.
adamc@701	1507
adamc@701	1508 \subsubsection{Asynchronous Message-Passing}
adamc@701	1509
adamc@701	1510 To support asynchronous, ``server push'' delivery of messages to clients, any client that might need to receive an asynchronous message is assigned a unique ID. These IDs may be retrieved both on the client and on the server, during execution of code related to a client.
adamc@701	1511
adamc@701	1512 $$\begin{array}{l}
adamc@701	1513 \mt{type} \; \mt{client} \\
adamc@701	1514 \mt{val} \; \mt{self} : \mt{transaction} \; \mt{client}
adamc@701	1515 \end{array}$$
adamc@701	1516
adamc@701	1517 \emph{Channels} are the means of message-passing. Each channel is created in the context of a client and belongs to that client; no other client may receive the channel's messages. Each channel type includes the type of values that may be sent over the channel. Sending and receiving are asynchronous, in the sense that a client need not be ready to receive a message right away. Rather, sent messages may queue up, waiting to be processed.
adamc@701	1518
adamc@701	1519 $$\begin{array}{l}
adamc@701	1520 \mt{con} \; \mt{channel} :: \mt{Type} \to \mt{Type} \\
adamc@701	1521 \mt{val} \; \mt{channel} : \mt{t} ::: \mt{Type} \to \mt{transaction} \; (\mt{channel} \; \mt{t}) \\
adamc@701	1522 \mt{val} \; \mt{send} : \mt{t} ::: \mt{Type} \to \mt{channel} \; \mt{t} \to \mt{t} \to \mt{transaction} \; \mt{unit} \\
adamc@701	1523 \mt{val} \; \mt{recv} : \mt{t} ::: \mt{Type} \to \mt{channel} \; \mt{t} \to \mt{transaction} \; \mt{t}
adamc@701	1524 \end{array}$$
adamc@701	1525
adamc@701	1526 The $\mt{channel}$ and $\mt{send}$ operations may only be executed on the server, and $\mt{recv}$ may only be executed on a client. Neither clients nor channels may be passed as arguments from clients to server-side functions, so persistent channels can only be maintained by storing them in the database and looking them up using the current client ID or some application-specific value as a key.
adamc@701	1527
adamc@701	1528 Clients and channels live only as long as the web browser page views that they are associated with. When a user surfs away, his client and its channels will be garbage-collected, after that user is not heard from for the timeout period. Garbage collection deletes any database row that contains a client or channel directly. Any reference to one of these types inside an $\mt{option}$ is set to $\mt{None}$ instead. Both kinds of handling have the flavor of weak pointers, and that is a useful way to think about clients and channels in the database.
adamc@701	1529
adamc@659	1530
adamc@549	1531 \section{Ur/Web Syntax Extensions}
adamc@549	1532
adamc@549	1533 Ur/Web features some syntactic shorthands for building values using the functions from the last section. This section sketches the grammar of those extensions. We write spans of syntax inside brackets to indicate that they are optional.
adamc@549	1534
adamc@549	1535 \subsection{SQL}
adamc@549	1536
adamc@549	1537 \subsubsection{Queries}
adamc@549	1538
adamc@550	1539 Queries $Q$ are added to the rules for expressions $e$.
adamc@550	1540
adamc@549	1541 $$\begin{array}{rrcll}
adamc@550	1542 \textrm{Queries} & Q &::=& (q \; [\mt{ORDER} \; \mt{BY} \; (E \; [o],)^+] \; [\mt{LIMIT} \; N] \; [\mt{OFFSET} \; N]) \\
adamc@549	1543 \textrm{Pre-queries} & q &::=& \mt{SELECT} \; P \; \mt{FROM} \; T,^+ \; [\mt{WHERE} \; E] \; [\mt{GROUP} \; \mt{BY} \; p,^+] \; [\mt{HAVING} \; E] \\
adamc@549	1544 &&& \mid q \; R \; q \\
adamc@549	1545 \textrm{Relational operators} & R &::=& \mt{UNION} \mid \mt{INTERSECT} \mid \mt{EXCEPT}
adamc@549	1546 \end{array}$$
adamc@549	1547
adamc@549	1548 $$\begin{array}{rrcll}
adamc@549	1549 \textrm{Projections} & P &::=& \ast & \textrm{all columns} \\
adamc@549	1550 &&& p,^+ & \textrm{particular columns} \\
adamc@549	1551 \textrm{Pre-projections} & p &::=& t.f & \textrm{one column from a table} \\
adamc@558	1552 &&& t.\{\{c\}\} & \textrm{a record of columns from a table (of kind $\{\mt{Type}\}$)} \\
adamc@549	1553 \textrm{Table names} & t &::=& x & \textrm{constant table name (automatically capitalized)} \\
adamc@549	1554 &&& X & \textrm{constant table name} \\
adamc@549	1555 &&& \{\{c\}\} & \textrm{computed table name (of kind $\mt{Name}$)} \\
adamc@549	1556 \textrm{Column names} & f &::=& X & \textrm{constant column name} \\
adamc@549	1557 &&& \{c\} & \textrm{computed column name (of kind $\mt{Name}$)} \\
adamc@549	1558 \textrm{Tables} & T &::=& x & \textrm{table variable, named locally by its own capitalization} \\
adamc@549	1559 &&& x \; \mt{AS} \; t & \textrm{table variable, with local name} \\
adamc@549	1560 &&& \{\{e\}\} \; \mt{AS} \; t & \textrm{computed table expression, with local name} \\
adamc@549	1561 \textrm{SQL expressions} & E &::=& p & \textrm{column references} \\
adamc@549	1562 &&& X & \textrm{named expression references} \\
adamc@549	1563 &&& \{\{e\}\} & \textrm{injected native Ur expressions} \\
adamc@549	1564 &&& \{e\} & \textrm{computed expressions, probably using $\mt{sql\_exp}$ directly} \\
adamc@549	1565 &&& \mt{TRUE} \mid \mt{FALSE} & \textrm{boolean constants} \\
adamc@549	1566 &&& \ell & \textrm{primitive type literals} \\
adamc@549	1567 &&& \mt{NULL} & \textrm{null value (injection of $\mt{None}$)} \\
adamc@549	1568 &&& E \; \mt{IS} \; \mt{NULL} & \textrm{nullness test} \\
adamc@549	1569 &&& n & \textrm{nullary operators} \\
adamc@549	1570 &&& u \; E & \textrm{unary operators} \\
adamc@549	1571 &&& E \; b \; E & \textrm{binary operators} \\
adamc@549	1572 &&& \mt{COUNT}(\ast) & \textrm{count number of rows} \\
adamc@549	1573 &&& a(E) & \textrm{other aggregate function} \\
adamc@549	1574 &&& (E) & \textrm{explicit precedence} \\
adamc@549	1575 \textrm{Nullary operators} & n &::=& \mt{CURRENT\_TIMESTAMP} \\
adamc@549	1576 \textrm{Unary operators} & u &::=& \mt{NOT} \\
adamc@549	1577 \textrm{Binary operators} & b &::=& \mt{AND} \mid \mt{OR} \mid \neq \mid < \mid \leq \mid > \mid \geq \\
adamc@549	1578 \textrm{Aggregate functions} & a &::=& \mt{AVG} \mid \mt{SUM} \mid \mt{MIN} \mid \mt{MAX} \\
adamc@550	1579 \textrm{Directions} & o &::=& \mt{ASC} \mid \mt{DESC} \\
adamc@549	1580 \textrm{SQL integer} & N &::=& n \mid \{e\} \\
adamc@549	1581 \end{array}$$
adamc@549	1582
adamc@549	1583 Additionally, an SQL expression may be inserted into normal Ur code with the syntax $(\mt{SQL} \; E)$ or $(\mt{WHERE} \; E)$.
adamc@549	1584
adamc@550	1585 \subsubsection{DML}
adamc@550	1586
adamc@550	1587 DML commands $D$ are added to the rules for expressions $e$.
adamc@550	1588
adamc@550	1589 $$\begin{array}{rrcll}
adamc@550	1590 \textrm{Commands} & D &::=& (\mt{INSERT} \; \mt{INTO} \; T^E \; (f,^+) \; \mt{VALUES} \; (E,^+)) \\
adamc@550	1591 &&& (\mt{UPDATE} \; T^E \; \mt{SET} \; (f = E,)^+ \; \mt{WHERE} \; E) \\
adamc@550	1592 &&& (\mt{DELETE} \; \mt{FROM} \; T^E \; \mt{WHERE} \; E) \\
adamc@550	1593 \textrm{Table expressions} & T^E &::=& x \mid \{\{e\}\}
adamc@550	1594 \end{array}$$
adamc@550	1595
adamc@550	1596 Inside $\mt{UPDATE}$ and $\mt{DELETE}$ commands, lone variables $X$ are interpreted as references to columns of the implicit table $\mt{T}$, rather than to named expressions.
adamc@549	1597
adamc@551	1598 \subsection{XML}
adamc@551	1599
adamc@551	1600 XML fragments $L$ are added to the rules for expressions $e$.
adamc@551	1601
adamc@551	1602 $$\begin{array}{rrcll}
adamc@551	1603 \textrm{XML fragments} & L &::=& \texttt{<xml/>} \mid \texttt{<xml>}l^*\texttt{</xml>} \\
adamc@551	1604 \textrm{XML pieces} & l &::=& \textrm{text} & \textrm{cdata} \\
adamc@551	1605 &&& \texttt{<}g\texttt{/>} & \textrm{tag with no children} \\
adamc@551	1606 &&& \texttt{<}g\texttt{>}l^*\texttt{</}x\texttt{>} & \textrm{tag with children} \\
adamc@559	1607 &&& \{e\} & \textrm{computed XML fragment} \\
adamc@559	1608 &&& \{[e]\} & \textrm{injection of an Ur expression, via the $\mt{Top}.\mt{txt}$ function} \\
adamc@551	1609 \textrm{Tag} & g &::=& h \; (x = v)^* \\
adamc@551	1610 \textrm{Tag head} & h &::=& x & \textrm{tag name} \\
adamc@551	1611 &&& h\{c\} & \textrm{constructor parameter} \\
adamc@551	1612 \textrm{Attribute value} & v &::=& \ell & \textrm{literal value} \\
adamc@551	1613 &&& \{e\} & \textrm{computed value} \\
adamc@551	1614 \end{array}$$
adamc@551	1615
adamc@552	1616
adamc@553	1617 \section{The Structure of Web Applications}
adamc@553	1618
adamc@553	1619 A web application is built from a series of modules, with one module, the last one appearing in the \texttt{.urp} file, designated as the main module. The signature of the main module determines the URL entry points to the application. Such an entry point should have type $\mt{unit} \to \mt{transaction} \; \mt{page}$, where $\mt{page}$ is a type synonym for top-level HTML pages, defined in $\mt{Basis}$. If such a function is at the top level of main module $M$, it will be accessible at URI \texttt{/M/f}, and so on for more deeply-nested functions, as described in Section \ref{tag} below.
adamc@553	1620
adamc@553	1621 When the standalone web server receives a request for a known page, it calls the function for that page, ``running'' the resulting transaction to produce the page to return to the client. Pages link to other pages with the \texttt{link} attribute of the \texttt{a} HTML tag. A link has type $\mt{transaction} \; \mt{page}$, and the semantics of a link are that this transaction should be run to compute the result page, when the link is followed. Link targets are assigned URL names in the same way as top-level entry points.
adamc@553	1622
adamc@553	1623 HTML forms are handled in a similar way. The $\mt{action}$ attribute of a $\mt{submit}$ form tag takes a value of type $\$\mt{use} \to \mt{transaction} \; \mt{page}$, where $\mt{use}$ is a kind-$\{\mt{Type}\}$ record of the form fields used by this action handler. Action handlers are assigned URL patterns in the same way as above.
adamc@553	1624
adamc@558	1625 For both links and actions, direct arguments and local variables mentioned implicitly via closures are automatically included in serialized form in URLs, in the order in which they appear in the source code.
adamc@553	1626
adamc@660	1627 Ur/Web programs generally mix server- and client-side code in a fairly transparent way. The one important restriction is that mixed client-server code must encapsulate all server-side pieces within named functions. This is because execution of such pieces will be implemented by explicit calls to the remote web server, and it is useful to get the programmer's help in designing the interface to be used. For example, this makes it easier to allow a client running an old version of an application to continue interacting with a server that has been upgraded to a new version, if the programmer took care to keep the interfaces of all of the old remote calls the same. The functions implementing these services are assigned names in the same way as normal web entry points, by using module structure.
adamc@660	1628
adamc@553	1629
adamc@552	1630 \section{Compiler Phases}
adamc@552	1631
adamc@552	1632 The Ur/Web compiler is unconventional in that it relies on a kind of \emph{heuristic compilation}. Not all valid programs will compile successfully. Informally, programs fail to compile when they are ``too higher order.'' Compiler phases do their best to eliminate different kinds of higher order-ness, but some programs just won't compile. This is a trade-off for producing very efficient executables. Compiled Ur/Web programs use native C representations and require no garbage collection.
adamc@552	1633
adamc@552	1634 In this section, we step through the main phases of compilation, noting what consequences each phase has for effective programming.
adamc@552	1635
adamc@552	1636 \subsection{Parse}
adamc@552	1637
adamc@552	1638 The compiler reads a \texttt{.urp} file, figures out which \texttt{.urs} and \texttt{.ur} files it references, and combines them all into what is conceptually a single sequence of declarations in the core language of Section \ref{core}.
adamc@552	1639
adamc@552	1640 \subsection{Elaborate}
adamc@552	1641
adamc@552	1642 This is where type inference takes place, translating programs into an explicit form with no more wildcards. This phase is the most likely source of compiler error messages.
adamc@552	1643
adamc@552	1644 \subsection{Unnest}
adamc@552	1645
adamc@552	1646 Named local function definitions are moved to the top level, to avoid the need to generate closures.
adamc@552	1647
adamc@552	1648 \subsection{Corify}
adamc@552	1649
adamc@552	1650 Module system features are compiled away, through inlining of functor definitions at application sites. Afterward, most abstraction boundaries are broken, facilitating optimization.
adamc@552	1651
adamc@552	1652 \subsection{Especialize}
adamc@552	1653
adamc@552	1654 Functions are specialized to particular argument patterns. This is an important trick for avoiding the need to maintain any closures at runtime.
adamc@552	1655
adamc@552	1656 \subsection{Untangle}
adamc@552	1657
adamc@552	1658 Remove unnecessary mutual recursion, splitting recursive groups into strongly-connected components.
adamc@552	1659
adamc@552	1660 \subsection{Shake}
adamc@552	1661
adamc@552	1662 Remove all definitions not needed to run the page handlers that are visible in the signature of the last module listed in the \texttt{.urp} file.
adamc@552	1663
adamc@661	1664 \subsection{Rpcify}
adamc@661	1665
adamc@661	1666 Pieces of code are determined to be client-side, server-side, neither, or both, by figuring out which standard library functions might be needed to execute them. Calls to server-side functions (e.g., $\mt{query}$) within mixed client-server code are identified and replaced with explicit remote calls. Some mixed functions may be converted to continuation-passing style to facilitate this transformation.
adamc@661	1667
adamc@661	1668 \subsection{Untangle, Shake}
adamc@661	1669
adamc@661	1670 Repeat these simplifications.
adamc@661	1671
adamc@553	1672 \subsection{\label{tag}Tag}
adamc@552	1673
adamc@552	1674 Assign a URL name to each link and form action. It is important that these links and actions are written as applications of named functions, because such names are used to generate URL patterns. A URL pattern has a name built from the full module path of the named function, followed by the function name, with all pieces separated by slashes. The path of a functor application is based on the name given to the result, rather than the path of the functor itself.
adamc@552	1675
adamc@552	1676 \subsection{Reduce}
adamc@552	1677
adamc@552	1678 Apply definitional equality rules to simplify the program as much as possible. This effectively includes inlining of every non-recursive definition.
adamc@552	1679
adamc@552	1680 \subsection{Unpoly}
adamc@552	1681
adamc@552	1682 This phase specializes polymorphic functions to the specific arguments passed to them in the program. If the program contains real polymorphic recursion, Unpoly will be insufficient to avoid later error messages about too much polymorphism.
adamc@552	1683
adamc@552	1684 \subsection{Specialize}
adamc@552	1685
adamc@558	1686 Replace uses of parameterized datatypes with versions specialized to specific parameters. As for Unpoly, this phase will not be effective enough in the presence of polymorphic recursion or other fancy uses of impredicative polymorphism.
adamc@552	1687
adamc@552	1688 \subsection{Shake}
adamc@552	1689
adamc@558	1690 Here the compiler repeats the earlier Shake phase.
adamc@552	1691
adamc@552	1692 \subsection{Monoize}
adamc@552	1693
adamc@552	1694 Programs are translated to a new intermediate language without polymorphism or non-$\mt{Type}$ constructors. Error messages may pop up here if earlier phases failed to remove such features.
adamc@552	1695
adamc@552	1696 This is the stage at which concrete names are generated for cookies, tables, and sequences. They are named following the same convention as for links and actions, based on module path information saved from earlier stages. Table and sequence names separate path elements with underscores instead of slashes, and they are prefixed by \texttt{uw\_}.
adamc@664	1697
adamc@552	1698 \subsection{MonoOpt}
adamc@552	1699
adamc@552	1700 Simple algebraic laws are applied to simplify the program, focusing especially on efficient imperative generation of HTML pages.
adamc@552	1701
adamc@552	1702 \subsection{MonoUntangle}
adamc@552	1703
adamc@552	1704 Unnecessary mutual recursion is broken up again.
adamc@552	1705
adamc@552	1706 \subsection{MonoReduce}
adamc@552	1707
adamc@552	1708 Equivalents of the definitional equality rules are applied to simplify programs, with inlining again playing a major role.
adamc@552	1709
adamc@552	1710 \subsection{MonoShake, MonoOpt}
adamc@552	1711
adamc@552	1712 Unneeded declarations are removed, and basic optimizations are repeated.
adamc@552	1713
adamc@552	1714 \subsection{Fuse}
adamc@552	1715
adamc@552	1716 The compiler tries to simplify calls to recursive functions whose results are immediately written as page output. The write action is pushed inside the function definitions to avoid allocation of intermediate results.
adamc@552	1717
adamc@552	1718 \subsection{MonoUntangle, MonoShake}
adamc@552	1719
adamc@552	1720 Fuse often creates more opportunities to remove spurious mutual recursion.
adamc@552	1721
adamc@552	1722 \subsection{Pathcheck}
adamc@552	1723
adamc@552	1724 The compiler checks that no link or action name has been used more than once.
adamc@552	1725
adamc@552	1726 \subsection{Cjrize}
adamc@552	1727
adamc@552	1728 The program is translated to what is more or less a subset of C. If any use of functions as data remains at this point, the compiler will complain.
adamc@552	1729
adamc@552	1730 \subsection{C Compilation and Linking}
adamc@552	1731
adamc@552	1732 The output of the last phase is pretty-printed as C source code and passed to GCC.
adamc@552	1733
adamc@552	1734
adamc@524	1735 \end{document}

Mercurial > urweb

annotate doc/manual.tex @ 756:8ce31c052dce