comparison doc/manual.tex @ 897:2faf558b2d05

FFI manual section
author Adam Chlipala <adamc@hcoop.net>
date Sat, 18 Jul 2009 15:08:21 -0400
parents 0ae8894d5c97
children 782f0b4eea67
comparison
equal deleted inserted replaced
896:0ae8894d5c97 897:2faf558b2d05
1938 The HTTP standard suggests that GET requests only be used in ways that generate no side effects. Side effecting operations should use POST requests instead. The Ur/Web compiler enforces this rule strictly, via a simple conservative program analysis. Any page that may have a side effect must be accessed through a form, all of which use POST requests. A page is judged to have a side effect if its code depends syntactically on any of the side-effecting, server-side FFI functions. Links, forms, and most client-side event handlers are not followed during this syntactic traversal, but \texttt{<body onload=\{...\}>} handlers \emph{are} examined, since they run right away and could just as well be considered parts of main page handlers. 1938 The HTTP standard suggests that GET requests only be used in ways that generate no side effects. Side effecting operations should use POST requests instead. The Ur/Web compiler enforces this rule strictly, via a simple conservative program analysis. Any page that may have a side effect must be accessed through a form, all of which use POST requests. A page is judged to have a side effect if its code depends syntactically on any of the side-effecting, server-side FFI functions. Links, forms, and most client-side event handlers are not followed during this syntactic traversal, but \texttt{<body onload=\{...\}>} handlers \emph{are} examined, since they run right away and could just as well be considered parts of main page handlers.
1939 1939
1940 Ur/Web includes a kind of automatic protection against cross site request forgery attacks. Whenever any page execution can have side effects and can also read at least one cookie value, all cookie values must be signed cryptographically, to ensure that the user has come to the current page by submitting a form on a real page generated by the proper server. Signing and signature checking are inserted automatically by the compiler. This prevents attacks like phishing schemes where users are directed to counterfeit pages with forms that submit to your application, where a user's cookies might be submitted without his knowledge, causing some undesired side effect. 1940 Ur/Web includes a kind of automatic protection against cross site request forgery attacks. Whenever any page execution can have side effects and can also read at least one cookie value, all cookie values must be signed cryptographically, to ensure that the user has come to the current page by submitting a form on a real page generated by the proper server. Signing and signature checking are inserted automatically by the compiler. This prevents attacks like phishing schemes where users are directed to counterfeit pages with forms that submit to your application, where a user's cookies might be submitted without his knowledge, causing some undesired side effect.
1941 1941
1942 1942
1943 \section{The Foreign Function Interface}
1944
1945 It is possible to call your own C and JavaScript code from Ur/Web applications, via the foreign function interface (FFI). The starting point for a new binding is a \texttt{.urs} signature file that presents your external library as a single Ur/Web module (with no nested modules). Compilation conventions map the types and values that you use into C and/or JavaScript types and values.
1946
1947 It is most convenient to encapsulate an FFI binding with a new \texttt{.urp} file, which applications can include with the \texttt{library} directive in their own \texttt{.urp} files. A number of directives are likely to show up in the library's project file.
1948
1949 \begin{itemize}
1950 \item \texttt{clientOnly Module.ident} registers a value as being allowed only in client-side code.
1951 \item \texttt{clientToServer Module.ident} declares a type as OK to marshal between clients and servers. By default, abstract FFI types are not allowed to be marshalled, since your library might be maintaining invariants that the simple serialization code doesn't check.
1952 \item \texttt{effectful Module.ident} registers a function that can have side effects. It is important to remember to use this directive for each such function, or else the optimizer might change program semantics.
1953 \item \texttt{ffi FILE.urs} names the file giving your library's signature. You can include multiple such files in a single \texttt{.urp} file, and each file \texttt{mod.urp} defines an FFI module \texttt{Mod}.
1954 \item \texttt{header FILE} requests inclusion of a C header file.
1955 \item \texttt{jsFunc Module.ident=name} gives a mapping from an Ur name for a value to a JavaScript name.
1956 \item \texttt{link FILE} requests that \texttt{FILE} be linked into applications. It should be a C object or library archive file, and you are responsible for generating it with your own build process.
1957 \item \texttt{script URL} requests inclusion of a JavaScript source file within application HTML.
1958 \item \texttt{serverOnly Module.ident} registers a value as being allowed only in server-side code.
1959 \end{itemize}
1960
1961 \subsection{Writing C FFI Code}
1962
1963 A server-side FFI type or value \texttt{Module.ident} must have a corresponding type or value definition \texttt{uw\_Module\_ident} in C code. With the current Ur/Web version, it's not generally possible to work with Ur records or complex datatypes in C code, but most other kinds of types are fair game.
1964
1965 \begin{itemize}
1966 \item Primitive types defined in \texttt{Basis} are themselves using the standard FFI interface, so you may refer to them like \texttt{uw\_Basis\_t}. See \texttt{include/types.h} for their definitions.
1967 \item Enumeration datatypes, which have only constructors that take no arguments, should be defined using C \texttt{enum}s. The type is named as for any other type identifier, and each constructor \texttt{c} gets an enumeration constant named \texttt{uw\_Module\_c}.
1968 \item A datatype \texttt{dt} (such as \texttt{Basis.option}) that has one non-value-carrying constructor \texttt{NC} and one value-carrying constructor \texttt{C} gets special treatment. Where \texttt{T} is the type of \texttt{C}'s argument, and where we represent \texttt{T} as \texttt{t} in C, we represent \texttt{NC} with \texttt{NULL}. The representation of \texttt{C} depends on whether we're sure that we don't need to use \texttt{NULL} to represent \texttt{t} values; this condition holds only for strings and complex datatypes. For such types, \texttt{C v} is represented with the C encoding of \texttt{v}, such that the translation of \texttt{dt} is \texttt{t}. For other types, \texttt{C v} is represented with a pointer to the C encoding of v, such that the translation of \texttt{dt} is \texttt{t*}.
1969 \end{itemize}
1970
1971 The C FFI version of a Ur function with type \texttt{T1 -> ... -> TN -> R} or \texttt{T1 -> ... -> TN -> transaction R} has a C prototype like \texttt{R uw\_Module\_ident(uw\_context, T1, ..., TN)}. Only functions with types of the second form may have side effects. \texttt{uw\_context} is the type of state that persists across handling a client request. Many functions that operate on contexts are prototyped in \texttt{include/urweb.h}. Most should only be used internally by the compiler. A few are useful in general FFI implementation:
1972 \begin{itemize}
1973 \item \begin{verbatim}
1974 void uw_error(uw_context, failure_kind, const char *fmt, ...);
1975 \end{verbatim}
1976 Abort the current request processing, giving a \texttt{printf}-style format string and arguments for generating an error message. The \texttt{failure\_kind} argument can be \texttt{FATAL}, to abort the whole execution; \texttt{BOUNDED\_RETRY}, to try processing the request again from the beginning, but failing if this happens too many times; or \texttt{UNLIMITED\_RETRY}, to repeat processing, with no cap on how many times this can recur.
1977
1978 \item \begin{verbatim}
1979 void uw_push_cleanup(uw_context, void (*func)(void *), void *arg);
1980 void uw_pop_cleanup(uw_context);
1981 \end{verbatim}
1982 Manipulate a stack of actions that should be taken if any kind of error condition arises. Calling the ``pop'' function both removes an action from the stack and executes it.
1983
1984 \item \begin{verbatim}
1985 void *uw_malloc(uw_context, size_t);
1986 \end{verbatim}
1987 A version of \texttt{malloc()} that allocates memory inside a context's heap, which is managed with region allocation. Thus, there is no \texttt{uw\_free()}, but you need to be careful not to keep ad-hoc C pointers to this area of memory.
1988
1989 For performance and correctness reasons, it is usually preferable to use \texttt{uw\_malloc()} instead of \texttt{malloc()}. The former manipulates a local heap that can be kept allocated across page requests, while the latter uses global data structures that may face contention during concurrent execution.
1990
1991 \item \begin{verbatim}
1992 typedef void (*uw_callback)(void *);
1993 void uw_register_transactional(uw_context, void *data, uw_callback commit,
1994 uw_callback rollback, uw_callback free);
1995 \end{verbatim}
1996 All side effects in Ur/Web programs need to be compatible with transactions, such that any set of actions can be undone at any time. Thus, you should not perform actions with non-local side effects directly; instead, register handlers to be called when the current transaction is committed or rolled back. The arguments here give an arbitary piece of data to be passed to callbacks, a function to call on commit, a function to call on rollback, and a function to call afterward in either case to clean up any allocated resources. A rollback handler may be called after the associated commit handler has already been called, if some later part of the commit process fails.
1997
1998 To accommodate some stubbornly non-transactional real-world actions like sending an e-mail message, Ur/Web allows the \texttt{rollback} parameter to be \texttt{NULL}. When a transaction commits, all \texttt{commit} actions that have non-\texttt{NULL} rollback actions are tried before any \texttt{commit} actions that have \texttt{NULL} rollback actions. Thus, if a single execution uses only one non-transactional action, and if that action never fails partway through its execution while still causing an observable side effect, then Ur/Web can maintain the transactional abstraction.
1999 \end{itemize}
2000
2001
2002 \subsection{Writing JavaScript FFI Code}
2003
2004 JavaScript is dynamically typed, so Ur/Web type definitions imply no JavaScript code. The JavaScript identifier for each FFI function is set with the \texttt{jsFunc} directive. Each identifier can be defined in any JavaScript file that you ask to include with the \texttt{script} directive.
2005
2006 In contrast to C FFI code, JavaScript FFI functions take no extra context argument. Their argument lists are as you would expect from their Ur types. Only functions whose ranges take the form \texttt{transaction T} should have side effects; the JavaScript ``return type'' of such a function is \texttt{T}. Here are the conventions for representing Ur values in JavaScript.
2007
2008 \begin{itemize}
2009 \item Integers, floats, strings, characters, and booleans are represented in the usual JavaScript way.
2010 \item Ur functions are represented with JavaScript functions, currying and all. Only named FFI functions are represented with multiple JavaScript arguments.
2011 \item An Ur record is represented with a JavaScript record, where Ur field name \texttt{N} translates to JavaScript field name \texttt{\_N}. An exception to this rule is that the empty record is encoded as \texttt{null}.
2012 \item \texttt{option}-like types receive special handling similar to their handling in C. The ``\texttt{None}'' constructor is \texttt{null}, and a use of the ``\texttt{Some}'' constructor on a value \texttt{v} is either \texttt{v}, if the underlying type doesn't need to use \texttt{null}; or \texttt{\{v:v\}} otherwise.
2013 \item Any other datatypes represent a non-value-carrying constructor \texttt{C} as \texttt{"\_C"} and an application of a constructor \texttt{C} to value \texttt{v} as \texttt{\{n:"\_C", v:v\}}. This rule only applies to datatypes defined in FFI module signatures; the compiler is free to optimize the representations of other, non-\texttt{option}-like datatypes in arbitrary ways.
2014 \end{itemize}
2015
2016 It is possible to write JavaScript FFI code that interacts with the functional-reactive structure of a document, but this version of the manual doesn't cover the details.
2017
2018
1943 \section{Compiler Phases} 2019 \section{Compiler Phases}
1944 2020
1945 The Ur/Web compiler is unconventional in that it relies on a kind of \emph{heuristic compilation}. Not all valid programs will compile successfully. Informally, programs fail to compile when they are ``too higher order.'' Compiler phases do their best to eliminate different kinds of higher order-ness, but some programs just won't compile. This is a trade-off for producing very efficient executables. Compiled Ur/Web programs use native C representations and require no garbage collection. 2021 The Ur/Web compiler is unconventional in that it relies on a kind of \emph{heuristic compilation}. Not all valid programs will compile successfully. Informally, programs fail to compile when they are ``too higher order.'' Compiler phases do their best to eliminate different kinds of higher order-ness, but some programs just won't compile. This is a trade-off for producing very efficient executables. Compiled Ur/Web programs use native C representations and require no garbage collection.
1946 2022
1947 In this section, we step through the main phases of compilation, noting what consequences each phase has for effective programming. 2023 In this section, we step through the main phases of compilation, noting what consequences each phase has for effective programming.