λk.(k blog): Posts tagged 'research'

What is a model?

2023-06-15T20:25:25Z

What is a model, particularly of a programming language? I’ve been struggling with this question a bit for some time. The word “model” is used a lot in my research area, and although I have successfully (by some metrics) read papers whose topic is models, used other peoples’ research on models, built models, and trained others to do all of this, I don’t really understand what a model is.

Before I get into a philosophical digression on what it even means to understand something, let’s ignore all that and try to discover what a model is from first principles.

Definitions of “model”

The apparent place to start to understand the meaning of a word is to read its definition. This is actually no help at all. There are lots of uses of the word “model”, with several definitions. Here are some.

Definition 0 In science and engineering, a model is “an abstract description of a concrete system using mathematical concepts and language”. See Wikipedia provides a nice introduction to this kind of model, and the Standard Encylopedia of Philosophy provides a nice explanation in the context of model theory, which will be relevant later in this post.

Definition 1 A syntactic model (of a type theory) is defined by Boulier, Pédrot, and Tabareau as a translation from one type theory into another that preserves typing, the definition of false, and definitional equivalence. This syntactic model enables the source type theory to inherit properties of the target type theory—such as consistency.

Definition 2 A model (of a vocabulary also called a language $\sigma$ ) in the sense of model theory (as defined by Elements of Finite Model Theory) is a $\sigma$ -structure (“also called a model”) defining a set A along with 3 sets providing interpretations of that vocabulary. These sets are $Ic_A$ , which interprets each constant in $\sigma$ as an element of $A$ , $IP_A$ , which interprets each n-ary predicate symbol or relation symbol from $\sigma$ as an n-ary (set-theoretic) relation between elements of $A$ , and $If_A$ , which interprets each n-ary function symbol in $\sigma$ as a (set-theoretic) function from n elements of $A$ to an element of $A$ .

Definition 3 The above definition is confusing, since it conflates structure and model, which the text later distinguishes with the following separate definition. A model (of a theory (over a vocabulary $\sigma$ )) is a structure (“also called a model”) of vocabulary $\sigma$ such that every sentence in the theory is interpreted in the structure to make the sentence true. (A theory is a set of sentences drawn from a vocabulary.) My rephrasing of the definition of model is intentionally confusing and difficult to parse, to make apparent the inherit confusingness created by the several layers of definitions and one definition that defines “model” using a second definition of “model”.

Definition 4 Nlab hosts an article with a much clarified definition, which distinguishes language, theory, structure, and model carefully. In particular, it is careful to only call structure the interpretation of the language (call vocabulary above), and only call model an interpretation that makes true the axioms composing the theory of the language.

Definition 5 Carlo Anguli once gave me the following definition of model:

A collection of interpretation functions that interpret every syntactic category in such that the original relationship is respected.

e.g.,
- interpret every context as a set,
- interpret every (non-dependent) type as a set, and
- interpret every term-of-a-type indexed-by-a-context as an element-of-the-interpretation-of-that-type indexed-by-elements-of-the-interpretation-of-that-context.

Implicit in this definition is that the interpretations must respect equality — because if you don’t respect equality of arguments then you’re not a function!

This definition seems to be close to Definition 2, as it doesn’t mention axioms and their interpretation. However, it might be Definition 3 instead, as there could be implicit in the definition of syntax an inclusion of all judgements of the programming language, and therefore in the phrase “such that the original relationship is respected” a requirement that axioms of those judgements become true.

Also implicit in this definition of model is what it is a model of. Perhaps a programming language, but it again depends on what syntax means and thus what “every syntactic category” refers to.

I’m interested in the requirement that the interpretation is a collection of functions, which seems to be missing or only implied in some model theory definitions of “model”.

So what is a model?

One of the first thing that jumps out to me after reviewing the above definitions is that to understand each definition, you have to reframe the definition of model into model [of what]. It really never makes sense to give a definition of merely “model”.

Definition 0 defines a model of a system [of the real world]. Definition 1 defines a model of a type theory. Definitions 2 and 3 give definitions of a model, in the sense of model theory, but of two different objects: model of a vocabulary (or language), which is more often called a structure, and model of a theory (which everyone seems to agree is a “model”). Definition 4 makes this distinction very clear. Definition 5 seems to use “model” in the model-theoretic sense, but has abstracted a bit away from a particular notion of theory and generalized to syntax.

What is a model of a programming language?

I’ve had two problems understanding the word “model” in the context of programming languages.

First, we use “model” in three different senses, and I have neither understood that nor understood the relationships between them.

Model in the sense of an abstract description of a system. This is Definition 0. This sense of “model” means something like “mathematical description”. What we want is a description in which we can work using math, so we can make predictions about the real world. Ideally, the predictions we make will be true.
Model in the strict sense of model theory. These are Definitions 2, 3, and 4. This sense of “model” is the closet to having a strict definition. It often carries a set-theoretic connotation, asking for a set defining the domain of values, and three interpretation functions that interpret specifics parts of a theory in specific ways.
Model in the generalized sense, inheriting from or related to model theory. I hesitate to even call this distinct from the second sense, but I will anyway. I also hesitate to speculate about history—perhaps this sense actually predates model theory. But I distinguish it from the second sense, because it frequently generalizes away from the strict 3 category “constant symbol”, “predicate symbol”, “function symbol” specification and doesn’t seem beholden to set theory. Definitions 1 and 5 use “model” in this sense.

In the second sense of “model”, the first sense of the word remains—we’re still interested in a description of some system (the theory), and of using the model to make predictions or reason. However, since the theory is also mathematical, we can be more rigid about our reasoning requirements—axioms of the theory must be true of the model, and relationships must be preserved in the model. This is rarely true of a model of the real world; e.g., the Newtonian model of gravity works pretty well, until it doesn’t, so it’s a model that doesn’t quite make all axioms true or preserve all relationships.

The third sense seems closer to the idea of semantics, in the mathematical logic sense of the word as assigning meaning or interpretation to syntax. In this sense, the word “model” frequently avoid committing to set theory as a formal foundation, generalizes away from the three interpretation functions, and focuses instead on the relationships between uninterpreted syntax being preserved by the interpretation. For example, in Definition 1, the relationships of interest are well-typedness, definitional equivalence, and falsehood, and the formal foundation is type theory. Category theory seems to come closest to a complete formalization of this sense of the word “model”, although I’ve had a hell of a time understanding that. Nlab articles don’t say this explicitly, but reading between the lines in articles linked to from the Nlab article on model theory for the words syntax and semantics implies that the idea of syntax, i.e., uninterpreted symbols with relationships between themselves and judgements about them, can be formalized in category theory, and then so can the idea of semantics, i.e., providing an interpretation in some other domain of those uninterpreted symbols; a domain in which one can use all the power of the other domain to reason about the judgements one wishes to make about the uninterpreted symbols.

The second problem with the word “model” is that we frequently work with two senses simultaneously.

When I write down a programming language, I’m often trying to model (in the first sense) a real programming language (or some feature of it), one actual software developers use to make real things happen in the real world. I am not merely describing a mathematical object for study. (Okay, sometimes I do that, but usually to the first end, eventually.) When I write down such a model, I may describe the abstract syntax, the typing judgement, and an abstract machines or reduction rules. These form a pretty good mathematical description of how a real language behaves. A compiler will reject syntactically invalid expressions. It may then type check the abstract syntax tree, and reject some possibly semantically invalid expressions. If judged well typed, the compiler may transform the tree into something that runs, and that run-time behaviour can be predicted using the reduction rules.

However, for much programming languages work, I’m not interested in merely predicting the behaviour of a single program. I might want to predict behaviour or properties of the entire language, or its typing judgement, etc. To reason about single programs, the model (in the first sense) may work well. But it might not work well for, say, trying to decide whether certain types can even be inhabited. To solve this, we might build a model (in the second or third sense). We interpret the abstract syntax tree and typing judgement in some other domain. That is, the AST and the typing judgement, being a model in the first sense, form a theory in the model theoretic sense. We can then construct a model (in the second sense) of a model (in the first sense). The Standard Encyclopedia of Philosophy article on model theory goes into this in detail in the context of model theory, which is great.

What’s more interesting is how these two senses of model interact in programming languages. If one is interested in a model, in the second sense, it may inform how one develops a model (in the first sense). If I know I will want to construct a model (in the second sense) to reason about the typing judgement, I may decide that single-step reduction rules are actually irrelevant; I only care that certain program equivalences hold, really, and any implementation that has those equivalences suffices. So rather than create a model (in the first sense) with an abstract machine or small-step operational semantics, I’ll specify an equivalence judgement. This might give less predictive power about a real world implementation, but allow the predictions I do make to apply to many implementations.

If you see these patterns, you may have some insight into how the author is approaching their work, and in what senses they are using the word “model”.

What is syntax?

2023-06-07T20:58:46Z

I’m in the middle of confronting my lack of knowledge about denotational semantics. One of the things that has confused me for so long about denotational semantics, which I didn’t even realize was confusing me, was the use of the word “syntax” (and, consequently, “semantics”).

For context, the contents of this note will be obvious to perhaps half of programming languages (PL) researchers. Perhaps half enter PL through math. That is not how I entered PL. I entered PL through software engineering. I was very interested in building beautiful software and systems; I still am. Until recently, I ran my own cloud infrastructure—mail, calendars, reminders, contacts, file syncing, remote git syncing. I still run some of it. I run secondary spam filtering over university email for people in my department, because out department’s email system is garbage. I am way better at building systems and writing software than math, but I’m interested in PL and logic and math nonetheless. Unfortunately, I lack lot of background and constantly struggle with a huge part, perhaps half, of PL research. The most advanced math course I took was Calculus 1. (well, I took a graduate recursion theory course too, but I think I passed that course because it was a grad course, not because I did well.)

So when I hear “syntax”, I think “oh sure. I know what that is. It’s the grammar of a programming language. The string, or more often the tree structure, used to represent the program text.”. And that led me to misunderstand half of programming languages research.

The First Meaning of Syntax

Syntax has two meanings in programming languages, and both meanings can frequently be found in the same paper.

The first meaning is the one I gave above. I could give a definition of the syntax (in the first sense) of the lambda-calculus as follows.

e ::= x | (lambda (x) e) | (e e)

Ah. Beautiful syntax.

If we were following a standard text, such as Harper’s Practical Foundation for Programming Languages (2nd ed), we might next define the “semantics” of this “syntax”. We might define the “static semantics”, i.e., the type system or binding rules, then the “dynamic semantics”, i.e., the rules governing the evaluation behaviour of the syntax. For example, I might write the following small-step operational semantics.

((lambda (x) e) e') -> e[x := e']

Ah. Beautiful semantics.

Except, everything I wrote above, reduction rule included, is also syntax and not semantics.

Historical Interlude

The words “syntax” and “semantics” come from mathematical logic.

In that context, “syntax” describes sentences, statements, symbols, formulas, etc, without respect to any meaning. You can write down a logical formula say as "∀ X.P(X, A)" (where “A” is a logical constant, “X” is a variable, “P” is a proposition), and it has no meaning; it’s mere syntax. It might be true, or might be false, depending on its interpretation of “P”, “A”, and "∀". I could say that it means “all leaves are green”, which would be false. A more relevant example for PL might be the syntax ((lambda (x) x+1) 2) = 3, which I would certainly like to be true, but it very much depends on what I mean. If + means string append as in JavaScript, then the statement is false since ''.concat(1, 2) = '12'. Wikipedia is a good start for trying to understand this history of the word “syntax”: https://en.wikipedia.org/wiki/Syntax_(logic)

By contrast, in that same context, “semantics” is the means by which syntax is given an interpretation. Perhaps the most widely used approach to providing an interpretation of syntax is model theory, which I never learned. In model theory, we start with a “syntax” (or “theory”). This theory is a collection of constants, function symbols, and predicate symbols. A model then is a map from the uninterpreted syntax to some interpretation that preserves relationships. I’ll say more of this in a later post, but for now, consider the following example. I might provide a model of our earlier example that interprets + as ''.concat, and = is mapped to, say ===. This preserves relationships, if all my constants are mapped to strings. Wikipedia is a good source for this history too: https://en.wikipedia.org/wiki/Semantics_of_logic.

When Semantics is the Syntax

What’s interesting about this history is how it was adopted in programming languages, and evolved in two different ways. On the one hand, a programming language grammar is syntax, in the sense of being uninterpreted statements. That syntax can be given a semantics, an interpretation, by using operation semantics (this is the sense in which operational semantics is a semantics). The operational semantics provides an interpretation to our grammar.

But, in another sense, the grammar, typing rules, and evaluation rules (the “syntax”, “static semantics”, and “dynamic semantics”) are mere syntax, in the older logical sense. They are a theory, in the model-theoretic sense. To see why, we must understand what the earlier example ((lambda (x) x+1) 2) = 3 means. Or in fact, realize that it doesn’t mean anything at all.

To write this down is to write down a proposition about the grammar: that one piece of the grammar is equal to another. Except I didn’t write a proposition that the two were equal. I wrote the uninterpreted proposition symbol =, the syntax =, next to two pieces of uninterpreted grammar, two other pieces of syntax. Every syntactic judgment about our grammar is itself syntax, in the model theoretic sense. At least, this is true if we follow the tradition of writing them down synthetically, axiomatically, about the grammar, as is done in standard programming languages textbooks such as Types and Programming Languages or Practical Foundations for Programming Languages.

In this view, the typing rules and reduction relations are syntax. This is a bizarre perspective from a software engineering perspective, but makes sense from the mathematical logic perspective.

With this perspective, it might make sense to call “operational semantics” “syntactic semantics”, or to imagine a tower of syntax and semantics where one level’s semantics become the next level’s syntax. This view finally helped me make sense of why we call “syntactic logical relations” syntactic, when they are clearly semantics. (A problem I danced around in my previous post on logical relations.)

This perspective is also useful, for two reasons. The first is that reasoning purely syntactically, while very general, prevents you from importing any other reasoning principles from any other domain. By viewing the typing system as syntax, and then building a model of it (and by necessity, the programming language terms) in, say, set theory, we can import all set-theoretic reasoning in our attempts to reason about our type system. But more than that, we can reinterpret the syntax freely, to prove general results. While I might have written a type system using syntax that looks like numbers, I could build a model that interprets that type system as over strings, and know that actually the entire system is safe for strings, too. Appropriately generalized, I wouldn’t need to do any additional proofs.

Unfortunately, this double meaning of the word syntax seems to be completely taken for granted by some. nLab is a good example of this. To quote from the introduction to the nLab model theory page:

On the one hand, there is syntax. On the other hand, there is semantics. Model theory is (roughly) about the relations between the two: model theory studies classes of models of theories, hence classes of “mathematical structures”.

What’s most interesting about this quote isn’t what it says, but what it links to. The link for “syntax” is to the page on the internal logic of a category. From the software perspective, this is not syntax, but semantics. How on earth could it be syntax? The link for “semantics” is to the page on structure, the idea of equipping a category with a particular functor. How on earth is that any more semantics than the original abstract nonsense version of syntax?

Before I understood “syntax”, I couldn’t make any sense of that, but now I’m beginning to understand. The internal logic of a category in some sense must be able to express the grammar of a language, and the judgments of a language, but in a purely syntactic way—in the same way that when I write down the grammar and typing rules of a language, I don’t refer to any interpretation of those symbols beyond the way I combine them on the page. Then the semantics or structure is a the particular functor over that category, providing an interpretation, a semantics, of that original category (the syntax).

Anyway, now I think I’m ready to understand what a model is.

In What Sense is WebAssembly Memory Safe?

2023-05-19T02:35:56Z

I’ve been trying to understand the semantics of memory in WebAssembly, and realized the “memory safety” doesn’t mean what I expect in WebAssembly.

What is memory safety?

Here are some definitions.

Memory safety is a feature of programming languages that prevents certain types of memory-access bugs, such as out-of-bounds reads and writes, and use-after-free bugs. In an app that manages a list of to-do items, for example, an out-of-bounds read could involve accessing the nonexistent sixth item in a list of five, while a use-after-free bug could involve accessing one of the items on an already deleted to-do list.

https://spectrum.ieee.org/memory-safe-programming-languages

Memory safety is the state of being protected from various software bugs and security vulnerabilities when dealing with memory access, such as buffer overflows and dangling pointers. For example, Java is said to be memory-safe because its runtime error detection checks array bounds and pointer dereferences.

https://en.wikipedia.org/wiki/Memory_safety

Memory (un)safety in Wasm

WebAssembly (Wasm) is a language that guarantees “type safety … [preventing] invalid calls or illegal accesses to locals, … memory safety, and … inaccessibility of code addresses or the call stack”.

(Technically, the Wasm paper describes Wasm as a binary code format, that happens to be presented as a language.)

Formally, a whole Wasm program that type checks is guaranteed to either be a well-typed value, or take an evaluation step to a well-typed program, or evaluate to the well-known dynamic error “trap”.

This is in contrast to an unsafe language like C. A well-typed C program might take a step to a well-typed program, or it might evaluate to a value of arbitrary type or no type. For example, a well-typed program of type char that reads from a buffer might evaluate to a well-typed char, or it might evaluate to an arbitrary integer that does not correspond to any character because you were reading uninitialized memory.

For example, consider the following C program.

// unsafe.c
#include <unistd.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char** argv) {
  char* buf = malloc(0);
  memcpy(buf, "Hello world\n", 12);
  write(1, buf, 12);
  return 0;
}

(compiled with clang -o unsafe.exe unsafe.c; run with ./unsafe.exe)

This program creates a buffer of size 0, writes “Hello world\n” to it, and tries to print that to standard out. The program printed “Hello world” when I ran it, but it’s undefined behaviour, so anything could happen. I tried to writing a loop that mallocd lots of memory and wrote arbitrary numbers, but never managed to crash the program. Still, it’s not memory safe.

The equivalent Wasm program is below.

;; safe.wat
(module
 (import "wasi_unstable" "fd_write" (func $fd_write (param i32 i32 i32 i32) (result i32)))

 (memory 0)
 ;;(memory 1)
 (export "memory" (memory 0))

 (data (i32.const 0) "Hello World\n")

 (func $main (export "_start")
       (i32.store (i32.const 12) (i32.const 0))
       (i32.store (i32.const 16) (i32.const 12))
       (call $fd_write (i32.const 1) (i32.const 12) (i32.const 1) (i32.const 20))
       drop))

(run with wasmtime safe.wat)

In this example, we create a string “Hello World\n” at address 0 in the module’s memory. We then create (encode) a new iovs just after it, starting as address 12, with a pointer to address 0 and length 12. Then we call fd_write, from the wasi API.

Unfortunately, we declared the memory size to be 0, so trying to allocate this string fails, traps safely, and the process exits with an error message.

So wasm is memory safe right?

Well, sort of, but there’s a pretty key distinction here.

In C, we are creating a new pointer with malloc. We are allocating a new data structure, then using it (unsafely).

In Wasm, there is exactly one memory for the entire module. Inside that memory, we encode 2 data structures: our string, and the iovs structure used by fd_write. All access to the global memory are safe. But not all accesses to the encoded data structures are.

Most application will create data structure within the memory. That’s what our call to fd_write did. The two stores actually create an iovs structure in the global memory. We have no guarantees, within Wasm, about that data structure.

For example, here’s our Hello World program in Wasm which uses the memory safely and correctly, but creates an iovs whose length is claimed to be 100, larger than the actual string.

;; unsafe.wat
(module
 (import "wasi_unstable" "fd_write" (func $fd_write (param i32 i32 i32 i32) (result i32)))

 (memory 1)
 (export "memory" (memory 0))

 (data (i32.const 0) "Hello World\n")

 (func $main (export "_start")
       (i32.store (i32.const 12) (i32.const 0))
       (i32.store (i32.const 16) (i32.const 100))
       (call $fd_write (i32.const 1) (i32.const 12) (i32.const 1) (i32.const 20))
       drop))

(run with wastime unsafe.wat)

When I run this, I get “Hello world\nd” printed to stdout. I have no idea where that trailing d comes from, and it didn’t crash, suggesting it read uninitialized memory of some kind.

Arguable, this is cheating: Wasm does not and cannot make claims about external system functions, and wasi is unstable. But IMO the root of the error isn’t really about wasi.

Really, the root cause of this error is memory unsafety, but of a data structure encoded within a Wasm module. In a truly memory-safe language, if I try to access the 100th element of a 12-character long string, I get an error:

> racket
Welcome to Racket v8.9 [cs].
> (string-ref "Hello world\n" 100)
; string-ref: index is out of range
;   index: 100
;   valid range: [0, 11]
;   string: "Hello world\n"
; [,bt for context]

But that doesn’t happen in Wasm.

Wasm memory safety doesn’t apply to data structures implemented (encoded) within the memory. It only applies to the module’s memory, which is protected from other modules, even those running in the same process’s virtual address space.

This means Wasm modules are protected from each other, and so this kind of memory unsafety probably isn’t a security risk, only a cause of logic bugs.

In Wasm, data structures have to be encoded anyway, since Wasm doesn’t provide any kind of structure data primitives; you only have integers and some integers are interpreted as addresses into memory. But, when you encode such data structures in the memory and use them incorrectly, you have no guarantees about what happens. You could read some arbitrary data (from your own module), or read some uninitialized memory (from your own module). I.e., you get out-of-bounds reads and writes.

In another view of this, memory is the only data structure in Wasm, and it is memory safe. That’s all the language can be responsible for; if you go about encoding weird things inside that data structure, errors are likely. But this doesn’t seem like what people would expect when they hear “memory safe”. At least, it’s not what I expected at first.

What is logical relations?

2023-03-24T23:32:03Z

I have long struggled to understand what a logical relation is. This may come as a surprise, since I have used logical relations a bunch in my research, apparently successfully. I am not afraid to admit that despite that success, I didn’t really know what I was doing—I’m just good at pattern recognition and replication. I’m basically a machine learning algorithm.

So I finally decided to dive deep and figure it out: what is a logical relation?

As with my previous note on realizability, this is a copy of my personal notebook on the subject, which is NOT AUTHORITATIVE, but maybe it will help you.

Here’s my working definition of a logical relation:

A realizability semantic model,
built of predicates over syntax,
that reflects judgments and structures from semantics to syntax.

Point 1 is subtle; it implies that the logical relation is both a model, and a realizability semantics. Unfortunately, I still don’t know what a model is, so I’m going to have work with the following probably wrong oversimplification: the logical relation must take (syntactically) equal terms to (semantically) equal terms. Which notion of syntactic equality though? I’m not sure, and I’m going to ignore it for now.

Point 2 is actually more specific than necessary. We don’t need to predicates over syntax specifically, but really over some base model. It’s easier for me to think of this as “syntax”, though.

Point 3 is quite difficult to make precise without making a lot of this more precise in a mathematical framework. Jon Sterling gave me the following helpful definition:

A logical relation on a model M (viewed as a category) is then a model that is constructed in the following way:

Choose some functor R : M —> E where E is a sufficiently structured category (e.g. the category of sets, or something else!). The most basic example of a functor R is the “global sections functor” M —> Set, which sends every type in M to the set of closed elements of that type. This is exactly the usual “non-Kripke logical relations"; to get Kripke logical relations, you replace Set with a functor category (presheaf category) and choose a more interesting functor R.

Now define a new category G, as a category whose objects are pairs of an object A of M, together with a subobject of R(A). A morphism in G from (A,A’) to (B, B’) is given by a morphism (f : A -> B) that sends elements satisfying A’ to elements satisfying B’.

You have to show that the category G is actually a model of your language (e.g. show that it has function spaces, booleans, whatever). Doing so is the FTLR.

Note that there are some ways to generalize the situation above, but this is basically what logical relations are.

Point 3 is also is more specific than necessary; “syntax” can be generalized to be “base model”.

Despite the complexity, we can see point 3 in action in some examples below.

What are logical relations, historically?

tait1967 - Intensional interpretations of functionals of finite type I

Logical relations are sometimes called “Tait’s Method”, dating back to Tait, as far as I can tell.

In this paper, Tait proves that System T with bar induction is a conservative extension of intuitionistic analysis U_1, which is intuitionistic arithmetic plus quantification over functions plus the axiom (schema) of choice plus bar induction. This conservative extension property is the semantic property of interest. The proof starts with a proof that T without bar induction is a conservative extension of just intuitionistic arithmetic (no choice or bar induction).

To do this, Tait develops a type-indexed predicate over System T terms (without bar induction), providing a U_0 term for all T terms of each type. These predicates M_t, C_t, and E_t are (I think) what we refer to as a logical relation. In particular, the C_t relation provides the interpretation of T values of type t, M_t seems to deal with variables, and E_t seems to be a binary relation defining semantics (“weak α-definitional equality”) of terms.

Theorem V (page 205) uses this logical relation to prove that, for all semantics values at the same type, (weak α-) definitional equality is decidable: they either are or are not related in E_t. This seems to be the key point: the definitional equality is reflected out of the semantics of terms, so it can apply to the syntax of terms.

This use of logical relation seems to also be a realizability semantics, since it it assigns syntactic types to a collection of semantics terms, by induction over syntactic types, where the realizers are a subset of all possible semantics terms.

However, it seems to be more than a realizability semantics, too. What seems very important in this paper is that the semantics preserves structure, namely definitional equality. Perhaps implicitly though, other pieces are important. For example, T functions are interpreted as U functions, although it’s not clear to me that this is critical.

This is in contrast to Kleene’s (kleene1945) realizability, which did not seem concerned with structure, but only the existence of the realizers.

plotkin1973 - Lambda-definability and logical relations

Plotkin seems to be responsible for the name, and perhaps rediscovering logical relations in the context of programming languages.

Plotkin helpfully gives us a definition of “logical”, as well, and it seems quite importantly related to part 3 of my working definition. Plotkin defines a relation R as logical if it is:

a subset of any D_k from the carrier any D∞ model (this seems to correspond to “admissible relations” in modern logical relations parlance);
the relation is preserved by functions in D. That is, the relation holds on a function f in D_k iff for all arguments x, R(x) implies R (f x) (extended to the n-ary case for n-ary relations).

This suggests that it is important the logical relation is somehow interpreting syntactic structures as semantic structures, as in the case of Tait’s model interpreting syntactic functions as semantic functions. More generally, we likely want this property of all structures in the languages: syntactic pairs are interpreted as semantic pairs, etc. Jon’s category theoretical definition seems to generalize Plotkin’s definition nicely.

This denotational logical relation also shows us a logical relation that is not defined over syntax. Instead, it is a relation over some arbitrary non-trivial D∞ model. The author mentions that since they can interpret syntax in a D∞ model, they informally treat the logical relation as over syntax sometimes, which I suppose could be made formal easily enough.

How is “logical relations” used in PL?

ahmed2006 - Step-Indexed Syntactic Logical Relations for Recursive and Quantified Types

In this paper, Ahmed is concerned with syntactic logical relations for recursive and quantified types, in particular for reasoning about contextual equivalence. Likely due to Ahmed’s work, this kind of syntactic logical relation seems to be what most people mean or think when they say “logical relation”, although that may be changing.

The desired property of the logical relation then is that two related semantic terms should be contextually equivalent in the syntax. That is, the logical relation reflects (from semantics to syntax) equivalence.

Strangely (for a realizability model), this particular syntactic logical relation also reflects typing: semantic terms in the relation are also guaranteed to be well-typed in the syntax. In contrast, some uses of “logical relations” enable semantics terms to be syntactically ill-typed. Such logical relations might be better called realizability models, although they do something reflect some structure, so perhaps reflecting typability is not a critical point of reflecting structure.

Ahmed in the introduction points out an interesting distinction: that logical relations can be either denotational, or syntactic. Syntactic logical relations model syntax as sets of syntactic values such that some property holds over that syntax. By contrast, denotational logical relations model syntax as some denotational object, Syntactic logical relations are useful for proving properties of the operational semantics directly. Denotational models instead model syntax as denotational objects, such as, e.g., sets of set-theoretic functions over elements of a D∞ model in plotkin1973. This is useful for easily proving meta-theoretical properties by reflecting properties of the denotation into the syntax, but not necessarily about the operational semantics directly.

For example, Tait uses a “denotational logical relation” into intuitionistic analysis to prove that definitional equality of System T is decidable—the definition of definitional equality, in the model, and its proof of decidability, are reflected back into the syntax; this requires no operational semantics at all. Plotkin uses a denotational logical relation, into domain theory, to show that certain λ-calculus constructs are or are not definable—existence of a term in the logical relation is reflected into the syntax as a definable expression. Neither of these is a syntactic logical relation; the semantic values never mention syntactic values directly.

Ahmed uses a “syntactic logical relation” to prove something about the operational semantics, namely, to prove contextual equivalence (an operational notion), indirectly. Direct proofs of contextual equivalence are difficult. So instead, a semantic proof of equivalence is reflected back into the syntax as ta proof of contextual equivalence. This requires structuring the logical relation into a denotation of sets of syntactic terms that evaluate in the operational semantics, so that being in the relation tells us something about evaluation in the operational semantics, which tells us something about contextual equivalence.

abel2018 - Decidability of conversion for type theory in type theory

Abel et al. define a syntactic logical relation for typed, reducible (and equivalent) terms, to prove decidability of conversion for type theory. Here, the use of syntactic logical relation is important for proving a particular conversion algorithm over the syntax is decidable.

The interesting feature of this logical relation is the generalization from a model inductively defined over types, to inductively defined over judgments. This demonstrates a weakness in my working definition of logical relation and realizability, since I defined “realizability” in terms of models inductively defined over types.

timany2022 - A Logical Approach to Type Soundness

This paper is interesting because it uses a syntactic logical relation that intentionally does not reflect typing, as many syntactic logical relations cdo. Semantically valid terms are not necessarily syntactically valid. In other ways, it looks very much like a logical relation: syntactic pairs are semantic pairs, sums sums, functions functions, etc.

The key property this paper is interested in is type safety: all well-typed terms are well-defined in the operational semantics, i.e., they evaluate to values or well-defined errors or fail to terminate, but importantly, do not get stuck. “in the operational semantics” is important to understanding why this is a syntactic logical relation; it must model terms as sets of syntactic values to reason about the operational semantics given in the paper.

However, one could imagine proving a slightly different form of type safety with a denotational logical relation. Giving a logical relation into an arbitrary model with a well-defined notion of evaluation would be implicitly a proof of type safety: that there exists a model that is type safe. The ability to reflect from semantics to syntax provides a mechanism for constructing that evaluation over syntax. So while the denotational logical relation provides no direct proof about the operational semantics, it may provide a mechanism for a type-safe-by-construction operational semantics. (This reflecting evaluation out of the semantics seems very related to the idea of normalization-by-evaluation, but I’m not clear on this.)

What is realizability?

2022-10-05T21:54:39Z

I recently decided to confront the fact that I didn’t know what “realizability” meant. I see it in programming languages papers from time to time, and could see little rhyme or reason to how it was used. Any time I tried to look it up, I got some nonsense about constructive mathematics and Heyting arithmetic, which I also knew nothing about, and gave up.

This blog post is basically a copy of my personal notebook on the subject, which is NOT AUTHORITATIVE, but maybe it will help you.

My best understanding of realizability right now, in programming languages (PL) terms, is:

A technique for assigning each syntactic type to a collection of semantic terms;
By induction over syntactic types;
Where the semantic terms that are realizers—i.e., included in the collection related to some syntactic type—are a sub-collection of all possible terms in the semantic domain. That is, there are valid semantic terms not associated with any syntactic type.

I use the word “collection” rather than “set” to avoid invoking set theory.

Graphically, we can represent this as follows:

The point of the technique is that clause 2 gives us a proof technique by induction, and clause 3 means we can relate the collection of terms (or proofs) to some other well-known collection. This yields a proof technique for metatheoretic properties about the collection, such as that there are only terminating terms in the collection of realizers, or there are only recursive functions and therefore some classical things remain unprovable.

I’m not entirely sure that clause 2, induction, is necessary, and I can’t find anything explicit about clause 3, but they seem to be true historically and in many uses of the term.

Okay so how did I get to this understanding?

What is realizability, historically?

kleene1945 - On the Interpretation of Intuitionistic Number Theory

Realizability seems to come from Kleene’s paper “On the Interpretation of Intuitionistic Number Theory”. I say “seems to” as Kleene attributes the “detailed investigation of the notion of realizability” to David Nelson, attributes several of the results in the paper to Nelson, and claims that the main results of the paper are joint work with Nelson. But the paper only has Kleene’s name on it, and Kleene claims in the first footnote that they introduced the idea of realizability to Nelson in a seminar. So anyway, realizability seems to come from Kleene, and this is the canonical paper cited for the technique.

In this paper, realizability is quite specific. It’s a technique that takes an intuitionistic first-order logic formula about Peano arithmetic (Heyting arithmetic) and constructs a natural number from it, representing the (constructive) proof of that formula. Only provable formulas are realized. The point of this exercise is to prove various metatheorems about the realized language: is it consistent, and what are provable/unprovable in the intuitionistic formulae.

Intuitively, something is unprovable if there exists a formula, but there does not exist a realization of it. This can be shown by connecting the formula to the set of realizers (in this case, natural numbers), but showing that there cannot exist a related natural number (or, more often, function on natural numbers represented by its Gödel number) with the properties required of the realizability interpretation. The simplest example: since “false” is unprovable (it has no realization, by construction), the intuitionistic logic is consistent.

This also lets us prove something about the class of all provable statements. Since we have a method for constructing something from any provable (or true) statement, we can say something about the set of all provable statement in relation to the realizers. Kleene mentions one consequence is that the intuitionistic calculus cannot prove the existance of any function other than a general recursive function, since those are the only functions constructed in the realizability interpretation. This tells us, for example, that the intuitionistic calculus is different from classical set theory, which contains other functions.

An important detail in this paper that clarifies the distinction between the intuitionistic and the classical happens in Clause 6, on page 113. This is the definition of the realizability interpretation for existential quantification ∃x.A(x). This has a realization if, for some x, A(x) has a realization. It’s important to notice that this second “for some x” quantification happens in the metalanguage, namely, classical set theory, and therefore could be choosen by Choice. Kleene discusses this on page 118, where he uses the word “classically” as a modifier on various quantifiers to remind us that, when working with the quantification and realizers directly, we are working in a classical system in which intuitionistic proofs also exist.

What seems to be going on here is that the realizers are something like the intuitionistic subset of classical set theory. I think that statement isn’t exactly true; Kleene uses classical choice when working with the realizers to show there are unprovable theorems. For example, a realizer parameterized over (classically) all variables may not correspond to an intuitionistic formula. So it’s not that the realizers are only intuitionistic, I think. But any particular realizer is (must be)? The important point may be the realizers are a subset of the whole system, and thus we can prove interesting metatheorems that rely on distinguishing the realizers (and therefore, the formulae they realizer) from all the things in the full system.

amadio1998 - Domains and Lambda-Calculi, Chapter 15

Chapter 15 of Amadio and Currien’s book “Domains and Lambda-Calculus” introduces realizability in its historical context. The introduction formalizes Kleene’s work as an example, and discusses its use.

They emphasize two things, which seem to confirm some of my understanding:

The realizability relation is defined inductively over formulas, and relates formulas to proofs.
The use lets us reason about all proofs in the system.

This is the best definition of realizability I’ve seen, and applies both to Kleene’s original, but also to uses in PL.

The authors point out that Kleene’s original goal was to prove consistency. They then confirm my above intuitions, that the realizability interpretation also lets us prove metatheorems about what is provable/unprovable in the realized system. However, they note that one application of this is to find unprovable true statements, which can be consistently axiomatized back into the original system. There are proofs in the set of realizers, i.e., true statements, that are never constructed by the realizability interpretation. These could be added back to the original system to enrich it.

This latter use seems to confirm one feature of realizability that isn’t explicit stated anywhere, but seems to be true of all realizability interpretations I’ve seen: that the realizers are a strict subsystem of some larger formal system.

How is “realizability” used in PL?

In programming languages, we’re not often concerned with intuitionistic vs classical logic; we’re working constructively by default. In fact, many of the uses of “realizability” in PL don’t seem to be related to logic at all, but to modeling well-typed programs. And while, sure, these are related by Curry-Howard, the difference seems important to me. So what does realizability mean in this context?

In most uses in PL, the important feature seems to be clause 3 in my definition above: the collection of all values is larger than the set of realizers. In PL, this suggests that we’re ascribing types to “untyped” terms, and the realizers are those that are semantically well typed, but not necessarily syntactically well typed. The full collection contains also untyped terms, and we can therefore prove through realizability that the type system rules out ill-typed terms.

There do seem to be some examples in PL that are explicitly relating classical and intuitionistic ideas, namely those trying to import constructive interpretations of classic logic. I’m not really interested in those, and I think the connection to realizability is much more clear in those applications, so I’ll ignore that area.

Let’s look at some examples.

benton2010 - Realizability and Compositional Compiler Correctness for a Polymorphic Language

In “Realizability and Compositional Compiler Correctness for a Polymorphic Language”, Benton and Hur define a “realizability” interpretation of System F types realized by terms in low-level language, for proving some compiler correctness properties. The terms realize the types, and this lets us talk about which low-level programs are valid to link with, without restricting the set of linkable programs to only those generated by the compiler.

This has lost all connection to intuitionistic vs classical logic, but I suppose it keeps the key features of the technique: types (formula) of one language are realized by terms in another, and there is some concern that the realizers should be a subset of all terms. Not all low-level programs should be valid, but some set of them should be.

nakano2000 - A Modality for Recursion

“A Modality for Recursion” was actually the start of my realizability journey. This paper starts by defining a collection of models (β-models) of the untyped λ-calculus. It then defines the class of realizability models, in terms of β-models, for an extrinsically typed λ-calculus with equi-recursive types. A realizability model is parameterized by a β-model, and is a relation inductively defined over types to their realizers, which are values drawn from the β-model.

So why is this realizability? Well, I don’t see anything to do with intuitionistic vs classical. But, the set of all values is larger than the set of realizers, which seems to be important to all uses of “realizability”, and important for this result in particular. In this paper, this is used to show that the dot modality rules out some valid β-model terms, namely those that would correspond to non-terminating λ terms.

Later in the paper, they define a “realizability interpretation”. This seems to be distinct from the collection of all realizability models in that they pick a particular set of realizers? So, it ought to be a realizability model, I guess? But they don’t say so explicitly. The interpretation is still quite heavily parameterized, but it does seem to fix or restrict the set of realizers. Anyway, this interpretation includes all the features of my definition above: it’s inductively defined over types, relating types to (a semantic model of) untyped λ terms, for the purposes of proving something about the collection of realizers as they related to the collection of all untyped λ terms.

The A Means A

2022-06-30T17:25:55Z

I have argued about the definition of “ANF” many times. I have looked at the history and origins, and studied the translation, and spoken to the authors. And yet people insist I’m “quacking” because I insist that “ANF” means “A-normal form”, where the “A” only means “A”.

Here, I write down the best version of my perspective so far, so I can just point people to it.

I want to answer three question: what does the A mean, why does the A matter, and where does the A come from.

What does the A mean?

The “A” in “A-normal form” refers to a particular formal object, named “A” (not “administrative”), with respect to which there is a normal form with certain useful properties. This form is “A normal”—none of the A reductions apply to terms in this form—hence, A-normal form.

While it’s true that the history of ANF is concerned with “administrative reductions” in CPS, this is an informal concept, modeled by the formal object “A”.

In truth, “A” is several formal objects, defined somewhat differently in at least 3 different papers. Only one of these is arguably called “administrative”, but is about CPS, and not what we now call ANF.

“A” appears in “The Essence of Compiling with Continuations”, page 5. Under the discussion of the CPS, optimization, and un-CPS diagram, the authors observe that this diagram begs for a completion, some direct process, “A”, that simply normalizes a term within the same language. This diagram is reproduced below: $\begin{array}{ccc} e & \overset{CPS}{\to} & e' \\ \overset{A}{\downarrow} && \overset{\beta}{\downarrow} \\ e_A & \overset{unCPS}{\leftarrow} & e_O \end{array}$ They ask, what are some set of reductions, call this set A, such that normalizing with respect to A would produce a normal form, A-normal form, that characterizes the use of CPS in practice.

The same pattern appears in "Reasoning about Programs in Continuation-Passing Style, page 1:

Thus, we refine this question as follows: Is there a set of axioms, A, that extend the call-by-value λ-calculus such that: …

The authors go on to define the set A, never naming it by some administrative reductions, but deriving A instead from the inverse CPS translation.

We could argue that Sabry’s thesis, Chapter 3, Section 1, “Administrative Source Reduction: The A-Reductions”, names the A-reductions “administrative”. He goes on to analyse those reductions considered to be the administrative ones, defining βlift and βflat in terms of CPS. He then defines in Definition 3.1, the administrative source reduction (A-reductions). However, these refer to reductions over CPS terms, and are distinct from the reductions considered for ANF. While they are the origin of ANF, they do not produce terms in what we now call ANF. A term in “administrative normal form” with respect to that set of reductions would actually be in CPS. That’s not what we mean when we say ANF; we mean normal with respect to the set A defined in “The Essence of Compiling with Continuations”.

Maintaining this distinction between the formal object A and the informal notion of administrative reductions is important for two reasons. First, it helps remind us that ANF is a form ultimately about normalizing a specific set of reductions, not the output of a particular translation, which is important in practice. Implementations often relax ANF until code generation, by omitting some of the A reductions, typically, A2 in “The Essence of Compiling with Continuations”—even that paper relaxes A2 in their implementation in the appendix, because A2 leads to exponential code duplication or requires object-language continuations (“join points”). It’s hard to even formally discuss this relaxation if we do not have the set of normalized reductions in mind. Second, the idea that ANF is free of “administrative” redexes is absurd, since the idea of the administrative redex is an informal concept: a reduction that isn’t really necessary but merely an artifact of the translation. It is easy to introduce such administrative redexes in ANF; e.g., let x = y in x contains an extra unnecessary ζ redex, but it is in ANF. It is, however, free of A reductions.

Why does the A matter?

I don’t actually care what the “A” means, or what the authors intended it to mean. I care that we think about ANF as a normal form, normal with respect to a specific set of reductions.

This most recent rant was triggered by a conversation with a reviewer, who, after observing that the “A” actually stood for “administrative”, asked whether our ANF translation could be decomposed into two translations, one that did everything but normalize the ifs (handling if is annoying in ANF, as it either requires being clever or causes code duplication), and then separately handle if.

The answer is completely obvious… if you think about ANF in terms of a normal form with respect to a set of reductions, and not as merely the output of some translation process, nor “CPS but like without adminsitrative redexes”. Since ANF is a normal form with respect to “A”, we can easily decompose it into multiple normal forms, thus deriving several decomposed translations: remove the A reduction that normalizes if, and you get another normal form. Remove the rules that normalize if and nested let, and you get monadic form.

But all of this is much more complicated to explain if you think of ANF as a particular translation or particular syntactic form, and not a normal form with respect to the set A. And this seems to be very likely how you will think of you think A means administrative.

Where does the A come from??

Incidentally, I spoke with Amr after he read this blog post. The origin of the “A” comes from a result by Curry, who proves some theorems about any combinatory logic extended by a set A of ground equations: https://staff.fnwi.uva.nl/p.h.rodenburg/Varia/RelCLlam.pdf

This led Matthias to ask Amr to create a set A, such that bla bla bla.

Amr admits he may have intended a pun between A and administrative, but doesn’t remember.

How I Redex---Experimenting with Languages in Redex

2019-10-06T19:45:13Z

Recently, I asked my research assistant, Paulette, to create a Redex model. She had never used Redex, so I pointed her to the usual tutorials:

While she was able to create the model from the tutorials, she was left the question “what next?”. I realized that the existing tutorials and documentation for Redex do a good job of explaining how to implement a Redex model, but fail to communicate why and what one does with a Redex model.

I decided to write a tutorial that introduces Redex from the perspective I approach Redex while doing work on language models—a tool to experiment with language models. The tutorial was originally going to be a blog post, but it ended up quite a bit longer that is reasonable to see in a single page, so I’ve published it as a document here:

Experimenting with Languages in Redex

Untyped Programs Don't Exist

2018-01-19T23:37:01Z

Lately, I’ve been thinking about various (false) dichotomies, such as typed vs untyped programming and type systems vs program logics. In this blog post, I will argue that untyped programs don’t exist (although the statement will turn out to be trivial).

TLDR

All languages are typed, but may use different enforcement mechanisms (static checking, dynamic checking, no checking, or some combination). We should talk about how to use types in programming—e.g. tools for writing and enforcing invariants about programs—instead of talking about types and type checking as properties of languages.

1. TLDR
2. Some Context
3. Definitions
4. Is X a Typed Language?
5. But I Don't Get Type Errors in X!
6. Untyped Programs Don't Exist.
7. Conclusion
8. Related Reading

Some Context

In most of my academic work, I work with “typed” languages. These languages have some nice properties for the metatheorist and compiler writer. Types lend themselves to strong automated reasoning, automatically eliminate large classes of errors, and simplify the job of whoever is reasoning about the programs. The downside is that the programmer must essentially statically prove properties of their program in such a way that a machine can understand the theorem and check the proof.

When I’m hacking, I write in “untyped” languages. I write programs in Racket, scripts in bash, plugins and tools in JavaScript, papers in latex, build systems in Makefile, and so on. These languages lend themselves to experimentation and avoid the overhead of necessarily proving properties of the programs. The downside is that the computer cannot help the programmer, since the programmer has not communicated the invariants about the program in a way the computer can understand.

“But surely”, a type evangelist says, “the very same benefits of types for metatheory help one develop the program in the first place? Why do you hobble yourself by omitting types? Come join us in the land of light! Use types from the start!”

“Dear friend, I couldn’t agree more!”, I reply, "Types are invaluable to developing my programs, but your ‘typed’ language prevent me from writing down my types!’

"Well, certainly there are some limitations of typed languages," the type evangelist concedes, "but this we could also choose to ignore the type system, create the uni-type, and program in the error monad. Now we have the benefits of both worlds!"

"Don’t you see,", I say excitedly, "that is just what I’m doing! My ‘untyped’ languages are, in fact, well-typed. My programs run implicitly in the error monad. What’s more, I am not required to prove it, for it is simply true."

A grave look comes over my interlocutor’s face. "But you forfeit all benefits of static typing. Your errors are reported later, and there are performance implications, and …"

"Exactly.", I interrupt. "We are not arguing about typing, for all programs are well typed. We are arguing about pragmatics."

Definitions

Before I can argue that untyped programs don’t exist, I need some near-formal definitions to work with. I posit the following definitions are reasonable, intuitive definitions about types and programs.

Definition. An expression is a symbol of sequence of symbols given some interpretation.

Example.

5 is an expression, whose interpretation is the number five.
e₁ + e₁ is an expression, whose interpretation is the mathematical addition function applied to expressions e₁ and e₂.
function(): return 5; is an expression, whose interpretation is a mathematical function that when applied to any number of arguments returns the expression 5.

Definition. A type is a statement of the invariants of some expressions.

Example.

a register word is a type describing the kinds of values that fit in an x86 register, such as a collection of 32 bits. A register word supports operations such as:
- move a value of type register word into a register
- move a value of type register word from one register into another
a pointer is a type describing a memory address. It is either uninitialized or a valid memory address. A pointer supports operations such as:
- initialization, giving an uninitialized pointer a value
- dereference, reading the value of the memory address of an initialized pointer
a Nat is a type describing an element of the set of natural numbers. A Nat supports operations such as:
- addition
- multiplication
- subtraction, but only when subtracting a smaller natural number from a larger natural number

Definition. A language is collection of expressions.

Example.

arith is a language containing the following expressions e:
- 0,1,2 …, etc and -1,-2,-3, …, etc, each representing an integer
- e₁ + e₂, where e₁ and e₂ are integers
- e₁ - e₂, where e₁ and e₂ are integers
JavaScript is a language, defined by the ECMAScript standard, and extended by various implementations.

Definition. A program is a collection of expressions from some language.

Example.

5 + 5 is an arith program.
5 is a JavaScript program.

Is X a Typed Language?

Is x86 assembly a typed language?

I say yes.

First, x86 assembly is a language. The language x86 assembly meets our definition of a language: it defines a collection of symbols or sequences of symbols given some interpretation. For example, mov ax, bx is an x86 assembly program that moves the contents of register bx to register ax.

Second, x86 is typed. Each expression in x86 assembly has invariants stated about the expression. For example, x86 defines the type “little endian”, which describes the particular encoding of binary data such as numbers over which operations like addition are defined. The division operation is well typed: as division is only defined when the denominator is non-zero. Attempting to divide by zero cause a type error (a dynamic exception).

I would make the same argument for every other language. C is a typed language. So is JavaScript. And Racket. And Haskell.

But I Don’t Get Type Errors in X!

First, you probably do. Second, when you don’t, that’s a major problem.

Let’s visit x86 for a moment, to see dynamically enforced type errors. Recall that division is not defined when the denominator is zero. The result of division by zero in x86 is defined to be a general-protection exception, error code 0. That is a type error. It’s a type error describing that you attempted to divide by zero, and that this is ill-typed. It is a dynamically enforced type error.

Let’s move to C, in which we can easily see two different kinds of type errors: static and unenforced. The language C includes expressions like x=e, where x is a declared name and e is an expression. The expression x=e raises a static error when x is undeclared; this is a static type error. It is a statically enforced invariant that names must be declared before they are used. Other invariants are not enforced at all, such as notorious undefined behavior. For example, bool b; if(b) { ... }; violates a C invariants, namely that uninitialized scalars are never used. However, C does not attempt to enforce this invariant, either statically or dynamically. The result of this sequence of symbols is undefined in C.

Untyped Programs Don’t Exist

First, a few more definitions based on the above arguments about the languages x86 and C.

Definition. A type error is an error raised during the enforcement of a type, i.e., during the enforcement of an invariant about an expression.

Definition. Undefined behavior is the result of interpreting a non-expression, i.e., a sequence of symbols that have no meaning because some invariant has been violated.

Theorem. Untyped Programs Don’t Exist.

Proof. Recall that programs consist of expressions from a language. Expressions are sequences of symbols that have meaning. But undefined behavior only results from non-expressions. As programs are composed of expressions, a program cannot have undefined behavior. Therefore, all programs obey the invariants required by the expressions in the language. That is, all programs are well typed, and untyped programs don’t exist. QED.

I warned you it was a trivial theorem.

Conclusion

The theorem is trivial, but still useful because it helps us reframe our discussion.

Really, the statement is just a rephrasing of type safety: “well typed programs don’t go wrong”. For type safety, what we show is that programs exhibit only defined behavior. The difference is that, typically, type safety is typically thought of as a property of a language, and in particular, of statically typed languages. We should think about type safety differently: it is a property we must enforce of programs. Enforcing it via static typing of every program in the language is one useful way, but it is not the only way, and we cannot always hope to have type safety of a language.

Instead of arguing about untyped vs typed, a non-existent distinction, we should accept that all programs have invariants that must be obeyed, i.e., all programs are typed. The argument we must have is about the pragmatics of types and type checking.

how can we express types about complex languages like x86 and C
under what situations should we enforce types, i.e., check types
is type checking useful
should we check types statically or dynamically
should we allow the programmer to circumvent types checking
is type checking decidable
should it be

The Meaning of Types - From Intrinsic to Extrinsic Semantics (Reynold 2000)

This paper proves equivalence of an intrinsically typed languages in which meaning is only assigned to well-typed programs and an extrinsically typed language in which programs are first given meaning and can separately be ascribed types and proved to inhabit those types. In the extrinsic semantics, Reynold’s treat all programs as existing in the universal domain, and use embedding-projection pairs essentially as contracts at run-time, since, e.g., only a function can be called. In my mind, this work essentially proves the same theorem as this blog post: even in an when the semantics of programs consider typing as happening “after” semantics, the semantics still require types.

Dynamic Languages are Static Languages (Harper 2011)

This blog post argues that dynamic languages are just straight-jacketed versions of static languages, and therefore they aren’t really a separate class of languages. In many ways, I agree with this blog post. Because “dynamic” languages lack any static enforcement, they can be a hindrance when you do know how to encode the types you want, and they can lead to weird type confusing programming patterns. My favorite example pattern is from Racket, where the value #f is sometimes used at type bool and sometimes at type Maybe A. This can lead to annoying problems with functions like findf over a list of bools. However, I think it ignores some of the pragmatics. For example, while sum types give you incredible expressive power, tagged sums are very annoying to use in many languages that enforce static typing, while very simple to use when you are not required to statically prove a term inhabits a sum.

On Typed, Untyped, and Uni-typed Languages (Tobin-Hochstadt 2014)

This blog post begins to get at some of the same criticisms of Harper’s view, and starts to talk about pragmatics.

What to Know Before Debating Type Systems (Smith 2010)

This blog post, reproduced in 2010 on a perl blog, does a great job of breaking down some false dichotomies and fallacies in discussions about type systems. It into more depth than this article about some distinctions in type systems, when they are meaningful and when they are not, and I pretty much agree with it.

The Calculi of Lambda-v-CS Conversion: A Syntactic Theory of Control and State in Imperative Higher-order Programming Languages (Felleisen 1987)

The abstract and chapter 1 of this dissertation have something to say about syntax and semantics, which I think are very related to the topic of this blog post. In particular, the thoughts on symbolic-syntactic reasoning I think are vital to understanding the trade-offs in different enforcements of typing.

What is Gradual Typing (Siek 2014)

This blog post discusses some trade-offs in static vs dynamic typing, in the context of gradual typing. To me, advancement in gradual typing is crucial in making typing enforcement more pragmatic. However, I disagree with some of the “good points” in this blog post. For example, the point “Dynamic type checking doesn’t get in your way” is a bad point to me; it’s also an argument in favor of no enforcement and undefined behavior. I also find some examples of gradual typing to be great evidence of what is wrong with gradual typing. For example, the program add1(true) at the end of the post should be refuted by a gradual type system, but passes current “plausibility checkers”, even when add1 has static type annotations requiring that its argument be a number.

The reviewers were right to reject my paper

2017-10-09T03:22:35Z

I submitted two papers to POPL 2018. The first, “Type-Preserving CPS Translation of Σ and Π Types is Not Not Possible”, was accepted. The second, “Correctly Closure-Converting Coq and Keeping the Types, Too” (draft unavailable), was rejected.

Initially, I was annoyed about the reviews. I’ve since reconsidered the reviews and my work, and think the reviewers were right: this paper needs more work.

In short and in my own words, the reviews criticized my work as follows:

The translation requires ad-hoc additions to the target language.
There is no proof of type soundness of the target language.
The work ignores the issue of computational relevance, compiling irrelevant things like functions in Prop.
The key insight is poorly explained, lost in the details of the Calculus of Inductive Constructions (CIC).

Initially, I thought that the reviews were unfair. I had worked out type-preserving closure conversion for much of CIC! We have an argument for why the target ought to be sound, but a formal proof would be too much. It took many dissertations to work out the soundness of CIC! As for computational relevance, well sure, we’re compiling too much, but we’re preserving all the information! Computational relevance is hard; one dissertation has been written on the subject and another is in the works. Figuring out computational relevance is important, but will be a separate project in itself! As for ad-hoc, well, I disagree, but maybe I communicated badly; that’s on me.

And that’s, essentially, what I wrote in my rebuttal. However, now I’m reconsidering my position.

But first, some context.

In this POPL submission, I developed a type-preserving closure conversion for CIC. An early version of this work was presented as a student research competition poster at POPL 2017, which you can find at here. In this paper, I scaled that work from the Calculus of Constructions to CIC; I added inductive types, guarded recursion, the universe hierarchy, and Set vs Prop. To do that, I made some compromises. I decided not to formally prove soundness, but give an argument thusly: use types that can be encoded in CIC, give a syntactic guard condition that seems plausible, but might have minor bugs that need to be repaired (which is the pragmatic approach to termination taken by Coq). As mentioned before, proving CIC sound was quite a challenge, and I felt it unrealistic to try to prove this target language sound. I also ignored computational relevance for two reasons. First I couldn’t find a great formal description of how to treat Type; there seems to be some kind of static analysis involved in giving it semantics via extraction. Second, after reading a lot about relevance, I think Set vs Prop is sort of the wrong way to encode it anyway, so I’d want to compile those into distinct concepts in the long run. So I decided to do CIC since it’s a more realistic source, but treat soundness of the target and relevance as future work.

To judge this work, we have to look at the type-preserving compilation literature. Since the reviews came out, I’ve been rereading the literature as I work on my thesis proposal, and talking to my committee; this helped put the reviews in a new context for me. The de-facto standard by which we judge type-preserving compilation work is “System F to Typed Assembly Language”. That work does not compile a realistic programming language; it compiles System F. Essentially it shows how to preserve one feature—parametric polymorphism—into a statically typed assembly language. And it took four of the best in our field to do that and do it “right”. While they do not handle a practical source language, they do handle a complicated type theoretic feature, preserve it through a realistic compiler to an assembly like language, and prove type soundness of that target language.

Judged by this standard, I can see the reviewers’ criticism as this: this paper was focusing on the wrong things. I am an academic, not an engineer. Instead of trying to handle all of CIC so that I have a practical source language, I should focus on compiling the new type theoretic feature—full spectrum dependent types—and doing that right. I should carve off the subset that I know how to do well, how to explain well, and how to prove correct. I should leave scaling to all the pragmatic features of CIC as future work, so that I have time to figure out how to do those features right.

So, thank you POPL anonymous reviewers for evaluating my work. You’ve given me a new perspective on my work and I think I know how to improve it.

What even is compiler correctness?

2017-03-24T21:41:13Z

In this post I precisely define common compiler correctness properties. Compilers correctness properties are often referred to by vague terms such as “correctness”, “compositional correctness”, “separate compilation”, “secure compilation”, and others. I make these definitions precise and discuss the key differences. I give examples of research papers and projects that develop compilers that satisfy each of these properties.

What is a Language

Our goal is to give a generic definition to compiler correctness properties without respect to a particular compiler, language, or class of languages. We first give a generic definition of a Language over which a generic Compiler can be defined.

A Language $\mathcal{L}$ is defined as follows. $\newcommand{\peqvsym}{\overset{P}{\simeq}} \newcommand{\ceqvsym}{\overset{C}{\simeq}} \newcommand{\leqvsym}{\overset{\gamma}{\simeq}} \newcommand{\ctxeqvsym}{\overset{ctx}{\simeq}} \newcommand{\neweqv}[3]{#2 \mathrel{#1} #3} \newcommand{\peqv}{\neweqv{\peqvsym}} \newcommand{\ceqv}{\neweqv{\ceqvsym}} \newcommand{\leqv}{\neweqv{\leqvsym}} \newcommand{\ctxeqv}{\neweqv{\ctxeqvsym}} \begin{array}{llcl} \text{Programs} & P \\ \text{Components} & C \\ \text{Linking Contexts} & \gamma \\ \text{Link Operation} & γ(C) & : & P \\ \text{Program Equivalence} & \peqvsym & : & P \to P \to Prop \\ \text{Linking Equivalence} & \leqvsym & : & \gamma \to \gamma \to Prop \\ \text{Component Equivalence} & \ceqvsym & : & C \to C \to Prop \\ \text{Observational Equivalence} & \ctxeqvsym & : & C \to C \to Prop \\ \end{array}$ where $\ctxeqvsym$ is the greatest compatible and adequate equivalence on Components.

A Language $\mathcal{L}$ has a notion of Programs $P$ . Programs can be evaluated to produce observations. Program Equivalence $\peqv{P_1}{P_2}$ defines when two Programs produce the same observations. A Language also has a notion of Components $C$ . Unlike Programs, Components cannot be evaluated, although they do have a notion of equivalence. However, we can produce a Program from a Component by linking. We Link by applying a Linking Context $\gamma$ to a Component $C$ , written $\gamma(C)$ . Linking Contexts can also be compared for equivalence using Linking Equivalence $\leqv{\gamma_1}{\gamma_2}$ . Observational Equivalence is a “best” notion of when two Components are related. Note, however, that a Language’s Observational Equivalence is completely determined by other aspects of the language. We are not free to pick this relation.

C is a Language; its definition is as follows. Let $P$ be any well-defined whole C program that defines a functionmain; such a program would produce a valid executable when compiled. Let $C$ be any well-defined C program that defines a functionmain, but requires external libraries to be linked either dynamic or statically. Such a component would produce a valid object file when compiled, but would not run without first being linked. Let $\gamma$ be directed graphs of C libraries with a C header file. Define the Link Operation by static linking libraries at the C level. Define two Programs to be Program Equivalent when the programs both diverge, both raise the same error, or both terminate leaving the machine in the same state. Define two Linking Contexts to be Equivalent when they are exactly the same. Define two Components to be Component Equivalent when both are Program Equivalent after Linking with Linking Equivalent Linking Contexts.

Coq (or, CIC) is a Language; its definition is as follows. Let $P$ be any closed, well-typed Coq expression. Let $C$ be any open, well-typed Coq expression. Let $\gamma$ be maps from free variables to Programs of the right type. Define the Link Operation as substitution. Define Components Equivalence and Program Equivalence as definitional equality. Define Linking Equivalence by applying Program Equivalence pointwise to the co-domain of the maps.

What is a Compiler

Using our generic definition of Language, we define a generic Compiler as follows.

$\newcommand{\newsteqvsym}[1]{_S\!\!#1_T} \newcommand{\psteqvsym}{\newsteqvsym{\peqvsym}} \newcommand{\lsteqvsym}{\newsteqvsym{\leqvsym}} \newcommand{\csteqvsym}{\newsteqvsym{\ceqvsym}} \newcommand{\psteqv}{\neweqv{\psteqvsym}} \newcommand{\csteqv}{\neweqv{\csteqvsym}} \newcommand{\lsteqv}{\neweqv{\lsteqvsym}} \begin{array}{llcl} \text{Source Language} & \mathcal{L}_S \\ \text{Target Language} & \mathcal{L}_T \\ \text{Program Translation} & \leadsto & : & P_S \to P_T \\ \text{Component Translation} & \leadsto & : & C_S \to C_T \\ \text{Cross-Language (S/T) Program Equivalence} & \psteqvsym \\ \text{S/T Linking Equivalence} & \lsteqvsym \\ \text{S/T Component Equivalence} & \csteqvsym \\ \end{array}$

Every Compiler has a source Language $\mathcal{L}_S$ and target Language $\mathcal{L}_T$ . We use the subscript $_S$ when referring to definition from $\mathcal{L}_S$ and $_T$ when referring to definitions from $\mathcal{L}_T$ . Every Compiler defines a translation from $\mathcal{L}_S$ Programs to $\mathcal{L}_T$ Programs, and similarly a translation on Components. A Compiler also defines cross-language relations on Programs, Components, and Linking Contexts.

We can define a Compiler from C to x86 as follows. Let $\mathcal{L}_S$ be the Language for C defined earlier. Define a Language for x86 similarly. Let gcc be both the Program and Component Translation. Define S/T Program Equivalence as compiling the Source Language Program to x86, and comparing the machine states after running the x86 programs. Define S/T Linking Equivalence similarly to the definition given for the C Language. Define S/T Component Equivalence by linking with S/T Equivalent Linking Contexts and referring to S/T Program Equivalence.

We can define a Compiler from Coq to ML as follows. Let $\mathcal{L}_S$ be the Language for Coq defined earlier. Define a Language for ML similarly. Let the Coq-to-ML extractor be both the Program and Component Translation. Define Program Equivalence via a closed cross-language logical relation indexed by source types. Define Component Equivalence by picking related substitutions, closing the Components, and referring to the Program Equivalence. Define Linking Equivalence by applying Program Equivalence pointwise to the co-domain of the map.

What Even is Compiler Correctness

Type Preservation

The simplest definition of compiler correctness is that we compile Programs to Programs, i.e., our compiler never produces garbage. A slightly less trivial definition is that we compile Components to Components. In the literature, these theorems are called “Type Preservation”. Typically, Type Preservation also connotes that the target language has a non-trivial type system and the compiler is obviously non-trivial.

Type Preservation (Programs)
$P_S \leadsto P_T$

Type Preservation (Components)
$C_S \leadsto C_T$

Type Preservation is only interesting when the source and target languages provide sound type systems that enforces sophisticated high-level abstractions. Even then, it still requires other properties or tests to ensure the compiler is non-trivial. For the user to understand the guarantees of Type Preservation, it is still necessary to understand the target language type system and the compiler.

A compiler that compiles every Program to 42 is type-preserving, in the trivial sense. By definition, the source Programs are Programs and 42 is a valid Program in many Languages. However, if you were to call such a compiler “Type Preserving”, the academic community may laugh at you.

A C-to-x86 compiler is type-preserving, in the trivial sense. Neither C nor x86 provide static guarantees worth mentioning. If you were to call such a compiler “Type Preserving”, the academic community may laugh at you.

The CompCert C-to-Mach is type-preserving, in a weak but non-trivial sense. CompCert enforces a particular memory model and notion of memory safety, and preserves this specification through the compiler to a low-level machine independent language called Mach. The assembler is type-preserving in a trivial since, since x86 provides no static guarantees to speak of.

The Coq-to-ML extractor is type-preserving, in a pretty trivial sense. As ML has a less expressive type system than Coq, and the extractor often makes use of casts, Type Preservation provides few guarantees for Components. For example, it is possible to cause a segfault by linking a extracted ML program a with a stateful ML program.

The System F-to-TAL compiler is type-preserving in a strong sense. System F provides strong data hiding and security guarantees via parametric polymorphism. TAL provides parametric polymorphic and memory safety, allowing all of System F’s types to be preserved. Even so, type preservation could hold if we compile everything to the trivial TAL program, such as halt[int]. However, a quick look at the definition of the compiler or a small test suite is sufficient to convince us that the compiler is non-trivial, and thus Type Preservation is meaningful in this context.

Whole-Program Correctness

The next definition is what I would intuitively expect of all compilers (that are bug free). A source Program should be compiled to a “related” target Program. In the literature, this theorem is referred to as “Whole-Program Correctness” or “Semantics Preservation”. Note that any Whole-Program or Semantics Preserving Compiler is also trivially Type Preserving. Such as Compiler may also be Type Preserving in a non-trivial sense.

Whole-Program Correctness
If $P_S \leadsto P_T$ then $\psteqv{P_S}{P_T}$

A whole-program compiler provides no guarantees if we attempt to compile a Component and then link. Since many, arguably all, Programs are actually Components, Whole-Program Correctness is of limited use. A notable exception is in the domain embedded systems. In this domain, writing a whole source program may be practical.

The CompCert C-to-Asm compiler is proven correct with respect to Whole-Program Correctness, with machine checked proofs. CompCert refers to this guarantee as “semantics preservation”. Prior versions of CompCert pointed out that, while it is possible to Link after compilation, “the formal guarantees of semantic preservation apply only to whole programs that have been compiled as a whole by CompCert C.” More recent versions lift this restrctions, as we discuss shortly.

The CakeML CakeML-to-Asm compiler is proven correct with respect to Whole-Program Correctness, with machine checked proofs. CakeML is “a substantial subset of SML”. Asm is one of several machine languages: ARMv6, ARMv8, x86–64, MIPS–64, and RISC-V. The assemblers here, unlike in CompCert, are proven correct.

Compositional Correctness

Intuitively, Compositional Correctness is the next step from Whole-Program Correctness. Compositional Correctness should give us guarantees when we can compile a Component, then Link with a valid Linking Context in the target Language.

Compositional Correctness
If $C_S \leadsto C_T$ and $\lsteqv{\gamma_S}{\gamma_T}$ then $\psteqv{\gamma_S(C_S)}{\gamma_T(C_T)}$

To understand the guarantees of this theorem, it is necessary to understand how Linking Contexts are related between the source and target Languages. For instance, some compilers may allow linking with arbitrary target Linking Contexts. Some compilers may restrict linking to only Linking Contexts produced by the compiler.

The phrase “Compositional Correctness” usually connotes that the relation $\lsteqvsym$ is defined independently of the compiler—that is, it is used to mean there is a specification separate from the compiler for cross-language equivalent Linking Contexts. This supports more interoperability, since linking is permitted even with Linking Contexts produced from other compilers, or handwritten in the target language, as long as they can be related to source Linking Contexts.

Compositional Compiler Correctness
If $C_S \leadsto C_T$ and $\lsteqv{\gamma_S}{\gamma_T}$ then $\psteqv{\gamma_S(C_S)}{\gamma_T(C_T)}$ (where $\lsteqvsym$ is independent of $\leadsto$ )

The phrase “Separate Compilation” usually connotes that linking is only defined with Linking Contexts produced by same compiler. That is, when $\lsteqv{\gamma_S}{\gamma_T} \iff \gamma_S \leadsto \gamma_T$ .

Correctness of Separate Compilation
If $C_S \leadsto C_T$ and $\gamma_S \leadsto \gamma_T$ then $\psteqv{\gamma_S(C_S)}{\gamma_T(C_T)}$

Some papers present a variant of “Semantics Preservation” stated of Components instead of Programs. Usually, this theorem implies Compositional Correctness. This require a cross-language equivalence on Components, which is usually defined in terms of linking with S/T Equivalent Linking Contexts and observing S/T Equivalent Programs.

Semantics Preservation
If $C_S \leadsto C_T$ then $\csteqv{C_S}{C_T}$

Some papers define interoperability between source Components and target Linking Contexts, and between target Components and source Linking Contexts. This supports a broader notion of linking and thus a more widely applicable Compositional Correctness guarantee. However, it requires understanding the source/target interoperability semantics to understand the guarantee. There is no way to relate the resulting behaviors back to the source language, in general.

Open Compiler Correctness

If $C_S \leadsto C_T$ , then for all $\gamma_T$ , $\psteqv{\gamma_T(C_S)}{\gamma_T(C_T)}$
If $C_S \leadsto C_T$ , then for all $\gamma_S$ , $\psteqv{\gamma_S(C_S)}{\gamma_S(C_T)}$

Note that to satisfy Open Compiler Correctness, two new definitions of linking must be defined: one that links target Linking Contexts with source Components, and one that links source Linking Contexts with target Components.

Most compilers that generate machine code aim to be Compositional Compilers, since we should be able to link the output with any assembly, even that produced by another compiler. Compilers that target .NET VM, JVM, and LLVM are similar.

Some languages, like Coq and Racket, target special purpose VMs and aim only to be Separate Compilers.

Languages with FFIs can be thought to aim for a limited form of Open Compiler Correctness. For instance, we can link Java and C code for certain limited definitions of Linking Contexts. The full spirit of the theorem is limited to research projects, for now.

The Compositional CompCert compiler extends CompCert and its correctness proofs to guarantee Compositional Compiler Correctness. Linking is defined for any target Linking Context whose interaction semantics are related to a source Linking Context. The paper’s Corollary 2 titled “Compositional Compiler Correctness” is a generalized version of our theorem by the same name. They allow for compiling an arbitrary number of Components, then linking those with a Linking Context.

The SepCompCert compiler extends CompCert and its correctness proofs to guarantee Separate Compiler Correctness. This work notes that Separate Compiler Correctness is significantly easier to proof, increasing the proof size by 2%, compared to the 200% required in Compositional CompCert. This work was merged into CompCert as of version 2.7.

The Pilsner compiler is an MLish-to-Assemblyish compiler that guarantees Compositional Compiler Correctness. Linking is defined for any target Linking Context that is related to a MLish Component by a PILS.

Perconti and Ahmed develop a compiler from System F to a low-level typed IR that guarantees Open Compiler Correctness. In every Language, Linking is defined for any Linking Context in the source, intermediate, or target languages that has a compatible type. This paper defined a multi-language semantics in which all languages can interoperate.

Full Abstraction/Secure Compilation

Some properties of a program cannot be stated in terms of a single run of the program. We require more sophisticated compiler correctness theorems to show these properties are preserved. For instance, security properties, such as indistinguishably of a cipher text to a random string, are relational properties. These can only be stated as a property between two Programs in the same Language.

Fully abstract compilers, seek to prove compilers preserve these relational properties. Since security properties are often relational, these are sometimes called “Secure Compilers”. Often times, we also want to reflect equivalence, which usually follows from Compositional Correctness. Full abstraction refers specifically to preserving and reflecting Observational Equivalence. Papers on this topic often focus on equivalence preservation, since equivalence reflection by itself usually follows from compiler correctness, and preservation is the direction of interest for stating security properties.

Compilers that guarantee these properties are limited to research projects, as there are many open problems to be solved. The key difficulty lies in the proofs of Equivalence Preservation, which essentially requires “decompiling” a target Program into a source Program.

Equivalence Preservation
If $\ceqv{C_S}{C'_S}$ and $C_S \leadsto C_T$ and $C'_S \leadsto C'_T$ then $\ceqv{C_T}{C'_T}$

Equivalence Reflection
If $\ceqv{C_T}{C'_T}$ and $C_S \leadsto C_T$ and $C'_S \leadsto C'_T$ then $\ceqv{C_S}{C'_S}$

Full Abstraction
Let $C_S \leadsto C_T$ and $C'_S \leadsto C'_T$ . $\ctxeqv{C_S}{C'_S}$ iff $\ctxeqv{C_T}{C'_T}$

Bowman and Ahmed develop an Equivalence Preserving and Reflecting compiler from The Core Calculus of Dependency (DCC) to System F. DCC guarantees certain security properties, which are preserved by encoding using parametric polymorphism. This compiler also satisfies Compositional Compiler Correctness, using a cross-language logical relation to define relatedness of Components between languages. This compiler is not Fully Abstract, as it does not define Contextual Equivalence. Instead, the compiler Preserves and Reflects the security property of interest.

Fournet et al. develop a Fully Abstract compiler from a language similar to a monomorphic subset of ML with exceptions to JavaScript. This paper demonstrates a key difficulty in Fully Abstract compilers. Often, the source language must be artificially constrained (in this case, by eliminating polymorphism and adding exceptions) in order to support back-translation.

Devriese et al. develop a Fully Abstract compiler from STLC to the untyped lambda calculus. This paper developed a key innovation in back-translation techniques, allowing a more expressive and less typed target language to be back-translated to a less expressive source and more typed source language.

Toward Type-Preserving Compilation of Coq, at POPL17 SRC

2017-01-03T21:41:11Z

Almost two months ago, my colleagues in the Northeastern PRL wrote about three of our POPL 2017 Student Research Competition submissions. There was fourth submission, but because I was hard at work completing proofs, it wasn’t announced.

Toward Type-Preserving Compilation of Coq

Toward Type-Preserving Compilation of Coq
William J. Bowman
2016

A type-preserving compiler guarantees that a well-typed source program is compiled to a well-typed target program. Type-preserving compilation can support correctness guarantees about compilers, and optimizations in compiler intermediate languages (ILs). For instance, Morrisett et al. (1998) use type-preserving compilation from System F to a Typed Assembly Languages (TAL) to guarantee absence of stuckness, even when linking with arbitrary (well-typed) TAL code. Tarditi et al. (1996) develop a compiler for ML that uses a typed IL for optimizations.

We develop type-preserving closure conversion for the Calculus of Constructions (CC). Typed closure conversion has been studied for simply-typed languages (Minamide1996, Ahmed2008, New2016) and polymorphic languages (Minamide1996, Morrisett1998). Dependent types introduce new challenges to both typed closure conversion in particular and to type preservation proofs in general.

ICFP 2016

2016-10-15T21:15:00Z

Full disclosure: This blog post is sponsored in part by ACM SIGPLAN. ACM SIGPLAN! Pushing the envelope of language abstractions for making programs better, faster, correcter, stronger.

TLDR

I went to ICFP again this year. I’m a frequent attendee. Last year I had a paper and gave a talk. This year I had a paper, but someone else gave the talk. But I also gave a talk at HOPE 2016. I met some people and saw some talks and pet a deer.

I’m a fifth year Ph.D. candidate studying compiler correctness, dependent types, and (functional) programming language abstractions. ICFP is my second home.

This year, I met some cool new researchers, several of whose names I’ve already forgotten (sorry new friends). I met Zoe Paraskevopoulou, who works on the CertiCoq project, a combination of my two favorite things: compiler correctness and dependent types. We talked a bit about this because I too have been looking at correctly compiling dependent types. I also met Éric Tanter, who works on, among many things, gradual typing and dependent types. He gave a talk on a method for verified interoperability with dependent types, which is related to certain kinds of compiler correctness problems that interest me, such as compositional compiler correctness and full abstraction. He’s also interested in Racket, so we spent some time discussing Cur.

I got some new ideas for Cur. David Christiansen’s talk on Elaborator Reflection: Extending Idris in Idris did a great job of motivating the problem and comparing meta-programming styles in proof assistants. The elaborator monad looks like a good abstraction for reasoning about certain kinds of extensions, and I need to figure out how to make it good for reasoning about the complex extensions possible in Cur. Jesper Cockx’s talk on Unifiers as Equivalences demonstrated ideas that might let me implement unification as a user defined extension in a proof-relevant way.

I saw most of the other talk, and a bunch of talks at the workshops. I have pages and pages of notes, and dozens of items in my TODO list to go and review papers and talks that I didn’t properly digest the first time. I hope I finish those by next year.

Post-ECOOP

2016-08-10T19:46:50Z

I returned from ECOOP a few weeks ago, and have been trying to figure out what I got of the experience. I’ll focus on two big things.

For a long time I have been debating what I should do after I graduate, which I usually phrase as “industry vs academia”. I’m coming to understand this is a false dichotomy, as most dichotomies are. (It helps that a friend spelled it out for me.) Dave Herman’s talk, on starting and running a research lab doing academic-style work (e.g., developing a principled, safe programming language) in industry, helped me see that. Shriram’s summer school lectures were equally helpful, and sort of the dual of this: taking objects from industry—scripting languages—and applying academic rigor to them. ECOOP, more than any other conference I’ve been to, brought together industry and academia in a smooth spectrum. I wish I had attended as a younger student.

The other big thing was a crystallized version of thoughts I had on programming language. Matthias Felleisen on Racket and Larry Wall on Perl 6 helped me see this: anything you might want to do to or in a program should be expressible in your programming language (Matthias said it better). This is what annoys me about languages like C, Java, and Coq. C has the preprocessor and make and the dynamic linker, etc. Java has Eclipse. Coq has OCaml plugins. All of these languages require doing “more” than writing programs, but have no way to express it in the language. Racket (and, apparently, Perl 6) pulls those things into the language so that those too become just writing programs: extend the reader, dynamically load a library, muck about with the top level, add new syntax.

I got a handful of smaller things: insights about what objects are best at, what a long-term (~25 year) research agenda looks like, an appreciation for the 99 different designs for any given program.

ECOOP was a great experience. If I go again, though, I hope the summer school won’t conflict with the entire research track.

ECOOP 2016

2016-07-15T20:18:06Z

Full disclosure: This blog post is sponsored and required by the National Science Foundation (NSF): The NSF! Funding SCIENCE! since 1683 or whenever.

TLDR

I’m going to ECOOP to see a part of the PL community I wouldn’t normally see, talk to people that I wouldn’t normally talk to, attend the co-located summer school, and figure out what I want to do with my (academic) life. If you want to know why I might do those things, read a little about me.

The long story

On Sunday, I am heading to ECOOP. I have never been to ECOOP, the conference is a little outside of my specialty, I do not know anyone there, and I do not even have a paper or talk at one of the workshops. However, a few weeks ago I ignored an email from one of the mailing lists that said there was some NSF funding that students should apply for. Then I saw an email from Jan Vitek on a local mailing list saying students should really apply for this funding and get to go to Rome.

“Huh”, I thought to myself, “I wonder what’s interesting in Rome”. I went to the ECOOP program and started looking around.

The Curry On program looks interesting. This co-located conference should help me understand how PL applies to industry problems. Unfortunately, I’m going to miss most or all the first day. But the talk I’m most interested in is the final keynote, “Building an Open Source Research Lab”; hopefully this will give me some insights on this industry vs academia problem I have been struggling with.

There is also a summer school. While the history of typed and untyped languages looks fascinating, I’m going to have to skip part of it to learn about type specialization of JavaScript programs; I prove things on type-preserving compilation and I want to see more work that uses types for optimizations. Next up, the lecture on “Building a Research Program for Scripting Languages” should help me better understand what an academic career will look like, and give me some idea of how to be a good academic. Then I’m going to learn how to build a JIT compiler for free, because despite being a compilers expert, I don’t know anything about JIT compilers. Finally, I’m going to learn a little about experimental evaluation; I normally do theory and proofs, but I imagine one day I might need to measure something.

Unfortunately, the summer school is in parallel with most of the conference talks, so it’s going to be tough to decide how much of the summer school to miss in order to see new research.

“Yeah”, I thought after much consideration, “I guess there are some interesting things to see in Rome”. I’m a little concerned about the accommodations and venue though; I understand that a lot of the architecture in Rome is very old.

Conference talks reconsidered

2015-08-30T04:18:36Z

A couple weeks ago, I wrote that I was beginning to hate conference talks. The next morning, I woke up with 50+ Twitter notifications caused by people debating that point. I have reconsidered my views.

In my earlier post, I point out that the typical advice I hear is “The talk should be an ad for the paper”. After several discussions, I think this is bad advice. Instead, Lindsey Kuper and Chris Martens encouraged me to ignore this advice and instead make my talk a performance.

At first, I was unsure what this meant. In fact, I am still not quite what this means. What does it mean to perform a paper? But I followed it anyway.

Essentially I tried to communicate, at a high-level, why I think this work is cool, and what parts of the work are most interesting. I tried to tell a story about what inspired this work, why I care about it, and what came out of it. I did not try to show many technical details; I showed only those necessary to tell the story of this work. I did not try to explain the particulars of all this work; I showed only those necessary to fit the work into the context of the story I wanted to tell.

I think the end result is actually an effective ad for the paper. However, by approaching the talk differently, I produced a much better talk (IMHO). And thankfully, I am not alone in that opinion. For example, I was very excited after my initial practice talk when Matthias called the talk “90% perfect”, in defiance of a NU PRL tradition of not dwelling on positive aspects and only giving constructive criticism after a practice talk.

A video of this talk is online here.

Conference talks

2015-08-08T22:24:06Z

I am beginning to hate conference talks. I am in the midst of writing a conference talk for my recently accepted paper. Although I have only given one conference talk thus far, I have attended several conference and listened to many talks. These experiences have convinced me that conference talks are largely pointless.

I do not find conferences to be pointless. The papers are usually well written, if dense. The conferences themselves always lead to interesting conversations with clever people. I always return from a conference filled with creative energy. And, I admit, I like the excuse to travel to interesting locales.

However, the talks themselves are pointless. Most talks I have attended are terrible. Those that are not terrible I do not remember much of anyway, except that I should go read that paper. Of those talks, I would have made the same decision after reading the abstract for the paper. The talks add nothing because the talk slots are too short to communicate any technical material.

It is not entirely the fault of the speakers. For one, there is little incentive to give a good talk. If you give a good talk, then maybe you convince someone to read your paper, and maybe people remember who you are. This might be important if you are on the job market, but it does not matter for everyone else. Besides, most people will forget the talk in a month, good or bad.

Even if you are a perfectionist so incentive does not matter, it is not easy to craft a good talk. Conference papers are often complex and dense pieces of work. Frequently, the papers omit many details due to space, so completely understanding the work requires not only the paper but a technical appendix or code artifact published separately. Authors (usually (maybe only sometimes)) spend a great deal of time polishing these papers and supplementary materials to effectively communicate a complex and dense piece of work. The slot for the conference talk is 15—20 minutes, in which a speaker much fit a 12-page paper plus supplementary material?

“No! Obviously as a speaker you must not do that. The talk should be an advertisement for the paper. It should be an overview of the paper. It should communicate the key technical ideas and convince people to read the paper.”

What silly advice. I hate advertisements. Why should I sit through sessions and sessions of advertisements?

“No! Obviously as an audience member you must not do that. Just go read the abstracts and find the talks you want to attend. Skip the rest to have conversations with colleagues and authors.”

Okay, so the audience is going to read the abstract to convince them to see a talk that convinces them to read the paper of which they just read the to convince them to see the talk that convinces them to read the paper? This is circular reasoning that wastes the time of both the speaker and the audience.

As a speaker and writer, I have already spent a lot of time and effort on the paper. I have crafted the abstract and introduction to communicate the key technical ideas and give an overview of the paper as precisely and concisely as possible. Shortly thereafter, I have carefully written the rest of the paper to effectively communicate the technical contributions in as much detail yet as concisely as page limits allow. Besides, I had to write them anyway to effectively communicate my research. Why should I reproduce these efforts in a short talk that must communicate less due to the nature of the talk and the audience?

As an audience member, if I want an overview of the paper, the abstract and introduction section provide this. The author already spent a great deal of time writing these sections, which communicate more thoughts in less time than the talk will. If I want more details, these sections are conveniently located with the rest of the paper. Besides, I need to read the abstract anyway to figure out which talks to attend and which papers to read. Why should I then sit through a talk that advertises a paper that I have already decided whether or not to read?

“Well the talks give an excuse and talking points around which we can organize a conference.”

Well why can’t we find a better excuse or better talking points? Why not give longer highly-technical talks that supplement the paper, or questions-and-answer style talks for those who have read the paper and want more? Or why not make the papers more open ended so talks can be more speculative?

I do not know what should go in place of the current conference talks, but the current system seems utterly pointless and results in completely wasted effort.

Notes on "Ur: Statically-Typed Metaprogramming ..."

2015-02-14T17:22:11Z

Today I read Ur: Statically-Typed Metaprogramming with Type-level Record Computation. This paper presents the Ur language, a functional programming language based on an extension of System Fω. The novel idea is to use type-level functions as a form of type-safe meta-programming. The paper claims this novel idea enables safe heterogeneous and homogeneous meta-programming in Ur.

The interesting insight is that type-level computation may be valuable outside of dependently typed languages. The paper quickly and easily makes this case. The type-level computations reduce type annotations by enabling the programmer to compute types rather than manually write them everywhere. This could be a useful form of meta-programming in any typed language.

The claims about heterogeneous and homogeneous meta-programming seem overstated. Ignoring the novel ability to compute type annotations, type-safe heterogeneous programming could be as easily accomplished in any other type-safe language. I could just as easily (or more easily) write a program in Coq, ML, Haskell, or Typed Racket that generates HTML and SQL queries as I could in Ur. As for homogeneous meta-programming, restricting the meta-programs to record computations at the type-level seems to severely restricts the ability to generate code at compile-time and abstract over syntax, features which are provided by general-purpose meta-programming systems such as Racket’s macros or Template Haskell.

Beluga and explicit contexts

2014-09-11T00:56:44Z

In my recent work, I found it useful to pair a term and its context in order to more easily reason about weakening the context. At the prompting of a colleague, I’ve been reading about Beluga, [1] [2], and their support for programming with explicit contexts. The idea seems neat, but I’m not quite sure I understand the motivations or implications.

So it seems Beluga has support for describing what a context contains (schemas), describing in which context a type/term is valid, and referring to the variables in a context by name without explicitly worrying about alpha-renaming. This technique supports reasoning about binders with HOAS in more settings, such as in the presence of open data and dependent types. Since HOAS simplifies reasoning about binders by taking advantage of the underlying language’s implementation of substitutions, this can greatly simplify formalized meta-theory in the presence of advanced features which previously required formalizing binders using more complicated techniques like De Bruijn indices. By including weakening, meta-variables, and parameter variables, Beluga enables meta-theory proofs involving binders to be much more natural, i.e., closer to pen-and-paper proofs.

Obviously this is great for formalized meta-theory. While I have seen how HOAS can simplify life for the meta-theorist, and seen how it fails, I don’t fully understand the strengths and weakness of this work, or how it compares to techniques such as the locally nameless. I’m also not sure if there is more to this work than a better way to handle formalization of binding (which is a fine, useful accomplishment by itself).

If anyone can elaborate on or correct my understanding, please do.

FASTR

2013-03-13T07:05:00Z

FASTR is a bill to ensure all publically funded research is open access. I urge you all to contact your congresspeople and demand they support this bill.

If you need help, the EFF has a page from which you can contact your congresspeople. You can use their template, or my template below that has been customized for researchers.

As your constituent, and as a university researcher, I am urging you to support the Fair Access to Science & Technology Research Act (FASTR is S. 350 in the Senate and H.R. 708 in the House).

As a researcher, I want my research distributed widely, to anyone who is willing to read it! We in the scientific community are often held to the whims of for-profit journals and publishing agents—agents we must publish through to advance our career, and to get our work seen in the field, due the monopoly-like grip they have on what constitutes a high quality publishing venue—who seek to maximize profit at the expense of taxpayer dollars and the advancement of knowledge.

This research is developed, written, reviewed, digitally typeset, and presented AT NO COST to these publisher, BY US RESEARCHERS, who are often funded with taxpayer dollars through public universities and government agencies like the National Science Foundation. Some venues, through obscene application of copyright, do not allow authors to provide digital copies of THEIR OWN WORK via their personal websites or other means of distribution.

As a result, students, researchers at less well-funded institutions, and citizens have difficulty accessing information they need; professors have a harder time reviewing and teaching the state of the art; cutting-edge research remains hidden.

FASTR helps fix this. The bill makes government agencies design and implement a plan to facilitate public access to the results of their investments. Any researcher who receives federal funding must submit a copy of resulting journal articles to the funding agency, which will then make that research widely available within six months.

Please secure our rights as taxpayers, and our rights as scientists, and promote the progress of science by supporting FASTR.

λk.(k blog): Posts tagged 'research'

What is a model?

Definitions of “model”

So what is a model?

What is a model of a programming language?

What is syntax?

The First Meaning of Syntax

Historical Interlude

When Semantics is the Syntax

In What Sense is WebAssembly Memory Safe?

What is memory safety?

Memory (un)safety in Wasm

What is logical relations?

What are logical relations, historically?

tait1967 - Intensional interpretations of functionals of finite type I

plotkin1973 - Lambda-definability and logical relations

How is “logical relations” used in PL?

ahmed2006 - Step-Indexed Syntactic Logical Relations for Recursive and Quantified Types

abel2018 - Decidability of conversion for type theory in type theory

timany2022 - A Logical Approach to Type Soundness

What is realizability?

What is realizability, historically?

kleene1945 - On the Interpretation of Intuitionistic Number Theory

amadio1998 - Domains and Lambda-Calculi, Chapter 15

How is “realizability” used in PL?

benton2010 - Realizability and Compositional Compiler Correctness for a Polymorphic Language

nakano2000 - A Modality for Recursion

The A Means A

What does the A mean?

Why does the A matter?

Where does the A come from??

How I Redex---Experimenting with Languages in Redex

Experimenting with Languages in Redex

Untyped Programs Don't Exist

TLDR

Table of Contents

Some Context

Definitions

Is X a Typed Language?

But I Don’t Get Type Errors in X!

Untyped Programs Don’t Exist

Conclusion

Related Reading

The Meaning of Types - From Intrinsic to Extrinsic Semantics (Reynold 2000)

Dynamic Languages are Static Languages (Harper 2011)

On Typed, Untyped, and Uni-typed Languages (Tobin-Hochstadt 2014)

What to Know Before Debating Type Systems (Smith 2010)

The Calculi of Lambda-v-CS Conversion: A Syntactic Theory of Control and State in Imperative Higher-order Programming Languages (Felleisen 1987)

What is Gradual Typing (Siek 2014)

The reviewers were right to reject my paper

What even is compiler correctness?

What is a Language

What is a Compiler

What Even is Compiler Correctness

Type Preservation

Whole-Program Correctness

Compositional Correctness

Full Abstraction/Secure Compilation

Toward Type-Preserving Compilation of Coq, at POPL17 SRC

Toward Type-Preserving Compilation of Coq

ICFP 2016

TLDR

Post-ECOOP

ECOOP 2016

TLDR

The long story

Conference talks reconsidered

Conference talks

Notes on "Ur: Statically-Typed Metaprogramming ..."

Beluga and explicit contexts

FASTR