λk.(k blog): Posts tagged 'notes'

What is a model?

2023-06-15T20:25:25Z

What is a model, particularly of a programming language? I’ve been struggling with this question a bit for some time. The word “model” is used a lot in my research area, and although I have successfully (by some metrics) read papers whose topic is models, used other peoples’ research on models, built models, and trained others to do all of this, I don’t really understand what a model is.

Before I get into a philosophical digression on what it even means to understand something, let’s ignore all that and try to discover what a model is from first principles.

Definitions of “model”

The apparent place to start to understand the meaning of a word is to read its definition. This is actually no help at all. There are lots of uses of the word “model”, with several definitions. Here are some.

Definition 0 In science and engineering, a model is “an abstract description of a concrete system using mathematical concepts and language”. See Wikipedia provides a nice introduction to this kind of model, and the Standard Encylopedia of Philosophy provides a nice explanation in the context of model theory, which will be relevant later in this post.

Definition 1 A syntactic model (of a type theory) is defined by Boulier, Pédrot, and Tabareau as a translation from one type theory into another that preserves typing, the definition of false, and definitional equivalence. This syntactic model enables the source type theory to inherit properties of the target type theory—such as consistency.

Definition 2 A model (of a vocabulary also called a language $\sigma$ ) in the sense of model theory (as defined by Elements of Finite Model Theory) is a $\sigma$ -structure (“also called a model”) defining a set A along with 3 sets providing interpretations of that vocabulary. These sets are $Ic_A$ , which interprets each constant in $\sigma$ as an element of $A$ , $IP_A$ , which interprets each n-ary predicate symbol or relation symbol from $\sigma$ as an n-ary (set-theoretic) relation between elements of $A$ , and $If_A$ , which interprets each n-ary function symbol in $\sigma$ as a (set-theoretic) function from n elements of $A$ to an element of $A$ .

Definition 3 The above definition is confusing, since it conflates structure and model, which the text later distinguishes with the following separate definition. A model (of a theory (over a vocabulary $\sigma$ )) is a structure (“also called a model”) of vocabulary $\sigma$ such that every sentence in the theory is interpreted in the structure to make the sentence true. (A theory is a set of sentences drawn from a vocabulary.) My rephrasing of the definition of model is intentionally confusing and difficult to parse, to make apparent the inherit confusingness created by the several layers of definitions and one definition that defines “model” using a second definition of “model”.

Definition 4 Nlab hosts an article with a much clarified definition, which distinguishes language, theory, structure, and model carefully. In particular, it is careful to only call structure the interpretation of the language (call vocabulary above), and only call model an interpretation that makes true the axioms composing the theory of the language.

Definition 5 Carlo Anguli once gave me the following definition of model:

A collection of interpretation functions that interpret every syntactic category in such that the original relationship is respected.

e.g.,
- interpret every context as a set,
- interpret every (non-dependent) type as a set, and
- interpret every term-of-a-type indexed-by-a-context as an element-of-the-interpretation-of-that-type indexed-by-elements-of-the-interpretation-of-that-context.

Implicit in this definition is that the interpretations must respect equality — because if you don’t respect equality of arguments then you’re not a function!

This definition seems to be close to Definition 2, as it doesn’t mention axioms and their interpretation. However, it might be Definition 3 instead, as there could be implicit in the definition of syntax an inclusion of all judgements of the programming language, and therefore in the phrase “such that the original relationship is respected” a requirement that axioms of those judgements become true.

Also implicit in this definition of model is what it is a model of. Perhaps a programming language, but it again depends on what syntax means and thus what “every syntactic category” refers to.

I’m interested in the requirement that the interpretation is a collection of functions, which seems to be missing or only implied in some model theory definitions of “model”.

So what is a model?

One of the first thing that jumps out to me after reviewing the above definitions is that to understand each definition, you have to reframe the definition of model into model [of what]. It really never makes sense to give a definition of merely “model”.

Definition 0 defines a model of a system [of the real world]. Definition 1 defines a model of a type theory. Definitions 2 and 3 give definitions of a model, in the sense of model theory, but of two different objects: model of a vocabulary (or language), which is more often called a structure, and model of a theory (which everyone seems to agree is a “model”). Definition 4 makes this distinction very clear. Definition 5 seems to use “model” in the model-theoretic sense, but has abstracted a bit away from a particular notion of theory and generalized to syntax.

What is a model of a programming language?

I’ve had two problems understanding the word “model” in the context of programming languages.

First, we use “model” in three different senses, and I have neither understood that nor understood the relationships between them.

Model in the sense of an abstract description of a system. This is Definition 0. This sense of “model” means something like “mathematical description”. What we want is a description in which we can work using math, so we can make predictions about the real world. Ideally, the predictions we make will be true.
Model in the strict sense of model theory. These are Definitions 2, 3, and 4. This sense of “model” is the closet to having a strict definition. It often carries a set-theoretic connotation, asking for a set defining the domain of values, and three interpretation functions that interpret specifics parts of a theory in specific ways.
Model in the generalized sense, inheriting from or related to model theory. I hesitate to even call this distinct from the second sense, but I will anyway. I also hesitate to speculate about history—perhaps this sense actually predates model theory. But I distinguish it from the second sense, because it frequently generalizes away from the strict 3 category “constant symbol”, “predicate symbol”, “function symbol” specification and doesn’t seem beholden to set theory. Definitions 1 and 5 use “model” in this sense.

In the second sense of “model”, the first sense of the word remains—we’re still interested in a description of some system (the theory), and of using the model to make predictions or reason. However, since the theory is also mathematical, we can be more rigid about our reasoning requirements—axioms of the theory must be true of the model, and relationships must be preserved in the model. This is rarely true of a model of the real world; e.g., the Newtonian model of gravity works pretty well, until it doesn’t, so it’s a model that doesn’t quite make all axioms true or preserve all relationships.

The third sense seems closer to the idea of semantics, in the mathematical logic sense of the word as assigning meaning or interpretation to syntax. In this sense, the word “model” frequently avoid committing to set theory as a formal foundation, generalizes away from the three interpretation functions, and focuses instead on the relationships between uninterpreted syntax being preserved by the interpretation. For example, in Definition 1, the relationships of interest are well-typedness, definitional equivalence, and falsehood, and the formal foundation is type theory. Category theory seems to come closest to a complete formalization of this sense of the word “model”, although I’ve had a hell of a time understanding that. Nlab articles don’t say this explicitly, but reading between the lines in articles linked to from the Nlab article on model theory for the words syntax and semantics implies that the idea of syntax, i.e., uninterpreted symbols with relationships between themselves and judgements about them, can be formalized in category theory, and then so can the idea of semantics, i.e., providing an interpretation in some other domain of those uninterpreted symbols; a domain in which one can use all the power of the other domain to reason about the judgements one wishes to make about the uninterpreted symbols.

The second problem with the word “model” is that we frequently work with two senses simultaneously.

When I write down a programming language, I’m often trying to model (in the first sense) a real programming language (or some feature of it), one actual software developers use to make real things happen in the real world. I am not merely describing a mathematical object for study. (Okay, sometimes I do that, but usually to the first end, eventually.) When I write down such a model, I may describe the abstract syntax, the typing judgement, and an abstract machines or reduction rules. These form a pretty good mathematical description of how a real language behaves. A compiler will reject syntactically invalid expressions. It may then type check the abstract syntax tree, and reject some possibly semantically invalid expressions. If judged well typed, the compiler may transform the tree into something that runs, and that run-time behaviour can be predicted using the reduction rules.

However, for much programming languages work, I’m not interested in merely predicting the behaviour of a single program. I might want to predict behaviour or properties of the entire language, or its typing judgement, etc. To reason about single programs, the model (in the first sense) may work well. But it might not work well for, say, trying to decide whether certain types can even be inhabited. To solve this, we might build a model (in the second or third sense). We interpret the abstract syntax tree and typing judgement in some other domain. That is, the AST and the typing judgement, being a model in the first sense, form a theory in the model theoretic sense. We can then construct a model (in the second sense) of a model (in the first sense). The Standard Encyclopedia of Philosophy article on model theory goes into this in detail in the context of model theory, which is great.

What’s more interesting is how these two senses of model interact in programming languages. If one is interested in a model, in the second sense, it may inform how one develops a model (in the first sense). If I know I will want to construct a model (in the second sense) to reason about the typing judgement, I may decide that single-step reduction rules are actually irrelevant; I only care that certain program equivalences hold, really, and any implementation that has those equivalences suffices. So rather than create a model (in the first sense) with an abstract machine or small-step operational semantics, I’ll specify an equivalence judgement. This might give less predictive power about a real world implementation, but allow the predictions I do make to apply to many implementations.

If you see these patterns, you may have some insight into how the author is approaching their work, and in what senses they are using the word “model”.

What is syntax?

2023-06-07T20:58:46Z

I’m in the middle of confronting my lack of knowledge about denotational semantics. One of the things that has confused me for so long about denotational semantics, which I didn’t even realize was confusing me, was the use of the word “syntax” (and, consequently, “semantics”).

For context, the contents of this note will be obvious to perhaps half of programming languages (PL) researchers. Perhaps half enter PL through math. That is not how I entered PL. I entered PL through software engineering. I was very interested in building beautiful software and systems; I still am. Until recently, I ran my own cloud infrastructure—mail, calendars, reminders, contacts, file syncing, remote git syncing. I still run some of it. I run secondary spam filtering over university email for people in my department, because out department’s email system is garbage. I am way better at building systems and writing software than math, but I’m interested in PL and logic and math nonetheless. Unfortunately, I lack lot of background and constantly struggle with a huge part, perhaps half, of PL research. The most advanced math course I took was Calculus 1. (well, I took a graduate recursion theory course too, but I think I passed that course because it was a grad course, not because I did well.)

So when I hear “syntax”, I think “oh sure. I know what that is. It’s the grammar of a programming language. The string, or more often the tree structure, used to represent the program text.”. And that led me to misunderstand half of programming languages research.

The First Meaning of Syntax

Syntax has two meanings in programming languages, and both meanings can frequently be found in the same paper.

The first meaning is the one I gave above. I could give a definition of the syntax (in the first sense) of the lambda-calculus as follows.

e ::= x | (lambda (x) e) | (e e)

Ah. Beautiful syntax.

If we were following a standard text, such as Harper’s Practical Foundation for Programming Languages (2nd ed), we might next define the “semantics” of this “syntax”. We might define the “static semantics”, i.e., the type system or binding rules, then the “dynamic semantics”, i.e., the rules governing the evaluation behaviour of the syntax. For example, I might write the following small-step operational semantics.

((lambda (x) e) e') -> e[x := e']

Ah. Beautiful semantics.

Except, everything I wrote above, reduction rule included, is also syntax and not semantics.

Historical Interlude

The words “syntax” and “semantics” come from mathematical logic.

In that context, “syntax” describes sentences, statements, symbols, formulas, etc, without respect to any meaning. You can write down a logical formula say as "∀ X.P(X, A)" (where “A” is a logical constant, “X” is a variable, “P” is a proposition), and it has no meaning; it’s mere syntax. It might be true, or might be false, depending on its interpretation of “P”, “A”, and "∀". I could say that it means “all leaves are green”, which would be false. A more relevant example for PL might be the syntax ((lambda (x) x+1) 2) = 3, which I would certainly like to be true, but it very much depends on what I mean. If + means string append as in JavaScript, then the statement is false since ''.concat(1, 2) = '12'. Wikipedia is a good start for trying to understand this history of the word “syntax”: https://en.wikipedia.org/wiki/Syntax_(logic)

By contrast, in that same context, “semantics” is the means by which syntax is given an interpretation. Perhaps the most widely used approach to providing an interpretation of syntax is model theory, which I never learned. In model theory, we start with a “syntax” (or “theory”). This theory is a collection of constants, function symbols, and predicate symbols. A model then is a map from the uninterpreted syntax to some interpretation that preserves relationships. I’ll say more of this in a later post, but for now, consider the following example. I might provide a model of our earlier example that interprets + as ''.concat, and = is mapped to, say ===. This preserves relationships, if all my constants are mapped to strings. Wikipedia is a good source for this history too: https://en.wikipedia.org/wiki/Semantics_of_logic.

When Semantics is the Syntax

What’s interesting about this history is how it was adopted in programming languages, and evolved in two different ways. On the one hand, a programming language grammar is syntax, in the sense of being uninterpreted statements. That syntax can be given a semantics, an interpretation, by using operation semantics (this is the sense in which operational semantics is a semantics). The operational semantics provides an interpretation to our grammar.

But, in another sense, the grammar, typing rules, and evaluation rules (the “syntax”, “static semantics”, and “dynamic semantics”) are mere syntax, in the older logical sense. They are a theory, in the model-theoretic sense. To see why, we must understand what the earlier example ((lambda (x) x+1) 2) = 3 means. Or in fact, realize that it doesn’t mean anything at all.

To write this down is to write down a proposition about the grammar: that one piece of the grammar is equal to another. Except I didn’t write a proposition that the two were equal. I wrote the uninterpreted proposition symbol =, the syntax =, next to two pieces of uninterpreted grammar, two other pieces of syntax. Every syntactic judgment about our grammar is itself syntax, in the model theoretic sense. At least, this is true if we follow the tradition of writing them down synthetically, axiomatically, about the grammar, as is done in standard programming languages textbooks such as Types and Programming Languages or Practical Foundations for Programming Languages.

In this view, the typing rules and reduction relations are syntax. This is a bizarre perspective from a software engineering perspective, but makes sense from the mathematical logic perspective.

With this perspective, it might make sense to call “operational semantics” “syntactic semantics”, or to imagine a tower of syntax and semantics where one level’s semantics become the next level’s syntax. This view finally helped me make sense of why we call “syntactic logical relations” syntactic, when they are clearly semantics. (A problem I danced around in my previous post on logical relations.)

This perspective is also useful, for two reasons. The first is that reasoning purely syntactically, while very general, prevents you from importing any other reasoning principles from any other domain. By viewing the typing system as syntax, and then building a model of it (and by necessity, the programming language terms) in, say, set theory, we can import all set-theoretic reasoning in our attempts to reason about our type system. But more than that, we can reinterpret the syntax freely, to prove general results. While I might have written a type system using syntax that looks like numbers, I could build a model that interprets that type system as over strings, and know that actually the entire system is safe for strings, too. Appropriately generalized, I wouldn’t need to do any additional proofs.

Unfortunately, this double meaning of the word syntax seems to be completely taken for granted by some. nLab is a good example of this. To quote from the introduction to the nLab model theory page:

On the one hand, there is syntax. On the other hand, there is semantics. Model theory is (roughly) about the relations between the two: model theory studies classes of models of theories, hence classes of “mathematical structures”.

What’s most interesting about this quote isn’t what it says, but what it links to. The link for “syntax” is to the page on the internal logic of a category. From the software perspective, this is not syntax, but semantics. How on earth could it be syntax? The link for “semantics” is to the page on structure, the idea of equipping a category with a particular functor. How on earth is that any more semantics than the original abstract nonsense version of syntax?

Before I understood “syntax”, I couldn’t make any sense of that, but now I’m beginning to understand. The internal logic of a category in some sense must be able to express the grammar of a language, and the judgments of a language, but in a purely syntactic way—in the same way that when I write down the grammar and typing rules of a language, I don’t refer to any interpretation of those symbols beyond the way I combine them on the page. Then the semantics or structure is a the particular functor over that category, providing an interpretation, a semantics, of that original category (the syntax).

Anyway, now I think I’m ready to understand what a model is.

What is logical relations?

2023-03-24T23:32:03Z

I have long struggled to understand what a logical relation is. This may come as a surprise, since I have used logical relations a bunch in my research, apparently successfully. I am not afraid to admit that despite that success, I didn’t really know what I was doing—I’m just good at pattern recognition and replication. I’m basically a machine learning algorithm.

So I finally decided to dive deep and figure it out: what is a logical relation?

As with my previous note on realizability, this is a copy of my personal notebook on the subject, which is NOT AUTHORITATIVE, but maybe it will help you.

Here’s my working definition of a logical relation:

A realizability semantic model,
built of predicates over syntax,
that reflects judgments and structures from semantics to syntax.

Point 1 is subtle; it implies that the logical relation is both a model, and a realizability semantics. Unfortunately, I still don’t know what a model is, so I’m going to have work with the following probably wrong oversimplification: the logical relation must take (syntactically) equal terms to (semantically) equal terms. Which notion of syntactic equality though? I’m not sure, and I’m going to ignore it for now.

Point 2 is actually more specific than necessary. We don’t need to predicates over syntax specifically, but really over some base model. It’s easier for me to think of this as “syntax”, though.

Point 3 is quite difficult to make precise without making a lot of this more precise in a mathematical framework. Jon Sterling gave me the following helpful definition:

A logical relation on a model M (viewed as a category) is then a model that is constructed in the following way:

Choose some functor R : M —> E where E is a sufficiently structured category (e.g. the category of sets, or something else!). The most basic example of a functor R is the “global sections functor” M —> Set, which sends every type in M to the set of closed elements of that type. This is exactly the usual “non-Kripke logical relations"; to get Kripke logical relations, you replace Set with a functor category (presheaf category) and choose a more interesting functor R.

Now define a new category G, as a category whose objects are pairs of an object A of M, together with a subobject of R(A). A morphism in G from (A,A’) to (B, B’) is given by a morphism (f : A -> B) that sends elements satisfying A’ to elements satisfying B’.

You have to show that the category G is actually a model of your language (e.g. show that it has function spaces, booleans, whatever). Doing so is the FTLR.

Note that there are some ways to generalize the situation above, but this is basically what logical relations are.

Point 3 is also is more specific than necessary; “syntax” can be generalized to be “base model”.

Despite the complexity, we can see point 3 in action in some examples below.

What are logical relations, historically?

tait1967 - Intensional interpretations of functionals of finite type I

Logical relations are sometimes called “Tait’s Method”, dating back to Tait, as far as I can tell.

In this paper, Tait proves that System T with bar induction is a conservative extension of intuitionistic analysis U_1, which is intuitionistic arithmetic plus quantification over functions plus the axiom (schema) of choice plus bar induction. This conservative extension property is the semantic property of interest. The proof starts with a proof that T without bar induction is a conservative extension of just intuitionistic arithmetic (no choice or bar induction).

To do this, Tait develops a type-indexed predicate over System T terms (without bar induction), providing a U_0 term for all T terms of each type. These predicates M_t, C_t, and E_t are (I think) what we refer to as a logical relation. In particular, the C_t relation provides the interpretation of T values of type t, M_t seems to deal with variables, and E_t seems to be a binary relation defining semantics (“weak α-definitional equality”) of terms.

Theorem V (page 205) uses this logical relation to prove that, for all semantics values at the same type, (weak α-) definitional equality is decidable: they either are or are not related in E_t. This seems to be the key point: the definitional equality is reflected out of the semantics of terms, so it can apply to the syntax of terms.

This use of logical relation seems to also be a realizability semantics, since it it assigns syntactic types to a collection of semantics terms, by induction over syntactic types, where the realizers are a subset of all possible semantics terms.

However, it seems to be more than a realizability semantics, too. What seems very important in this paper is that the semantics preserves structure, namely definitional equality. Perhaps implicitly though, other pieces are important. For example, T functions are interpreted as U functions, although it’s not clear to me that this is critical.

This is in contrast to Kleene’s (kleene1945) realizability, which did not seem concerned with structure, but only the existence of the realizers.

plotkin1973 - Lambda-definability and logical relations

Plotkin seems to be responsible for the name, and perhaps rediscovering logical relations in the context of programming languages.

Plotkin helpfully gives us a definition of “logical”, as well, and it seems quite importantly related to part 3 of my working definition. Plotkin defines a relation R as logical if it is:

a subset of any D_k from the carrier any D∞ model (this seems to correspond to “admissible relations” in modern logical relations parlance);
the relation is preserved by functions in D. That is, the relation holds on a function f in D_k iff for all arguments x, R(x) implies R (f x) (extended to the n-ary case for n-ary relations).

This suggests that it is important the logical relation is somehow interpreting syntactic structures as semantic structures, as in the case of Tait’s model interpreting syntactic functions as semantic functions. More generally, we likely want this property of all structures in the languages: syntactic pairs are interpreted as semantic pairs, etc. Jon’s category theoretical definition seems to generalize Plotkin’s definition nicely.

This denotational logical relation also shows us a logical relation that is not defined over syntax. Instead, it is a relation over some arbitrary non-trivial D∞ model. The author mentions that since they can interpret syntax in a D∞ model, they informally treat the logical relation as over syntax sometimes, which I suppose could be made formal easily enough.

How is “logical relations” used in PL?

ahmed2006 - Step-Indexed Syntactic Logical Relations for Recursive and Quantified Types

In this paper, Ahmed is concerned with syntactic logical relations for recursive and quantified types, in particular for reasoning about contextual equivalence. Likely due to Ahmed’s work, this kind of syntactic logical relation seems to be what most people mean or think when they say “logical relation”, although that may be changing.

The desired property of the logical relation then is that two related semantic terms should be contextually equivalent in the syntax. That is, the logical relation reflects (from semantics to syntax) equivalence.

Strangely (for a realizability model), this particular syntactic logical relation also reflects typing: semantic terms in the relation are also guaranteed to be well-typed in the syntax. In contrast, some uses of “logical relations” enable semantics terms to be syntactically ill-typed. Such logical relations might be better called realizability models, although they do something reflect some structure, so perhaps reflecting typability is not a critical point of reflecting structure.

Ahmed in the introduction points out an interesting distinction: that logical relations can be either denotational, or syntactic. Syntactic logical relations model syntax as sets of syntactic values such that some property holds over that syntax. By contrast, denotational logical relations model syntax as some denotational object, Syntactic logical relations are useful for proving properties of the operational semantics directly. Denotational models instead model syntax as denotational objects, such as, e.g., sets of set-theoretic functions over elements of a D∞ model in plotkin1973. This is useful for easily proving meta-theoretical properties by reflecting properties of the denotation into the syntax, but not necessarily about the operational semantics directly.

For example, Tait uses a “denotational logical relation” into intuitionistic analysis to prove that definitional equality of System T is decidable—the definition of definitional equality, in the model, and its proof of decidability, are reflected back into the syntax; this requires no operational semantics at all. Plotkin uses a denotational logical relation, into domain theory, to show that certain λ-calculus constructs are or are not definable—existence of a term in the logical relation is reflected into the syntax as a definable expression. Neither of these is a syntactic logical relation; the semantic values never mention syntactic values directly.

Ahmed uses a “syntactic logical relation” to prove something about the operational semantics, namely, to prove contextual equivalence (an operational notion), indirectly. Direct proofs of contextual equivalence are difficult. So instead, a semantic proof of equivalence is reflected back into the syntax as ta proof of contextual equivalence. This requires structuring the logical relation into a denotation of sets of syntactic terms that evaluate in the operational semantics, so that being in the relation tells us something about evaluation in the operational semantics, which tells us something about contextual equivalence.

abel2018 - Decidability of conversion for type theory in type theory

Abel et al. define a syntactic logical relation for typed, reducible (and equivalent) terms, to prove decidability of conversion for type theory. Here, the use of syntactic logical relation is important for proving a particular conversion algorithm over the syntax is decidable.

The interesting feature of this logical relation is the generalization from a model inductively defined over types, to inductively defined over judgments. This demonstrates a weakness in my working definition of logical relation and realizability, since I defined “realizability” in terms of models inductively defined over types.

timany2022 - A Logical Approach to Type Soundness

This paper is interesting because it uses a syntactic logical relation that intentionally does not reflect typing, as many syntactic logical relations cdo. Semantically valid terms are not necessarily syntactically valid. In other ways, it looks very much like a logical relation: syntactic pairs are semantic pairs, sums sums, functions functions, etc.

The key property this paper is interested in is type safety: all well-typed terms are well-defined in the operational semantics, i.e., they evaluate to values or well-defined errors or fail to terminate, but importantly, do not get stuck. “in the operational semantics” is important to understanding why this is a syntactic logical relation; it must model terms as sets of syntactic values to reason about the operational semantics given in the paper.

However, one could imagine proving a slightly different form of type safety with a denotational logical relation. Giving a logical relation into an arbitrary model with a well-defined notion of evaluation would be implicitly a proof of type safety: that there exists a model that is type safe. The ability to reflect from semantics to syntax provides a mechanism for constructing that evaluation over syntax. So while the denotational logical relation provides no direct proof about the operational semantics, it may provide a mechanism for a type-safe-by-construction operational semantics. (This reflecting evaluation out of the semantics seems very related to the idea of normalization-by-evaluation, but I’m not clear on this.)

What is realizability?

2022-10-05T21:54:39Z

I recently decided to confront the fact that I didn’t know what “realizability” meant. I see it in programming languages papers from time to time, and could see little rhyme or reason to how it was used. Any time I tried to look it up, I got some nonsense about constructive mathematics and Heyting arithmetic, which I also knew nothing about, and gave up.

This blog post is basically a copy of my personal notebook on the subject, which is NOT AUTHORITATIVE, but maybe it will help you.

My best understanding of realizability right now, in programming languages (PL) terms, is:

A technique for assigning each syntactic type to a collection of semantic terms;
By induction over syntactic types;
Where the semantic terms that are realizers—i.e., included in the collection related to some syntactic type—are a sub-collection of all possible terms in the semantic domain. That is, there are valid semantic terms not associated with any syntactic type.

I use the word “collection” rather than “set” to avoid invoking set theory.

Graphically, we can represent this as follows:

The point of the technique is that clause 2 gives us a proof technique by induction, and clause 3 means we can relate the collection of terms (or proofs) to some other well-known collection. This yields a proof technique for metatheoretic properties about the collection, such as that there are only terminating terms in the collection of realizers, or there are only recursive functions and therefore some classical things remain unprovable.

I’m not entirely sure that clause 2, induction, is necessary, and I can’t find anything explicit about clause 3, but they seem to be true historically and in many uses of the term.

Okay so how did I get to this understanding?

What is realizability, historically?

kleene1945 - On the Interpretation of Intuitionistic Number Theory

Realizability seems to come from Kleene’s paper “On the Interpretation of Intuitionistic Number Theory”. I say “seems to” as Kleene attributes the “detailed investigation of the notion of realizability” to David Nelson, attributes several of the results in the paper to Nelson, and claims that the main results of the paper are joint work with Nelson. But the paper only has Kleene’s name on it, and Kleene claims in the first footnote that they introduced the idea of realizability to Nelson in a seminar. So anyway, realizability seems to come from Kleene, and this is the canonical paper cited for the technique.

In this paper, realizability is quite specific. It’s a technique that takes an intuitionistic first-order logic formula about Peano arithmetic (Heyting arithmetic) and constructs a natural number from it, representing the (constructive) proof of that formula. Only provable formulas are realized. The point of this exercise is to prove various metatheorems about the realized language: is it consistent, and what are provable/unprovable in the intuitionistic formulae.

Intuitively, something is unprovable if there exists a formula, but there does not exist a realization of it. This can be shown by connecting the formula to the set of realizers (in this case, natural numbers), but showing that there cannot exist a related natural number (or, more often, function on natural numbers represented by its Gödel number) with the properties required of the realizability interpretation. The simplest example: since “false” is unprovable (it has no realization, by construction), the intuitionistic logic is consistent.

This also lets us prove something about the class of all provable statements. Since we have a method for constructing something from any provable (or true) statement, we can say something about the set of all provable statement in relation to the realizers. Kleene mentions one consequence is that the intuitionistic calculus cannot prove the existance of any function other than a general recursive function, since those are the only functions constructed in the realizability interpretation. This tells us, for example, that the intuitionistic calculus is different from classical set theory, which contains other functions.

An important detail in this paper that clarifies the distinction between the intuitionistic and the classical happens in Clause 6, on page 113. This is the definition of the realizability interpretation for existential quantification ∃x.A(x). This has a realization if, for some x, A(x) has a realization. It’s important to notice that this second “for some x” quantification happens in the metalanguage, namely, classical set theory, and therefore could be choosen by Choice. Kleene discusses this on page 118, where he uses the word “classically” as a modifier on various quantifiers to remind us that, when working with the quantification and realizers directly, we are working in a classical system in which intuitionistic proofs also exist.

What seems to be going on here is that the realizers are something like the intuitionistic subset of classical set theory. I think that statement isn’t exactly true; Kleene uses classical choice when working with the realizers to show there are unprovable theorems. For example, a realizer parameterized over (classically) all variables may not correspond to an intuitionistic formula. So it’s not that the realizers are only intuitionistic, I think. But any particular realizer is (must be)? The important point may be the realizers are a subset of the whole system, and thus we can prove interesting metatheorems that rely on distinguishing the realizers (and therefore, the formulae they realizer) from all the things in the full system.

amadio1998 - Domains and Lambda-Calculi, Chapter 15

Chapter 15 of Amadio and Currien’s book “Domains and Lambda-Calculus” introduces realizability in its historical context. The introduction formalizes Kleene’s work as an example, and discusses its use.

They emphasize two things, which seem to confirm some of my understanding:

The realizability relation is defined inductively over formulas, and relates formulas to proofs.
The use lets us reason about all proofs in the system.

This is the best definition of realizability I’ve seen, and applies both to Kleene’s original, but also to uses in PL.

The authors point out that Kleene’s original goal was to prove consistency. They then confirm my above intuitions, that the realizability interpretation also lets us prove metatheorems about what is provable/unprovable in the realized system. However, they note that one application of this is to find unprovable true statements, which can be consistently axiomatized back into the original system. There are proofs in the set of realizers, i.e., true statements, that are never constructed by the realizability interpretation. These could be added back to the original system to enrich it.

This latter use seems to confirm one feature of realizability that isn’t explicit stated anywhere, but seems to be true of all realizability interpretations I’ve seen: that the realizers are a strict subsystem of some larger formal system.

How is “realizability” used in PL?

In programming languages, we’re not often concerned with intuitionistic vs classical logic; we’re working constructively by default. In fact, many of the uses of “realizability” in PL don’t seem to be related to logic at all, but to modeling well-typed programs. And while, sure, these are related by Curry-Howard, the difference seems important to me. So what does realizability mean in this context?

In most uses in PL, the important feature seems to be clause 3 in my definition above: the collection of all values is larger than the set of realizers. In PL, this suggests that we’re ascribing types to “untyped” terms, and the realizers are those that are semantically well typed, but not necessarily syntactically well typed. The full collection contains also untyped terms, and we can therefore prove through realizability that the type system rules out ill-typed terms.

There do seem to be some examples in PL that are explicitly relating classical and intuitionistic ideas, namely those trying to import constructive interpretations of classic logic. I’m not really interested in those, and I think the connection to realizability is much more clear in those applications, so I’ll ignore that area.

Let’s look at some examples.

benton2010 - Realizability and Compositional Compiler Correctness for a Polymorphic Language

In “Realizability and Compositional Compiler Correctness for a Polymorphic Language”, Benton and Hur define a “realizability” interpretation of System F types realized by terms in low-level language, for proving some compiler correctness properties. The terms realize the types, and this lets us talk about which low-level programs are valid to link with, without restricting the set of linkable programs to only those generated by the compiler.

This has lost all connection to intuitionistic vs classical logic, but I suppose it keeps the key features of the technique: types (formula) of one language are realized by terms in another, and there is some concern that the realizers should be a subset of all terms. Not all low-level programs should be valid, but some set of them should be.

nakano2000 - A Modality for Recursion

“A Modality for Recursion” was actually the start of my realizability journey. This paper starts by defining a collection of models (β-models) of the untyped λ-calculus. It then defines the class of realizability models, in terms of β-models, for an extrinsically typed λ-calculus with equi-recursive types. A realizability model is parameterized by a β-model, and is a relation inductively defined over types to their realizers, which are values drawn from the β-model.

So why is this realizability? Well, I don’t see anything to do with intuitionistic vs classical. But, the set of all values is larger than the set of realizers, which seems to be important to all uses of “realizability”, and important for this result in particular. In this paper, this is used to show that the dot modality rules out some valid β-model terms, namely those that would correspond to non-terminating λ terms.

Later in the paper, they define a “realizability interpretation”. This seems to be distinct from the collection of all realizability models in that they pick a particular set of realizers? So, it ought to be a realizability model, I guess? But they don’t say so explicitly. The interpretation is still quite heavily parameterized, but it does seem to fix or restrict the set of realizers. Anyway, this interpretation includes all the features of my definition above: it’s inductively defined over types, relating types to (a semantic model of) untyped λ terms, for the purposes of proving something about the collection of realizers as they related to the collection of all untyped λ terms.

Notes on "Ur: Statically-Typed Metaprogramming ..."

2015-02-14T17:22:11Z

Today I read Ur: Statically-Typed Metaprogramming with Type-level Record Computation. This paper presents the Ur language, a functional programming language based on an extension of System Fω. The novel idea is to use type-level functions as a form of type-safe meta-programming. The paper claims this novel idea enables safe heterogeneous and homogeneous meta-programming in Ur.

The interesting insight is that type-level computation may be valuable outside of dependently typed languages. The paper quickly and easily makes this case. The type-level computations reduce type annotations by enabling the programmer to compute types rather than manually write them everywhere. This could be a useful form of meta-programming in any typed language.

The claims about heterogeneous and homogeneous meta-programming seem overstated. Ignoring the novel ability to compute type annotations, type-safe heterogeneous programming could be as easily accomplished in any other type-safe language. I could just as easily (or more easily) write a program in Coq, ML, Haskell, or Typed Racket that generates HTML and SQL queries as I could in Ur. As for homogeneous meta-programming, restricting the meta-programs to record computations at the type-level seems to severely restricts the ability to generate code at compile-time and abstract over syntax, features which are provided by general-purpose meta-programming systems such as Racket’s macros or Template Haskell.

Beluga and explicit contexts

2014-09-11T00:56:44Z

In my recent work, I found it useful to pair a term and its context in order to more easily reason about weakening the context. At the prompting of a colleague, I’ve been reading about Beluga, [1] [2], and their support for programming with explicit contexts. The idea seems neat, but I’m not quite sure I understand the motivations or implications.

So it seems Beluga has support for describing what a context contains (schemas), describing in which context a type/term is valid, and referring to the variables in a context by name without explicitly worrying about alpha-renaming. This technique supports reasoning about binders with HOAS in more settings, such as in the presence of open data and dependent types. Since HOAS simplifies reasoning about binders by taking advantage of the underlying language’s implementation of substitutions, this can greatly simplify formalized meta-theory in the presence of advanced features which previously required formalizing binders using more complicated techniques like De Bruijn indices. By including weakening, meta-variables, and parameter variables, Beluga enables meta-theory proofs involving binders to be much more natural, i.e., closer to pen-and-paper proofs.

Obviously this is great for formalized meta-theory. While I have seen how HOAS can simplify life for the meta-theorist, and seen how it fails, I don’t fully understand the strengths and weakness of this work, or how it compares to techniques such as the locally nameless. I’m also not sure if there is more to this work than a better way to handle formalization of binding (which is a fine, useful accomplishment by itself).

If anyone can elaborate on or correct my understanding, please do.