Elixir Types
Elixir Types
1 Introduction
Elixir is a functional programming language that runs on the Erlang Virtual Ma-
chine [17]. The language has been gaining adoption over the last years in areas such
as web applications, embedded systems, data processing, and distributed systems,
and used by companies like Discord and PepsiCo. The success of Elixir is primarily
attributed to the underlying Erlang VM, developed by Ericsson in the eighties, and
considered to be a great feat of engineering for concurrency, distribution, and fault
tolerance. A limitation of both Erlang and Elixir is that they are dynamically typed,
meaning they do not enjoy the safety features of a static type system that ensure at
compile-time the absence of a given class of run-time errors.
Developing a static type system suitable for Erlang has been an open research
problem for almost two decades. The earliest effort was attempted by Marlow and
Wadler [29], which typed a subset of Erlang using subtyping unification constraints.
However, their system was not adopted as type inference was slow, and the inferred
types were large and complex. Ever since then, several attempts—either practical,
theoretical, or both—have followed [32, 27, 41, 35, 18, 25, 38].
We present a gradual type system for Elixir, based on the framework of semantic
subtyping [10, 19]. This framework, developed for and implemented by the CDuce
programming language [3, 15], provides a type system centered on the use of set-
theoretic types (unions, intersections, negations) that satisfy the commutativity and
distributivity properties of the corresponding set-theoretic operations [19]. The system
is a polymorphic type system with local type inference, that is, functions are explicitly
annotated with types that may contain type variables, but their applications do not
require explicit instantiations: the system deduces the right instantiations of every
type variable. It also features precise typing of pattern matching combined with type
narrowing: the types of the capture variables of the pattern and of some variables of
the matched expression are refined in the branches to take into account the results of
pattern matching. With respect to the system implemented for the CDuce language,
the system we define for Elixir brings several novelties and new features. Its main
contributions can be summarized as follows:
Semantic subtyping. An extension of the semantic subtyping framework to fit
Elixir/Erlang, in particular, the definition of new function domains to account for
the tight connection between Elixir/Erlang functions and their arity.
Guards. A precise type system for analyzing guards in pattern matching.
Records and dictionaries. A new typing discipline unifying records and dictionaries.
Dynamic type. The integration of the dynamic type in the type system, which is
used to describe untyped parts of the code, and how they interact with statically
typed parts. This uses techniques of the gradual typing literature.
Strong arrows. A new gradual typing technique for typing functions that takes into
account runtime type tests performed by the virtual machine or inserted by the
programmer. This makes it possible to guarantee the soundness of the gradual
typing system with more precise static types without modifying the compilation of
the source code.
Outline In Section 2, we provide an overview of the specific typing issues that arise
in Elixir and the reasons why set-theoretic types are a good fit to type it. In Section 3,
we demonstrate the various typing techniques we developed specifically for Elixir. We
outline the formal approach to typing programs in Elixir in Section 4. Section 5 covers
our design principles for integrating the type system into the Elixir compiler and the
impact on programmers. Section 6 discusses related work, while Section 7 concludes
our presentation and outlines some features planned as future work.
Static typing seems nowadays to be the biggest need for the Elixir community. Today,
Elixir supports "Typespecs", a mechanism for annotating functions with types. However,
Typespecs are not verified by the compiler. Instead a tool called Dialyzer, which ships
with the Erlang standard library, can be used to find discrepancies in your source code
2
Giuseppe Castagna, Guillaume Duboc, and José Valim
and type annotations. Dialyzer is based on success typing [27], which guarantees no
false positives, but may leave several bugs uncaught.
As the Elixir community grows, the general feedback is that, while Dialyzer is helpful
and provides developers with some guarantees, its ergonomics and functionality do not
fully match the community expectations. Based on our experience with the language
and its ecosystem, we speculate developers would accept more false positives from
the compiler in exchange of catching more bugs. Hence the interest of the authors in
fully baking static typing into the Elixir compiler.
The benefits we expect are essentially twofold. The first benefit of types is to aid
documentation (emphasis on the word “aid” since we don’t believe types can replace
textual documentation). Elixir already reaps similar benefits from Typespecs and we
expect an integrated type system to be even more valuable in this area.
The second benefit of static types revolves around contracts. If function caller(arg)
calls a function named callee(arg), we want to guarantee that, as these functions
change over time, the caller passes valid arguments into the callee and correctly
handles the return types from the callee.
This may seem like a simple guarantee, but we can run into tricky scenarios even
on small code samples. For example, imagine that we define a negate function, that
negates numbers. One may implement it like this:
The negate function receives an integer() and returns an integer().1 Type specifi-
cations are prefixed by $ and each specification applies to the definition it precedes.
With our custom negation in hand, we can implement a custom subtraction:
This would all work and typecheck as expected, as we are only working with integers.
Now, imagine in the future someone decides to make negate polymorphic (here,
ad hoc polymorphic), by including an additional clause so it also negates booleans:
The specification at issue uses integer() or boolean() stating that both the argu-
ment and the result are either an integer or a boolean. This is a union type which has
become common place in many programming languages.
The type specified for negate is not precise enough for the type system to deduce
that when negate is applied to an integer the result is also an integer.
1
We follow Erlang convention that basic types are suffixed by “ () ”, for instance, string() .
3
The Design Principles of the Elixir Type System
10 Type warning:
11 | def subtract(a, b) when is_integer(a) and is_integer(b) do
12 | a + negate(b)
13 ^ the operator + expects integer(), integer() as arguments,
14 but the second argument can be integer() or boolean()
Such a type system would not be enough to capture many of Elixir idioms and it would
probably lead to too many false positives. Therefore, in order to evolve contracts over
time, we need more expressive types. In particular, to solve this issue we need an
intersection type, which specifies that negate has both type integer()->integer()
(i.e., it is a function that maps integers to integers) and type boolean()->boolean()
(i.e., it is a function that maps booleans to booleans). This type is more precise than
the previous one and is written as:
With this type, the type checker can infer that applying negate to an integer will
return an integer. Therefore, in the definition of subtract, the application negate(b)
has type integer(), and the function subtract is well-typed.
We have seen that we can specify two different types for negate, that is:
(1) (integer() or boolean()) -> (integer() or boolean())
(2) (integer() -> integer()) and (boolean() -> boolean())
and we said that the latter type is “more precise” than the former. Formally, we state
that the latter is a subtype of the former, meaning that every value of the latter is also
a value of the former. In the case of the two types above, the subtyping relation is also
strict: every function that maps integers to integers and booleans to booleans, is also
a function that maps an integer or a boolean to an integer or a boolean, but not vice
4
Giuseppe Castagna, Guillaume Duboc, and José Valim
versa (e.g. the constant function fn x -> 42 end maps both integers and booleans
to integers and thus to integer() or boolean(); thus it does not map booleans to
booleans and, therefore, it is not in the intersection type). When two types are one
subtype of each other they are said to be equivalent, since they denote the same set of
values (e.g., (integer()->integer()) and (boolean()->integer()) is equivalent
to (integer or boolean())->integer() )
The type of negate or negate_alt can also be expressed without intersections, by
using parametric bounded quantification,2 but this is seldom the case. For instance,
Elixir provides a negation operator named !, which is defined for all values. The values
nil and false return true, while all other values return false. With set-theoretic
types, we can give to this operator the following intersection type:
18 $ (false or nil -> true) and (not (false or nil) -> false)
This type introduces two further ingredients of our type syntax: singleton types
and negation types. Namely, the atoms true, false, and nil,3 are also types, called
singleton types, because they contain only the constant/atom of the same name. The
connective not denotes the negation of a type, that is, the type that contains all the
well-typed values that are not in the negated type, whence the interpretation of the
functional type above.⁴
The advantage of interpreting types as the set of their values is that, thus, types
satisfy the distributivity and commutativity laws of their set-theoretic counterparts.
For instance, a well-known property of products is that unions of products with a
same projection factorize, that is, {s1,t} or {s2,t} is equivalent to {s1 or s2, t}
(Elixir uses curly brackets for products). This is reflected by the behavior of our
type-checker that accepts the following definitions:
19 $ type t() = {integer() or string(), boolean()}
20 $ type s() = {integer(), boolean()} or {string(), boolean()}
21 $ (( t() -> t()), s() ) -> s()
22 def apply(f,x) do: f.(x)
The first two lines define the types t() and s() while lines 21-22 define a function
whose typing demonstrates that the type-checker considers t() and s() to be equiva-
lent. This is because it allows an expression of type t() to be used where an expression
of type s() is expected (i.e., f which expects an argument of type t() is given an
argument x , which is of type s()) and an expression of type of type s() where an
expression of type t() is expected (i.e., the type specification declares that apply
returns a result of type s(), but the body returns f(x)) which is of type t()). In
contrast, languages that use a syntactic definition of subtyping, such as Typed Racket,
Flow, or TypeScript, accept the application f(x) but reject the typing of apply: they
cannot deduce that t() is a subtype of s().
2
Precisely as $ a -> a when a: integer() or boolean() : see Section 2.3.
3
In Elixir, atoms are user-defined constants obtained by prefixing an identifier by colon, as
in :ok , :error , and so on. The atoms true , false , and nil are supported without
colon for convenience.
4
The precedence of and and or is higher than type constructors (arrows, tuples, records,
lists), and the negation not has the highest precedence of them all.
5
The Design Principles of the Elixir Type System
Finally, we use term() to represent the top type (i.e., the type of all values) and
none() to denote the empty type, that is, the type that has no value and which is
equivalent to not term() (likewise, term() is equivalent to not none()).
The existing set-theoretic types literature enables our type system to represent several
Elixir idioms. We outline some examples in this section.
meaning that for all types a and b (i.e., for all a and b subtypes of term())
map is a binary function that takes a list of elements of type a (notation [a]), a
function from a to b and returns a list of elements of type b;
reduce is a ternary function that takes a list of a elements, an initial value of type
b, a binary function that maps a’s and b’s into b’s, and returns a b result.
Local type inference infers that for map([1, 4], fn x -> negate(x) end) both type
variables must be instantiated by integer(), deducing the type [integer()] for it.
Intersection can also be used to define the type specification of reduce for the case
of empty lists (in which case the third argument can be of any type):
5
We did not specify lower bounds since they are not frequently used and they can be encoded
by union types, e.g., ∀(s ≤ α).α → α = ∀(α).(s ∨ α → s ∨ α); upper bounds can be encoded,
de f
6
Giuseppe Castagna, Guillaume Duboc, and José Valim
Polymorphic types make inference more precise for other functions. For instance, if
we add a default case to the negate example (lines 60-61) we obtain the code
for which we can deduce—or at least check—the type (notice the use of bounded
quantification in line 36)
the type atom() -> atom(), since an atom is neither an integer nor a Boolean.
It is possible to define polymorphic types with type parameters. For instance, we
can define the type tree(a), the type of nested lists whose elements are of type a, as
and then use it to type the polymorphic function flatten that flattens a tree(a)
returning a list of a elements:
If the argument is not a list, then a is instantiated to the type of the argument. If it
is a list, then a is instantiated to the union of the types of the non-list elements of this
nested list. For instance, the type statically deduced for the application
7
The Design Principles of the Elixir Type System
the Enumerable protocol which is represented by the type Enumerable.t(a), that is,
the enumerables whose elements are of type a (i.e., a-lists, a-sets, and a-ranges).
Similarly, the Collectable protocol specifies data types that can collect elements,
one by one. The latter is implemented for lists and sets, but not ranges, as ranges
cannot represent—and therefore cannot collect—elements in arbitrary order. Thanks
to set-theoretic types, we can precisely type protocol functions parametric in the type
of the elements. For instance, the [Link]/2 function takes two parameters, an
Enumerable and a Collectable collection and puts the elements of the former into
the latter. Its type is
There is also a ternary version of into which takes as third parameter a function that
transforms the elements of the enumerable before inserting them in the collectable:
This type can then be used to type the following echo function that uses the above
mentioned [Link]/3 function
which we can then apply to the [Link]() and a string transformation function so
that we have an echo of the standard input that is transformed by the function:
Above we used Elixir interactive toplevel to transform the input in upper cases by the
function [Link]/1. The type statically deduced for this application is the one
of the first argument, that is, [Link]().t().
All features presented so far adapt to Elixir what is already possible in the type
system of CDuce, defined via the set-theoretic interpretation of types of semantic
subtyping [19]. There are however several key specific characteristics of Elixir that
require the semantic subtyping framework to be modified, improved, and/or extended.
8
Giuseppe Castagna, Guillaume Duboc, and José Valim
A first such characteristic is the arity of functions which plays an important role in Elixir.
While it is possible to test the arity of a function using the expression is_function
(e.g., is_function(foo, 2) tests whether foo is a binary function), it is not possible
in semantic subtyping to express the type of exactly all functions with a specific
arity. ⁶ This is because, in CDuce, all functions are unary, with a function that takes
two arguments being considered a unary function that expects a pair. Although it is
possible to define a type for all functions as none() -> term(),⁷ it is not possible to
give a type specifically for, say, binary functions using {none(),none()} -> term() ,
for the simple reason that a product of the empty set is equivalent to the empty set,
and thus the latter type is equivalent to the former. To address this issue, we introduce
a special syntax for function types, written as (t1 , · · · ,tn ) -> t which outlines the
arity of the functions and that we already used in the previous examples. This allows
the type of all binary functions to be written as (none(),none()) -> term() , but
it requires a non-trivial modification to the set-theoretic interpretation of function
spaces—detailed in Appendix A.1—and, ergo, of the subtyping decision algorithm.
However, our type system is capable to infer the types of functions as the above
even in the absence of their type declaration, by considering guards as explicit type
annotations for the respective parameters. This not only applies to simple type tests
of the parameters, but also to more complex tests. For example, for
our system deduces from the guard that person must be a record with at least the
field age defined and containing a value of type integer(), that is, an expression of
type %{age: integer(), ...} . This is a record type: records are prefixed by % to
distinguish them from tuples; the three dots indicate that the record type is open, that
is, it types records where other fields may be defined (cf. Section 3.3). Since the dot in
[Link] denotes field selection, then our system deduces for the function get_age
6
In our system, to be able to express arity tests in terms of types is crucial for the precise
typing of guards and, thus, of functions and pattern matching: cf. Section 3.2.
7
The top type of functions of arity one is not term()->term() . In our system, every function
of this type can be safely applied to any argument of type term() , that is, every well-typed
argument. But of course not every function satisfies this property: only the total ones.
9
The Design Principles of the Elixir Type System
the type %{age: integer(), ...} -> integer() . One novelty of our system is that
it can precisely express (most) guards in terms of types, in the sense that the set of
values that satisfy a guard (e.g., is_integer([Link])) is the set of values that
belong to a given type (i.e., %{age: integer(), ...} ).
Note that, in the absence of such guards, it is the task of the programmer to
explicitly provide the type of the whole function by preceding its definition by a
type specification. It is also possible to elide parts of the return type of non-recursive
functions by using the underscore symbol “_ ”:
57 $ integer() -> _
58 def negate(x) when is_integer(x), do: -x
or
59 $ (integer() -> _) and (boolean() -> _)
60 def negate(x) when is_integer(x), do: -x
61 def negate(x) when is_boolean(x), do: not x
leaving to the type system the task of deducing the best possible types to replace for
each occurrence of the underscore.
We define the type result() as the union of two record types: the first maps the
atom :output to the (atom) singleton type :ok and the atom :socket to the type
socket(); the second maps :output to :error and maps :message to a union type
formed by an atom and a tuple. Next consider the definition of handle: values of
type %{output: error, message: {:delay, integer()}} are going to escape every
pattern used by handle, triggering a type warning:
69 Type warning:
70 | def handle(r) do
71 ^^^^^^^^^
72 this function definition is not exhaustive.
73 there is no implementation for values of type:
74 %{output: :error, message: {:delay, integer()}}
Note that the type checker is able to compute the exact type whose implementation
is missing, which enables fast refactoring since, as the type of result() or the imple-
mentation of handle are modified, the type checker will issue precise new warnings
to point out the places where code changes are required.
10
Giuseppe Castagna, Guillaume Duboc, and José Valim
then since the specified input type is result() (which is a subtype of maps), the third
branch will never match (its pattern matches only pairs) and can be deleted.
This will remove useless code, detect unused function definitions, or reveal more
complex problems as these hints can indicate areas where the programmer’s expecta-
tions and the actual logic of the program do not match.
Narrowing Narrowing is the typing technique that consists in taking into account
the result of a (type-related) test to refine (i.e., to narrow) the type of variables
in the different branches of the test. In Section 2.2 we have already presented a
simple example in which narrowing is used, namely, in the function negate_alt
(code in line 17) the type-checker uses the test to narrow the type of x , which is
(integer() or boolean()) , to integer() in the “do” branch and to boolean() in
the “else” branch. This is a simple application of narrowing, where the narrowing is
performed on the type of a variable whose type is directly tested. However, our system
is also able to narrow the type of the variables that occur in the expression tested by a
“case” or a “if ”, even if this expression is not a single variable (some exceptions apply
though: see future works). Here is a more complete example where we test the field
selection on a variable
79 $ result() -> _
80 def handle(r) when [Link] == :ok, do: {:accepted, [Link]}
81 def handle(r) when is_atom([Link]), do: [Link]
82 def handle(r), do: {:retry, elem([Link], 1)}
In the example the type of r which initially is result() is narrowed in the first branch
to %{output: :ok, socket: socket()} , to %{output: :error, message: :timeout} in
the second branch, and to %{output: :error, message: {:delay, integer())}} in the
last one. This precision is shown by the fact that handle type-checks the following
type specification too:
11
The Design Principles of the Elixir Type System
values. In such cases, we use both lower and upper approximations to ensure that
narrowing and exhaustivity/redundancy checking still work. As an example, consider
the following simplistic function:
We are unable to express by a type the exact domain of this function, which is the set
of “all maps of size 2”. However, when the guard succeeds, it is clear that x is a map,
and this assumption is enough to deduce by narrowing that the body of the function
is well-typed. Using the type of all maps to approximate the set of all maps of size
2 is an over-approximation. We call such a type the potentially accepted type of the
pattern/guard since it contains all the values that may match it. Conversely, consider
the following example:
The first clause matches both the maps of size 2 (but no other map) and any lists.
Although we cannot characterize by a type all the values matched by the first clause, we
do know that all lists are captured by it and, therefore, the second clause is redundant
(length being defined only for lists). The type of all lists is an under-approximation
(i.e., a subset) of the set of all values that satisfy the guard in the first clause. We refer
to this under-approximation as the surely accepted type of the pattern/guard since it
contains only values that do match it. Our system makes a distinction between guards
that require approximation and those that do not, as further described in Section 4.
Complex Guards The analysis of guards is more sophisticated than it appears. First
of all, guards are examined left to right by incrementally generating environments
during their analysis. An example is the guard in line 87: if we compare it with line 86
we see that we added an is_map(x) test. Without it the guard in line 87 would be
equivalent to the one in line 86, since when x is a list, then map_size(x) == 2 fails
(rather than return false), and so does the whole guard: the is_list(x) would
never be evaluated. To account for this, our analysis examines is_list(x) only if
the preceding clause may not fail, which is always the case—though, it can return
false— and map_size(x) is examined only in the environments in which is_map(x)
succeeds.
Another stumbling block is that the analysis may need to generate for a single guard
different type environments under which the continuation of the program is checked,
as the following definition shows:
The definition above type-checks, but this is possible only because the analysis of the
guard is_boolean(y) or is_integer(z) in line 90 generates two distinct environ-
ments (i.e., one where z has type integer() and y type term(), and a second one
12
Giuseppe Castagna, Guillaume Duboc, and José Valim
where their types are inversed) which are both used to deduce two types for {z,y}
which are then united in the result. By the same technique, in the absence of a type
specification our system deduces for the definition of baz in line 90 the type
92 ((term(),integer()) -> {integer(),term()}) and
93 ((boolean(),term()) -> {term(),boolean()})
and the analysis of the code defined in line 91 adds to this intersection the following
arrow: (not(boolean()),not(integer())) -> nil .
Finally, our type system can analyze arbitrarily nested Boolean combinations of
guards which are type tests of complex selections primitives, as the following definition
of a parametric guard is_data(d) shows:
In Elixir, maps are a key-value data structure that serves as the primary means of
storing data. There are two distinct use cases for maps: as records, where a fixed set
of keys is defined, and as dictionaries, where keys are not known in advance and can
be dynamically generated. A map type should unify both, allowing the type-checker
to sensibly choose when it needs to ensure that some expected keys are present while
enforcing type specifications for queried values.
Maps as Records When used as records, Elixir provides the [Link] syntax, where
:key is an atom(). If the map returned by the expression map does not contain said
key at runtime, an error is raised. In Section 3.2, we saw the following definition:
98 $ %{age: integer(), ...} -> integer()
99 def get_age(person) when is_integer([Link]), do: [Link]
Each “key type” in a map type is either required or optional. Singleton keys are
assumed to be required, unless otherwise noted.⁸ The triple dot notation means the
type also types records values that define more keys than those specified in the type
and corresponds to optional(term()) => term(). We refer to those as open maps.
A program similar to the above but with optional keys
8
The compiler will reject required(term()) , as that would require a map with an infinite
amount of keys. However, because a finite type such as boolean() can be either required
or optional, we require all non-singleton types to be accordingly tagged to avoid ambiguity.
13
The Design Principles of the Elixir Type System
Hence typing gets rid of all KeyError exceptions for every use of [Link], by restricting
the use of dot-selection to maps that are known to have the key.
Maps as Dictionaries When working with maps as dictionaries, we use the m[e]
syntax to access fields, where m and e are expressions returning a map and a key,
respectively. In this notation, the field may not exist, in which case nil is returned.
For the given function
the system infers integer() or nil as return type. If the type-checker can infer that
the keys are present, then it will omit the nil return. For instance, the following
function is well typed since both fields are required and, thus, cannot return nil:
The type-checker distinguishes when a key is present, may be present, or is not present.
To summarize, for the following function (where %{} is the empty map),
Dictionary Keys We have shown how to interact with maps as records and dictionaries
where the keys were restricted to singleton types. But e [ e′ ] (and other map operations)
allows for generic use of maps as dictionaries, where the key is the result of an arbitrary
expression e′ . To model this behavior, map types can specify the type of values expected
by querying over certain fixed key domains, e.g.,
14
Giuseppe Castagna, Guillaume Duboc, and José Valim
115 $ type t() = %{ foo: atom(), bar: atom(), optional(atom()) => integer() }
represents maps with two fields :foo and :bar set to atoms, and where any other
key is an atom associated to an integer.
If e is an expression of type the t() above, then the type of the selection e [ e′ ] will
be computed according to the type of the expression e′ : if e′ has type :foo or :bar,
then the selection has type atom(); if the type of e′ intersects atom() but it is not a
subtype of it, then the selection will be typed by atom() or integer() or nil. If e′
has type not atom() then the selection will have type nil and will issue a warning.
If the fetch! operation is used, it will raise if e′ is of type not atom().
There is an important base of existing code for Elixir. If we want to migrate this code
to a typed setting, the ability to blend statically typed and dynamically typed code is
crucial. Some code that is working fine may not pass type-checking, therefore a gradual
migration approach is preferred to converting the entire codebase to comply with
static typing at once. This is the goal of gradual typing [40]. For that, we introduce
the type dynamic(), which essentially puts the type-checker in dynamic typing mode.
In practice, the programmer can think of dynamic() as a type that can become at
run-time (technically, that materializes into: see [11]) any other type: an expression of
type dynamic() can be used wherever any other type is expected, and an expression
of any type can be used where a dynamic() type is expected since, in both cases,
dynamic() may become at run-time that [Link] simplest use case is to declare that
a parameter of a function has type dynamic():
meaning that in the body of foo1 the parameter x can be given any type—possibly a
different type for each occurrence of x —and that foo1 can be applied to arguments
of any type. The type dynamic() is a new basic type, that can occur in other type
expressions. This can be used to constrain the types arguments can have. For instance
requires the argument of foo2 to have a function type. This means that in the body of
foo2 the parameter f can be applied to arguments of any type and its result can be
15
The Design Principles of the Elixir Type System
used in any possible context, but a use of f other than as a function—e.g., f + 42—will
be rejected. Likewise an application of foo2 to an argument not having a functional
type—e.g., foo2({7,42})—will be statically rejected, as well.
Gradual typing guarantees. Using dynamic() does not mean that type-checking be-
comes useless. Even in the presence of dynamic() type annotations, our type system
guarantees that if an expression is given type, say, integer(), then it will either
diverge, or return an integer value, or fail on a run-time type-check verification. This
safety guarantee characterizes the approach known as sound gradual typing [40]. This
approach was developed for set-theoretic types in Lanvin’s PhD thesis [26] whose
results we use here to define subtyping and precision relations using the subtyping
relation on non-gradual types (i.e., types in which dynamic() does not occur).
There is however a fundamental difference between our approach and the one of
sound gradual typing: the latter uses the gradual type annotations present in the
source code to insert into the compiled code the run-time type-checks necessary to
ensure the above type safety guarantee. Instead, one of our requirements is that the
addition of types must not modify the compilation of Elixir. Therefore, we have to
design our gradual type system so that it ensures the type safety guarantee by taking
into account both the dynamic type-checks performed by the Erlang VM and those
inserted by the programmer. The goal, of course, is to deduce for every well-typed
expression a type which is gradual as little as possible, the best deduction being that
of a non-gradual type (the less gradual the type, the more the errors captured at
compile time). To that end we introduce the notion of strong function types.
Strong Function Types. We have seen that our system can deduce the type of a function
also in the absence of an explicit annotation when there are guards on the parameters.
So from a typing point of view the following two definitions are equivalent since they
both define the identity function of type integer() -> integer():
120 $ integer() -> integer() $ integer() -> integer()
121 def id_weak(x), do: x def id_strong(x) when is_integer(x), do: x
However, from a runtime perspective, the two definitions above differ as the latter
checks that its argument is of type integer(), while the former does not. Therefore,
if these functions are applied to an argument that may not be an integer (e.g., of type
dynamic()), then we can only be certain that the resulting output is an integer for the
function id_strong. This distinction appears when we type the following function:
which is accepted by our system since it deduces that when id_weak is applied to an
argument of an unknown dynamic() type, then it cannot give to the result a type more
precise than dynamic(), while for the same application with id_strong it deduces
that whenever the application returns a result, this result will be of type integer.
We refer to the function id_strong as having a “strong” function type, since it
guarantees that when applied to an argument that is not within its domain, it will
16
Giuseppe Castagna, Guillaume Duboc, and José Valim
either return a result within its codomain or fail on dynamic type check (or diverge).
Likewise, the function id_weak has a “weak” function type, since it may return (and
actually, does return) a result not of type integer() when applied to an argument
that is not of type integer().
Actually, our system deduces for foo3 a type more precise than the one given in
line 124. If we omit the type annotation and, like for foo3, there is no explicit guard,
then our system assumes that the parameter x has type dynamic() and deduces
for foo3 the type dynamic() -> {dynamic(), (integer() and dynamic)} which is
a subtype of the type in line 124 (a classic sound gradual typing approach such
as those by [40] or [11, 26] would have deduced for this function the return type
{integer(), integer()} , but also modified its standard compilation by inserting two
run-time integer type-checks, one for each occurrence of x in the body of foo3). The
reason why the second projection of the result is intersected with dynamic() is that
this improves the typability of existing code via gradual typing. An expression of type
t () and dynamic() can be used not only in all contexts where an expression of type
t () is expected, but also in all contexts where a strict subtype of t () is expected (in
which case the use of dynamic() will be propagated). This is useful especially for
(strong) functions whose codomain is a union type. For instance, consider again the
function negate as defined in lines 60–61. This is a strong function whose codomain is
integer() or boolean(). If this code is coming from some existing base—i.e., with-
out any annotation—then the system deduces that this function takes a dynamic()
input and returns a result of type (integer() or boolean()) and dynamic() : thanks
to the intersection with dynamic() in the result type, it is then possible to use the
result of negate not only where an expression of type integer() or boolean() is
expected, but also where just an integer() or just a boolean() is expected, then prop-
agating the dynamic type. Concretely, in this dynamic setting, the type of subtract
as defined in lines 4–6 would still be well typed since the type of negate(b) in
line 5 could materialize into integer() and in the absence of an explicit annota-
tion, by the propagation of dynamic() the type deduced for the result of subtract
would be integer() and dynamic() which thus in turn could be passed to a func-
tion expecting a subtype of integer (e.g., a modulo function expecting an input
of type integer() and not 0). Finally, notice that if we had explicitly defined the
type of negate to be dynamic() -> integer() or boolean() (as we did above in
line 124 for foo3), then the result of negate(b) in line 5 would have been typed as
integer() or boolean() thus precluding the materialization and the consequent
typing of subtract. Therefore, as a good programming practice it is better to leave
the system to deduce the return types of all functions whenever gradual typing is used,
by systematically using the underscore _ for return types, since an explicit return type
may hinder the propagation of dynamic().
We have extended the semantic subtyping framework to add strong function types,
which are inhabited by functions satisfying the property we described above (cf.
Appendix ??). Strong function types are the key feature that allows the type-system
to take into account the run-time typechecks, either performed by the VM or inserted
by the programmer. Built-in operations, such as field selection, tuple projections, etc,
are, by implementation, strong: the virtual machine dynamically checks that, say, if
17
The Design Principles of the Elixir Type System
the field a of a value is selected, then the value is a record and the field a is defined in
it. Our theory extends this kind of checks to user-defined operations—i.e., functions
definitions—by fine-grainedly analyzing their bodies to check that all the necessary
dynamic checks are performed. The functions for which this holds have a strong type,
and the system can safely deduce that when they are applied to an argument that may
be not in their domain (e.g., an argument of type dynamic()), then the application
will return a value in their codomain (rather than a result of type dynamic()), and as
explained above, to maximize typability of existing code this codomain is intersected
with dynamic().
All this is currently transparent to the programmer since strong types are only
used internally by the type checker to deduce the type of functions such as foo3. A
possible extension of our system would be to allow the programmer to specify whether
higher-order parameters require a strong type or not.
4 A Pinch of Formalization
Elixir is, in essence, a minimalist language, with most of its constructs being syntactic
sugar for the language’s core expressions: functions and pattern matching. In this
section we just outline the formalization of this core (in which a significant part of
Elixir can be encoded), its typing and its extension to gradual typing. We omit the
formalization of maps, the theoretical properties of the type system and its algorithmic
aspects since they are fully detailed in two companion papers: [8] which formalizes
the record and maps we presented in Section 3.3 and [9] which covers the aspects of
function arity, guard analysis, and gradual typing.
The syntax and types of Core Elixir are illustrated in Figure 1 where we use c to
range over constants (i.e., atoms or integers), x to range over expression variables, α
to range over type variables, and the notation u to denote the sequence u1 , . . . , un .
Types are polymorphic, set-theoretic, and can be recursively defined (technically,
they are the contractive regular terms coinductively generated by the grammar:
18
Giuseppe Castagna, Guillaume Duboc, and José Valim
see [7]). They are built from basic types (the types of all integers, all atoms, all
functions, and all tuples, respectively), type variables, value constants (to represent
singleton types), and by applying two constructors for function types ( t → t ) and
tuple types ({ t }) of given arity, and two connectives union and negation (∨, ¬), with
intersection ∧ encoded as t 1 ∧ t 2 = ¬ (¬t 1 ∨ ¬t 2 ). We also encode the top type 1, the
type of all values, as 1 = int ∨ atom ∨ 1fun ∨ 1tup , and the bottom type O as O = ¬1:
they correspond to Elixir’s term() and none(), respectively.
Expressions include constants and variables, functions and applications, tuples
and their projections, annotated let-expressions to model type annotations, and case
expressions. The latter are composed by branches that are guarded by a pattern p
followed by a guard g . Patterns are either variables or constants or tuples thereof,
while guards are Boolean combinations of tests of basic types and of relations on
selector expressions.
The expressions of Core Elixir are translated into an intermediate calculus (defined
in Figure 2 in Appendix B) in which all the negations in guards are eliminated (by
pushing them to the leaves) and where all specific type tests are replaced by a generic
one. This intermediate language can be typed by a type system (Figure 3 also in
Appendix B) which is essentially a merge of the type system of polymorphic CDuce [14,
13] and of the type system for occurrence typing with set-theoretic types [12]. A sound
algorithm to check whether an expression is well-typed in this type system is easily
obtained simply by reusing the algorithmic techniques developed in the cited papers
and embedding them in a bidirectional type system that uses the information of the
type annotations in the let-expressions. There are however two exceptions which
differentiate this system from the one in the cited works. First, in Elixir and, thus, in
our core calculus, the index of a tuple projection can be the result of an expression,
thus the typing rules for projections have to be reworked to take into account this
aspect and our type system cannot statically ensure that the projection of a tuple will
be in the bound of the tuple size (this happens only if any rule (projΩ ) in Figure 3 is
used, in which case the compiler emits a warning). Second, and more importantly,
the typing rule for case expressions is new since it must perform the analysis of Elixir
L
guards. Let us look at it in detail (where < is the strict lexicographical order on pairs):
Γ ⊢e:t (∀i≤n) (∀ j≤mi ) (t i j ≰ O ⇒ Γ , t i j / pi ⊢ ei : s) _ I H
(case) t≤ pi g i
Γ ⊢ case e do (pi g i → ei )i<n : s i≤n
where Γ ; t ⊢ (pi g i )i≤n ⇝ (si j , bi j )i≤n, j≤mi and t i j = (t ∧ si j ) \
W
L shk
{(h,k) | (h,k)<(i, j) and bhk }
This rule types a case expression with n branches, where the i -th branch is guarded
by the pattern pi and guard g i . To type the expression ei of the corresponding branch,
the system must estimate which values may reach the pair pi g i : these are the values
that can be returned by e (i.e., those of type t ) minus those that are surely matched
in one of the branches preceding i , that is, those in the surely accepted type of a
I H
pair p j g j for some j<i . We note by H p g I and p g the surely accepted type and the
potentially accepted type of the pair p g . Then the values that may reach pi g i are those
in (t \ j<i H p j g j I). But not all these values are then matched by the rule against the
W
pattern pi , only those that may be accepted by g i . To compute the set of these values,
19
The Design Principles of the Elixir Type System
as we anticipated in Section 3.2, our system computes for each pair pi g i a list of types
t i1 , ..., t imi whose union contains these values. Then the rule types ei mi -times, by
generating each type environment t i j / pi , which assigns the type for the capture
variables of the pattern pi under the hypothesis that the pattern is matched against a
value in t i j . The various t i j ’s are computed thanks to an auxiliary deduction system
that computes them for all the branches of the case: Γ ; t ⊢ (pi g i )i≤n ⇝ (si j , bi j )i≤n, j≤mi .
This auxiliary judgment, whose definition is given in [9] essentially scans each g i
from left to right looking for OR-clauses, and for each clause, it generates a pair
(s, b) where s is the type of the values for which the clause may be true and b is a
Boolean value that indicates whether s is exact or not. For instance, for the guard
(is_map(x) and map_size(x) == 2) or is_list(x) , which we used in line 87, it
will generate two pairs: (map(), false) since the first clause may be true for some
maps but not all of them, and (list(), true) since the second clause is true for all
lists. Guards are parsed from left to right to take into account Elixir evaluation order
and possible failures (a clause is typed only if the preceding clauses may not fail). The
computation for a guard g i needs both Γ and pi since g i can use variables that are
in the environment or are introduced by pi . Given Γ ; t ⊢ (pi g i )i≤n ⇝ (si j , bi j )i≤n, j≤mi ,
then the potentially accepted type of pi g i is the union of all t i j ’s, while the surely
I H
accepted typeWof pi g i is the union of all t i j ’s for which bi j is true: pi g i = j≤mi si j
W
Gradual Typing and Strong Arrows We extend the previous system by adding gradual
typing. This consists of adding a new base type ? (corresponding to the dynamic()
type of Section 3.4) and defining strong arrows and ? propagation. To determine
whether a function of type s → t is strong, we check whether the body e of the function
strongly ensures that it will return a result in t (noted e ⦂⦂ t ) under the hypothesis
that the parameter is not in the domain of the function (i.e., x : ¬s). This uses an
auxiliary deduction system where, for instance, built-in operations are strong
Γ, x : s ⊢ e : t Γ , x : ¬ s ⊢ e ⦂⦂ t Γ ⊢ e1 ⦂⦂ 1 Γ ⊢ e2 ⦂⦂ 1
(λ⋆ ) ⋆ (add)
Γ ⊢ λ ( x. e) : (s → t) Γ ⊢ e1 + e2 ⦂⦂ int
and where the rule for case expressions does not check exhaustiveness (since if no
branch match, then the case fails and the expression is strong).
20
Giuseppe Castagna, Guillaume Duboc, and José Valim
The addition of “?” is then handled by defining a precision relation ≼. The intuition
is that a type t is more precise than a type s, written s ≼ t if t can be obtained by
replacing in s some occurrences of ? by other types.⁹
Whenever we need to use the precision relation to type an application, we propagate
?. If the function is weak, then the application can only be given type ?, while for
strong functions we use the result type of the function intersected with ?, in order to
improve the typability of gradually-typed programs:
Design Principles. A type system tailored for Elixir needs to carefully balance the
need to bring static error detection to existing users of the language without changing
their experience, while still appealing to programmers coming from statically-typed
languages. To fulfill these requirements we established few basic design principles:
the introduction of typing must not require any modification to the syntax of Elixir
expressions (not even the addition of type annotations for function parameters)
gradual typing should suffice to ensure that programmers are not forced to annotate
existing code (apart from some corner cases);
the system must extract a maximum of type information from patterns and guards;
programmers who prefer a fully statically typed environment should be able to
reduce (or remove altogether) the reliance on gradual types within their code by
emitting warnings when dynamic() is used;
priority is given to type most (if not all) Elixir idioms before turning our atten-
tion to more advanced (and computationally expensive) features such as type
reconstruction.
Type syntax. Elixir is a language defined by minimal syntax with direct translation to
Abstract Syntax Tree, similar to M-expressions introduced by McCarthy for LISP[30]
A large part of Elixir is written in itself through macros, and therefore it does not
provide special syntax for defining modules, functions, conditionals, etc.
From the typing perspective, this means Elixir shall not provide special syntax
for types, and all the operators and notation found in types must match their uses
outside of types, including associativity and precedence. While new operators can be
9
Actually, we use a semantic definition of ≼, due to [26], which takes into account type
equivalences: e.g., ({?, int} ∨ {1fun , ?}) ≼ {1fun , int∨atom} or ¬? ≼ ?
21
The Design Principles of the Elixir Type System
introduced (as long as they consistently apply everywhere in the language), there is
also a concern from the Elixir team about relying too much on punctuation and its
impact on the language adoption.
With this in mind, we choose to use the operators or, and, and not to represent
our fundamental set-theoretic operations. All of those operators exist in the lan-
guage today and are extensively used in guards. Furthermore, we hope the Elixir
community will find Enumerable.t(a) and Collectable.t(a) more readable than
Enumerable.t(a) & Collectable.t(a), and atom() and not(:foo or :bar) to be
clearer than atom() \ (:foo | :bar).
6 Related work
With this design, a typed Elixir would join the family of languages that make use of
semantic subtyping, which includes CDuce [15] (for which it was originally designed),
and newer additions such as Ballerina [2], Luau [28, 24], and (partially) Julia [4].
The work most related to ours is the addition to set-theoretic types to Erlang
by Schimpf et al. [38] which transposes and adapts the current polymorphic type
system of CDuce to Erlang. Their work provides a nice formalization and a throughout
comparison with other current type analysis and verification systems for Erlang.
Essentially, [38] ends where we start from, namely the content of Section 2, whereas
most of the features we presented in Section 3 are by [38] either left as future work
(e.g., the typing of records and gradual typing) or ignored (e.g., the typing of function
arity). A notable exception is the typing of patterns and guards: as in our case, and
independently of us, the authors of [38] propose over and under approximations for
types accepted by patterns combined with guards, but their analysis, which is similar
to [22], stops to very simple guards formed by conjunctions of type tests on single
variables, thus avoiding the complex guard analysis we outlined in Section 4.
Another related work is eqWAlizer [18], an open-source Erlang type-checker (with
an extremely succinct documentation) developed by Meta and used to check the
code of WhatsApp. It consumes the spec and type alias of Erlang with few exceptions.
In particular, and contrary to what we do, they have distinct types for records and
dictionaries, and empty lists are subsumed to lists. As in our system, they use generics
(constrained by the same when keyword we use) with local type inference, type
narrowing, and gradual typing. In particular, eqWAlizer uses the same subtyping
and precision relations for gradual types as we do, since both approaches are based
on [26, 11]. However, eqWAlizer techniques to gradually type Erlang expressions
22
Giuseppe Castagna, Guillaume Duboc, and José Valim
are quite different from ours (no dynamic propagation or strong arrows). Another
important difference is that when typing overloaded functions with overloaded specs
(i.e., our intersections of arrow types) eqWAlizer does not take into account the order
of the clauses of the functions while their applications require the argument to be
compatible with a unique clause. Thanks to negation types our approach takes into
account the order of clauses, and applications are correctly typed even if the argument
is compatible with several clauses: it thus implements a more precise type inference.
Besides eqWAlizer and our approach, the theory of [11] is used also by Cassola et
al. [6] to add gradual typing to Elixir. The work focuses on the gradual aspects, which
is why the typing of the functional core is quite basic (no guards, no polymorphism,
no set-theoretic types). For the gradual aspects, they use the sound gradual typing
approach of [11] but they do not couple it neither with a cast-inserting compilation (to
preserve Elixir semantics) nor with advanced techniques like ours that can take into
account existing checks, and this may hinder the satisfaction of gradual guarantees.
The initial effort to type Erlang was by Marlow and Wadler [29] who define a type
system based on solving subtyping constraints. This type system supports disjoint
unions, a limited form of complement, and recursive types, but not general unions,
intersections, or negations, as we do. The formalization lacks proofs for first-class
function types, which is a solved problem in semantic subtyping. One issue with this
work is that they infer constrained types which are quite large, which leads to the use
of a heuristics-based simplification algorithm to make them more readable.
Dialyzer [27], which serves as the current default to provide type inference in Erlang,
is a static analysis tool that detects errors with a discipline of no false positive, while
our static type system ensures soundness, that is, no false negative. Dialyzer lacks
support for conditional types or intersection types to capture the relation between
input and output types for functions, and record types are parsed but not used.
An actively developed alternative to type Erlang is Gradualizer [25], which also
supports Elixir programs through a translation frontend. Their approach looks simi-
lar to ours, though it lacks subtyping polymorphism, with gradual typing inspired
from [26]. But a comparison is difficult since it lacks a formalization or a detailed
description.
The literature on Erlang also includes Hindley-Milner type systems [41] and bidi-
rectional type systems (without set-theoretic types) [35].
Numerous statically-typed languages constructed for the Erlang VM have emerged
over time. Two examples, Hamler and purerl [21, 33], derived from [34], incorporate
a type system akin to Haskell’s, including type classes. Notably, in Hamler, type
classes are used to model OTP behaviors. Another language, Caramel [5], features
a type system inspired from OCaml. Sesterl [39] extends the trend by offering a
module system [37], utilizing functors to type OTP behaviors (a high priority in our
future work list). Lastly, Gleam [20] is a functional language utilizing well-proven
static typing methodologies from the ML community dating back to the early 90s: a
Hindley-Milner type system [23, 31], supplemented with a rudimentary form of row
polymorphism [42, 36].
23
The Design Principles of the Elixir Type System
We presented the type system that we plan to incorporate in the Elixir compiler. The
system is a transposition to languages of the Erlang family of the polymorphic type
system of CDuce. To do that we had to improve and extend the latter to account for
several characteristics of Elixir: the arity of functions, the use of guards, a uniform
treatment of records and dictionaries, the need for a new sound gradual typing
discipline that does not rely on the insertion at compile time of specific run-time
type-tests but, rather, takes into account both the type tests performed by the virtual
machine and those explicitly added by the programmer. The design of our system
was guided by the principles and goals we briefly exposed in Section 5. Whether it
achieves these goals will have to be checked on an actual implementation.
Incorporating a type system into a language used at scale can be a daunting task.
Our concerns range from how the community will interact and use the type system
to how it will perform on large code bases, . Therefore, our plan is to introduce very
gradually our gradual (pun intended) type system into the Elixir compiler.
In the first release types will be used just internally by the compiler. The type system
will extract type information from patterns and guards to find the most obvious of
mistakes, such as typos in field names or type mismatches from attempting to add an
integer to a string, without using any $-prefixed type specification: developers will
not be allowed to write them. The goal is to assess the performance impact of the type
system and the quality of the reports we can generate in case of typing violations,
without tying the language to a specific type syntax.
The second milestone is to introduce type annotations only in structs, which are
named and statically-defined closed record types. Elixir programs frequently pattern
match on structs, which reveals information about the struct fields, but it knows
nothing about their respective types. By propagating types from structs and their
fields throughout the program, we will increase the type system’s ability to find errors
while further straining our type system implementation.
The third milestone is to introduce the $-prefixed type annotations for functions,
with no or very limited type reconstruction: users can annotate their code with types,
but any untyped parameter will be assumed to be of the dynamic() type.
The development of the type system will happen in parallel with further research
into set-theoretic types and their application to other Elixir idioms, according to the
lines we briefly describe next.
Type Reconstruction and Occurrence Typing In the current system we can define and
type JavaScript’s “logical or” as follows:
The type is very precise: it states that when the first argument is of a type a that is nei-
ther false nor nil (as not(false or nil) is equivalent to not false and not nil),
then the result is (of the type of) the first argument, otherwise it is (of the type of)
24
Giuseppe Castagna, Guillaume Duboc, and José Valim
the second argument. It reflects Elixir’s semantics of if which executes the else:
part if and only if the tested value is either false or nil. By extending the techniques
of [12] to polymorphic types, it will become possible not only to check but also to
reconstruct—i.e., to infer in the absence of an explicit type specification—this same
type for l_or. Using these same techniques we should be able to give more precise
types to some common functions. For example, consider:
This function type-checks in our system, and the given type is as good a type as we
can specify for it. However, by extending the techniques of [12] to polymorphic types,
it will become possible to check (and probably also to reconstruct) for filter the
following far more precise type:
134 $ ((a and b -> true) and (a and not b -> false), [a]) -> [a and b]
135 when a: term(), b: term()
In our current system, checking this type fails because its verification requires to
narrow the type of h in the branches of the if-expression by using the fact that, in
the test, h is the argument of a function, fun, that has an intersection type; such a
deduction requires the powerful occurrence typing techniques developed by [12] which
can probably be decoupled from type reconstruction. Enabling type reconstruction in
Elixir would make it easier for programmers to annotate their programs, since the type
reconstructed for a function can be suggested as a starting point for the annotation
(the type reconstructed for l_or would probably not be the first type a programmer
would think of). Having a powerful type reconstruction system would also open
the possibility for a strict type-checker mode that, instead of using dynamic() for
non-annotated parameters, tries to infer their type. However, these advantages are
counterbalanced by the computational price of this kind of reconstruction which
makes several passes on the code each pass requiring the resolution of several type
constraint problems. Therefore, a careful analysis of the costs and benefits of the
approach must be performed before implementing it.
25
The Design Principles of the Elixir Type System
relies on pattern matching and guards to match messages sitting in the process inbox.
Typing the concurrency constructs and the actor model of Elixir is an obvious next
step. Our type system is already capable of augmenting the code in receive with
type information from guards, with narrowing and approximations. The potentially
accepted type (cf. Section 3.2) of the patterns and guards in receive operations can be
used to define interfaces (i.e., types) for processes and thus type higher-order commu-
nications. A longer-term research project is to type processes with their behavioral
types, in the sense of [1].
Acknowledgements This work was partially supported by Supabase and Fresha. The
second author was supported by a CIFRE grant agreement between CNRS and Remote
Technology. The work benefited from the constant feedback from all the members of
the Elixir compiler core development team.
References
26
Giuseppe Castagna, Guillaume Duboc, and José Valim
[10] Giuseppe Castagna and Alain Frisch. A gentle introduction to semantic subtyping.
In Proceedings of the 7th ACM SIGPLAN international conference on Principles and
practice of declarative programming, pages 198–199, 2005.
[11] Giuseppe Castagna, Victor Lanvin, Tommaso Petrucciani, and Jeremy G Siek.
Gradual typing: a new perspective. Proceedings of the ACM on Programming
Languages, 3(POPL):1–32, 2019.
[12] Giuseppe Castagna, Mickaël Laurent, Kim Nguyen, and Matthew Lutze. On
type-cases, union elimination, and occurrence typing. Proceedings of the ACM on
Programming Languages, 6(POPL):75, 2022.
[13] Giuseppe Castagna, Kim Nguyen, Zhiwu Xu, and Pietro Abate. Polymorphic
functions with set-theoretic types. Part 2: local type inference and type recon-
struction. In Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium
on Principles of Programming Languages, POPL ’15, pages 289–302, January 2015.
doi:10.1145/2676726.2676991.
[14] Giuseppe Castagna, Kim Nguyen, Zhiwu Xu, Hyeonseung Im, Sergueï Lenglet,
and Luca Padovani. Polymorphic functions with set-theoretic types. Part 1: Syntax,
semantics, and evaluation. In Proceedings of the 41st Annual ACM SIGPLAN-
SIGACT Symposium on Principles of Programming Languages, POPL ’14, pages
5–17, January 2014. doi:10.1145/2676726.2676991.
[15] CDuce. [Link]
[16] CDuce git repository. [Link]
[17] Elixir. [Link]
[18] eqWAlizer. [Link]
[19] A. Frisch, G. Castagna, and V. Benzaken. Semantic subtyping: dealing set-
theoretically with function, union, intersection, and negation types. Journal of
the ACM, 55(4):1–64, 2008.
[20] Gleam. [Link]
[21] Hamler. [Link]
[22] Joseph Richard Harrison. Robust Communications in Erlang. University of Kent
(United Kingdom), 2020.
[23] J. Roger Hindley. The principal type-scheme of an object in combinatory logic.
Transactions of the American Mathematical Society, 146:29–60, 1969.
[24] Alan Jeffrey. Semantic subtyping in Luau. Blog post, November 2022. Accessed
on May 6th 2023. URL: [Link]
[25] Svenningsson Josef. Gradualizer. [Link] 2019.
[26] Victor Lanvin. A semantic foundation for gradual set-theoretic types. PhD thesis,
Université Paris Cité, 2021.
[27] Tobias Lindahl and Konstantinos Sagonas. Practical type inference based on
success typings. In ACM-SIGPLAN International Conference on Principles and
Practice of Declarative Programming, 2006.
[28] Luau. [Link]
27
The Design Principles of the Elixir Type System
[29] Simon Marlow and Philip Wadler. A practical subtyping system for Erlang. ACM
SIGPLAN Notices, 32(8):136–149, 1997.
[30] John McCarthy. Recursive functions of symbolic expressions and their com-
putation by machine, part i. Commun. ACM, 3(4):184–195, apr 1960. doi:
10.1145/367177.367199.
[31] Robin Milner. A theory of type polymorphism in programming. Journal
of Computer and System Sciences, 17(3):348–375, 1978. URL: [Link]
[Link]/science/article/pii/0022000078900144, doi:[Link]
1016/0022-0000(78)90014-4.
[32] Sven-Olof Nyström. A soft-typing system for Erlang. In Erlang Workshop, 2003.
[33] Pure Erlang. [Link]
[34] Purescript. [Link]
[35] Nithin Vadukkumchery Rajendrakumar and Annette Bieniusa. Bidirectional
typing for erlang. In Proceedings of the 20th ACM SIGPLAN International Workshop
on Erlang, pages 54–63, 2021.
[36] Didier Rémy. Type checking records and variants in a natural extension of
ML. In Proceedings of the 16th ACM SIGPLAN-SIGACT Symposium on Principles
of Programming Languages, POPL ’89, page 77–88, New York, NY, USA, 1989.
Association for Computing Machinery. doi:10.1145/75277.75284.
[37] Andreas Rossberg, Claudio Russo, and Derek Dreyer. F-ing modules. Journal of
functional programming, 24(5):529–607, 2014.
[38] Albert Schimpf, Stefan Wehr, and Annette Bieniusa. Set-theoretic types for
Erlang, 2023. URL: [Link] doi:10.48550/ARXIV.2302.12783.
[39] Sesterl. [Link]
[40] J. G. Siek and W. Taha. Gradual typing for functional languages. In Scheme and
Functional Programming Workshop, volume 6, pages 81–92, 2006.
[41] Nachiappan Valliappan and John Hughes. Typing the wild in Erlang. In Proceed-
ings of the 17th ACM SIGPLAN International Workshop on Erlang, pages 49–60,
2018.
[42] Mitch Wand. Type inference for record concatenation and multiple inheritance.
In Proceedings of the Fourth Annual Symposium on Logic in Computer Science,
pages 92–97, 1989. doi:10.1109/LICS.1989.39162.
A Types
[ Beppe: This is not enough, you have to deduce for it the decomposition rule for
the algorithm or at least show how subtyping can be encoded]
28
Giuseppe Castagna, Guillaume Duboc, and José Valim
Using the usual semantic subtyping relation for unary functions, our system encodes
multi-arity function types using a CDuce record type:
Subtyping Properties.
∀(t 1 , .. , t n , t) Fun ((t 1 , .. , t n ) , t) ≤ Fun1
there is a distinct top function for functions of all arity, which is for all n,
Funn = {fn = O → 1; ar = n}
Note that this encoding makes intersections of function types of different arity be
the empty type, since n ∧ m is O if n ̸= m.
29
The Design Principles of the Elixir Type System
B Language
B.1 Syntax
Constants c ::= i | k
Expressions e, f ::= c | x | λ ( x. e) | f ( e ) | {e} | πe (e)
|let x : t = e in e | e + e
| case e do p g → e
Base types b ::= int | atom | 1fun | 1tup
Types t, s ::= b | c | α | t → t | t ∨ t | ¬t | t
Singletons ℓ ::= c | {ℓ}
Patterns p ::= ℓ | x | p & p | {p}
Guard atoms a ::= ℓ | x | πa (a) | size (a)
Guards g ::= a ? t | a = a | g and g | g or g
Type environments Γ ::= • | Γ , x: t
30
Giuseppe Castagna, Guillaume Duboc, and José Valim
Γ ⊢e:t e:t t ≤s e : t1 e : t2
(inst) dom (ϱ) ∩ Γ = ∅ (≤) (∧)
Γ ⊢e: tϱ e:s e : t1 ∧ t2
f :s x :s∧u⊢ e : t x : s ∧ ¬u ⊢ e : t
(∨) fv(u) = ∅
e[ f /x] : t
The system of Figure 3 uses a presentation in which only the relevant part of the type
environments is presented (i.e., the part Γ ⊢ is often omitted). In that system the
rules marked by a “Ω” correspond to cases in which the type-checker emits a warning
since it cannot ensure type safety. More precisely, whenever a rule (projΩ ) is used
the type-checker warns that the expression may generate an "index out of range"
exception; when the rule (caseΩ ) is used the type-checker warns that the case may
not be exhaustive.
31
The Design Principles of the Elixir Type System
32