hn-classics/_stories/2010/13911027.md

383 lines
18 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
created_at: '2017-03-20T01:32:48.000Z'
title: Bootstrapping a Type System (2010)
url: http://journal.stuffwithstuff.com/2010/10/29/bootstrapping-a-type-system/
author: lemming
points: 101
story_text:
comment_text:
num_comments: 22
story_id:
story_title:
story_url:
parent_id:
created_at_i: 1489973568
_tags:
- story
- author_lemming
- story_13911027
objectID: '13911027'
year: 2010
---
[Source](http://journal.stuffwithstuff.com/2010/10/29/bootstrapping-a-type-system/ "Permalink to
Bootstrapping a Type System
– journal.stuffwithstuff.com
")
#
Bootstrapping a Type System
– journal.stuffwithstuff.com
# [ Bootstrapping a Type System ][1]
↩ ↪
#### [October 29, 2010][2] [code][3] [java][4] [language][5] [magpie][6]
Magpie has reached a strange but neat little milestone on the road from “weird experiment” to “real language”: the type system is now bootstrapped. Im not aware of any other languages where a statement like that is even _meaningful_ (maybe [Typed Scheme][7]?), much less _true_, so Ill try to break that down.
Even if youre not interested in Magpie, my hope is that you can at least get a better picture of how type systems work in general from this. But, as a warning, this ventures into an echelon of programming language nerdery that few care to explore. Ill try not to be too boring, but no promises.
## What is a Type?
“Type” like a lot of other simple words (“object” anybody?) ends up with a lot of fuzzy overlapping definitions in programming: classes, runtime types, prototypes, type objects. In dynamically-typed languages youll hear “values have type” while others say those languages have no types.
Magpie makes this distinction _very_ blurry, but for our purposes what I care about is the static definition of a type: its the stuff that the type-checker works with. For example, say you create a variable:
var i = 123
The type-checker will track that `i` is an integer. It does so by associating an object— a _thing_— with that name. That object is a type. If you think of types as objects that the type-checker works with, that leads to the natural question, “What operations on types do we need to support?”
For Magpie, there are two important ones:
### Assignment Compatibility
The first operation is a test for assignment compatibility (i.e. the subtype relation). Put plainly, if we have a variable of type `A`, and were trying to assign to it a value of type `B`, is that kosher? This sounds like its limited to variables, but that core operation covers almost everything you need to do with a type. Its equivalent to asking, “Is `B` a subtype of `A`?” and covers things like, “Can I pass an argument of type `A` to a function declared to expect type `B`?”
So, with this operation, we can track how types are related to each other, and make sure we dont pass or assign values of the wrong type.
### Member Type
The other operation, which is a little more OOP-specific, is determining if a type has a member with a given name, and, if so, what its type is. This helps us ensure we arent calling unsupported methods or accessing undefined fields on a class. For example, if we have a chunk of code like:
list add("item")
The type-checker needs to get the type of the `list` variable and ask it what the type of its `add` member is. If the answer to that is nothing (i.e. it doesnt have that member) or not a function that accepts a string, then that method call is an error.
## An Interface for Types
If you have an OOP mindset, you can look at those two operations as defining an interface. If you were implementing a type-checker in Java, you could define a type to be any class that implements:
interface Type {
boolean canAssignFrom(Type other);
Type getMemberType(String name);
}
The concrete classes that represented kinds of types (say “Class”, “Interface”, “Array”, etc.) would then implement that interface and youre good to go. Magpie does exactly that.
## First-Class Types
Its just that _where_ it does that is kind of unusual. In Magpie, all types are first-class, like Ruby or Python. For example:
var monkey = Monkey new("Bobo")
In this code, `Monkey` is just a global variable whose value is an object representing the `Monkey` class. So classes are first-class objects.
But there are _types_ that are not _classes_. For example, “array of ints” is a type, but its not a class. Magpie also has [ad-hoc unions][8], so “int or string” is a type, but its not a class.
It would be perfectly valid for _classes_ to be first-class, without having all types be first-class, except for one important thing: generics. Magpie has generics because Im firmly of the opinion that a type system without generics is about as useful as a language with functions but no parameters. Do you really want a language whose type system is as expressive as `GOSUB` in BASIC?
Given generics, you should be able to instantiate them with any _type_, not just any _class_. That way, you can create things like “collection of array of ints”. Its also important for type arguments in generics to be [reified][9] (unlike [Javas short bus generics][10]). If you have a “list of ints” it should _know_ that its a list of ints and be able to act on that fact.
Those last two points: that any type can be a type argument in a generic, and that type arguments should be reified means that all types really do need to be first-class objects in Magpie.
## Piercing the Veil
So, going down this path, we find ourselves in a weird place. Over in the Java type-checking code, we have some `Type` interface and a bunch of classes that implement it. The type system is fully fleshed out in Java. Meanwhile, on the other side of the veil we have these first-class objects in Magpie that also fully implement the type system.
Thats redundant, kind of ugly, and hard to manage. To get rid of that redundancy, well kill one side. Which one? We know we need types to be first- class in Magpie, so that really only leaves one option: yank the type system out of Java.
## What A Type-Checker Does
A type-checker is actually a pretty simple piece of software. It basically walks through a chunk of code tracking the types of the operations and checking them for validity. For example, consider:
print((1 + 2) string)
The parser converts that into an [AST][11], a little tree like:
(print)
|
(string)
|
(+)
/
(1) (2)
The type-checker walks that tree from the bottom to top. Something like this:
1. Get the type of `1` (`Int`).
2. Get the type of `2` (`Int`).
3. Look up the type of `+` on the receivers type (`Int`, from _1_).
4. See if the parameter type for `+` (`Int`, from the declaration of `+`) matches the argument type (`Int`, from _2_). (It does.)
5. Get the return type of that method (`Int`).
6. Look up the type of `string` on the receivers type (`Int`, from _5_).
7. Get the return type of that method (`String`).
8. Look up the type of `print`.
9. See if the parameter type for `print` (`String`, from its declaration) matches the argument type (`String`, from _7_). (It does.)
And were done. Steps 3 6, and 8 are where we do `getMemberType`. Steps 4 and 9 and where we use `canAssignFrom`. Nothing too crazy. In Magpie, this is basically [one file][12].
## Re-piercing the Veil
Heres where it gets weird. We need to get this working with first-class types. That means that `getMemberType` and `canAssignFrom` will be methods written in Magpie.
To do this, the type-checker will do something unusual. In the middle of _statically_ checking a chunk of code, it will periodically switch to _dynamically_ evaluating some Magpie code. Whenever it needs to compare two types or look up a method, it will hand off to the interpreter which will call the appropriate method and and pass the result back to the type-checker.
This is slightly less crazy than it seems because the code being dynamically evaluated (methods on the type objects) is generally not the code being type- checked (any random piece of Magpie code like `1 + 2`). Keeping track of which context the interpreter is in is a little bit confusing, but it works.
## Bootstrapping
Magpie has worked this way successfully for several months. This is possible because at its core, Magpie is dynamic: it can execute a chunk of Magpie code without having done any compilation or type-checking on it. So I could write the classes that defined the type system in Magpie without using any type annotations and theyd run happily along like any dynamic language.
It worked that way, with the type system itself being dynamically-typed until I had enough type system features in place to actually be able to express all of the types I needed.
In the last week or so, I finally reached that point. The most important part was interfaces. I needed the ability to define an interface for `Type` that the different type classes (`ArrayType`, `Class`, `Tuple`, etc.) could implement. Interface types are defined entirely within Magpie, there isnt a single line of Java code related to them.
Once I had that working, I could use that to define a `Type` interface (an instance of the Magpie class `Interface`). Then I could turn around and use that type to annotate the types of the different methods that the type system requires. Where before, methods like `canAssignFrom(other)` were dynamically typed, now I could do `canAssignFrom(other Type)`.
The end result was, by far, the most mind-crushingly meta thing Ive ever had to wade through. Behold:
interface Type
canAssignFrom(other Type -> Bool)
getMemberType(name String -> Type | Nothing)
// some other stuff...
end
That chunk of Magpie code defines an interface `Type` that the type classes in Magpie must implement. Youll note that its recursive: most of its methods take arguments of type `Type`.
Every class that defines a type in Magpie implements that. Interfaces are a type, so [`Interface`][13] implements it, of course:
class Interface
canAssignFrom(other Type -> Bool)
// Check that other type has every member of this one.
for member = members do
var type = member memberType()
let otherMem = other getMemberType(member name) then
// Must be assignable.
if type canAssignFrom(otherMem) not then
return false
end
else return false // Must have members
end
// If we got here, every method was found and matched.
true
end
getMemberType(name String -> Type | Nothing)
let member = members first(
fn (m Member -> Bool) m name == name) then
member memberType()
end
end
// other stuff...
end
So we have a Magpie class, `Interface` that implements an interface, `Type`, that is in turn an instance of that same `Interface` class.
Now that all of the methods in the classes defining the type system have type annotations, this means the type-checker will check them too. So when a Magpie program is type-checked, it will also type-check the type system itself.
## A Side-Effect: One Less Language
With all types being actual live Magpie objects, that raises an odd question: “What is a type annotation?” In most languages, there are actually two languages mixed together: the language of expressions and the language of types. So, in Java, there are value expressions like `1 + 2` but also type expressions like `List<Int>`, and these are two totally separate languages.
But in Magpie, even type expressions need to yield Magpie objects, so there is no meaningful distinction between expressions and type annotations. Type annotations are just regular Magpie expressions, evaluated dynamically during type-checking. If you have a function like:
parseNumber(value String -> Int | Nothing)
// ...
end
That `String` is just a regular Magpie expression, as is `Int | Nothing`. In the case of the latter, `|` is just an operator on the `Class` class.
This means that not only is Magpies type system extensible, even its type _annotations_ are. If you wanted to, you could define new operators or functions and use them in type annotations:
what(arg doSomething(Very * Strange))
I honestly dont know if thats a useful feature, but I do like the idea of not having the type system welded into the language. If you like prototypes, or algebraic data types, wouldnt it be nice if you could add them to the language yourself? Or maybe thats just crazy talk.
## Why?
You may be asking why the hell I went so far down the rabbit hole here. I didnt intend to, honest. I really dont want Magpie to be some weird esoteric language. All I wanted was:
1. The ability to imperatively modify classes like you can in Python and Ruby.
2. Static type-checking (after that imperative modification has happened).
This just seemed like the most straight-forward way to pull that off, but if you got any better ideas, Im all ears.
[ __ ][14] [ __ ][15] [ __ ][16] [ __ ][17] [ __ ][18] [ __ ][19]
Please enable JavaScript to view the [comments powered by Disqus.][20]
![][21]
Hi! I'm **Bob Nystrom**, the one on the left.
I wrote a book called [**Game Programming Patterns**][22].
You can email me at `robert` at this site or follow me on twitter at [`@munificentbob`][23].
## Elsewhere
* Code at [github][24]
* Code at [bitbucket][25]
* Tweets at [twitter][26]
* Photos at [flickr][27]
* Video at [vimeo][28]
* Posts at [google+][29]
## Categories
* [code][3] 67
* [language][5] 41
* [magpie][6] 24
* [c-sharp][30] 13
* [dart][31] 13
* [game-dev][32] 12
* [java][4] 10
* [cpp][33] 8
* [game-patterns][34] 6
* [parsing][35] 6
* [roguelike][36] 6
* [design][37] 5
* [go][38] 5
* [js][39] 4
* [book][40] 3
* [c][41] 3
* [finch][42] 3
* [python][43] 3
* [ruby][44] 3
* [blog][45] 2
* [f-sharp][46] 2
* [lua][47] 2
* [ai][48] 1
* [beta][49] 1
* [blogofile][50] 1
* [game][51] 1
* [jasic][52] 1
* [javascript][53] 1
* [music][54] 1
* [oop][55] 1
* [optimization][56] 1
* [oscon][57] 1
* [politics][58] 1
* [scheme][59] 1
* [typescript][60] 1
* [visualization][61] 1
All [73 articles][2]…
This blog is built using [jekyll][62]. The source repo for it is [here][63].
© 2008-2014 Robert Nystrom
[1]: http://journal.stuffwithstuff.com/2010/10/29/bootstrapping-a-type-system "Permanent Link to Bootstrapping a Type System"
[2]: http://journal.stuffwithstuff.com/archive
[3]: http://journal.stuffwithstuff.com/category/code
[4]: http://journal.stuffwithstuff.com/category/java
[5]: http://journal.stuffwithstuff.com/category/language
[6]: http://journal.stuffwithstuff.com/category/magpie
[7]: http://www.ccs.neu.edu/home/samth/typed-scheme/
[8]: http://journal.stuffwithstuff.com/2010/08/23/void-null-maybe-and-nothing/
[9]: http://stackoverflow.com/questions/557340/c-generic-list-t-how-to-get-the-type-of-t/557355#557355
[10]: http://en.wikipedia.org/wiki/Generics_in_Java#Type_erasure
[11]: http://en.wikipedia.org/wiki/Abstract_syntax_tree
[12]: http://bitbucket.org/munificent/magpie/src/tip/src/com/stuffwithstuff/magpie/interpreter/ExprChecker.java
[13]: http://bitbucket.org/munificent/magpie/src/8865c82a958d/base/Interface.mag
[14]: http://journal.stuffwithstuff.com//www.reddit.com/submit?url=http://journal.stuffwithstuff.com/2010/10/29/bootstrapping-a-type-system/
[15]: http://journal.stuffwithstuff.com//news.ycombinator.com/submitlink?u=http://journal.stuffwithstuff.com/2010/10/29/bootstrapping-a-type-system/&t=Bootstrapping a Type System
[16]: http://twitter.com/share?url=http://journal.stuffwithstuff.com/2010/10/29/bootstrapping-a-type-system&text=%22Bootstrapping a Type System%22%20%40munificentbob
[17]: http://www.facebook.com/share.php?u=http://journal.stuffwithstuff.com/2010/10/29/bootstrapping-a-type-system
[18]: https://plus.google.com/share?url=http://journal.stuffwithstuff.com/2010/10/29/bootstrapping-a-type-system
[19]: http://journal.stuffwithstuff.com/rss.xml
[20]: http://disqus.com/?ref_noscript
[21]: http://journal.stuffwithstuff.com/image/dogshot_square.jpg
[22]: http://gameprogrammingpatterns.com/
[23]: https://twitter.com/intent/user?screen_name=munificentbob
[24]: http://github.com/munificent
[25]: http://bitbucket.org/munificent
[26]: http://twitter.com/munificentbob
[27]: http://www.flickr.com/photos/bobisbob/
[28]: http://vimeo.com/bobisbob
[29]: http://plus.google.com/100798142896685420545
[30]: http://journal.stuffwithstuff.com/category/c-sharp
[31]: http://journal.stuffwithstuff.com/category/dart
[32]: http://journal.stuffwithstuff.com/category/game-dev
[33]: http://journal.stuffwithstuff.com/category/cpp
[34]: http://journal.stuffwithstuff.com/category/game-patterns
[35]: http://journal.stuffwithstuff.com/category/parsing
[36]: http://journal.stuffwithstuff.com/category/roguelike
[37]: http://journal.stuffwithstuff.com/category/design
[38]: http://journal.stuffwithstuff.com/category/go
[39]: http://journal.stuffwithstuff.com/category/js
[40]: http://journal.stuffwithstuff.com/category/book
[41]: http://journal.stuffwithstuff.com/category/c
[42]: http://journal.stuffwithstuff.com/category/finch
[43]: http://journal.stuffwithstuff.com/category/python
[44]: http://journal.stuffwithstuff.com/category/ruby
[45]: http://journal.stuffwithstuff.com/category/blog
[46]: http://journal.stuffwithstuff.com/category/f-sharp
[47]: http://journal.stuffwithstuff.com/category/lua
[48]: http://journal.stuffwithstuff.com/category/ai
[49]: http://journal.stuffwithstuff.com/category/beta
[50]: http://journal.stuffwithstuff.com/category/blogofile
[51]: http://journal.stuffwithstuff.com/category/game
[52]: http://journal.stuffwithstuff.com/category/jasic
[53]: http://journal.stuffwithstuff.com/category/javascript
[54]: http://journal.stuffwithstuff.com/category/music
[55]: http://journal.stuffwithstuff.com/category/oop
[56]: http://journal.stuffwithstuff.com/category/optimization
[57]: http://journal.stuffwithstuff.com/category/oscon
[58]: http://journal.stuffwithstuff.com/category/politics
[59]: http://journal.stuffwithstuff.com/category/scheme
[60]: http://journal.stuffwithstuff.com/category/typescript
[61]: http://journal.stuffwithstuff.com/category/visualization
[62]: http://jekyllrb.com/
[63]: https://github.com/munificent/journal