[Source](http://www.codecommit.com/blog/ruby/monads-are-not-metaphors "Permalink to Monads Are Not Metaphors - Code Commit")
# Monads Are Not Metaphors - Code Commit
![][1]
![][2]
[Skip to content][3]
![Print][4]
## [Monads Are Not Metaphors][5]
27
Dec
2010
_This article is also [available in Japanese][6]._
I am about to break a promise. Almost three years ago, I promised myself that I would _never_ write an article about monads. There are too many such articles already; so many, in fact, that people are often confused by the sheer proliferation. Everyone seems to have a different take on the subject, meaning that those attempting to learn the concept for the first time are stuck trying to reason out the commonalities between burritos, space suits, elephants and desert Bedouins.
I’m not going to add to this menagerie of confusing analogies. The fact is that none of these parallels are entirely accurate. None of them convey the whole picture, and some of them are blatantly misleading in important respects. You will _never_ come to understand monads by pondering Mexican food and the Final Frontier. The only way to understand monads is to see them for what they are: a mathematical construct.
### Math (or not)
Here’s the thing about monads which is hard to grasp: monads are a pattern, not a specific type. Monads are a _shape_, they are an abstract interface (not in the Java sense) more than they are a concrete data structure. As a result, _any_ example-driven tutorial is doomed to incompleteness and failure. The only way to really understand is to take a step back and look at what monads mean in the _abstract_ rather than the concrete. Take a look at the following Ruby snippet:
| ----- |
| |
def foo(bar)
puts bar
bar.size
end
|
Just as a quick Ruby refresher, we can rewrite this code in the following way:
| ----- |
| |
def foo(bar)
puts bar; bar.size
end
|
Ruby has this neat convention (which is shared by most modern languages) which causes the final expression in a method to be turned into the implicit return statement. Thus, the `foo` method will take a parameter, print it to standard out and then return its `size`. Fairly simple, right?
Here’s the puzzler: what is the semicolon (`;`) doing? It’s tempting to say that it’s just a separator, but theoretically speaking, there’s something much more interesting going on here. Let’s switch to Scala and add some Christmas trimmings:
| ----- |
| |
def foo(bar: String) = {
({ () => println(bar) })()
({ () => bar.length })()
}
|
Just in case you’re unfamiliar with Scala, I’d like to make it clear that we are _not_ required to enclose every statement inside its own lambda (anonymous function). I’m just doing that to make a point.
This function does exactly the same thing as the Ruby version. Well, the parameter is a bit more constrained since we require a `String` rather than accepting anything defines `size`, but moving past that… The major difference from what we had previously is that each statement is wrapped inside its own anonymous function, which we immediately call. We can again play the same semicolon trick that we used in Ruby. However, because the statements are actually functions, we can go a step further:
(note: the `andThen` method isn’t defined for functions of 0-arity, but we’re going to pretend that it is and that it works the same as it does for functions of one argument. If it makes you feel better, you can pretend that these are both one-argument functions taking `Unit` as a parameter, the theoretical implications are the same, it just requires more syntax)
Notice that we haven’t actually used the semicolon (although we could have). Instead, we’re _combining_ two functions together and invoking them at the very end. The semantics we’re using to do the combination are such that the first function will be evaluated, then its result (`()`) discarded and the second function evaluated with its result returned. For those following along at home, we could easily define `andThen` in the following way:
| ----- |
| |
def funcSyntax[A](f1: () => A) = new {
def andThen[B](f2: () => B) = f1(); f2()
}
|
In a way, we have defined a method which literally encapsulates the effect of the semicolon “operator”, allowing us to apply it directly to functions, rather than dealing with it indirectly at the level of statements. That’s kind of a cool thought, but the important point is that we are first executing the first function, discarding its result and then executing the second function, returning its result.
It should be clear that we could extend this to any number of functions. For example:
| ----- |
| |
def foo(bar: String) = {
({ () => println("Executing foo") } andThen
{ () => println(bar) } andThen
{ () => bar.length })()
}
|
Still with me? Congratulations, you’ve seen your first monad.
### [You Could Have Invented Monads! (and maybe you already have)][7]
This certainly isn’t a monad in the traditional sense, but if we worked at it, we could show that the monadic axioms do hold. The significant point here is what this monad is doing: combining one thing together with another in sequence. In fact, this is what _all_ monads do, deep down. You start out with Thing One, and you have a function which will (given One) will give you Thing Two. Monads let you combine Thing One and your function together, producing a final resultant Thing. Let’s look at some more code:
| ----- |
| |
case class Thing[+A](value: A)
|
This is about the simplest container imaginable (in fact, it is _precisely_ the simplest container imaginable, but that’s not relevant now). We can wrap up values inside of `Thing`, but that’s about it:
| ----- |
| |
val a = Thing(1)
val b = Thing(2)
|
Now, let’s switch into design mode for a moment. Imagine that we find ourselves writing a lot of code which looks like this:
| ----- |
| |
def foo(i: Int) = Thing(i + 1)
val a = Thing(1)
val b = foo(a.value) // => Thing(2)
|
We’re starting with a `Thing`, and then we’re using the value inside of that `Thing` to call a function which gives us a new `Thing`. If you think about it, this is actually a very common pattern. We have a value, and then we use that value to compute a new value. Mathematically, this is pretty much the same as the following:
| ----- |
| |
def foo(i: Int) = i + 1
val a = 1
val b = foo(a) // => 2
|
The only difference between these is that the first version wraps everything in `Thing`, while the second version is using “bare” values.
Now, let’s extend our imagine just a bit and assume that we have a good reason for wrapping everything inside of `Thing`. There could of course be any number of reasons for this, but basically it boils down to the notion that `Thing` might have some extra logic which does interesting things with its value. Here’s the question: can we come up with a nicer way of going from `a` to `b`? Basically, we want to encapsulate this pattern as a more general tool.
What we want is a function which pulls the value out of `Thing` and then calls another function with that value, returning the result of that function call (which will be a new `Thing`). Since we’re good object-oriented programmers, we will define this as a method on class `Thing`:
| ----- |
| |
case class Thing[+A](value: A) {
def bind[B](f: A => Thing[B]) = f(value)
}
|
So if we have a `Thing`, we can pull its value out and use it to compute a new `Thing`, all in one convenient step:
| ----- |
| |
def foo(i: Int) = Thing(i + 1)
val a = Thing(1)
val b = a bind foo // => Thing(2)
|
Notice that this is a lot cleaner that our original version, while still performing exactly the same function. `Thing` is a monad.
#### The Monad Pattern
Any time you start with something which you pull apart and use to compute a new something of that same type, you have a monad. It’s really as simple as that. If it sounds like I’m describing almost all of your code, then good, that means you’re starting to catch on. Monads are everywhere. And by “everywhere”, I do mean _everywhere_.
To understand why this is, let’s look at what it is that makes `Thing` a monad:
| ----- |
| |
val a = Thing(1)
|
The first thing is that I can wrap up a value inside of a new `Thing`. Object-oriented developers might call this a “constructor”. Monads call it “the `unit` function”. Haskell calls it “`return`” (maybe we shouldn’t try to figure out that one just yet). Whatever you call it though, it comes to the same thing. We have a function of type `A => Thing`; a function which takes some value and wraps it up inside a new `Thing`.
| ----- |
| |
a bind { i => Thing(i + 1) }
|
We also have this fancy `bind` function, which digs inside our `Thing` and allows a function which we supply to use that value to create a new `Thing`. Scala calls this function “`flatMap`“. Haskell calls it “`>>=`“. Again, the name doesn’t matter. What’s interesting here is the fact that `bind` is how you combine two things together in sequence. **We start with one thing and use its value to compute a new thing.**
It’s as simple as that! If you’re like me, then you’re likely asking the following question: if it’s so simple, what’s all the fuss about? Why not just call this the “using one thing to compute another” pattern? Well, for one thing, that’s far too verbose. For another, monads were first defined by mathematicians, and mathematicians _love_ to name things. Mathematics is all about finding patterns, and it’s hard to find patterns if you can’t affix labels once you’ve found them.
### More Examples
I said that monads are everywhere (they are!), but we’ve only looked at two examples so far. Let’s see a few more.
#### Option
This might be the most famous monad of all, quite possibly because it’s one of the easiest to understand and by far the easiest to motivate. Consider the following code:
| ----- |
| |
def firstName(id: Int): String = ... // fetch from database
def lastName(id: Int): String = ...
def fullName(id: Int): String = {
val fname = firstName(id)
if (fname != null) {
val lname = lastName(id)
if (lname != null)
fname + " " + lname
else
null
} else {
null
}
}
|
Here again, we have a fairly common pattern. We have two functions (`firstName` and `lastName`) which are responsible for producing some data which may or may not be available. If the data is available, then it is returned. Otherwise, the result of these functions will be `null`. We then use these functions to do something interest (in this case, compute a full name). Unfortunately, the fact that `firstName` and `lastName` may or may not produce a useful value needs to be handled explicitly with a set of nested `if`s.
At first blush, it seems this is the best that we can do. However, if you look very closely, you can find the monad pattern buried in this code. It’s a little more complicated than last time, but it’s still there. Let’s try wrapping everything in `Thing` to make it clear:
| ----- |
| |
def firstName(id: Int): Thing[String] = ... // fetch from database
def lastName(id: Int): Thing[String] = ...
def fullName(id: Int): Thing[String] = {
firstName(id) bind { fname =>
if (fname != null) {
lastName(id) bind { lname =>
if (lname != null)
Thing(fname + " " + lname)
else
Thing(null)
}
} else {
Thing(null)
}
}
}
|
See it now? As I said, monads are everywhere. Here’s the really useful bit though: every time we `bind`, the very first thing we do _inside_ the function is test the value to see if it is `null`, why not move that logic into `bind`? Of course, we can’t really do that without changing `Thing` into something different, so we will define a new monad called `Option`:
| ----- |
| |
sealed trait Option[+A] {
def bind[B](f: A => Option[B]): Option[B]
}
case class Some[+A](value: A) extends Option[A] {
def bind[B](f: A => Option[B]) = f(value)
}
case object None extends Option[Nothing] {
def bind[B](f: Nothing => Option[B]) = None
}
|
If you block out everything except `Some`, this looks a _lot_ like our old friend, `Thing`. The main difference is that `Option` has two different instantiations: `Some`, which contains a value, and `None`, which doesn’t contain a value. Think of `None` as being just an easier way of writing `Thing(null)`.
What’s interesting is that `Some` and `None` need to have two different definitions of `bind`. The definition of `bind` in `Some` looks a lot like the definition in `Thing`, which makes sense as `Some` and `Thing` are almost identical. However, `None` defines `bind` to always return `None`, ignoring the specified function. How does this help us? Well, let’s return to our `fullName` example:
| ----- |
| |
def firstName(id: Int): Option[String] = ... // fetch from database
def lastName(id: Int): Option[String] = ...
def fullName(id: Int): Option[String] = {
firstName(id) bind { fname =>
lastName(id) bind { lname =>
Some(fname + " " + lname)
}
}
}
|
All of those nasty `if` statements have disappeared. This works because `firstName` and `lastName` now return `None` when they fail to fetch the database record, rather than `Thing(null)`. Of course, if we try to `bind` on `None`, the result will always be `None`. Thus, the `fullName` function returns the combination of `firstName` and `lastName` inside of a `Some` instance only if neither `firstName` nor `lastName` return `None`. If either one returns `None`, then the result of the whole thing will be `None`.
For those keeping score at home, we have “accidentally” stumbled upon Groovy’s [safe-dereference operator][8] (`?.`), [Raganwald’s `andand`][9] for Ruby and many, many more. See? Monads are everywhere.
### IO
Anyone trying to understand monads will inevitably run into Haskell’s `IO` monad, and the results are almost always the same: bewilderment, confusion, anger, and ultimately Perl. The fact is that `IO` is a rather odd monad. It is fundamentally in the same category as `Thing` and `Option`, but it solves a very different problem.
Here’s the deal: Haskell doesn’t allow side-effects. At all. Functions take parameters and return values; you can’t just “change” something outside the function. As an example of what this means, let’s return to our earlier Ruby code:
| ----- |
| |
def foo(bar)
puts bar
bar.size
end
|
This function takes a value, calls its `size` method and returns the result. However, it also _changes_ the standard output stream. This is pretty much the same as if we had a global array hanging around which we just mutated in-place:
| ----- |
| |
STDOUT = []
def foo(bar)
STDOUT += [bar]
bar.size
end
|
Haskell doesn’t have variables of any sort (imagine if Scala _didn’t_ have `var`, or if every variable in Java were `final`). Since we don’t have any variables, we can’t change anything in-place. Since we can’t change anything in-place, there’s no way we can define a `puts` function. At least, not the `puts` we’re used to.
Let’s switch back to Scala. Our goal here is to define a `println` function which _doesn’t_ rely on any sort of mutable state (ignoring for the moment the fact that the standard output stream is always going to be mutable, since there’s no way we can “change” the user’s screen by cloning the physical display). One thing we could do is wrap up our standard output stream as a `Vector` which we carry along with our functions:
Theoretically, we could write all of our `println`-enabled functions in this way, passing in the current `stdout` and receiving the new state as a result. At the end of the day, the entire program would produce a result in addition to the final state of `stdout`, which could be printed to the screen by the language runtime.
This works (at least for `println`), but it’s ridiculously ugly. I would hate it if I had to write code like this, and early Haskell adopters felt exactly the same way. Unfortunately, things only get worse from there. Our trick with `Vector[String]` may work for the standard output stream, but what about standard in? At first blush, it _seems_ as if the `readLine` function wouldn’t be as bad as `println`, after all, we aren’t changing anything! Unfortunately, something is clearly being changed at some level, otherwise repeated calls to `readLine` would yield the same result (clearly not the case).
Graphics updates, networking, the list goes on. It turns out that any _useful_ program is going to need to have side-effects, otherwise there’s no way for all of that usefulness to get _outside_ the program where we can see it. So, Haskell’s designers (more specifically, Phillip Wadler) needed to come up with a way to solve this problem not only for standard out, but for _all_ side-effects.
The solution is actually quite simple. To solve the standard out problem with `println`, we just passed a `Vector[String]` around, “modifying” it by returning the new state along with our regular return value. Here’s the inspiration: what if we did that for the _entire universe?_ Instead of passing around just a plain `Vector[String]`, we pass around `Universe`:
As long as the language runtime is able to somehow give us an instance of `Universe` which behaves the way we would expect, then this code will work just fine. Obviously, the runtime can’t _really_ package up the entire cosmos and allow us to get new versions of it, but it can cheat a bit and _pretend_ to give us the entire universe. The language runtime is allowed to implement the `println` function on the `Universe` object in whatever way it deems best (hopefully by actually appending to standard out). Thus, we let the runtime perform whatever magic is necessary while we remain blissfully ignorant of any and all side-effects.
This solves our problems alright, but it’s horrible in almost every other respect. We have this manual threading of the `Universe`, which is both verbose and painful. Even worse, this sort of thing is very error prone (i.e. what happens if we “modify” the universe and then go back and change an older version of it? do we get two, parallel universes?). The heart of our problem now is that we are using the old version of the universe to compute a new version of the universe, and we’re doing that manually. We’re taking something (in this case, the universe) and using its value to compute a new something (the new version of the universe). Sound familiar?
Phillip Wadler’s inspiration was to take advantage of the monad pattern to solve this problem. The result is the `IO` monad:
| ----- |
| |
def foo(bar: String): IO[Int] = {
println(bar) bind { _ => IO(bar.length) }
}
def println(str: String): IO[Unit] = {
// TODO insert magic incantations
}
|
Of course, we can’t _actually_ implement `println`, since this fake language isn’t allowed to involve itself in side-effects. However, the language runtime could provide a native (read: cheat code) implementation of `println` which performs its side-effects and then conjures a new version of `IO` (i.e. the modified universe).
The only catch with this design is that we can never get anything _out_ of `IO`. Once you start down the dark path, forever will it dominate your destiny. The reason for this is one of language purity. Haskell wants to disallow side-effects, but if it were to allow us to pull values out of `IO`, then we could very easily bypass its safe-guards:
| ----- |
| |
def readLine(): IO[String] = {
// TODO insert magic incantations
}
def fakeReadLine(str: String): String = {
val back: IO[String] = readLine()
back.get // whew! doesn't work
}
|
As you can see, if we could pull values _out_ of `IO`, then the whole exercise would become a waste of time, since it would be trivially easy to hide side-effects within wrapper functions.
Of course, none of this is particularly relevant to Scala or Ruby, much less Java! Scala doesn’t restrict side-effects. It allows you to call `println` whenever you feel like it, and it provides a way of declaring mutable variables (`var`). In a sense, Scala is hiding a hypothetical `IO` monad. It’s as if _all_ Scala code is implicitly inside of `IO`, and thus implicitly threading the state of the universe from one moment to the next. In light of this fact, why should we care at all about how `IO` works? Mostly because it’s a monad which is very different from `Thing` and `Option`. I briefly considered using `State`, but that’s too much ancillary complexity when we’re trying to focus on the core idea of using the value from one thing to compute another thing.
### Now What?
So, we’ve identified the monad pattern. We can spot it in code and employ it in an impressive range of situations, but why are we bothering with all the ceremony? If we’re already using monads all over the place without realizing it (e.g. semicolon), then why do we have to bother futzing with all of this nomenclature? Simply put: why do we care that `Option` is a monad so long as it just “does the right thing”?
Well, the first answer to that lies in the nature of mathematics, and by extension, computer programming. As I said, before, mathematics is all about identifying patterns (well, you also play with those patterns to generate larger, more intricate patterns, but that’s really just a means to an end). Once you identify a pattern, you name it. That’s just what mathematicians do. Imagine if Newton hadn’t named the “derivative”, and he simply insisted on calling it “an expression for the line tangent to a given line relative to a particular value of _x_, where _x_ is some variable in the expression for the given line.” For one thing, calculus textbooks everywhere would be about 50 times longer. For another, we probably never would have been able to see “the derivative” as an abstract entity. Partial derivation would have never been devised. Integrals, differential equations, infinite series, and almost all of physics might never have happened. None of these consequences have anything to do with the _name_, that’s just a label. Rather, it was the fact that Newton was able to see (and represent) the derivative in the abstract, as a mathematical _shape_ to be manipulated and applied in novel ways.
If you understand the `Option` monad, then you can use the `Option` monad. You can see places in your code where it can be applied, and you will reap enormous benefits as a result. However, if you understand monads as an abstraction, then you will not only understand `Option`, but also `IO`, `State`, `Parser`, `STM`, the list goes on. Or at least, you will understand the fundamental properties of these constructs (the rest is details). You will begin to see places where you are doing monadic things even when those things don’t fit exactly into the restricted mold of `Option` or `State`. This is where the true utility can be found.
Besides the (vast) improvements in your thought process, there’s also a more immediate practical upshot. It’s possible to define functions which work on monads in the generic sense, rather than specializing in one or the other. Just as Swing programming would be impossible if you had to rewrite every function for each specific instance of `Component`, there are many things which would be impossible (or at least, very very impractical) if you had to rewrite them for each specific monad. One such function is `sequence`:
| ----- |
| |
trait Monad[+M[_]] {
def unit[A](a: A): M[A]
def bind[A, B](m: M[A])(f: A => M[B]): M[B]
}
implicit object ThingMonad extends Monad[Thing] {
def unit[A](a: A) = Thing(a)
def bind[A, B](thing: Thing[A])(f: A => Thing[B]) = thing bind f
There are a lot of ways to pretty this up, but I wanted to be as explicit as possible for the sake of illustration. The general function of `sequence` is to take a `List` of monad instances and return a monad instance of a `List` of those elements. For example:
| ----- |
| |
val nums = List(Some(1), Some(2), Some(3))
val nums2 = sequence(nums) // Some(List(1, 2, 3))
|
The magic is that this function works on _any_ monad:
| ----- |
| |
val nums = List(Thing(1), Thing(2), Thing(3))
val nums2 = sequence(nums) // Thing(List(1, 2, 3))
|
In this case, `Monad` (the trait) is an example of a _typeclass_. Basically, we’re saying that there is this general idea of a monad, and any monad will define two functions: `unit` and `bind`. In this way, define functions which operate on monads without knowing which _specific_ monad we’re manipulating. Think of it like the adapter pattern on steroids (sprinkled with a goodly portion of Scala’s implicit magic).
This is reason number two for understanding the _general_ concept of a monad, rather than just the specific: you suddenly become able to define tons of nifty utility functions. For more examples of this, you need look no further than [Haskell’s standard library][10].
### Those Pesky Axioms
Congratulations! You just made it through an entire monad tutorial without ever having to worry about the monadic axioms or what they mean. Now that you’ve got the concept down (hopefully), you can graduate to the axioms themselves.
As it turns out, the monadic axioms are really quite intuitive. We’ve actually been assuming them all along without ever really saying so. The axioms define how the `unit` (constructor) and `bind` (composition) functions are supposed to behave under certain situations. Think of them a bit like the laws governing integer addition (commutativity, associativity, etc). They don’t tell you _everything_ about monads (actually, they don’t tell you much at all from an intuitive standpoint), but they do give you the basic, mathematical underpinnings.
Excited yet? No, I didn’t think so. Well, here they are anyway, defined in terms of the `Monad` typeclass we used earlier:
def associativity[A, B, C](m: M[A], f: A => M[B], g: B => M[C]) {
val mf: M[B] = tc.bind(m)(f)
val mg: M[C] = tc.bind(mf)(g)
val mg2: M[C] = tc.bind(m) { a =>
tc.bind(f(a))(g)
}
assert(mg == mg2)
}
forAll { (m, f, g) => associativity(m, f, g) }
}
|
The first two axioms (the “identity” ones) are basically saying that the `unit` function is a simple constructor with respect to the `bind` function. Thus, when `bind` “pulls apart” the monad and passes the value to its function parameter, that value will be precisely the value that `unit` puts _into_ the monad. Likewise, if the function parameter given to `bind` simply takes the value and wraps it back up inside the monad, then the final result is exactly the same as if we had left the monad alone.
The third axiom is the most complicated to express, but I think it’s actually the most intuitive. Basically, this axiom is saying that if you first `bind` with one function, then you `bind` the result against another function, that’s the same as applying the first function to the value _inside_ the monad and then calling `bind` on the result of that application. This isn’t exactly associativity in the classical sense, but you can sort of think about it in that way.
One of the useful consequences of the second law comes up (quite frequently) when you have one `bind` nested inside the another. Whenever you find yourself in that situation, you can move the nested `bind` outside of the outer `bind`, flattening your code slightly. Like so:
The rewritten code has a much more “sequential” feel (the essence of monads!) and is almost always shorter than the nested form.
As I said before, the axioms are very, very intuitive once you understand the abstract concept of a monad. They may not be intuitive to _express_, but their consequences are very easy to understand and quite natural in practice. So, don’t spend a lot of time trying to memorize the axioms. Your time will be better spent contemplating the way that semicolon works.
### Conclusion
Monads are _not_ scary. They are not complex, academic or esoteric. Monads are an abstract, mathematical label affixed to a pattern found in almost all code. [We all use monads every day][11]. The hardest part in understanding monads is recognizing that the hardest part isn’t so hard after all.
I sincerely hope that this latest, dubious venture down the well-trod path of monad exposition has proven a fruitful one. I can say without hesitation that the understanding and perspective which comes from a solid grasp of monads is invaluable in very practical, down-to-earth coding (even in staid languages like Java!). Monads impart an understanding of the very fabric of sequential computation and composability, and if that isn’t sufficient motivation to learn, I don’t know what is.
[Add to DZone][12] [Digg it][13] [Bookmark with Del.icio.us][14]
### Comments
1. Great tutorial (and good writing). I enjoyed it very much. Thank you.
Ran.
Ran Biron Monday, December 27, 2010 at 4:05 am
2. Marvellous! Thank you very much for the explanation from the pragmatic programmer’s point of view.
[thSoft][15] Monday, December 27, 2010 at 5:51 am
3. Your article fails to explain what monads are, and why they are useful.
[Anonymous][16] Monday, December 27, 2010 at 10:10 am
4. IMO, the only way to understand monads is by writing a bunch of combinator libraries, noticing the resulting duplication, and then discovering for yourself that monads let you factor out this duplication. In discovering this, everyone builds some intuition for what a monad is… but this intuition isn’t the sort of thing that you can communicate to someone else directly – it seems everyone has to go through the same experience of generalizing to monads from some concrete examples of combinator libraries. Without this grounding, monads (or any other typeclass) are just going to seem like abstract nonsense – yeah, maybe you can sort of ape what they are, but there won’t be much real understanding. ![:\)][17]
[Paul Chiusano][18] Monday, December 27, 2010 at 10:45 am
5. A good point well made. When I’ve tutored maths in the past, I’ve found that it’s very easy for people to over-complicate things. Often part of the lesson is that there is less to learn than you think.
[Rich Dougherty][19] Monday, December 27, 2010 at 2:38 pm
6.@Anonymous
I’m not sure I understand what you mean. The article provides both a precise mathematical and an informal, intuitive explanation of *what* monads are. I tried to provide the *why* as well, though I certainly may have fallen short in that regard. What would you suggest?
@Paul Chiusano
Big +1. Combinators were the breakthrough for me as well. When I implemented my first parser combinator library, that was when the light dawned and I saw the connection between Parser, IO, Option, List and State. ‘Twas quite exciting. ![:-\)][17]
[Daniel Spiewak][20] Monday, December 27, 2010 at 6:16 pm
7. You have misplaced the variance annotation on Monad.
It should be trait Monad[M[+_]].
(or missing entirely in the event you want to support monads over mutable data types)
[Edward Kmett][21] Tuesday, December 28, 2010 at 12:10 am
8. Thank you! Most of the articles on monads seem to fall into two categories: “I’ve just had a Monad Epiphany and I want to share how magical they are!” and “Monads are obvious and I’m not sure why its necessary for me to write this article anyways, but here you go, little ones…”
Kudos to you for making it all seem down-to-earth, practical and accessible.
Frank Davis Tuesday, December 28, 2010 at 6:37 am
9.@Edward
There is no reason to apply the variance annotation to M’s parameter. We never see a use of M[A] <:M[B]whereA<:BsimplybecauseweneverhaveanyinformationaboutAandBinsidethetypeclassusagepoints(they’realwaysuniversallyquantified).Assuch,avarianceannotationherewouldn’tbenefitus.
Also, omitting that variance annotation doesn’t provide any more safety in the face of mutable containers. Scala’s type system has a bit of a flaw in this regard: a variance annotation on a higher-kinded parameter restricts the instantiation to types which exhibit exactly that variance, but the lack of a variance annotation allows *any* types, regardless of variance. This is a very serious problem for containers like HOMap (see: my “high wizardry” talk).
Finally, the variance annotation must be on the monad (and not its parameter) to allow for things like this:
val nums = List(Some(1), Some(2), Some(3))
sequence(nums) // this right here!
Without the variance annotation, the call to sequence will fail as there is no typeclass for type Some[Int]. The only way to make this succeed (other than ascribing a type to nums) is to ensure that the typeclass for Option also applies to its subtypes.
[Daniel Spiewak][20] Tuesday, December 28, 2010 at 8:09 am
10. Nicely done! Thanks for an exposition that gets to the heart of it.
[Joel Neely][22] Tuesday, December 28, 2010 at 10:51 pm
11. Dude thanks a bunch,
Been reading for days about monads trying to figure out what are they but to no avail…Internet seems to be filled with academic talk about monads that are too hard for someone who has only a basic understanding of functional programming, or irritating examples that are not good in explaining the abstraction that is this high.
This is by far the best tutorial on monads that i have ever seen.
Thanks again
Miklos Wednesday, December 29, 2010 at 8:22 am
12. “Your article fails to explain what monads are, and why they are useful.”
Monads is a mathematical concept that is preventing Scala from achieving a more popular status. As long as we continue to have these pseudo academic articles, people will be turned off.
Chad Wednesday, December 29, 2010 at 10:18 am
13.@Chad
I’m not sure how you can support that claim. Java implements a large number of concepts which were purely mathematical only a few decades ago. Programming itself only existed in the minds of computational theorists a mere half-century ago. In short: people learn, adapt and grow. Those who don’t are doomed to maintain systems written in FORTRAN (and its successors). Academia has always (and will continue to) develop concepts and ideas which eventually find their way into widespread adoption. Mathematics is the tool by which such concepts are initially conceived and explored, and “pseudo academic articles” such as this are the tools by which these concepts are brought to wider attention.
[Daniel Spiewak][20] Wednesday, December 29, 2010 at 10:27 am
14. First of all, thank you for creating the first article about monads I could read from start to finish.
About the conclusion:
“Monads are not scary. They are not complex, academic or esoteric. Monads are an abstract, mathematical label affixed to a pattern found in almost all code. We all use monads every day. The hardest part in understanding monads is recognizing that the hardest part isn't so hard after all.”
I found paradoxical the fact all the other articles failed at some point to explain the concept of monads. Or they didn’t understood it, or they are not so “no complex”.
![:\)][17]
OscarRyz Wednesday, December 29, 2010 at 4:17 pm
15. Excellent article. I’ve read so many horrible explanations of Monads before. There are the vague analogies but there are also the overly technical/jargonized explanations that are just as bad. Many people have been looking for practical explanations like this, but they are sorely lacking. So why stop now? What are you going to tackle next?
david Wednesday, December 29, 2010 at 9:01 pm
16.@Daniel: I’m glad you broke your promise. There are, indeed, far too many tutorials on Monads already, and I have read about 1/3 of them myself. But yours is organized, clear, readable, and well-illustrated with examples and motivation. You have a good talent for writing up issues in a readable fashion, and I will be bookmarking this as one of the best Monad tutorials I have read.
[Michael Chermside][23] Thursday, December 30, 2010 at 10:21 am
17. Great tutorial. I realize that this syntax is pseduo-code, but I think that the variables “a” and “x” should match.
forAll { (x, f) => identity1(a, f) }
[Craig P. Motlin][24] Thursday, December 30, 2010 at 12:09 pm
18.@Craig
Nice catch! I’ve updated the snippet.
[Daniel Spiewak][20] Thursday, December 30, 2010 at 10:40 pm
19. This is definitely one of the best articles I’ve read on monads, in terms of explaining the underpinnings without being overly jargon-y. That being said, I will agree with some of the earlier commenters; while you have said what a monad is, the article should really start with a, “what is this, and why is it useful to you?” kind of paragraph. It’s really not clear how monads can benefit your code, where it might be useful, etc. from this article. If it helps any, no monad article I’ve ever read really explains it either. Again, great article, but it definitely could use a “why this matters to you” intro.
J.Ja
[Justin James][25] Friday, December 31, 2010 at 10:29 pm
20. Daniel,
Great article, the first that (hopefully) made me understand Monads. I’d only hoped for a recommendation on where to read on about Haskell’s use of Monads for IO & co. – conceptually, without learning Haskell itself (I know I should but hey, time is short) – without needing to resort to one of those articles not as easy to understand as yours.
Tonis Morris wrote in the comment: “Java has a monad built right in – the throws keyword (this is the disjoint union monad aka (Either a)).”
What does that mean? Is it really the throws keyword or more the throw keyword (which holds the current stacktrace and attaches it to the Exception following the throw keyword; effectively the bind function is to provide/create an Exception and the parameter to this function is the current stacktrace)?
My google search for “disjoint union monad” turned out be be even less effective than finding a good explanation what a monad is. ![:-\)][17]
Horst Makitta Friday, January 7, 2011 at 1:51 pm
22. Hello Daniel,
I’m new to Scala and try understand Your article, but have difficulties: following code from your article does not compile:
Daniel Michalik Friday, January 7, 2011 at 11:09 pm
24. Just a quick thing,
While you TECHNICALLY can’t pull values out of ReadLine, Haskell uses some syntactic sugar to make the nested form easy to write, so you can use closures.
// Scala
object MonadDemo extends MonadicApplication { //Pretend this defines a Main method to invoke iomain.
def iomain : IO[Unit] = print(“What is your name? “) bind { _ =>
readLine bind { name =>
printLine(“Hello, ” \+ name + “!”) } } }
}
; haskell
main :: IO ()
let main = do
print “What is your name?”
name <\- readLine
printLine "Hello, " ++ name ++ "!"
// Scala AGAIN (defines a similar syntax, but it works for flatMap, not bind.
object MonadDemo extends MonadicApplication { //Pretend this defines a Main method to invoke iomain.
def iomain : IO[Unit] = for {
_ <\- print("Whatisyourname?")
name <\- readLine
_ <\- printLine("Hello," \+ name+"!")
}
anonymous Saturday, January 8, 2011 at 9:32 pm
25. val mg2: M[C] = tc.bind(m) { a =>
tc.bind(f(a))(g)
}
should probably be
val mg2: M[C] = tc.bind(m) { a =>
tc.bind( tc.unit( f ( a ) ) )( g )
}
Marius Monday, January 31, 2011 at 4:45 am
26. Daniel,
Very nice article on monads. I remember having difficulty comprehending Monads when I was implementing my parser in Haskell for my senior thesis. The idea did eventually click, but the manifold articles on monads failed to sufficiently aid in that regard. Your article was easy for me to digest and enabled me to re-connect to the monadic way of thinking.
[Daniel Knight][26] Thursday, February 3, 2011 at 4:24 pm
27. Perhaps I am missing something, but off top of my head, I am thinking the following is semantically incorrect, because it makes all None equivalent. (Also doesn’t this force Nothing to a supertype of every possible A, so that bind can always be called with the same function whether it is Some or a None?)
case object None extends Option[Nothing] {
def bind[B](f: Nothing => Option[B]) = None
}
Is that the way it is implemented in Scala standard library? Seems to me that None should be parametrized too, so that a None for one type (e.g. String) isn’t equal to a None for another which is not covariant (e.g. Int).
case class None[+A] extends Option[A] {
def bind[B](f: A => Option[B]) = None
}
What is my mistake?
[Shelby Moore III][27] Monday, February 21, 2011 at 9:37 am
28. Correct typo in my prior comment:
case class None[+A] extends Option[A] {
def bind[B](f: A => Option[B]) = None[B]
}
[Shelby Moore III][27] Monday, February 21, 2011 at 10:32 pm
29. Caveat: none of the following code is tested and I am new to Scala and have never installed the Scala (nor the Java) compiler.
Daniel’s “_typeclass_” is a fully generalized convention for declaring static methods of an interface. Imagine you could declare static methods in a trait with this pseudo-code.
trait Monad[+X] {
static def unit[Y] : Y => Monad[Y]
def bind[M M) => M
}
sealed trait Option[+X] extends Monad[X] {
static def unit[Y]( y : Y ) : Monad[Y] = Some( y )
}
To get legal Scala, this is translated as follows, noting the +, -, or no variance annotation on M depend on where Monad appears in the static methods of Monad.
Before we can add the cases for Option, note that Monad requires “unit” to be invertible, i.e. bijective, but None has no inverse, so we need an injective monad.
Note my version of Daniels sequence will work with both bijective Monad and injective InjMonad, because the call to bind is a method of the instance; whereas, Daniel’s version assumed the injective monad and I see no possible way to fix it using his convention of implicit duck typing of non-static methods. His is an example of how duck typing breaks composability.
==================
**** Monad Theory ****
==================
The best layman’s explanation I have found so far is “_Comprehending Monads_” by Philip Wadler, 1992. Google for the PDF.
Conceptually a Monad has three functions:
unit : X -> M[X]
map : (X -> Y) -> M[X] -> M[Y]
join: M[M[X]] -> M[X]
The map function might be curried two ways:
map : (X -> Y) -> (M[X] -> M[Y])
map : M[X] -> ((X -> Y) -> M[Y]) // Will use this for trait below
We must overload the map function, if M is not same type as N, because otherwise map will not know which “unit” to call (in order to lift Y => M[Y]), because overloading on return type is [ambiguous due to covariance][28]:
map : (Y -> M[Y]) -> (X -> Y) -> N[X] -> M[Y]
bind : (X -> M[Y]) -> N[X] -> M[Y]
map a b = bind x -> a b x
The reason I rephrased the abstracted monad as a inherited trait with static methods, is so far in my research, I don’t agree with a general “implicit” keyword for a language design, because the general use of duck typing can violate the localized single-point-of-truth (SPOT) and can make semantic assumptions that were not intended, because duck typing forces all traits and classes to share the same member namespace, and thus essentially bypasses the [behavioral conditions][29] of the Liskov Substitution Principle contract of OOP. Also, since duck typing does not explicitly state which interfaces are required at the SPOT of the trait or class declaration, there is no way to know which interfaces are available by looking in one place. Localization (separation) of concerns is a critical attribute of reusable/scalable software design. Again the following is pseudo-code for the translation of static methods to implicit, but now fully generalized to monad theory.
trait Monad[+X] {
static def unit[Y] : Y => Monad[Y]
def bind[M M) => M
def map[M M, b : X => Y ) : M = bind x => a b x // bind( x => a( b( x ) ) )
static def join[M M[Y]
}
But the above trait won’t work for monads whose “unit” is not bijective, i.e. where the inverse of “unit” is lossy, e.g. None option has no inverse. The injective monads thus know which “unit” to call, thus we could add a map to our prior injective monad, which does not input a “unit”.
def bind[Y]( f : X => Option[Y] ) : Option[Y] = f( value )
def map[Y]( f : X => Y ) : Option[Y] = ObjectStaticMethod.unit( f( value ) ) // Some( f( value ) )
}
case class None[+X] extends Option[X] {
def bind[Y]( f : X => Option[Y] ) : Option[Y] = None[Y]
def map[Y]( f : X => Y ) : Option[Y] = None[Y]
}
[Shelby Moore III][27] Tuesday, February 22, 2011 at 9:15 am
30. The pre tags in my prior comment did not indent as expected (at least not in my browser). If you want to read a version of my prior comment with the code properly indented, [click here][30].
[Shelby Moore III][27] Tuesday, February 22, 2011 at 9:19 am
31.@Shelby Moore III:
The None object works because of covariance rules.
None has no value. As such, get never returns (it throws an exception), which can also be though of as having the return type Nothing. (It returns Nothing, get it?)
Now, in Option, the return type of get is the type parameter. So, Option[X]#get returns an X.
But, None.get returns Nothing, by throwing an exception.
Thus, None is an Option[Nothing].
But wait, there’s more! Because of covarience, Option[Nothing] is a subtype of Option[X] forall X. Thus, None is a subtype of Option[X] forall x. Thus, you only need one None!
anonymous Tuesday, February 22, 2011 at 5:07 pm
32.@anonymous
Thanks for the explanation, I didn’t realize that.
However, I stand by my improvement, because [exceptions are a design error][31].
Read the link for detailed logic on why exceptions are bad, but suffice it to say that programs should be designed such that they force conditionals in the code due to type cases, instead of exceptions which put the program in a random state and thus are not referentially transparent. All exceptions can be converted into pre- and/or post-conditions, which can be modeled with case classes. The only exceptions that should be allowed are those that necessary due to performance justification, e.g. integer overflow which happens so rarely that it is not usually justified to force an Option wrapper on every integer.
[Shelby Moore III][27] Tuesday, February 22, 2011 at 8:06 pm
33.@Shelby
The problem is that your improvement has changed the entire shape of the Option ADT. It’s no longer Option, it’s something subtly different. In fact, it’s something which I’m not sure is even an ADT anymore, since you have an unused type parameter on `None`, which most algebras would not allow.
You don’t need to resort to a parameterized None to avoid exceptions: just don’t have a #get method! One of the major achievements of Option is that it allows you to handle emptiness (what Java would call “nullability”) without ever introducing the possibility of an exception.
Why does _unit_ need to be bijective? If that were indeed a requirement, then no monad could have a zero (Option isn’t the only case where this arises).
[Daniel Spiewak][20] Tuesday, February 22, 2011 at 8:34 pm
34.@Daniel
Hi Daniel, btw thanks so much for explaining to me in this article how to handle statics in a “_typeclass_“, so far that seems to be exactly what I was pondering around for recently.
The type parameter +X of None is used when it extends Option[X].
It is still an ADT, as None is just another state of Option, and we can (no matter what you do) and should be able to read that state, just as we can read any of the other state(s) of an ADT. The states of an ADT are the either compile-time or runtime reflection IsInstanceOf match on subclass, not the member data of each subclass of the supertype of the ADT. Conceptually an ADT is just a plurality of classes that inherit from a supertype, the rest is just syntatical sugar. A match on case classes is just a runtime test against InInstanceOf, with some syntactical sugar to extract the members of the subclass.
Thus any “get” is really just a match on subclass type, and thus the existence of class None can not be denied. For example, if any code does an equality test (==) on two instances of Option which happen to be None, then it is effectively doing a get, even if there is no get method. The existence of class None forces that it is exists and can be tested for. There is no way to hide it, it will _leak_ out some where.
(btw in reference to your question about STM at stackoverflow, this phenomenon of leaking, a/k/a aliasing error, is why I concluded in 2008 at LtU that STM would never work, and I now have an idea for scalable/composable solution in mind for concurrency, perhaps even better than Actors).
I was not saying that all monads need to have a bijective _unit_, nor have I yet taken the time to verify if they pass the necessary axioms. I was just allowing in my tentative design to be fully generalized to either type of monad, because in my design _bind_ is called on an instance, not through an implicit static. In my design, only the static functions are called through the implicit static of your clever typeclass (btw, thanks for that!). So far I have not found a use for bijective unit, maybe that is why Haskell does not have one, even though Wadler’s _Comprehending Monads_ paper seems to allow for them, but I need to go look more closely at this issue.
Apparently the improvement of my design (to inherit each instance from Monad interface and) to call bind without static implicit (compare the sequence function), is an improvement not just because of potential bijective generality, but also so any future code in the universe does not accidentally get interpreted as a Monad if it happens to collide in a global namespace (yuk! like Javascript globals) and also an implicit static parameter won’t be needed in many cases, so it reduces the boilerplate.
Note that the apparent reason for a bind, is it has the advantage over map, in that it can be chained more easily, as demonstrated in your sequence example.
I am not saying I am 100% sure I am correct (as I am in the deep design phase with many ideas fluid), I am putting these ideas out there for scrutiny and to help me with my design of Copute (which will output to Scala), and to help others including Scala, Java, etc.
[Shelby Moore III][27] Wednesday, February 23, 2011 at 1:26 am
35. Correction: I forgot that the following does not occur in your implicit statics (objects) design, because an object has to be declared for each class you want to lift to a Monad[+M[_]]. The comment below is perhaps applicable to use of duck typing and/or implicit in other ways.
“also so any future code in the universe does not accidentally get interpreted as a Monad if it happens to collide in a global namespace”
[Shelby Moore III][27] Wednesday, February 23, 2011 at 1:35 am
36. The other advantage of my design was to keep the single-point-of-truth localized for the _bind_ method, because it inherits from a monad trait, not implied via implicit objects boilerplate. Your typeclass implicit objects solution is absolutely necessarily for the static methods _unit_ and _join_ of monads.
[Shelby Moore III][27] Wednesday, February 23, 2011 at 1:39 am
37. I forgot to mention that my improvement to make None distinct for different type parameters, did not diminish its ability to be a state for “nothing” or emptiness (it can not be a stateless condition and exist simultaneously). Agreed the Exception monad (a/k/a Option or Maybe) is [necessary for eliminating exceptions][32].
[Shelby Moore III][27] Wednesday, February 23, 2011 at 1:55 am
38.@anonymous
Please excuse my ignorance, why is Option[Nothing] a subtype of Option[+X] for all X? Scala’s [variance annotation docs][33] say that for +T then Stack[T] is a subtype of Stack[S] if T is a subtype of S. How and where was Nothing declared to be a subtype of all X?
@Daniel
I mean to say that “emptiness” can not be a stateless condition and also exist in a program. Whether None.bind returns itself parametrized, or not, to represent that state of emptiness, does not make the existence of emptiness stateless. Rather returning None instead of None[X] makes all None share the same state for all X. So this means when I compare two None, I get equivalence, even if they represent an option with different type parameters. Hiding state behind an abstraction of privacy or barrier (e.g. an attempt to not provide a get method, or attempt to hide lack of concurrency behind an STM transaction abstraction) as being nothing but fooling oneself. Coase’s Theorem insures that any barrier will fail (as does Godel, Halting, 2nd Law of Thermo, etc).
[Shelby Moore III][27] Wednesday, February 23, 2011 at 2:21 am
39.@Shelby Moore III
Isn’t your argument against exceptions actually an argument for checked exceptions?
The problem you describe only holds for unchecked exceptions.
Horst Makitta Wednesday, February 23, 2011 at 4:40 am
40.@anonymous
I located the [source code][34] for the Scala Option, and the [semantics of Nothing][35]. So indeed you are correct, _get_ method throws an exception and Scala does a trick with Nothing, it makes it a subtype of everything. There is no way to declare such a semantics for a user declared class in the language, so this is a type built-in to the compiler, that is why it doesn’t follow normal rules. Another one of those special case “gotchas” that can make Scala frustrating for novices to learn. Maybe it was a necessary semantic in order to tightly (bytecode level?) interopt with Java and null? My first guiding principle for language and library design is to K.I.S.S., while maintaining **necessary** generality. Some forms of generality are not necessary and undesirable in language and library design, [click here for a thoughtful blog about that for API case written by John De Goes][36].
@Daniel
That Option supports a _get_ method is I guess some requirement of the integration of the overall standard library, because it is a semantic error to have such a method on an Exception monad type. The Exception monad should only be available by match case or otherwise testing the case class as a door to accessing the class member for the value. I’m sure you agree that in a strict type-safe static typing language, one must not be allowed to access members on a class unless they first have a reference to that class (should never get a class member does not exist exception as in dynamically typed languages), thus the _get_ method is a semantic error because there is not always value in an Option.
The _get_ method is orthogonal to the other semantic error– that None is not parametrized.
#Horst
As far as what I know up to now, it depends what we mean by “exception”. If we mean forcing the checking (i.e. unboxing) of an Exception monad by mapping all pre or post-conditions as I described at the [second link I had provided (click here)][32], then my logic-up-to-now says yes. If we mean throwing an exception and check it with catch, then no, for the reasons I stated in the [first link I had provided (click here)][31], because a function that throws an exception is not referentially transparent and thus not composable.
[Shelby Moore III][27] Wednesday, February 23, 2011 at 6:24 am
41.@Shelby
You made a lot of points, and I can’t really go through and address them all. So, I’ll just pick two. First of all, Nothing is not for Java interop (that would be Null). Nothing is defined as the subtype of all types (partially for that reason, it has no class, no members and cannot be instantiated). Theoretically speaking, it is exactly the bottom type, just as Any is the top type. Thus, Nothing is no more magical than Any, which is also a type that users cannot declare (since it doesn’t inherit from anything). Bottom and top types are _both_ extremely important in type systems which have subtyping, and doubly so in subtyping systems which have variance. The fact that Java gets away without a bottom type is something of an academic punch-line. Trust me, Java’s lack of bottom causes far more user confusion and corner cases than if it had bottom (as Scala does), it’s just that most people are accustomed to subconsciously working around those issues, so we don’t really notice it anymore.
Again, Option doesn’t need a #get method. None’s lack of parameterization is not a semantic error nor does it make an exception in #get unavoidable. As you said, you can certainly just not define #get and rely on pattern matching all over the place, but that would basically be the same as inlining the #get expression all over one’s code, so there really hasn’t been an improvement.
The trick to avoiding #get (and it’s exceptional semantics) altogether is a method like the following:
def getOrElse[A, B >: A](opt: Option[A])(b: =>B) = opt match {
case Some(a) => a
case None => b
}
No exceptions. No corner cases. Very simple, very composable.
[Daniel Spiewak][20] Wednesday, February 23, 2011 at 8:05 am
42. Wikipedia explains [map and join][37] succinctly.
[Shelby Moore III][27] Wednesday, February 23, 2011 at 10:46 am
43.@Daniel
1\. If the lack of parametrization on None is a problem, it is regardless of #get (#get is an orthogonal issue that has nothing to do with the type parametrization). I envisioned one potential problem with None’s lack of parametrization in that I assume r1 and r2 are true below.
def isEqual( a : Option[_], b : Option[_] ) = a == b
Now I think there is also a problem due to JVM’s overzealous type erasure:
ol.isInstanceOf[None] == os.isInstanceOf[None]
Now I also realize that the correct semantic is None#equals should always return false, and that would solve one specific problem without type parametrization of None. So maybe that is a bug report that needs to be filed.
I am trying to think if there is any other possible problem with None erasing its supertype parametrization via the magic of Nothing. I guess given that JVM can only store the head class (overzealous type erasure), then there is no extra danger in this:
var v : Option[Any] = Option[Int]
v = None()
Tangent: type erasure is apparently more efficient than reification (and without significant tradeoffs) because only need one compiled version of each generic function, but I read that <ahref='[JVM’soverzealouserasingofthetypesignatures][38]causesunnecessarycasting,whichhasaperformancepenaltyaswellascausesamyraidofotherissuessuchastheabove.ItisquitepossiblethatIstilldon’tunderstandwellthenuancesofJVM’stypeerasure.
2\. I am pondering that None would be more correct if it was parametrized, even if #equals is made correct, in case Scala was ever ported one day to a VM without overzealous type erasure. Somewhere I read that Odersky had stated that Scala on .Net would not fully interopt because Scala is implemented with type erasure on .Net.
3\. I am acknowledging that bottom type Nothing is apparently important somehow, since I have not seen a case yet, I will reserve judgment until I do.
4\. Agreed that a replacement for #get could be implemented in a type-safe manner, but it is not now, so for now there is a semantic error in the Scala library. I wrote in my prior comment to use “match cases”, which is what you have done by code example in your reply. Thanks.
[Shelby Moore III][27] Wednesday, February 23, 2011 at 12:19 pm
44.@Daniel
Here are two examples where a universal top-root supertype Any and bottom-tips subtype Nothing, are necessary.
A. Scala’s [example immutable Stack][33], will allow adding a type that has no user-defined common root, because every type inherits from Any and contravariant type parameters inherit from Nothing.
B. To support dynamic typing in Scala, one would cast to Any and then cast back to the expected duck type on use. Ditto Nothing would be used to cast contravariant type parameters.
[Shelby Moore III][27] Wednesday, February 23, 2011 at 12:44 pm
45. One Div Zero: Bottom types (and why they are useful) explained.
So long as co- and contravariance were supported at runtime, there would be no problem with Nothing not being parametrized, as far as I can tell.
Remember, covariance means that subtyped parameters imply subtyped types, i.e.,
case class C[+A](a : A); // +A means covariant annotation.
val c = new C(List(1, 2, 3))
c match {
case C(tr : Transversable[Int]) => // Transversable[Int] is a supertype of List[Int].
for (it ” \+ it) // This executes, because C[List[Int]] is substitutable for C[Transversable[Int]].
case _ =>
println(“WTF?”) // This does NOT excecute.
}
OK, so let’s assume that parameterized Nothing, is correct and necessary. I’ll ignore the map and bind functions for now. (Incidentally, defining map, flatMap, and foreach allows Scala to use your object in a for comprehension, which is basically a recursive lisp macro baked into the language.)
Let’s remember to annotate for covariance.
sealed trait Option[+T] {
def get : T
def getOrElse[U >: T](fn : => U) : U
def map[R](op : T => R) : Option[R]
def flatMap[R](op : T => Option[R]) : Option[R]
}
case class Some[+T](value : T) extends Option[T] {
// Don’t care about implementations.
}
case class None[+T] extends Option[T] {
// Don’t care about implementations.
}
This is all well and good. However, note that there is no useful information in a None object except for the fact that it is None (it isn’t mutable, and the constructor takes no arguments), and its parametrized type (which gets erased at runtime anyway).
Now, of course, all those None objects could very well eat up space on the heap. So, we could make some sort of pool for them. We could allocate one for each type, but that would be unnecessary–because of Liskov substitution, we only need to allocate one None object for every lowest common subtype of a type hierarchy. That is, you wouldn’t need to allocate a None[Number] if you already had a None[Int], for example.
Of course, thanks to bottom type, you already have *the* lowest common type, of all types. Thus, our pool shrinks to one object which is Liskov substitutable for any of your None[X] types.
Hope this helps. If it doesn’t, maybe Daniel will write a post on it.
anonymous Wednesday, February 23, 2011 at 7:38 pm
46. For some reason the blog is not accepting my post under my former name. This is very important.
Any and Nothing
Iry’s article predicates that List[Nothing], a/k/a Nil, is necessary because List#head would otherwise throw an exception for an empty list. But there is another way, List[T]#head could return an Option[T]. In the way Option is currently structured, that would be an Option[Nothing], but I proposed instead it could be a None[T]. So seems Nothing is not required in this use case? Is there any other compelling reason for Nil? My K.I.S.S. design instincts tell me to avoid unnecessary special case idioms, so I would toss Nil if does not have any other compelling reason to exist. Note the “stupid” in KISS is not the designer, it meanings do not reduce abstraction with more refined knowledge (more narrow subtypes) than necessary.
Maybe I am mistaken, but in the referentially transparent calculus, a function can never return Nothing (a/k/a Void), because a function must not have side-effects, thus it has no reason to exist (or be called) if does not return a value. Thus an instance of Nothing can never be created in such calculus, and can only be referenced by cast from a contravariant type parameter. In short, Nothing can never be subtituted for another type, except as a contravariant type parameter.
If we reason about a type as analogous to an overloaded function where covariance is defined by overloads that have contravariant parameter, and covariant return, types (where all types are just functions), then the Nothing type is the infinite set of functions that take parameters of type Any and return Nothing, and the Any type is the infinite set of functions that take parameters of type Nothing and return Any.
Thus Any and Nothing are only ever strictly necessary as references to a concrete covariant or contravariant instance respectively, because any program can be restructured into a referentially transparent one (e.g. by using a State monad).
The **key insight** is that a type system is not substitutable in the covariant and contravariant directions.
Thus Option[Nothing] or List[Nothing] when used as the return type where there is a contravariant type parameter, expresses that the type parameter can be anything. But this violates the contract that Liskov Substitution Principle [depends on][39] that a supertype has a greater set of possible subtypes, than any of its potential subtypes. A covariant type system is (injective) one-to-many in the covariant direction, but this is not invertible (bijective) in contravariant direction. [Generality does not increase in the contravariant direction, but generality does decrease in the covariant direction.][28]. In short, Nothing must never be on the right-hand side (rhs) of a substitution in a covariant type system (except as a contravariant type parameter). Violation of this rule creates aliasing error, as I explained in my prior comments– None (a/k/a Option[Nothing]) erases the concrete type parameter of an instance and means literally “I forgot and I am in unknown random state” (no many-to-one mapping allowed in the type system). The Scala compiler should be giving an error when Nothing is substituted for a covariant type parameter in Option[Nothing].
Unlike Any, Nothing should be only a reference type, mean you can substitute any contravariant subtype to it, and cast back to any contravariant subtype from Nothing. You must never substitute Nothing for covariant type. Maybe Scala gets away with this for now due to type erasure (or maybe we can find an edge case where it fails), but it is on shaky ground.
Unless someone can find a hole in my analysis, I am thinking to email this to Odersky or file a bug report for Scala on this. But certainly I must be wrong? This is already well entrenched in Scala. I hope I am wrong.
@anonymous
Hopefully I have explained why the cost of a single None for all T, is semantically erroneous and will break. If I am mistaken, I appreciate anyone that can elucidate.
III erooM yblehS Thursday, February 24, 2011 at 3:47 am
47. Correction, make that:
The [b]key insight[/b] is that a type system is not substitutable in the contravariant direction.
III erooM yblehS Thursday, February 24, 2011 at 4:12 am
48. Correction #2:
Thus Option[Nothing] or List[Nothing] when used as the return type where there is a _covariant_ type parameter…
III erooM yblehS Thursday, February 24, 2011 at 4:18 am
49. Okay after further thought, although (so far) I maintain my stance against Nil and Option[None], because Option[T] could suffice for an empty container…
I realize that, None extends **Option[Nothing], is _not_ a problem**, because it can not cause a reference to an instance of Nothing. Even if there are references to Option[T] for different T, that point to instances of Option[Nothing], there can never be different references pointing to Nothing. Even if we cast the Option[T] references to Option[Nothing] references, then we’ve lost the ability to cast back to Option[T], but this is not a problem per se– it is the None designers choice. It is oddity that is not available in the contravariant direction (i.e. with Any), because contravariance only occurs on type parameters. The designer is implying that None are either always equal or never equal, regardless of supertype. Thus, the only potential problem with None, is if #equals does not always return false if either parameter is None (a/k/a Option[Nothing]). Also the orthogonal problem that #get throws an exception.
[Click here for a summary][40] that provides context. I excerpt the key portions.
Any (covariant) type, except Nothing, may be substituted for (i.e. assigned to) the abstract type Any (the most general type), i.e. all (covariant) types implicitly inherit from Any. Whereas, an Any must not be substituted for (i.e. assigned to) another (covariant) type, without a runtime cast that has been suitably conditioned to avoid throwing an exception. A Nothing will never be substituted for (i.e. assigned to) another type, because a reference to a Nothing can never be created.
Any contravariant type parameter (only type parameters have the possibility to be contravariant), including Any, may be substituted for (i.e. assigned to) the abstract type Nothing (the least general type, which means “nothing”), i.e. all contravariant types implicitly inherit from Nothing. When a Nothing only occurs in the context of inheritance from a covariant, or substitution by a contravariant, type parameter, it will never create a reference to a Nothing because due to substitutability rules a contravariant type parameter can never be returned nor read, and in the covariant inheritance case, an instance of Nothing is never allowed to be constructed, because by definition Nothing does not know which covariance tree branch its instance may be on.
[Shelby Moore III][27] Thursday, February 24, 2011 at 8:42 am
50.@Shelby Moore III:
For another reason your last three blog entries did not show up in the correct order, your entry from 8:42am was visible before (!) the posts signed as “III erooM yblehS” (also it appears the dates on that posts are wrong).
Maybe a problem with the system time on the server?
You state:
“Thus Option[Nothing] or List[Nothing] when used as the return type where there is a contravariant type parameter, expresses that the type parameter can be anything. But this violates the contract that Liskov Substitution Principle depends on that a supertype has a greater set of possible subtypes, than any of its potential subtypes.”
However the link given does not state/prove this, it only states “Covariant means that a derived class’s overriding function must have equal or more restrictive return types than the base class’s overridden function. Contravariant means that a derived class’s overriding function can have equal or less restrictive arguments than the base class’s overridden function.”
Nothing being at the bottom of the type system means that List[Nothing] is accepted as a parameter for all methods that accept List[T] for any T. As an empty list simply allows for this, it does not violate LSP.
“Iry's article predicates that List[Nothing], a/k/a Nil, is necessary because List#head would otherwise throw an exception for an empty list.”
Actually James Iry’s article gives a number of reasons why a bottom type is required for any sound and complete type system that supports sub-types and this is only an example for one of these reasons.
“But there is another way, List[T]#head could return an Option[T].”
But it doesn’t.
“Violation of this rule creates aliasing error”
I don’t see the aliasing error. Nothing equals Nothing for any instance of Nothing in Scala (actually the number of instances of Nothing are limited; the number is 0 ![;-\)][41]
Also None equals None, but this time we actually have one instance!
“(no many-to-one mapping allowed in the type system)”
Why not, we are at the [b]bottom of all types[/b], and there are no instances of Nothing anyway.
If i buy a lot in the lottery and i draw a blank, i don’t need to care what currency the price is! ![;-\)][41]
Horst Makitta Thursday, February 24, 2011 at 11:13 am
51.@Horst
Hello,
I later discovered via trial&error that my posts were not being accepted because I was posting from an IP address that is apparently spam-blocked by the blog service this site uses (WordPress?). So I changed IP addresses (via a VPN I have in UK or USA, whereas I am in Asia), then I posted under the mirrored name, and the new name caused my posts to go into moderator queue (hence they’ve been released from queue to the site, I assume by Daniel).
Once I realized that I had some lapses in logic in my posts under the mirrored name, I posted the corrected logic under my unmirrored name, on the unblocked IP address, thus my last post was appeared on the site immediately (before the ones in queue).
Some of your reply apparently does not reflect the corrections in logic I made in my last post, i.e. appears you may be replying to the post I made under the mirrored name without acknowledging that I already corrected the logical error you are pointing out.
For example, I think maybe (not sure) you missed the point I was making about Liskov Substitution Principle. First of all, in addition to method parameter and return types, there are [several more requirements (click here)][42]. Thus, LSP requires that every subtype has fewer possible states than its supertype, and this is exactly what gives rise to the requirements on methods. This will become more clear if you study [this link (click here)][39], which illustrates that LSP really applies to the set of possible states.
Thus I was pointing out that Nothing is a subtype of every possible class, thus it has more possibilities than any supertype, except Any. Thus my first post asserted that it violated LSP if assigned in a covariant position (and actually that is still true, if the assigned not as a type parameter and if assigned as instance, but this can never occur). But then in my last point, I made the point that if Nothing can never be created as instance (as you say a “blank”), then LSP does not apply in that respect, because Nothing is never the type of a method parameter, nor an assignable return value. We thus see Nothing only ever applies for type parameters or return type that is never assigned. Thus the problem of Nothing is avoided. Agreed, thus there is no aliasing error because the number of instances of Nothing created is exactly 0, which is what I explained in my last post.
**Disagree**, None should not equal None, because if I have two Option[T] references for different T, both of which point at a None, then they should not report they are equal.
My point about Iry’s article was not that Nothing is unnecessary in every use case, only to state that if List#head returns Option[T] then we have to unbox it with “match-case” to get at the T value. Thus whether None is an Option[Nothing] or None[T] is arbitrary design choice. Either way, we have to do a match-case and handle the possibility of an empty List.
List may or may not return an Option[T] (I haven’t checked), yet if not, I can code a Scala library where List#head returns an Option[T]. We are not forced to use the standard library. We are free to explore all possibilities and choose. It helps to make the standard library better, if we question everything, and accept only what passes our best analysis. I think Linus Torvalds stated, as paraphrased by Eric Raymond, “Given enough eyeballs, all suboptimal things (‘bugs’) are shallow”. I have not yet formed a final opinion, discussion is part of discovery and analysis.
Nothing (a/k/a void) is only needed for return types that never return (which means they don’t apply in a strictly referentially transparent system like Haskell if an error monad is employed, and they never create instances in a system with side effects like Scala, Java, etc), or for type parameters, which also means an instance is never created. So reasoning about bottom types are not really needed, except as a supertype for contravariant type parameters. All other cases were handled without thinking and teaching void to be a “bottom type” (although that is what is was). This is important for me, because I am putting much thought into how to simplify Scala and make it more palatable for the masses. That is why you are seeing the genre of analysis from me that you are– I am questioning everything. It is my job to do so.
Friendly reminder, your use of _argument_ and _parameter_ is reversed from their definitions. The _parameter_ is in the declaration of the function, the _argument_ is in the function call (apply).
[Shelby Moore III][27] Thursday, February 24, 2011 at 1:36 pm
52.@Shelby,
Indeed i was replying to your older posts as i they showed up later and i thought they represent the current state of the discussion.
You state “Thus, LSP requires that every subtype has fewer possible states than its supertype” which in the link example is shown as restriction on possible values of an Integer.
However for OO-programming a subclass can have more fields than it’s superclass and while i agree it should have fewer possible states or the same number of possible states in the fields of the superclass, adding new fields in the subclass can lead to an explosion of possible states.
Again for Nothing having no instance it means it has 0 possible states – the lowest of all types.
I am not exactly convinced that Option[T] instances pointing at Nothing should be different when the type is different. This would only make sense if you want to maintain type information, e.g. for a method getOrNew() which would create a new instance if it is called on None (or more exactly a new implementation called TypedNone).
This can be implemented by the usual Java hack of passing the class as an argument to the constructor and storing the class in the TypedNone instance.
(This hack fails when the type to create is itself a generified type, but does not provide means to store the additional type information).
However the way None is implemented now (with full type erasure), there only needs to be one instance of None and this by nature equals to itself.
Horst Makitta Thursday, February 24, 2011 at 2:03 pm
53. Addition:
“Nothing (a/k/a void) is only needed for return types that never return (which means they don't apply in a strictly referentially transparent system like Haskell if an error monad is employed,”
In the cited article James Iry writes: “Haskell and languages in the ML family don’t have an explicit bottom type. Their type systems don’t have subtyping so adding bottom as a subtype would confuse things.”
So it appears adding a bottom type even for non-returning methods would still not be preferable.
Horst Makitta Thursday, February 24, 2011 at 2:14 pm
54.@Horst
Hi again,
Adding members to a subtype does not increase the # of potential LSP states, as the states that LSP is referring to are types in the inheritance hierarchy, not data in the types.
Nothing has infinite possible LSP states, just as Any does. The only reason we can use Nothing without breaking LSP, is because there can never exist a reference to Nothing. Nothing is not a referencable type. Try creating a reference with the type of Nothing in Scala, I expect it won’t compile.
val any : Nothing = any instance you like
And that is why I find it misleading to use Nothing in None, because it causes confusion for anyone who thinks of LSP fundamentally (not that many people do, but the concept of a bottom type is foreign to most OOP programmers). But on further thought, one eventually realizes that Nothing can be used as a type parameter without ever creating a reference to Nothing. Imagine trying to teach this to Java programmers? I am thinking I will rather hide Nothing as Void (ah everbody is comfortable with void as “nothing” for return type and functions that take no parameters), and avoid ever mentioning it for use as a type parameter, except for contravariant type parameters, which are rare because they can never be read. Generality is nice and dandy, except when the Java community might have to expend 10,000 man-hours to deal with all the confusion about “what is Nothing?” for a concept that only is necessary for 0.001% of the use-cases.
I am not in my last 2 posts (or include this one, so 3) asserting that Option[T] can not point to None, instead of None[T]. I am saying that None should never report true for #equals. Lets not conflate orthogonal issues.
Please check my logic here. None should never equal itself. Let me repeat as an example:
val os : Option[String] = None()
val ol : Option[List] = None()
if( os == ol ) // Do I have the same list or string?
That assumes that Some[T] implements #equals by comparing the value it stores.
Apparently Haskell does have a bottom type, but you would only use it when your monad called some system function that could never return:
<http://www.haskell.org/haskellwiki/Bottom>
That is why I said you wouldn’t encounter it if you used an error monad that would instead spool (chain) all errors back out to the main return, i.e. that an error call would simply unwind the function stack, and not cause a non-returning function.
[Shelby Moore III][27] Thursday, February 24, 2011 at 3:19 pm
55. Shelby,
I almost guessed that i misinterpreted what state in the context of LSP should mean.
However the argument still is how many types (“LSP states”) Nothing has. I would argue it has exactly one type, therefore no conflict with LSP.
“Trying to teach this to Java programmers”. Personally i have to say i prefer sound structures, and Java has too many corner cases (i programmed in Smalltalk before and found it has a more sound structure).
Example: Where is the method that constructs a new instance of a class. Smalltalk has a class object which lends itself to an inheritence hierarchy while Java has special constructor methods and a special new operator to invoke them.
So having a sound and valid language structure / type system can help in avoiding special / unorthogonal plumbing constructs.
Your code:
val os : Option[String] = None()
val ol : Option[List] = None()
does not compile in my Scala (still using 2.7, but i guess the issue is the same in 2.8).
This compiles:
val os : Option[String] = None
val oI : Option[Int] = None
os == oI
res0: Boolean = true
And it makes the same sense as:
val s : String = null
val l : List[Int] = null
s == l
res1: Boolean = true
Horst Makitta Thursday, February 24, 2011 at 3:48 pm
56. I summarized it more concisely now:
_Void is not a referencable type, it is only used to express inheritance (covariance) relationships, or the absence of function parameters and/or a return value._
Replace Void with Nothing for Scala.
Off topic: Copute separates _trait_ into _interface_ and _mixin_, where mixin is not a referencable type, so the concept of non-referencable type is not so esoteric.
[Shelby Moore III][27] Thursday, February 24, 2011 at 3:57 pm
57.@Shelby
I’m not sure we’re getting anywhere with this argument, but I do feel compelled to point out that Nothing and Void are *not* synonymous. Please see the following Wikipedia articles (or better yet, the corresponding chapters in TAPL):
\- <http://en.wikipedia.org/wiki/Unit_type>
\- <http://en.wikipedia.org/wiki/Bottom_type>
James Iry also summarized a lot of this. To summarize even further, Unit is a type which has exactly one member. It is _precisely_ equivalent to Java’s notion of `void`, except that Java failed to provide a member for this type (an oversight which causes problems in a number of areas, such as parametric polymorphism). Bottom (which Scala calls “Nothing”) is a subtype of all types which has exactly _zero_ members. The mere fact that Unit is not a subtype of all types should immediately convince you that it is not equivalent to Bottom.
If it doesn’t, I suggest you read up on the Curry-Howard isomorphism and the respective meanings of Unit and Bottom in the context of logic systems. The differences are somewhat clearer in that context.
[Daniel Spiewak][20] Thursday, February 24, 2011 at 4:04 pm
58. Hello Shelby,
Regarding Copute:
I really agree in Scala has a definite flaw in compiler not enforcing pure functions and in conflating interfaces and mixins.
Entirely (un)related, in a recent post:
<http://alarmingdevelopment.org/?p=562>
one adaptor of Scala plans to switch back to Java as tool / IDE support for Scala for him is not good enough.
This probably is more important for average developers than minor details of the type system, and Scala seems to have some issues in that regard so far.
Horst Makitta Thursday, February 24, 2011 at 4:13 pm
59.@Horst
Thanks for running the test, because I have never run a Java, Scala, or Haskell compiler or interpreter in my life (I do it all in my head for now, because it forces me to think more deeply).
So that looks like a bug to me, as I expected. It should report false, but it reports true. The null bug doesn’t make semantic sense to me either. All of those look like bugs to me. I vaguely remember having seen that null bug mentioned as a bug in general language design circles.
Nothing has infinite potential subtypes in the contravariant type parameter case. In the covariant case, it has infinite potential supertypes. But this does not cause a problem because it can never appear as an assignable reference in any covariant or contravariant case, and LSP is all about _reference substitutability_ (you can refer that link I gave before about LSP as sets of possible states). Thus LSP does not apply to Nothing, because we are never substituting any reference with or from it. Nothing is only used for expressing inheritance relationships, or for indicating no value, never for actual substitutions.
Agreed on having a sound type system with no edge cases that fail.
[Shelby Moore III][27] Thursday, February 24, 2011 at 4:18 pm
60. My intention was not to suggest that Nothing is equivalent to Void.
Only that None does not hold the type information, therefore there is no inequality of variables defined as Option[T] with different T but both being None.
Horst Makitta Thursday, February 24, 2011 at 4:19 pm
61. But null == null evaluates to true in almost all programming languages that have the concept of a null reference (C, C++, Smalltalk, Java, …).
The only markable exception are NULL values in SQL where comparing null values in a column against null with = or != never will return true, and special operators is null / is not null must be used.
Horst Makitta Thursday, February 24, 2011 at 4:28 pm
62.@Daniel
In the case of general programming and logic constructions, of course you are correct, that we need to distinguish for functions between:
a) never returns, does not exist (Nothing, undefined in JavaScript)
b) returns but no value, exists but has no value (Void, Unit, null in JavaScript)
a) non-callable/unreachable (Nothing, undefined method in JavaScript)
b) called with no parameter (Void, Unit parameter)
Thanks for raising the issue.
But remember, I made the subtle (and probably buried) point that in a 100% referentially transparent, statically typed language, we never have functions that don’t return, nor functions that don’t return a value, nor functions that are non-callable, although we may have functions which are called with no parameter– they are the functional equivalent of constants. So in that genre of language where there is no overlap between them, there is no need to have separate Void and Nothing, so one could decide call them by the same name (what I meant by “hiding Nothing in Void” since the inheritance uses are so esoteric and rare), and then use Void semantics for functions and Nothing semantics for inheritance, but always with one keyword Void.
We can relax the requirements for non-overlap. We just require to never have functions that don’t return (notice that Copute disallows thrown Exceptions, use an error monad if you want to unwind the function state to terminate), nor functions that are non-callable.
In the Curry-Howard isomorphism, Bottom corresponds to a false formula (and Void a true formula), which means any program that contains a Bottom with respect to a function, is not a true formula in the corresponding logic mapping. Thus eliminating Bottom in function context, makes our programs true proofs of themselves. A false formula introduces undecidability.
If I say that Void is unreferencable, then it is consistent with a unit type for use in functions. It is also consistent with a nothing (bottom) type in inheritance.
I hope I didn’t make a egregious error, but that seems correct to me at this groggy moment 18 hours into my work day.
[Shelby Moore III][27] Thursday, February 24, 2011 at 6:17 pm
63.@Horst
If I remember correctly, the legacy treatment of null is due to implicit conversion to boolean false, so that if( object ) tests are less verbose than if( object != null ). So false == false is true.
One problem in many languages is that null is hard-wired to every type, so we are forced to test for null every where. Modern languages supply the Option type (Exception monad) instead, which we can them employ selectively on only the types we need exceptions on.
If the semantics of null is “does not exist”, then why should (“does not exist” : Option[String]) == (“does not exist” : Option[List]) be true? For that matter, hy should (“does not exist” : Option[String]) == (“does not exist” : Option[String]) be true?
The programmer is asking of equals in that context, “are these pointing to the identical pair of Some instances?”. Not asking “are these pointing to the identical pair of either both Some or both None/null instances, but not one of Some and one of None/null?”.
Shouldn’t the answer should agree with the question.
Very sleepy, hope this was coherent. Probably I can not add anything more of significance on this issue of None#equals.
[Shelby Moore III][27] Thursday, February 24, 2011 at 7:00 pm
64. I meant, “…identical pair of Some instance _value_ instances…”
[Shelby Moore III][27] Thursday, February 24, 2011 at 7:38 pm
65.@Horst and @anonymous
Even though I originally (in my early comments) noticed that None is an object, somehow it escaped me, and then I didn’t correlate your point a few comments ago, that the key advantage of,
case object None extends Option[Nothing]
versus,
case class None[+T] extends Option[T]
is due to the _object_ versus _class_, the former only needs one instance of None for the entire program, which will be much more efficient when passing these Option types everywhere we would need a null in other languages. I of course originally thought the single object was a problem, until I realized the distinction between Object[Nothing] and the inability to create a reference to Nothing, and the orthogonal issue being the equality comparison result.
So I fully capitulate that the first way is superior. Until someone can elucidate otherwise, I think equality comparison for None should always return false, even when the other operand is None.
Thanks to both of you and Daniel for all the help. I hope this has been stimulating or otherwise helpful in some way. Apologies if we veered from the monad theme of this article.
My original main point was to show that the bind operator portion of a monad can be an inherited trait instead of an implicit typeclass. I am happy the discussion forked, because I gained numerous language design insights.
[Shelby Moore III][27] Friday, February 25, 2011 at 2:28 am
66. I remembered where I read [_null_ was a “Billion Dollar Mistake”][43] from the creator of null, Tony Hoare.
[Shelby Moore III][27] Friday, February 25, 2011 at 3:00 am
67. As you are designing a new language, why repeat the “Biilion Dollar Mistake”, supporting a null value / reference?
Instead enforce defining NullObjects for every type in the language
This way you get the desired equality semantics for free:
A.NULL != B.NULL if A != B and
A.NULL == B.NULL if A == B
Also the pattern is easy to explain to an average programmer and NullPointerExceptions are avoided (which the Option monad in Scala does not avoid by allowing Some(null)).
The only disadvantage is interfacing with existing Java / Scala code requires defining Null-Objects for every foreign type one has to deal with.
To me the NullObject seems to be the OOP-solution to the problem while the Option-Monad is the FP-solution and unfortunately Scala has decided to use the FP solution in a way that does not solve the problem at all:
scala> val problem : Option[String] = Some(null)
problem: Option[String] = Some(null)
scala> problem.getOrElse(“123″).length
java.lang.NullPointerException
Horst Makitta Friday, February 25, 2011 at 5:47 am
68.@Horst
My thought is to use the Exception monad (e.g. Option or Maybe) and never null. Exception monad is superior because the value can only be accessed via a match-case (or equivalent syntactic sugar), so there is never a random (i.e. non-deterministic) state introduced.
The NullObject silently introduces noop, which I think is the semantic equivalent of random state. This aliasing error will leak out some where. This is analogous in a way to using STM to try to pretend an algorithm and data structure is concurrent, when it is in not. Such “magic” fails.
Rather I prefer to model all instances of null with the Exception monad. I have not yet studied how to do this and interopt with Java. But the idea is Some(null) will never occur, instead None in that case.
[Shelby Moore III][27] Friday, February 25, 2011 at 2:05 pm
69. I will offer two improvements to [my prior comment][44]– the prior comment wherein I had proposed a conceptual mapping of pseudo-code “static” interface members to legal Scala syntax.
Note that the StaticMonad trait (in my prior comment) is necessary to enable accessing “statics” on types (e.g. M[_]) that are otherwise unknown due to type erasure (e.g. Daniel’s Sequence function example), but StaticMonad is not used for direct invocation of statics, e.g. Option.unit( value ). Thus a necessary improvement is to name _object_ OptionStaticMonad to _object_ Option, which makes _trait_ Option its companion class (or does Scala only allow this if Option is a class?):
Also, to give functionality similar to what we expect for “static” in Java, (some macro or other language to Scala compiler) could automatically generate the statics for each derived class in pseudo-code that did not override them, e.g. as follows although this example seems superfluous, it is not harmful and the generality is needed in other examples.
implicit object Some extends StaticMonad[Some] {
def unit[Y]( y ) = Object.unit( y )
}
implicit object None extends StaticMonad[None] {
def unit[Y]( y ) = Object.unit( y )
}
[Shelby Moore III][27] Monday, February 28, 2011 at 6:31 am
70. Expounding on my prior comment, the situations where StaticMonad[M] is employed (versus directly accessing the singleton), type M is unknown, and thus is a more composable abstraction which inverts the control of the access to the singleton statics, and gives that control to the caller:
Type erasure is an orthogonal issue, that forces the use of an implicit as function parameter, versus the compilation of a separate function body for each possible M in reified languages. Even if Scala was reified, trait StaticMonad is still necessary to abstract the inversion-of-control on singletons. Thus the declaration of implicit instances and parameter is justified by type erasure, but they (along with StaticMonad) could just as well be hidden to make a non-reified language appear to be reified. Which is what I was illustrating with pseudo-code examples.
[Shelby Moore III][27] Monday, February 28, 2011 at 9:33 am
71. Scala’s higher-kinded types enabled the paradigm of Haskell’s type classes, as Daniel did with Monad. See section _7.2 Encoding Haskell's type classes with implicits_ of [Generics of a Higher Kind][45].
Notably section _6.1 Implementation_ explains why it may be difficult (if not impractical) to implement higher-kinded types for languages without type erasure, such as .Net languages, e.g. C#.
[Shelby Moore III][27] Wednesday, March 2, 2011 at 3:57 pm
72. Per our prior discussion on this page, I stumbled on a case where “None[T] extends Option[T]” might be superior to “None extends Option[Nothing]“:
Note I offered the following suggestion for retaining the Option[Nothing], but that blogger (Scalaz author) deleted my comments:
> Can't you use (None : Option[Int]) in place of none? Seems I read somewhere that Scala has that compile-time cast for aiding type inference.
I link to the [comments that were deleted][46], because I raised the relevant point that the way I proposed inheriting “bind” in my comments on this page, thus enables variance annotations. I was hoping to explore idea.
[Shelby Moore III][27] Thursday, March 3, 2011 at 10:56 am
73. There is a fascinating discussion about the [practical utility of monads for mainstream programming][47]. That discussion morphs from an initial hypothesis that monads vs. actors can be used to interopt imperative and functional programming, to one of recognition that monads provide abstraction of widespread programming paradigms, e.g.
It might not be easy for some to grasp Applicative from that [research paper][48], and I have a layman’s concise tutorial, but I will wait to provide a link once my implementation is complete, because the tutorial uses my syntax. I promise it won’t be [incomprehensible for Java laymen][49].
Applicative is a neat pattern, because it allows the application of a function distributed (curried) over the wrapped elements of a tuple of Applicative, and unlike Monad#bind, the function is always called and can not be short-circuited by the computation and/or state of the wrapping.
However, and the main reason I make this post here, is that an empty container is incompatible with Applicative, if the emptiness is not a type parameter of the container, because the empty container can not provide an empty element to the function application, if the function is not expecting it (i.e. does not accept an empty type). This means that Nil for an empty List type is incompatible with Applicative. Only List[T] extends Applicative[Option[T]] will suffice. In my prior comments on this page, I had mentioned I expected problems with Nil and prefer to use Option[T] return type on element access to model an empty container.
The prior linked research paper seems to conflict with my assertion, wherein it shows how to implement the Applicative#ap (apply) for monads on page 2, and the statements that every monad is an Applicative in section 5 and conclusion section 8. However, what I see is that although the Monad#ap can return the same value as the Applicative#ap, the side-effects are not the same. The Applicative#ap will always call the input function (with empty elements as necessary) and the Monad#ap will return an empty value and short-circuit calling the function. The research paper alludes to this in discussion of the differences in evaluation tree in section 5. The research paper I think is coming from the mindset of Haskell ad-hoc polymorphism (i.e. tagged structural typing) where there is no subtyping, thus Monad#ap is not expected to honor the semantic conditions of the Liskov Substitution Principle, i.e. Monad is not a subtype of Applicative in Haskell. This is real-world an example of why I am against ad-hoc polymorphism and structural typing. Of course, side-effects are not supposed to matter in a 100% pure language such as Haskell.
Any way, back to the point that Nil is incompatible with Applicative side-effects semantics, perhaps this might be one example of Tony Morris’s (author of Scalaz) [criticism of Scala’s libraries][50], given that in Scalaz apparently Monad is a subtype of Applicative. I am not taking a stance yet on the general criticism, because my library implementation is not complete, so I do not yet know what I will learn.
[Shelby Moore III][27] Friday, March 4, 2011 at 10:50 am
74. “It might not be easy for some to grasp Applicative from that research paper”
Well it still is easier to grasp from the research paper than from the scalaz – documentation, which only links to that paper and cites the four requirements. The scalaz classes / traits used here do not have a single line of documentation.
Is this the Scala culture? I found that very disturbing and miles apart from the usual JavaDoc in the “Java culture”.
Well that last point was directed at the scalaz developers.
@Shelby:
You state that “an empty container is incompatible with Applicative, if the emptiness is not a type parameter of the container, because the empty container can not provide an empty element to the function application, if the function is not expecting it (i.e. does not accept an empty type).”
If my understanding of the applicative pattern and your rmentioning of containers in that context is correct, than the idea is to apply the function to all elements of the container.
As the empty container does not contain elements, why would you expect the function to be applied at all?
Or if you expect the function to be applied to an empty containers once and for non-empty containers applied for every element in the container, can this not naturally be solved by a pattern-match against the empty container?
It appears to me you are assuming that providing type information is required to call the function – even in the case of empty containers.
This however is not compatible with type-erasure and has to be solved by explicitly holding the type information somewhere – like in a monad that enriches containers with runtime-type-information.
In that case the type information is only a (memory) cost in that cases where it really is required and not for the majority of cases where an untyped null is enough that is required.
In the research paper i see a lot of examples where an empty case is dealt with and none of them requires any type information.
You wrote:
“Of course, side-effects are not supposed to matter in a 100% pure language such as Haskell.”
How is this to be understood in the presence of the IO monad and the Applicative’s whole point is the control over side-effects?
From the paper (exclamation mark added):
“the essence of Applicative programming: computations
have a fixed structure, given by the pure function, and a sequence of subcomputations,
given by the effectful(!) arguments.”
Horst Makitta Friday, March 4, 2011 at 2:36 pm
75.@Horst
I was resisting sharing this before it is ready for public consumption, but I think it will help to see the [implementation and documentation I am working on][51], and caveat that is intended to be Copute code, so translate it to Scala code, using the transformation that I outlined in one of my first comments on this page, wherein the static methods get separated out and become a trait+object in Scala. Also that link may break in future (e.g. those types will not all stay in same file), and that is just very rough pseudo-code at this point, but hopefully at least my documentation therein will clarify Applicative (and Monad). The documentation would make less sense if I copy+pasted it here, because it refers to Copute syntax. Hopefully it is not too foreign to Java/Scala programmers.
I think when you review the examples in my documentation at the link above, you will see how Applicative applies to an empty container. It took me a while to unravel that research paper too, so that is why I took some care to write it down, so I won’t forget the details.
Since my prior comment, I realized that since Copute will have pure functions, then we only need to make the input of Applicative#apply a wrapped pure function, in order to allow Monad to be subtype of Applicative and avoid the potential semantic breakage alleged in my prior comment. Since Scala doesn’t enforce pure functions, I don’t know how to avoid the potential semantic issue in Scala (run a unit test to enforce that no subtype of Monad is ever empty without an empty element type, i.e. no Nill allowed?). In Copute, (afaics) we will be free to use Nil or Option[T] strategy for empty containers, because can enforce the aforementioned pure function. So the problem is solved in theory for my design, but I don’t think for Scala. But I don’t want to say that for sure yet (vaporware). Any way, any thing compelling could be potentially be adopted by Scala I suppose.
I think the research paper is referring to argument effects, e.g. returning new locally created instances, which is a pure operation (referentially transparent). If the arguments are externally stateful effects, e.g. IO or State monad, then indeed I see a difference between the semantics of Applicative#ap and Monad#ap, even for Haskell. But as per the little I understand about the Haskell programming and type classes, if the type inference calls Applicative#ap or Monad#ap, then you are supposed to expect that the semantics can totally change, because in Haskell there is no subtyping (Monad is not subtype of Applicative, even if they share the same method signature). In Haskell you can be explicit about expected type, when you expect a conflict. (but how do you know when I new conflict will be added globally in general cases?) And that is why I am currently against structural typing unless I learn that this is not the case, or not a problem.
[Shelby Moore III][27] Friday, March 4, 2011 at 3:56 pm
76.@Horst
**Correction**: I have just clarified in my linked documentation translated as follows.
> The semantics of Applicative[T]#ap do not allow a subtype which is non-invertible to an unwrapped T, i.e. empty container not allowed, unless there is a subtype of the subtype (or a subtype of T) that represents the null state (e.g. the subtype, or T, is an Option type).
Perhaps that is implicit in the four laws for Applicative. I will check that next.
Thus I think Applicative#ap can implemented for an empty container using Nil in Scala, by returning a (Nil : List[B]), where I assume that is the correct Scala syntax for compile-time cast of List[Nothing] (a/k/a Nil) to the expected return type B of the curried function.
Alternatively Applicative#ap could be implemented for an empty container using Option elements, by returning the wrapped curried function with (None : Option[B]). This requires every possible composition function to input Option. So now I am seeing some good justification for the Nil strategy of Scala’s library.
Also studying the situation more, I realize there is no difference between the side-effects of the Applicative#ap and Monad#ap. I think I was conflating in my mind with Monad#bind.
So I no longer see any problem, nor a requirement for pure function per to solve it.
Hopefully I have helped explain Applicative and provide further insight in how to simplify the explanation of Monad.
Thus I don’t yet know an example of why Morris is dissatisfied with the Scala libraries.
P.S. note in the linked docs in my prior comment, I am currently using Maybe as synonymous to Option.
[Shelby Moore III][27] Friday, March 4, 2011 at 5:24 pm
77. I still don’t like Nil (extends List[Nothing]), because unlike None (extends Option[Nothing]), a Nil can throw an exception on accessing head or tail. In general for all programming, I want to always force unboxing and make throwing an exception illegal (for the scalable composition reason explained in an early comment on this page).
So I realized there is a third way to represent an empty list (or in general a container), which never throws an exception and I think it is the optimum one compared against the two alternatives I presented in the prior comment.
trait OptionList[+T]
case class List[+T] extends OptionList[T]
case object Nil extends OptionList[Nothing]
(or case class Nil[T] extends OptionList[T])
The point is that when you have a List, then it can’t be empty and no need to unbox, and when you have an OptionList, you must unbox (i.e. match-case) it before you can access any members of List, e.g. head or tail. If you get a case Nil upon unboxing, then it has no members from List that would throw an exception in Scala’s current Nil, e.g. head and tail.
Now I need to discover why Scala didn’t do their Nil this way? Maybe there is an important tradeoff I am not aware of yet.
[Shelby Moore III][27] Friday, March 4, 2011 at 10:45 pm
78. A variance annotation on _Sub_ was missing in [**my prior comment**][44], and should be as follows:
I am not sure if that is legal in Scala, but seems to me that is the only way to express that Sub’s type parameter is X.
[Shelby Moore III][27] Wednesday, March 9, 2011 at 4:51 pm
79. Please ignore the prior comment, I forgot the pre tags and so the scala code was butchered because it contains the less than character which opens a tag in HTML.
A variance annotation on _Sub_ was missing in [**my prior comment**][44], and should be as follows:
I am not sure if that is legal in Scala, but seems to me that is the only way to express that Sub’s type parameter is X.
[Shelby Moore III][27] Wednesday, March 9, 2011 at 4:53 pm
80. The pre tags didn’t work. Let me try again using html entities for less than and greater than characters…(I thought wordpress blogs was a solid product given its widespread use, amazing)
A variance annotation on _Sub_ was missing in [**my prior comment**][44], and should be as follows:
trait InjMonad[+Sub[_] <:InjMonad[Sub[_],X],+X]{
def bind[Y, S[Y] >: Sub[Y]] : (X => S[Y]) => S[Y]
}
Without that change, then Some and None would not subtypes of Option, because _Sub_ was invariant.
Also I am thinking the following is more correct, but I haven’t compiled any of my code on this page:
trait InjMonad[+Sub[X] <:InjMonad[Sub[X],X],+X]{
def bind[Y, S[Y] >: Sub[Y]] : (X => S[Y]) => S[Y]
}
I am not sure if that is legal in Scala, but seems to me that is the only way to express that Sub’s type parameter is X.
[Shelby Moore III][27] Wednesday, March 9, 2011 at 4:57 pm
81. Correction:
trait InjMonad[+Sub[X] <:InjMonad[Sub,X],+X]
[Shelby Moore III][27] Wednesday, March 9, 2011 at 5:51 pm
82. My prior idea for expressing X to be the Sub’s type parameter is not retracted.
However, my suggestion for a covariant annotation on Sub, is erroneous for the reason I had stated in prior comments– Sub’s lifted state may not be invertible (e.g. None has no value of type X) and thus there may be no mapping from a Sub to its supertype. Thus the correction to restore an invariant Sub, and keep the other idea, is:
trait InjMonad[Sub[X] <:InjMonad[Sub,X],+X]{
def bind[Y] : (X => Sub[Y]) => Sub[Y]
}
It was incorrect when I wrote that this invariant Sub prevents Some and None subtype of Option. Some#bind and None#bind type signatures contain Option, not Some and None respectively.
I can not think of a useful subtype that would overload bind, but in which case, in my paradigm the subtype could multiply inherit from Monad with distinct Sub. In Daniel’s typeclass paradigm this would entail adding another singleton implicit object that inherits from Monad.
[Shelby Moore III][27] Wednesday, March 9, 2011 at 7:42 pm
83. The Monay example does not compile…
scala> trait Monadic[+M[_]] {
| def unit[A](a: A): M[A]
| def bind[A, B](m: M[A])(f: A => M[B]):M[B]
| }
:9: error: covariant type M occurs in contravariant position in type M[A] of value m
def bind[A, B](m: M[A])(f: A => M[B]):M[B]
^
:9: error: covariant type M occurs in contravariant position in type (A) => M[B] of value f
def bind[A, B](m: M[A])(f: A => M[B]):M[B]
^
Dejan Dukic Wednesday, September 28, 2011 at 4:19 am
84. I can get the Scala code to compile by removing the + on M[_], then either of two strategies as follows.
1\. Cast to List or Thing when you need to use an implicit, for example as follows.
val nums : List[Option[Int]] = List(Some(1), Some(2), Some(3))
val nums2 = sequence(nums)
2\. Define everything for every subtype you need, for example as follows for Some.
case class Some[+A](value: A) extends Option[A] {
def bind[B](f: A => Option[B]) = f(value)
def bind[B](f: A => Some[B]) = f(value)
}
implicit object SomeMonad extends Monad[Some] {
def unit[A](a: A) = Some(a)
def bind[A, B](opt: Some[A])(f: A => Some[B]) = opt bind f
}
val nums = List(Some(1), Some(2), Some(3))
val nums2 = sequence(nums)
Imho, this is repeating oneself. This is why I did not implement the category theory library this way in my Scala derivative language. If you refer to my prior verbose comments (apologies), I am reasonably sure that I finally worked out the elegant solution:
<http://copute.com/dev/docs/Copute/ref/std/> (link not guaranteed to remain stable long-term)
In short, I automate the creation of the implicits (both at definition and use site) with the compiler (they become ’static’ in the interface, i.e. trait), and I use higher-kinded subtyping to eliminate the need to put the bind in the implicit singleton objects. Only the static functions, i.e. ‘unit’ (which I name ‘lift’) go in the implicit singleton objects, and this happens behind the scenes automated by the compiler. I suspect this is much less unwieldy than Scalaz, but I can’t say for sure, because I can’t (or won’t take the time to) understand Scalaz.
I [commented more][52] about this design decision at stackoverflow.
Shelby Moore III Thursday, September 29, 2011 at 6:50 am
85. Thank you for this post. To me this post is the best introduction into monads I’ve ever seen. Clear, concise, accessible, and most important without noise this post should be read by anyone who wants to understand this apparently inaccessible topic.
[Nicolae Caralicea][53] Friday, October 7, 2011 at 1:36 pm
86. Thanks for your post, it really helped me to finally understand Monads and how to thread values through them.
Roberto Salama Sunday, October 16, 2011 at 2:40 pm
87. A great read and I could not agree morewith your opening sentiment. Perhaps the person commenting on your “failure” was referring to the mathematical definition of monad because you’ve done a thorough job elsewhere and newbies like me appreciate it.
I’m not entirely sure that the mathematical connection to natural transformations helps, personally, or even if I’ve got the connection right, but if you are interested I’ve taken a stab here: <http://scalaquant.blogspot.com/2011/12/shortest-blog-article-about-monads-ever.html>. Sorry the URL is deceptive.
Functors and Monads are good stuff to shove in one’s head. I think one source of confusion relates merely to the fact that a functor ships both objects and morphisms, whereas the building blocks are provided separately in the language (constructor and map, essentially, and map is back-to-front).
How long before Functoral Programming becomes a buzzword, I wonder?
[Peter][54] Wednesday, December 28, 2011 at 4:29 pm
88. Thank you very much!! This is the first Monad tutorial which I really grasp. Excellent writing.
Thanks a lot for your time!!!!
Arsene Tuesday, February 14, 2012 at 3:16 am
### Post a Comment
Comments are automatically formatted. Markup are either stripped or will cause large blocks of text to be eaten, depending on the phase of the moon. Code snippets should be wrapped in `<pre>...</pre>` tags. Indentation within pre tags will be preserved, and _most_ instances of "<" and ">" will work without a problem.
Please note that first-time commenters are moderated, so don't panic if your comment doesn't appear immediately.
Name *
Email *
Website
Notify me of followup comments via e-mail
‹ [Unveiling the Mysteries of GLL Part 2: The Problem Space][55]
* **Daniel Spiewak** is a software developer based out of Wisconsin, USA. Over the years, he has worked with Java, Scala, Ruby, C/C++, ML, Clojure and several experimental languages. He currently spends most of his free time researching parser theory and methodologies, particularly areas where the field intersects with functional language design, domain-specific languages and type theory. He can be reached by [email][56].
If you're feeling particularly masochistic, you can follow Daniel on Twitter [@djspiewak][57].
* ### Categories
* [.NET][58]
* [Cat][59]
* [Database][60]
* [DZone][61]
* [Eclipse][62]
* [Java][63]
* [Linux][64]
* [Ruby][65]
* [Scala][66]
* [Scripts][67]
* ### Related
[Scala Collections for the Easily Bored Part 2: One at a Time][68]
[58]: http://www.codecommit.com/blog/category/net "View all posts filed under .NET"
[59]: http://www.codecommit.com/blog/category/cat "View all posts filed under Cat"
[60]: http://www.codecommit.com/blog/category/database "View all posts filed under Database"
[61]: http://www.codecommit.com/blog/category/dzone "View all posts filed under DZone"
[62]: http://www.codecommit.com/blog/category/eclipse "View all posts filed under Eclipse"
[63]: http://www.codecommit.com/blog/category/java "View all posts filed under Java"
[64]: http://www.codecommit.com/blog/category/linux "View all posts filed under Linux"
[65]: http://www.codecommit.com/blog/category/ruby "View all posts filed under Ruby"
[66]: http://www.codecommit.com/blog/category/scala "View all posts filed under Scala"
[67]: http://www.codecommit.com/blog/category/scripts "View all posts filed under Scripts"
[68]: http://www.codecommit.com/blog/scala/scala-collections-for-the-easily-bored-part-2 "Permanent Link: Scala Collections for the Easily Bored Part 2: One at a Time"
[69]: http://www.codecommit.com/blog/scala/the-option-pattern "Permanent Link: The “Option” Pattern"
[70]: http://www.codecommit.com/blog/scala/is-scala-not-functional-enough "Permanent Link: Is Scala Not “Functional Enough”?"
[71]: http://www.codecommit.com/blog/scala/infinite-lists-for-the-finitely-patient "Permanent Link: Infinite Lists for the Finitely Patient"
[72]: http://www.codecommit.com/blog/java/bencode-stream-parsing-in-java "Permanent Link: Bencode Stream Parsing in Java"