See a typo? Have a suggestion? Edit this page on Github
I gave a talk today at the F(by) 2017 conference in Minsk, Belarus. The conference was great, I would definitely recommend it in the future. Thank you very much to the organizers for the opportunity to present on Haskell.
I prepared for this talk differently than I've prepared for other talks in the past. I'm very comfortable writing up blog posts, but have always found slide preparation difficult. This time around, I wrote up the content in mostly-blog-post form first, and only created the slides after that was complete. Overall, this worked very well for me, and I'll try it again in the future. (If others want to share their approaches to preparing talks, I'd definitely be happy to hear them.)
As a result: I'm able to share the original write-up I did as well. For those who saw the live talk (or the video): you may want to skip towards the end, which covers some material that there wasn't time for in the talk itself.
If you'd like to follow with the slides, they're also available.
My name is Michael Snoyman. I work at a company called FP Complete. One of the things we do is help individuals and companies adopt Haskell, and functional programming in general. And that leads right in to the topic of my talk today:
What makes Haskell unique
Programmers today have a large number of languages to choose from when deciding what they will learn and use in their day to day coding. In order to make intelligent decisions about which languages to pursue, people need to be able to quickly learn and understand what distinguishes one language from another.
Given that this is a functional programming conference, it's probably no surprise to you that Haskell can be called a functional programming language. But there are lots of languages out there that can be called functional. Definitions vary, but let's take a particularly lax version of functional programming: first class functions, and higher order functions. Well, by this defintion, even a language like C counts! You may want to limit the definition further to include syntactic support for closures, or some other features. Regardless, the same point remains:
Haskell may be functional, but that doesn't make it unique
In fact, there's a long list of features I could rattle off that could be used to describe Haskell.
- Functional
- Statically typed
- Pure
- Lazy
- Strongly typed
- Green threads
- Native executables
- Garbage collected
- Immutability
Some of these features, like being pure and lazy, are relatively rare in mainstream languages. Others, however, are common place. What I'm going to claim is that not one of these features is enough to motivate new people to Haskell—including people in this audience—to start using it. Instead:
It's the combination of these features that makes Haskell unique
As an example: the intersection of purity, strong typing, and functional programming style, for instance, lends itself to a high level form of expression which is simultaneously easy to write, easy to read, easy to modify, and efficient. I want to share some examples of some code examples in Haskell that demonstrate how the language encourages you to write code differently from other languages. And I'm going to try to claim that this "different" style is awesome, though it also has some downsides.
Async I/O and Concurrency
Let's start off with a use case that's pretty popular today. Look at this pseudocode and tell me what's wrong with it:
json1 := httpGet(url1)
json2 := httpGet(url2)
useJsonBodies(json1, json2)
Given the heading of this slide, you may have guessed it: this is blocking code. It will tie up an entire thread waiting for the response body from each of these requests to come back. Instead, we should be using asynchronous I/O calls to allow more efficient usage of system resources. One common approach is to use callbacks:
httpGetA(url1, |json1| =>
httpGetA(url2, |json2| =>
useJsonBodies(json1, json2)
)
)
You may recognize this coding style as "callback hell." There are plenty of techniques in common languages to work around that, usually around the idea of promises or futures. And you may have heard something about how Javascript futures are a monad, and expect me to be talking about how Haskell does monads better. But I'm not going to do that at all. Instead, I want to show you what the asynchronous version of the code looks like in Haskell
json1 <- httpGet url1
json2 <- httpGet url2
useJsonBodies json1 json2
This may surprise you, since this looks exactly like the blocking pseudocode I showed above. It turns out that Haskell has a powerful runtime system. It will automatically convert your blocking-style code into asynchronous system calls, and automatically handle all of the work of scheduling threads and waking them up when data is available.
This is pretty great, but it's hardly unique to Haskell. Erlang and Go, as two popular examples, both have this as well. If we want to see what makes Haskell different...
we have to go deeper.
Concurrency
It's pretty lame that we need to wait for our first HTTP request to complete before even starting our second. What we'd like to do is kick off both requests at the same time. You may be imagining some really hairy APIs with threads, and mutable variables, and locks. But here's how you do this in Haskell:
(json1, json2) <- concurrently
(httpGet url1)
(httpGet url2)
useJsonBodies json1 json2
Haskell has a green thread implementation which makes forking threads
cheap. The async
library provides a powerful, high level interface
performing actions in parallel without bothering with the low level
aspects of locking primitives and mutable variables. And this builds
naturally on top of the async I/O system already described to be cheap
about system resource usage.
Canceling
What we've seen already is elegant in Haskell, but it's not terribly difficult to achieve in other languages. Let's take it to the next level. Instead of needing both JSON response bodies, we only need one: whichever one comes back first. In pseudocode, this might look like:
promise1 := httpGet(url1)
promise2 := httpGet(url2)
result := newMutex()
promise1.andThen(|json1| =>
result.set(json1)
promise2.cancel())
promise2.andThen(|json2| =>
result.set(json2)
promise1.cancel())
useJsonBody(result.get())
This code is tedious and error prone, but it gets the job done. As you can probably guess, there's a simple API for this in Haskell:
eitherJson <- race
(httpGet url1)
(httpGet url2)
case eitherJson of
Left json1 -> useJsonBody1 json1
Right json2 -> useJsonBody2 json2
At first, this may seem like it's just a well designed API. But
there's quite a bit more going on under the surface. The Haskell
runtime system itself supports the idea of an asynchronous exception,
which allows us to cancel any other running thread. This feature is
vital to making race
work.
And here's the final piece in the puzzle. All of the thread scheduing and canceling logic I've described doesn't just apply to async I/O calls. It works for CPU-intensive tasks as well. That means you can fork thousands of threads, and even if one of them is busy performing computation, other threads will not be starved. Plus, you can interrupt these long-running computations:
let tenSeconds = 10 * 1000 * 1000
timeout tenSeconds expensiveComputation
Summary: concurrency and async I/O
Advantages
- Cheap threads
- Simple API
- Highly responsive
Disadvantages
- Complicated runtime system
- Need to be aware of async exceptions when writing code
Immutability and purity
Most programming languages out there default to mutability: a variable or field in a data structure can be changed at any time. Haskell is different in two ways:
- Values are immutable by default, and mutability must be explicitly indicated with a variable type
- Mutating a mutable variable is considered a side effect, and that mutable is tracked by the type system
For example, the following Haskell-like code is impossible:
let mut total = 0
loop i =
if i > 1000000
then total
else total += i; loop (i + 1)
in loop 1
From pure code, we cannot create, read, or modify a mutable variable. We also need to say what kind of mutable variable we want:
total <- newIORef 0
let loop i =
if i > 1000000
then readIORef total
else do
modifyIORef total (+ i)
loop (i + 1)
loop 1
This is a lot of ceremony for a simple algorithm. Of course, the recommended Haskell way of doing this would be to avoid mutable variables, and use a more natural functional style.
let loop i total =
if i > 1000000
then total
else loop (i + 1) (total + i)
in loop 1 0
Besides pushing us towards this supposedly better functional approach, why is immutable, pure code such a nice thing?
Reasoning about code
You'll often hear Haskellers throw around a phrase "reasoning about code." Personally, I think the phrase is used to mean too many different things. But let me give you an example that I think is accurate. Let's look at some pseudocode:
// scores.txt
Alice,32
Bob,55
Charlie,22
func main() {
results := readResultsFromFile("results.txt")
printScoreRange(results)
print("First result was by: " + results[0].name)
}
func printScoreRange(results: Vector<TestResult>) {
...
}
If you look at the code above, what do you expect the output to be? I think it would be reasonable to guess something like:
Lowest: 22
Highest: 55
First result was by: Alice
However, now let's throw in another piece of information: the
definition of printScoreRange
:
func printScoreRange(results: Vector<TestResult>) {
results.sortBy(|result| => result.score)
print("Lowest: " + results[0].score)
print("Highest: " + results[results.len() - 1].score)
}
Suddenly our assumptions change. We can see that this function mutates
the results
value passed to it. If we're passing mutable references
to vectors in this made up language, then our output is going to look
more like:
Lowest: 22
Highest: 55
First result was by: Charlie
Since the original results
value in our main
function has been
modified. This is what I mean by hurting our ability to reason about
the code: it's no longer sufficient to look at just the main
function to understand what will be happening. Instead, we're required
to understand what may possibly be occurring in the rest of our
program to mutate our variables.
In Haskell, the code would instead look like:
main :: IO ()
main = do
results <- readResultsFromFile "results.txt"
printScoreRange results
putStrLn $ "First result was by: " ++ name (head results)
printScoreRange :: [TestResult] -> IO ()
printScoreRange results = do
let results' = sortBy score results
putStrLn $ "Lowest: " ++ show (score (head results'))
putStrLn $ "Highest: " ++ show (score (last results'))
We know that it's impossible for printScoreRange
to modify the
results
value we have in the main
function. Looking at only this
bit of code in main
is sufficient to know what will happen with the
results
value.
Data races
Even more powerful than the single threaded case is how immutability affects multithreaded applications. Ignoring the insanity of multiple threads trying to output to the console at the same time, we can easily parallelize our code:
main :: IO ()
main = do
results <- readResultsFromFile "results.txt"
concurrently_ printFirstResult printScoreRange
printFirstResult results =
putStrLn $ "First result was by: " ++ name (head results)
printScoreRange results = do
let results' = sortBy score results
putStrLn $ "Lowest: " ++ show (score (head results'))
putStrLn $ "Highest: " ++ show (score (last results'))
There's no need to worry about concurrent accesses to data structures. It's impossible for the other threads to alter our data. If you do want other threads to affect your local data, you'll need to be more explicit about it, which we'll get back to.
Mutability when needed
One thing you may be worried about is how this affects performance. For example, it's much more efficient to sort a vector using mutable access instead of only pure operations. Haskell has two tricks for that. The first is the ability to explicitly create mutable data structures, and mutate them in place. This breaks all of the guarantees I already mentioned, but if you need the performance, it's available. And unlike mutable-by-default approaches, you now know exactly which pieces of data you need to handle with care when coding to avoid tripping yourself up.
The other approach is to create a mutable copy of the original data, perform your mutable algorithm on it, and then freeze the new copy into an immutable version. With sorting, this looks something like:
sortMutable :: MutableVector a -> ST (MutableVector a)
sortMutable = ... -- normal sorting algorithm
sortImmutable :: Vector a -> Vector a
sortImmutable orig = runST $ do
mutable <- newMutableVector (length orig)
copyValues orig mutable
sort mutable
freeze mutable
ST
is something we use to have temporary and local mutable
effects. Because of how it's implemented, we know that none of the
effects can be visible from outside of our function, and that for the
same input, the sortImmutable
function will always have the same
output. While this approach requires an extra memory buffer and an
extra copy of the elements in the vector, it avoids completely the
worries of your data being changed behind your back.
Summary: immutability and purity
Advantages
- Easier to reason about code
- Avoid many cases of data races
- Functions are more reliable, returning the same output for the same input
Disadvantages
- Lots of ceremony if you actually want mutation
- Some runtime performance hit for mutable algorithms
Software Transactional Memory
Let's say you actually need to be able to mutate some values. And for fun, let's say you want to do this from multiple threads. A common example of this is a bank. Let's again play with some pseudocode:
runServer (|request| => {
from := accounts.lookup(request.from)
to := accounts.lookup(request.to)
accounts.set(request.from, from - request.amt)
accounts.set(request.to, to + request.amt)
})
This looks reasonable, except that if two requests come in at the same time for the same account, we can end up with a race condition. Consider something like this:
Thread 1: receive request: Alice gives $25
Thread 2: receive request: Alice receives $25
Thread 1: lookup that Alice has $50
Thread 2: lookup that Alice has $50
Thread 1: set Alice's account to $25
Thread 2: set Alice's account to $75
We know that we want Alice to end up with $50, but because of our data race, Alice ends up with $75. Or, if the threads ran differently, it could be $25. Neither of these is correct. In order to avoid this, we would typically deal with some kind of locking:
runServer (|request| => {
accounts.lock(request.from)
accounts.lock(request.to)
// same code as before
accounts.unlock(request.from)
accounts.unlock(request.to)
})
Unfortunately, this leads to deadlocks! Consider this scenario:
Thread 1: receive request: $50 from Alice to Bob
Thread 2: receive request: $50 from Bob to Alice
Thread 1: lock Alice
Thread 2: lock Bob
Thread 1: try to lock Bob, but can't, so wait
Thread 2: try to lock Alice, but can't, so wait
...
This kind of problem is the bane of many concurrent programs. Let me show you another approach. As you may guess, here's some Haskell:
runServer $ \request -> atomically $ do
let fromVar = lookup (from request) accounts
toVar = lookup (to request) accounts
origFrom <- readTVar fromVar
writeTVar fromVar (origFrom - amt request)
origTo <- readTVar toVar
writeTVar toVar (origTo + amt request)
There are helper functions to make this shorter, but I wanted to do
this the long way to prove a point. This looks like exactly the kind
of race condition I described before. However, that atomically
function is vital here. It ensures that only a complete transaction is
ever committed. If any of the variables we touch are mutated by
another thread before our transaction is complete, all of our changes
are rolled back, and the transaction is retried. No need for explicit
locking, and therefore many less worries about data races and
deadlocks.
A TVar
is a "transactional variable." It's an alternative to the
IORef
that I mentioned earlier. There are other kinds of mutable
variables in Haskell, including channels and MVar
s which are like
mutexes. This is what I meant when I said you need to be explicit
about what kind of mutation you want in Haskell.
Purity's role
What do you think will happen with this program:
atomically $ do
buyBitcoins 3 -- side effects on my bank account
modifyTVar myBitcoinCount (+ 3)
Here, buyBitcoins
is going off to some exchange a buying about
$100,000 in bitcoin (or whatever ridiculous amount they're selling for
now). I said before that, if the variables are modified while running,
the transaction will be retried. It seems like this function is very
dangerous, as it may result in me going about $10,000,000 into debt
buying bitcoins!
This is where purity steps in. Inside atomically
, you are not
allowed to perform any side effects outside of STM itself. That means
you can modify TVar
s, but you cannot read or write files, print to the
console, fire the missiles, or place multi million dollar currency
purchases. This may feel like a limitation, but the tradeoff is that
it's perfectly safe for the runtime system to retry your transactions
as many times as it wants.
Summary of STM
Advantages
- Makes concurrent data modification much easier
- Bypass many race conditions and deadlocks
Disadvantages
- Depends on purity to work at all
- Not really a disadvantage, you're already stuck with purity in Haskell
- Not really any other disadvantages, so just use it!
Laziness
It's a little cheeky of me to get this far into a talk about unique features of Haskell and ignore one of its most notable features: laziness. Laziness is much more of a double-edged sword than the other features I've talked about, and let me prove that by revisiting one of our previous examples.
let loop i total =
if i > 1000000
then total
else loop (i + 1) (total + i)
in loop 1 0
I didn't describe it before, but this function will sum up the numbers from 1 to 1,000,000. There are two problems with this function:
- There's a major performance bug in it
- It's much more cumbersome than it should be
Space leaks
The bane of laziness is space leaks, something you've probably heard about if you've read at all about Haskell. To understand this, let's look at how laziness is implemented. When you say something like:
let foo = 1 + 2
foo
doesn't actually contain 3
right now. Instead, it contains an
instruction to apply the operator +
to the values 1
and 2
. This
kind of instruction is called a thunk. And as you might guess,
storing the thunk is a lot more expensive than storing a simple
integer. We'll see why this helps in a bit, but for now we just care
about why it sucks. Let's look at what happens in our loop
function:
let loop i total =
if i > 1000000
then total
else loop (i + 1) (total + i)
in loop 1 0
Each time we step through the loop, we have to compare i
to the
number 1,000,000. Therefore, we are forced to evaluate it, which means
turning it into a simple integer. But we never look at the value of
total
. Instead of storing a simple integer, which would be cheap, we
end up building a huge tree that looks like "add 1 to the result of
add 2 to the result of ... to 1,000,000." This is really bad: it uses
more memory and more CPU than we'd like.
We can work around this in Haskell by being explicit about which values should be evaluated. There are a few ways to do this, but in our case, the easiest is:
let loop i !total =
if i > 1000000
then total
else loop (i + 1) (total + i)
in loop 1 0
All I've done is added an exclamation point in front of the total
argument. This is known as a bang pattern, and says "make sure this is
evaluated before running the rest of this function." The need to do
this in some cases is definitely a downside to Haskell's laziness. On
the other hand, as we'll see shortly, you often don't need to bother
if you use the right kinds of functions.
Laziness is awesome
Let's go back to pseudocode and rewrite our summation:
total := 0
for(i := 1; i <= 1000000; i++) {
total += i
}
Pretty simple. But now let's modify this to only sum up the even numbers:
total := 0
for(i := 1; i <= 1000000; i++) {
if (isEven(i)) {
total += i
}
}
OK, that's fine. But now, let's sum up the indices modulus 13 (for some weird reason):
total := 0
for(i := 1; i <= 1000000; i++) {
if (isEven(i)) {
total += i % 13
}
}
Each of these modifications is fine on its own, but at this point it's
getting harder to see the forest for the trees. And fortunately each
of these transformations was relatively simple. If some of the
requirements were more complicated, fitting it into the for
loop may
be more challenging.
Let's go back to the beginning with Haskell. We saw how we could do it with a loop, but let's see the real way to sum the numbers from 1 to 1,000,000:
-- Bad
let loop i !total =
if i > 1000000
then total
else loop (i + 1) (total + i)
in loop 1 0
-- Awesome!
sum [1..1000000]
We use list range syntax to create a list with one million numbers in it. On its face, this looks terrible: we need to allocate about 8mb of data to hold onto these integers, when this should run in constant space. But this is exactly where laziness kicks in: instead of allocating all of these values immediately, we allocate a thunk. Each time we step through the list, our thunk generates one new integer and a new thunk for the rest of the list. We're never using more than a few machine words.
There are also other optimizations in GHC to avoid even allocating those thunks, but that's not something I'm going to cover today.
Anyway, let's continue. We can easily tweak this to only add up the even numbers:
sum (filter even [1..1000000])
This uses the filter
higher order function, and likewise avoids
allocating an entire list at once. And doing the silly modulus 13
trick:
sum (map (`mod` 13) (filter even [1..1000000]))
Laziness is definitely a mixed bag, but combined with the functional style of Haskell in general, it allows you to write higher level, declarative code, while keeping great performance.
Short circuiting for free
Lots of languages define &&
and ||
operators which stop evaluation
early, e.g.:
foo() && bar()
bar
is only called if foo
returns true
. Haskell works the same way, but these operators aren't special; they just use laziness!
False && _ = False
True && x = x
True || _ = True
False || x = x
This even scales up to functions working on lists of values, such as
and
, or
, all
, and any
.
Other downsides
There's one other downside to laziness, and a historical artifact. Laziness means that exceptions can be hiding inside any thunk. This is also known as partial values and partial functions. For example, what does this mean?
head []
Generally speaking, partiality is frowned upon, and you should use total functions in Haskell.
The historical artifact is that many bad functions are still easily
available, and they should be avoided. head
is arguably an example
of that. Another is the lazy left fold function, foldl
. In virtually
all cases, you should replace it with a strict left fold foldl'
.
Summary of laziness
Advantages
- More composable code
- Get efficient results from combining high level functions
- Short-circuiting like
&&
and||
is no longer a special case
Disadvantages
- Need to worry about space leaks
- Exceptions can be hiding in many places
- Unfortunately some bad functions like
foldl
still hanging around
Side note There's a major overlap with Python generators or Rust iterators, but laziness in Haskell is far more pervasive than these other approaches.
Others
Due to time constraints, I'm not going to be able to go into detail on a bunch of other examples I wanted to talk about. Let me just throw out some quick thoughts on them.
Parser (and other) DSLs
- Operator overloading!
- Abstract type classes like
Applicative
andAlternative
a natural fit, e.g.:parseXMLElement <|> parseXMLText
. - Able to reuse huge number of existing library functions,
e.g.
optional
,many
- General purpose
do
-notation is great
data Time = Time Hour Minutes Seconds (Maybe AmPm)
data AmPm = Am | Pm
parseAmPm :: Parser Time
parseAmPm = Time
<$> decimal
<*> (":" *> decimal)
<*> (":" *> decimal)
<*> optional (("AM" $> Am) <|> ("PM" $> Pm))
c/o @queertypes
Advanced techniques
- Free monads
- Monad transformer stacks
- Lens, conduit, pipes, ...
- Lots of ways to do things in Haskell!
- It's a plus and a minus
- Recommendation: choose a useful subset of Haskell and its libraries, and define some best practices
Conclusion
- Haskell combines a lot of uncommon features
- Very few of those features are unique
- Combining those features allows you to write code very differently than in other languages
- If you want readable, robust, easy to maintain code: I think it's a great choice
- Be aware of the sharp edges: they do exist!