Lots of Scala developers know what a Monad
is.
A smaller number are aware what the definition of an Applicative Functor
.
It’s tempting to think of Applicative
as a “worse Monad
”.
This is not exactly wrong, but it doesn’t provide us with any insight into what Applicative
s are good for.
In this blog post, I’ll attempt to answer that question by looking at an example where requiring an Applicative
, rather than a Monad
gives us very different capabilities.
Writing a Command-Line Argument Parser
A fairly common programming problem is writing a parser for command line arguments.
That is, a function that takes a list of strings, such as --firstname Joe --surname Warren --tall
and converts this into some kind of domain object, such as:
case class Person(tall: Boolean, firstname: String, surname: String)
val joe = Person(true, "Joe", "Warren")
Let’s write a simple interface to a command line parser Fair warning, this is going to be simple demonstration code, rather than a fully featured parser.
We’ll start by defining an error type, because it’s more stylish clearer than littering String
everywhere.
case class ParseFailure(error: String)
And now, let’s define an interface.
/* Making the class sealed prevents other users from creating their `Parser` subclasses
* Our aim is to provide simple building blocks
* that can be combined into a more complex parser
* not to provide an interface that they have to implement
*
* The class is parameterized with a type `A`
* this is the type of object produced by a successful parse.
*/
sealed abstract class Parser[A] {
// defining a type alias saves us some typing
type PartialParseResult = Either[ParseFailure, (List[String], A)]
// Each parser can attempt to partially parse a block of command line arguments
// This will either return a `Left[ParseFailure]` if the arguments are invalid
// or it will "consume" some of the arguments, returning a `Right[(List[String], A)]`
// containing both a reduced list of arguments, and a value of type `A`
//
// We can mark this protected, because it isn't part of the external dsl
protected def partialParse(args: List[String]): PartialParseResult
// we parse a list of arguments by calling `partialParse`,
// and then validating that there are no arguments left unconsumed
// every Parser should implement `parse` in the same way, so we can mark this `final`
final def parse(args: List[String]): Either[ParseFailure, A] =
partialParse(args) flatMap {
case (List(), result) => Right(result)
case (remainingArgs, _) =>
Left(ParseFailure(
s"Unrecognised Arguments:${remainingArgs.mkString("\t\n", "\t\n", "")}"
))
}
}
Now let’s write a “concrete” parser.
We want this to be relatively simple, in this case, a boolean “flag” that will return true
or false depending on whether or not the flag is present.
By implementing this as an anonymous class, we prevent the user from pattern matching on the specific implementation of a Parser
.
This makes our Parser
type “opaque”; external code can’t look inside at the implementation details.
Creating the anonymous class via the apply
method on a Flag
object, gives us some nice syntactic sugar, and lets us write Flag("tall")
as if we were constructing an instance of a case class.
object Flag {
def apply(name: String): Parser[Boolean] = new Parser[Boolean] {
private val flag = s"--$name"
/* a flag is either present, or it is absent
* both valid states, so `partialParse can't fail
* and always returns a `Right` with a tuple
* The first value returned is the arguments list with the flag (if present) removed
* The second value is a boolean; whether the flag was passed
*/
override protected def partialParse(args: List[String]): PartialParseResult =
Right(args.filter(_ != flag), args.contains(flag))
}
}
In a similar vein, let’s add support for a (mandatory) String
argument.
This is going to require that an argument is passed, such as --firstname Joe
, and it’s going to have the type Parser[String]
.
Unlike the Flag
Parser, it is possible for this parser to fail, since either the value, or the entire argument could be missing.
object StringArg {
def apply(name: String): Parser[String] = new Parser[String] {
private val flag = s"--$name"
override protected def partialParse(args: List[String]): PartialParseResult = {
.span(_ != flag) match {
argscase (_, List()) =>
Left(ParseFailure(s"Missing argument $flag"))
case (_, List(_)) =>
Left(ParseFailure(s"Expected a value after $flag"))
case (prefix, _ :: value :: suffix) =>
Right((prefix ++ suffix, value))
}
}
}
}
Now, we have some simple building blocks, we need a way to compose them.
We’re going to do this by defining a Monad
instance for our Parser
type.
The world contains enough Monad tutorials, so I’m going to assume that you already know what a Monad is. But for completeness, I will include a (potentially inaccurate and definitely unhelpful) definition.
Monad
is a typeclass where instances have both apure
method, which takes a value, and returns that value “within” an instance of the monad (in our casepure[A](a: A): Parser[A]
), and aflatMap
method, which takes an instance of the monad of one type, and a function from that type, to an instance of the monad of a second type, and returns an instance of the monad of the second type (in our caseflatMap[A, B](fa: Parser[A])(f: A => Parser[B]): Parser[B]
), and also where instances must obey the “Monad Laws”: “Left Identity”, “Right Identity” and “Associativity”.
In my defence, I did say upfront that it would be unhelpful.
We’re going to use the Cats Monad
implementation.
If you’re following along at home, the required import is:
import cats.Monad
object Parser {
implicit def monadForParser: Monad[Parser] = new Monad[Parser] {
// pure creates a parser that always produces the same value
// no matter what the input is
override def pure[A](x: A): Parser[A] = new Parser[A] {
override protected def partialParse(args: List[String]): PartialParseResult =
Right((args, x))
}
/* flatMap runs the first parser
* then calls the provided function on the output to create a new parser
* and runs that on the arguments that weren't consumed by the first parser
* this implementation uses the `Either` Monad
* to compose the potential errors from the two parsers
*/
override def flatMap[A, B](fa: Parser[A])(f: A => Parser[B]): Parser[B] =
new Parser[B] {
override protected def partialParse(args: List[String]): PartialParseResult =
for {
(nextArgs, a) <- fa.partialParse(args)
<- f(a).partialParse(nextArgs)
result } yield result
}
/* cats Monad requires we write tailRecM for reasons of stack safety
* Because this is example code, lets more or less ignore it
* and go with a stock implementation
*/
override def tailRecM[A, B](a: A)(f: A => Parser[Either[A, B]]): Parser[B] =
flatMap(f(a)) {
case Right(b) => pure(b)
case Left(nextA) => tailRecM(nextA)(f)
}
}
}
Now let’s see our parser in action, and write a parser for the Person
case class mentioned above.
case class Person(isTall: Boolean, firstname: String, surname: String)
This is going to require another set of imports, as we’re using some cats syntactic sugar:
import cats.syntax.all._
We can use a “for comprehension” to provide an aesthetically pleasing way to “Monadically compose” our simple parsers
val personParser: Parser[Person] = for {
<- Flag("tall")
tall <- StringArg("firstname")
firstname <- StringArg("surname")
surname } yield Person(tall, firstname, surname)
And we can write some tests to validate that this works as expected:
import org.scalatest.freespec.AnyFreeSpec
import org.scalatest.matchers.should.Matchers
class ParserSpec extends AnyFreeSpec with Matchers {
"A Monadic Parser" - {
"Should be able to parse argument lists" - {
"basic clean parse" in {
val args = "--tall --firstname John --surname Doe".split(' ').toList
.parse(args) should be(
personParserRight(Person(true, "John", "Doe"))
)
}
"flags are optional" in {
val args = "--firstname John --surname Doe".split(' ').toList
.parse(args) should be(
personParserRight(Person(false, "John", "Doe"))
)
}
"arguments are not optional" in {
val args = "--tall --surname Doe".split(' ').toList
.parse(args) should be(
personParserLeft(ParseFailure("Missing argument --firstname"))
)
}
}
}
}
Monadic Shortcomings
A common thing to want a command line parser to do (on top of parsing arguments) is to print a simple “usage” summary of the arguments that are expected.
Let’s try to extend our Parser
interface in order to support this.
If you’re following along, at this point we’re more or less starting from scratch (in a separate package).
The only thing we share is the error type, ParseFailure
.
The Parser
interface is more or less unchanged, except for the addition of a “usage” method.
sealed abstract class PrintingParser[A] {
type PartialParseResult = Either[ParseFailure, (List[String], A)]
protected def partialParse(args: List[String]): PartialParseResult
final def parse(args: List[String]): Either[ParseFailure, A] =
partialParse(args) flatMap {
case (List(), result) => Right(result)
case (remainingArgs, _) =>
Left(ParseFailure(
s"Unrecognised Arguments:${remainingArgs.mkString("\t\n", "\t\n", "")}"))
}
def usage: String
}
Flag
and StringOption
are also more or less unchanged, other than by the addition of simple usage strings.
object Flag {
def apply(name: String): PrintingParser[Boolean] = new PrintingParser[Boolean] {
private val flag = s"--$name"
override protected def partialParse(args: List[String]): PartialParseResult = {
Right(args.filter(_ != flag), args.contains(flag))
}
override def usage: String = s"[$flag]"
}
}
object StringOption {
def apply(name: String): PrintingParser[String] = new PrintingParser[String] {
private val flag = s"--$name"
override protected def partialParse(args: List[String]): PartialParseResult = {
.span(_ != flag) match {
argscase (_, List()) =>
Left(ParseFailure(s"Missing argument $flag"))
case (_, List(_)) =>
Left(ParseFailure(s"Expected a value after $flag"))
case (prefix, _ :: value :: suffix) =>
Right((prefix ++ suffix, value))
}
}
override def usage: String = s"--$name <value>"
}
}
Now, let’s try to define a Monad
for our PrintingParser
.
Defining pure
is easy enough, a pure
Parser doesn’t require any arguments, so we can return an empty usage string, and leave the rest of the implementation unchanged
override def pure[A](x: A): PrintingParser[A] = new PrintingParser[A] {
override protected def partialParse(args: List[String]): PartialParseResult =
Right((args, x))
override def usage: String = ""
}
However, when we try to define flatMap
we run into some difficulties with the usage string
override def flatMap[A, B]
(fa: PrintingParser[A])
(f: A => PrintingParser[B]):
[B] =
PrintingParsernew PrintingParser[B] {
override protected def partialParse(args: List[String]): PartialParseResult = for {
(nextArgs, a) <- fa.partialParse(args)
<- f(a).partialParse(nextArgs)
result } yield result
override def usage: String = ???
}
Ideally, we’d like for this to include the usage strings from both the first and second parsers.
Unfortunately:
- We have a
fa: PrintingParser[A]
(the first parser); it gives us the first half of the usage string 😀 - We don’t have a
PrintingParser[B]
(the second parser), just a functionf: A => PrintingParser[B]
😞 - So we need a value of type
A
in order to get aPrintingParser[B]
🤔 - We can’t get an
A
out of ourPrintingParser[A]
without running it on some arguments, and we want to be able to useusage
without supplying an arguments list 🤬
Fortunately:
- This is where
Applicative
comes to the rescue 🦸♀️
Applicative
is a typeclass like Monad
, except instead of requiring flatMap
, it requires the (weaker) ap
.
(“Weaker” in this case means that we can build ap
out of pure
and flatMap
, but not the other way round.
Because of this, every Monad can be used as an Applicative, but not vice versa.)
ap
has the signature (when applied to Parser
) def ap[A, B](ff: PrintingParser[A => B])(fa: PrintingParser[A]): PrintingParser[B]
.
Compared to flatMap
, in ap
, the function sits “inside” the Parser, so we have direct access to both Parsers when composing them, and can look at both usage strings.
import cats.Applicative
object PrintingParser {
implicit def applicativeForPrintingParser: Applicative[PrintingParser] =
new Applicative[PrintingParser] {
override def pure[A](x: A): PrintingParser[A] = new PrintingParser[A] {
override protected def partialParse(args: List[String]): PartialParseResult =
Right((args, x))
override def usage: String = ""
}
override def ap[A, B]
(ff: PrintingParser[A => B])
(fa: PrintingParser[A]):
[B]
PrintingParser= new PrintingParser[B] {
override protected def partialParse(args: List[String]): PartialParseResult =
for {
(nextArgs, a) <- fa.partialParse(args)
(nextNextArgs, function) <- ff.partialParse(nextArgs)
} yield (nextNextArgs, function(a))
/* If we were to join both usage strings together with a space
* this would break the Monad identity laws
* Adding no space if either string is empty makes the Applicative lawful
* (to the best of my understanding)
*/
override def usage: String = (ff.usage, fa.usage) match {
case ("", "") => ""
case ("", snd) => snd
case (fst, "") => fst
case (fst, snd) => s"$fst $snd"
}
}
}
}
Because Applicative
doesn’t provide flatMap
, the code to compose our parsers together looks a little different.
import cats.syntax.all._
case class Person(tall: Boolean, firstname: String, surname: String)
val personParser = (
Flag("tall"),
StringOption("firstname"),
StringOption("surname")
).mapN(Person)
mapN
from cats, lets us compose a tuple full of applicatives of the same kind together.
I think it’s up for debate whether or not the Applicative or Monadic style of Person
parser is more readable.
But one thing that I do find noteworthy, is that Scala contains a fair amount of syntactic sugar for working with Monads, in the form of for comprehensions.
By comparison, Applicatives don’t have any syntactic sugar, and are build using the “standard” syntax of the Scala language, despite this, people have been able to define operators that allow for an elegant style when working with Applicatives.
The same statement could be made about Haskell, and “do notation” (Haskell syntactic sugar similar to a for comprehension) vs the <$>
and <*>
opperators.
Applicative Shortcomings
The downside to working with Applicative
rather than Monad
, is that it does limit the kinds of Parser
that it’s possible to write.
With a Monadic parser, it’s possible to use a result returned early in the parse, in order to decide which of two different parsers to use.
This may be best explained by example, let’s throw away our Person
case class, and make a more complex Algebraic Data Type (ADT):
sealed trait Person {
def firstname: String
}
case class PersonWithoutSurname(firstname: String) extends Person
case class PersonWithSurname(firstname: String, surname: String) extends Person
In this ADT, a Person
always has a firstname
and may have a surname
. This could possibly be represented by two different styles of command line argument
--has-surname --firstname John --surname Doe
--firstname John
And we could implement a Parser for this like so:
val personParser = for {
<- Flag("has-surname")
hasSurname <- StringArg("firstname")
firstname : Person <- if (hasSurname) {
personStringArg("surname").map( surname =>
PersonWithSurname(firstname, surname)
)
} else {
PersonWithoutSurname(firstname).pure[Parser]
}
} yield person
It is possible to make optional arguments without requiring a Monad
, (for example, using typeclasses such as Alternative
, which I won’t cover here).
With that said, the general problem of wanting to vary the Parser used as a function of the values produced earlier on cannot be solved without a Monad
.
There is a trade off between Applicative
which makes it possible to define classes with more functionality (such as Parsers with usage strings), vs Monad
, which gives us more freedom in how we compose our classes together.
This trade off still applies in languages without Monad
s, Applicative
s and other Higher Kinded Types (HKTs).
It’s equally possible to define a Parser
such as this one in a dynamically typed language such as Python.
But it would not be possible to provide both an automatic usage string, and flatMap
style composition as part of the same interface.
One of my favourite things about statically typed functional programming, is that it makes this kind of trade off easy to spot.
Appendix
If you want to run the code in this blogpost, it may be helpful to see the build.sbt
with the versions that were used to write it:
addCompilerPlugin("com.olegpy" %% "better-monadic-for" % "0.3.1")
lazy val root = (project in file("."))
.settings(
:= "parsing-blogpost",
name += "org.typelevel" %% "cats-core" % "2.1.0",
libraryDependencies += "org.scalatest" %% "scalatest" % "3.1.1" % "test"
libraryDependencies )
The code for this blogpost is available on BitBucket, here.
If you liked this blogpost, and can read Haskell, you may be interested in the optparse-applicative library. While optparse-applicative contains vastly more functionality than the parser discussed here, it did provide the inspiration for this post.