Monad is trash
This post is about some of my new ideas on programming language design, although the title sounds quite off.
But, monad is just trash.
Monad in Haskell is trash
Monad is used mostly for these purposes:
- Imperative paradigm with limited support of control flow.
- Side effect tracking, e.g.
IO a
which tracks I/O operations.
And the drawbacks are bizarre:
- When it comes to control flow, for example, one has to achieve early return via throwing exceptions and setting up handlers.
- The
IO a
effect is the "worst" effect, as mentioned in the Koka book, for covering up all the important sub-effects that users might need for fine-grained control:- Exceptions
- Divergence (a function might not terminate)
- Non-determinism (e.g. generating random numbers)
- Console
- Networking
- Filesystem
- UI (graphics system)
- Heap allocation/read/write
Why would Haskell put these altogether? Because making effects
extensible is not trivial at all. IO a
is just a plain data
type, imagine one prints some text to the console and writes some bytes
to a file, what return type would be suitable for the program entrypoint
main
?
Instead of calling it IO a
, there are essentially two
worlds in Haskell: the impure (effect-ful) and the
pure (effect-free). And as we know, getting back to the pure is
unsafe (unsafePerformIO
).
One still has the choice to use third-party libraries for extensible effects. Just see how many there are:
I'm not intrigued to know about their differences. See those language extensions, they might look terrifying for you, but I'm not scared. This is how it would look without good compile-time evaluation and static reflection support (if these two features remind you of C++, you got the joke). Check out Lean 4 and Scala 3, see how they trivialize the problems.
Koka trivializes extensible effects with scoped labels (i.e. effects could be duplicate). Note that it's nothing related to the fancy "algebraic effect" feature, it's only about (type system for) "effect types". So unsurprisingly, it's not new: Koka nailed it in 2014, more than a decade ago.
If you're still not convinced, let me show you these banger lines from Safer exceptions in Scala:
Dealing with one monad at a time is straightforward but dealing with several monads together is much less pleasant since monads do not compose well. A great number of techniques have been proposed, implemented, and promoted to deal with this, from monad transformers, to free monads, to tagless final. None of these techniques is universally liked, however; each introduces a complicated DSL with runtime overhead that is both hard to understand for non-experts and hard to debug. In the end, many developers prefer to work instead in a single βsuper-monadβ like ZIO that has error propagation built-in alongside other aspects. This one-size-fits-all approach can work nicely for frameworks that offer a fixed set of capabilities, but its fixed overhead and lack of flexibility make it unsuitable as the only provided solution for a general purpose programming language.
Okay, now, get ready for time to cook Koka...
Effect is far from enough
There are still further issues despite an effect is "monadic" or "algebraic", the most important one is called coeffect.
It sounds like some annoying category theorists speaking in tongues. But for most of the time it's just about a simple concept, "resource":
void *p = malloc(sizeof(int));
In the above example, one typically expects:
- Only
p
points to the resource (memory), - To close (reclaim) the resource wisely, or,
- To transfer the resource, and the transferee should take further care of it.
Here, the p
is a coeffect, a handle, a descriptor, a
resource, and any similar names you would drop.
Usually, people use the state monad ST s a
to
simulate this situation:
import Control.Monad.ST
import Data.STRef
app :: ST s Int
= do
app <- newSTRef 0
ref +1)
modifySTRef ref (
readSTRef ref
main :: IO ()
= print $ runST app main
If you're confused with the type parameter s
, the type
of variable ref
is STRef s Int
. If one
attempts to return ref
for leaking the resource, this
s
from ref
would not be able to reference that
from app
itself, since runST
has a pretty
special type:
runST :: (forall s. ST s a) -> a
-- ^ ^
-- | |
-- | +--- Impossible to reference that inner `s`.
-- |
-- +--- `s` is only available inside `app`.
So s
is a hack, and poor Haskell has to enable the
RankNTypes
language extension for this.
What's worse? Everyone is using this trick:
- Koka would try to
eliminate the internal
st<h>
effect by runningrunST
on the function. If the resource is leaked, the compiler is able to leave that specific effect, and does better than Haskell. - Rust's lifetime parameter is the same thing, and Clippy (Rust
linter) will tell you if one could elide a fully generic lifetime
parameter by, yup, running
runST
.
What's even worse? People really want to track other kinds of resources, e.g. files, sockets, threads, locks... It's stupid to have all of them built-in, but Koka has no better ideas yet.
And so far, you might already notice a terrible truth:
Koka marks the effect
alloc<h>
, if one leaks the allocated memory to its caller.
Which means, a coeffect system is often coupled with the effect system, they need to work well together, not just in isolation.
You would see people keep talking about the fancy algebraic effects, or the classic linear/affine logic (for coeffects). But the truth is, we need to take both.
Open problem
Now we've understood the background of this open problem. There are already many attempts from the greats:
- Combining Effects and Coeffects via Grading (ICFP '16)
- The
Syntax and Semantics of Quantitative Type Theory (LICS '18)
- This paper is just about coeffects, but we have an implementation to play around: Idris 2!
- A Crash Course in Idris 2 shows examples of monadic effects and its quantitative type feature together.
- Effects and Coeffects in Call-by-Push-Value (OOPSLA '24)
I haven't dived deep enough into these works. Not to mention in that ICFP '16 paper above, there are 16 additional laws beyond the basic typing rules for the effect-over-coeffect and coeffect-over-effect setups. It's hard to convince myself such system is practical enough for the end users.
My attempts
I'm not a professional for making and proving novelty. I just try collecting these existing ideas.
Modality
First off, what to be attached with the information of a coeffect (e.g. how many times a variable should be used exactly, or at most)? The options are two: Adding new types or adding to existing constructs (e.g. bindings and functions).
Most systems decided to attach coeffects with a dedicated new set of types:
std::unique_ptr<T>
in C++, a unique pointer should be used exactly once (move semantics), and one could achieve this in a custom type with its copy constructor and copy assignment operator all deleted. And then people shoot their feet usinguse-after-move
for fun.Box<T>
type andCopy
trait in Rust, much saner.- That ICFP '16 paper, and many linear/affine logic papers, mostly you
could see
box T
/unbox T
type operators, and some even usedπ T
/π T
to look cool.
It's just straightforward to use a type former for constructing a coeffect, and a type eliminator for destructing it. People are so used to it. The drawbacks are not apparent, until you relate to monads:
- It's hard to compose or extend such constructs:
- You've seen the types
Box<T>
,&T
and&mut T
in Rust coming out of nowhere. You've seen implicit rules like&Box<T>
would be seamlessly used as&T
, but when it fails to work, one writesa.as_ref()
instead of&a
. - You don't see any resources like files are able to track using
exactly the approach for types like
&T
. The manual tells you to use RAII and theDrop
trait.
- You've seen the types
- More and more primitive types are added to the compiler, more rules
are added for them being poorly composable and extendable.
- You might feel weird about why even the
Box<T>
type comes from the void, isn't there just source code for it? Then read more carefully and you would find the comment above it:
- You might feel weird about why even the
The declaration of the
Box
struct must be kept in sync with the compiler or ICEs will happen.
So the alternative approach looks quite intriguing: Adding (co)effects to existing types. That's how modality works in Koka and Idris 2.
- Koka adds effect types on function types and (so naturally) the function definitions.
- Idris 2 adds coeffects on function parameters only.
So it might be interesting if we can specify effects and coeffects in:
- Function parameters
- Variable declarators (a.k.a. bindings), not just parameters
- Function types
We will give code examples to show how. Here comes the next question.
Handling coeffects
How to define a function that closes/reclaims a resource? Of course,
we should not use the s
/runST
trick again.
My initial thought tells me one could also specify this behavior using parameters and variable declarators. But the biggest concern might be, if there is a variable with unlimited usage (e.g. of a primitive arithmetic type), every use of it might not be considered as some resource reclamation. Or simply it's just not a resource, since it won't be running out.
By far, we should end up with something like Koka and Idris 2 altogether.
Wrapping up
π‘ Below are the personal ideas. Feel free to judge.
Let's write some code in a new language toy like C. We declare the
prototypes of malloc
and free
, as the
lowest-level memory management primitives (although
aligned_alloc
should be better):
void *malloc(size_t sz);
void free(void *p);
We define a new effect for memory allocation:
[[effect]]
typedef Alloc(typename T);
We write their overloaded functions for convenience. But we enforce that overloaded definitions should be always inline and fully evaluated at compile-time. They won't really reside in any translation units.
[[overload, move(Alloc(T))]]
// ^~~~~~~~~~~~~^ effect and coeffect of this function.
*
T (typename T)
malloc{
void *p [[move(Alloc(T))]] = malloc(sizeof(T));
// ^~~~~~~~~~~~~^ coeffect attached to the declarator:
// `p` is used exactly once.
return p;
}
And the overloaded free
to reclaim the memory:
[[overload]]
void
(typename T, T *p [[move(Alloc(T))]])
free// ^~~~~~~~~~~~~^
// "closes" the coeffect.
{
(p);
free}
Now, some interesting examples that would be rejected by the typechecker:
void f0() {
int *p = malloc(); // type parameter `T` is inferred.
// β error: variable `p` must be used or returned.
}
void f1() {
int *p = malloc();
(p); // type parameter `T` is inferred.
free
(p); // β error: variable `p` is already moved.
free}
void f2() {
int *p = malloc();
if (true) {
(p);
freereturn;
}
// β error: variable `p` must be used in all branches.
}
int * // β error: function `f3` must be annotated with effect `Alloc(int)`.
()
f3{
return malloc();
}
More thoughts?
We're still pretty far from changing the game, obviously. Some problems that are still very challenging:
- Flexible pointer types. Pointers could be:
- Of weird semantics with the
void *
. - Immutable and mutable references (or views) of some existing pointers. There are related practices in ATS using view types for accesses.
- Restrict (for aliasing), just like
Box<T>
andstd::unique_ptr<T>
. - In, out, inout parameters.
- Used for pointer arithmetic.
- Of weird semantics with the
- Fully/partially initialized and uninitialized memory.
- For the problems above, their interactions with struct fields.
And finally, I should get back for the journey...