The Koka Programming Language

(Daan Leijen, 2024-05-30)

1. Getting started

Welcome to Koka – a strongly typed functional-style language with effect types and handlers.

Why Koka? A Tour of Koka Install Discussion forum Github Libraries

Note: Koka v3 is a research language that is currently under development and not ready for production use. Nevertheless, the language is stable and the compiler implements the full specification. The main things lacking at the moment are (async) libraries and package management.

News:

2024-05-30: Koka v3.1.2 released which fixes Koka installs outside a VS Code workspace.
2024-05-30: Read the paper on “The Functional Essence of Binary Search Trees” by Anton Lorenzen, Daan Leijen, Sam Lindley, and Wouter Swierstra to be presented at PLDI'24 on June 27.
2024-05-30: View the talk on the design and compilation of efficient effect handlers in Koka as part of Xavier Leroy's beautiful lecture series on control structures and algebraic effects at the Collège de France (with many other invited talks available online).
2024-03-04: Koka v3.1.1 released with a language server crash fix.
2024-02-14: Koka v3.1.0 released with a concurrent build system and improved language service over the stdio protocol. Redesign of named effect typing to match the formal system more closely. See samples/handlers/named for examples.
2024-01-25: Koka v3.0.4 released with improved VS Code hover and inlay information. Splits std/core in multiple modules, fixes bug in infinite expansion for implicits and various other small improvements.
2024-01-13: Koka v3.0.1 released with improved VS Code integration and inlay hints. Initial support for locally qualified names and implicit parameters (see the samples/syntax). Various small bug fixes.
2023-12-30: Koka v2.6.0 released with VS Code language integration with type information, jump to definition, run test functions directly from the editor, automatic Koka installation, and many more things. Special thanks to Tim Whiting and Fredrik Wieczerkowski for all their work on making this possible!
2023-12-27: Update of the technical report on "The functional essence of binary trees" where we use fully-in-place programming and the new hole contexts to create fully verified functional implementations of binary search tree algorithms with performance on par with imperative C implementations.
2023-07-03: Koka v2.4.2 released: add support for fip and fbip keywords described in “FP²: Fully in-Place Functional Programming” (ICFP'23) [pdf]. Various fixes and performance improvements.
2021-02-04 (pinned) The Context Free youtube channel posted a short and fun video about effects in Koka (and 12 (!) other languages).
2021-09-01 (pinned) The ICFP'21 tutorial “Programming with Effect Handlers and FBIP in Koka” is now available on youtube.
2022-02-07: Koka v2.4.0 released: improved specialization and int operations, add rbtree-fbip sample, improve grammar (pub (instead of public, remove private (as everything is private by default now)), final ctl (instead of brk), underscores in number literals, etc), rename double to float64, various bug fixes.
2021-12-27: Koka v2.3.8 released: improved int performance, various bug fixes, update wasm backend, initial conan support, fix js backend.
2021-11-26: Koka v2.3.6 released: maybe-like types are already value types, but now also no longer need heap allocation if not nested (and [Just(1)] uses the same heap space as [1]), improved atomic refcounting (by Anton Lorenzen), improved specialization (by Steven Fontanella), various small fixes, add std/os/readline, fix build on freeBSD
2021-10-15: Koka v2.3.2 released, with initial wasm support (use --target=wasm, and install emscripten and wasmtime), improved reuse specialization (by Anton Lorenzen), and various bug fixes.
2021-09-29: Koka v2.3.1 released, with improved TRMC optimizations, and improved reuse (the rbtree benchmark is as fast as C++ now), and faster effect operations. Experimental: allow elision of -> in anonymous function expressions (e.g. xs.map( fn(x) x + 1 )) and operation clauses. Command line options changed a bit with .koka as the standard output directory.
2021-09-20: Koka v2.3.0 released, with new brace elision and if/match conditions without parenthesis. Updated the javascript backend using ES6 modules and BigInt. new module std/num/int64, improved effect operation performance.
2021-09-05: Koka v2.2.1 released, with initial parallel tasks, the binary-trees benchmark, and brace elision.
2021-08-26: Koka v2.2.0 released, improved simplification (by Rashika B), cross-module specialization (Steven Fontanella), and borrowing annotations with improved reuse analysis (Anton Lorenzen).
2021-08-26: At 12:30 EST was the live Koka tutorial at ICFP'21, see it on youtube.
2021-08-23: “Generalized Evidence Passing for Effect Handlers”, by Ningning Xie and Daan Leijen presented at ICFP'21. See it on youtube or read the paper.
2021-08-22: “First-class Named Effect Handlers”, by Youyou Cong, Ningning Xie, and Daan Leijen presented at HOPE'21. See it on youtube or read the paper.
2021-06-23: Koka v2.1.9 released, initial cross-module specialization (by Steven Fontanella).
2021-06-17: Koka v2.1.8 released, initial Apple M1 support.
The Perceus paper won a distinguished paper award at PLDI'21!
2021-06-10: Koka v2.1.6 released.
2021-05-31: Koka v2.1.4 released.
2021-05-01: Koka v2.1.2 released.
2021-03-08: Koka v2.1.1 released.
2021-02-14: Koka v2.0.16 released.
2020-12-12: Koka v2.0.14 released.
2020-12-02: Koka v2.0.12 released.
2020-11-29: Perceus technical report publised (pdf).

1.1. Installing the compiler

1.1.1. Install with the VS Code editor

The easiest way to start with Koka is to use the excellent VS Code editor and install the Koka extension. Go to the extension panel, search for Koka and install the official extension as shown on the right.

Installing the extension also prompts to install the latest Koka compiler on your platform (available for Windows x64, MacOS x64 and arm64, and Linux x64).

Once installed, the samples directory is opened. You can also open this manually from the command panel (Ctrl/Cmd+Shift+P), and running the Koka: Open samples command (when you start typing the command will surface to the top). Open for example the basic/caesar.kk file: When you click run debug (or optimized) Koka compiles and runs the function, showing the output in the VS Code terminal.

vscode-caesar

Press and hold ctrl+alt (or ctrl+option on MacOS) to show inlay hints – showing inferred types, fully qualified names, and implicit arguments.

vscode-inlay-off

versus

vscode-inlay-on

1.1.2. Manual Installation

On Windows (x64), open a cmd prompt and use:

curl -sSL -o %tmp%\install-koka.bat https://github.com/koka-lang/koka/releases/latest/download/install.bat && %tmp%\install-koka.bat

On Linux (x64) and macOS (x64, arm64 (M1/M2)), you can install Koka using:

curl -sSL https://github.com/koka-lang/koka/releases/latest/download/install.sh | sh

(If you previously installed Koka on macOS using brew, do an brew uninstall koka first). On other platforms it is usually easy to build Koka from source instead.

After installation, verify if Koka installed correctly:

$ koka
 _         _
| |       | |
| | _ ___ | | _ __ _
| |/ / _ \| |/ / _' |  welcome to the koka interactive compiler
|   ( (_) |   ( (_| |  version 2.4.0, Feb  7 2022, libc x64 (gcc)
|_|\_\___/|_|\_\__,_|  type :? for help, and :q to quit

loading: std/core
loading: std/core/types
loading: std/core/hnd
>

Type :q to exit the interactive environment.

For detailed installation instructions and other platforms see the releases page. It is also straightforward to build the compiler from source.

1.2. Running the compiler

vscode-codelens

Note that when using the VS Code editor, you can directly compile and run public functions that are named main, example..., or test... from the editor environment.

Of course, we can also run the compiler directly from a command line or use the interactive environment.

1.2.1. Command line usage

You can compile a Koka source as (note that all samples are pre-installed):

$ koka samples/basic/caesar.kk
compile: samples/basic/caesar.kk
loading: std/core
loading: std/core/types
...
check  : samples/basic/caesar
linking: samples_basic_caesar
created: .koka/v2.3.1/gcc-debug/samples_basic_caesar

and run the resulting executable:

$ .koka/v2.3.1/gcc-debug/samples_basic_caesar
plain  : Koka is a well-typed language
encoded: Krnd lv d zhoo-wbshg odqjxdjh
cracked: Koka is a well-typed language

The -O2 flag builds an optimized program. Let's try it on a purely functional implementation of balanced insertion in a red-black tree (rbtree.kk):

$ koka -O2 -o kk-rbtree samples/basic/rbtree.kk
...
linking: samples_basic_rbtree
created: .koka/v2.3.1/gcc-drelease/samples_basic_rbtree
created: kk-rbtree

$ time ./kk-rbtree
420000
real    0m0.626s

(On Windows you can give the --kktime option to see the elapsed time). We can compare this against an in-place updating C++ implementation using stl::map (rbtree.cpp) (which also uses a red-black tree internally):

$ clang++ --std=c++17 -o cpp-rbtree -O3 /usr/local/share/koka/v2.3.1/lib/samples/basic/rbtree.cpp
$ time ./cpp-rbtree
420000
real    0m0.667s

The excellent performance relative to C++ here (on Ubuntu 20.04 with an AMD 5950X) is the result of Perceus automatically transforming the fast path of the pure functional rebalancing to use mostly in-place updates, closely mimicking the imperative rebalancing code of the hand optimized C++ library.

1.2.2. Running the interactive compiler

Without giving any input files, the interactive environment runs by default:

$ koka
 _         _
| |       | |
| | _ ___ | | _ __ _
| |/ / _ \| |/ / _' |  welcome to the koka interactive compiler
|   ( (_) |   ( (_| |  version 2.3.1, Sep 21 2021, libc x64 (clang-cl)
|_|\_\___/|_|\_\__,_|  type :? for help, and :q to quit

loading: std/core
loading: std/core/types
loading: std/core/hnd
>

Now you can test some expressions:

> println("hi koka")
check  : interactive
check  : interactive
linking: interactive
created: .koka\v2.3.1\clang-cl-debug\interactive

hi koka

> :t "hi"
string

> :t println("hi")
console ()

Or load a demo (use tab completion to avoid typing too much):

> :l samples/basic/fibonacci
> main()
...

The 10000th fibonacci number is 33644764876431783266621612005107543310302148460680063906564769974680081442166662368155595513633734025582065332680836159373734790483865268263040892463056431887354544369559827491606602099884183933864652731300088830269235673613135117579297437854413752130520504347701602264758318906527890855154366159582987279682987510631200575428783453215515103870818298969791613127856265033195487140214287532698187962046936097879900350962302291026368131493195275630227837628441540360584402572114334961180023091208287046088923962328835461505776583271252546093591128203925285393434620904245248929403901706233888991085841065183173360437470737908552631764325733993712871937587746897479926305837065742830161637408969178426378624212835258112820516370298089332099905707920064367426202389783111470054074998459250360633560933883831923386783056136435351892133279732908133732642652633989763922723407882928177953580570993691049175470808931841056146322338217465637321248226383092103297701648054726243842374862411453093812206564914032751086643394517512161526545361333111314042436854805106765843493523836959653428071768775328348234345557366719731392746273629108210679280784718035329131176778924659089938635459327894523777674406192240337638674004021330343297496902028328145933418826817683893072003634795623117103101291953169794607632737589253530772552375943788434504067715555779056450443016640119462580972216729758615026968443146952034614932291105970676243268515992834709891284706740862008587135016260312071903172086094081298321581077282076353186624611278245537208532365305775956430072517744315051539600905168603220349163222640885248852433158051534849622434848299380905070483482449327453732624567755879089187190803662058009594743150052402532709746995318770724376825907419939632265984147498193609285223945039707165443156421328157688908058783183404917434556270520223564846495196112460268313970975069382648706613264507665074611512677522748621598642530711298441182622661057163515069260029861704945425047491378115154139941550671256271197133252763631939606902895650288268608362241082050562430701794976171121233066073310059947366875

You can also set command line options in the interactive environment using :set <options>. For example, we can load the rbtree example again and print out the elapsed runtime with --showtime:

> :set --showtime
> :l samples/basic/rbtree.kk
> main()
...

420000
info: elapsed: 4.104s, user: 4.046s, sys: 0.062s, rss: 231mb

and then enable optimizations with -O2 and run again (on Windows with an AMD 5950X):

> :set -O2
> :r
> main()
...

420000
info: elapsed: 0.670s, user: 0.656s, sys: 0.015s, rss: 198mb

And finally we quit the interpreter:

> :q

I think of my body as a side effect of my mind.
  -- Carrie Fisher (1956)

What next?

Basic Koka syntax Browse the Library documentation

2. Why Koka?

There are many new languages being designed, but only few bring fundamentally new concepts – like Haskell with pure versus monadic programming, or Rust with borrow checking. Koka distinguishes itself through effect typing, effect handlers, and Perceus memory management:

2.1. Minimal but General

Koka has a small core set of orthogonal, well-studied language features – but each of these is as general and composable as possible, such that we do not need further “special” extensions. Core features include first-class functions, a higher-rank impredicative polymorphic type- and effect system, algebraic data types, and effect handlers.

fun hello-ten()
  var i := 0
  while { i < 10 }
    println("hello")
    i := i + 1
fun hello-ten()
  var i := 0
  while { i < 10 }
    println("hello")
    i := i + 1

As an example of the min-gen design principle, Koka implements most control-flow primitives as regular functions. An anonymous function can be written as fn(){ <body> }; but as a syntactic convenience, any function without arguments can be shortened further to use just braces, as { <body> }. Moreover, using brace elision, any indented block automatically gets curly braces.

We can write a whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () loop now using regular function calls as shown in the example, where the call to whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () is desugared to whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> ()( fn(){ i < 10 }, fn(){ ... } ).

This also naturally leads to consistency: an expression between parenthesis is always evaluated before a function call, whereas an expression between braces (ah, suspenders!) is suspended and may be never evaluated or more than once (as in our example). This is inconsistent in most other languages where often the predicate of a whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () loop is written in parenthesis but may be evaluated multiple times.

Learn more about basic syntax

2.2. Effect Typing

Koka infers and tracks the effect of every function in its type – and a function type has 3 parts: the argument types, the effect type, and the type of the result. For example:

fun sqr    : (int) -> total int       // total: mathematical total function    
fun divide : (int,int) -> exn int     // exn: may raise an exception (partial)  
fun turing : (tape) -> div int        // div: may not terminate (diverge)  
fun print  : (string) -> console ()   // console: may write to the console  
fun rand   : () -> ndet int           // ndet: non-deterministic  
fun sqr    : (intstd/core/types/int: V) -> totalstd/core/types/total: E intstd/core/types/int: V       // total: mathematical total function    
fun divide : (intstd/core/types/int: V,intstd/core/types/int: V) -> exnstd/core/exn/exn: (E, V) -> V intstd/core/types/int: V     // exn: may raise an exception (partial)  
fun turing : (tape) -> divstd/core/types/div: X intstd/core/types/int: V        // div: may not terminate (diverge)  
fun print  : (stringstd/core/types/string: V) -> consolestd/core/console/console: X ()   // console: may write to the console  
fun rand   : () -> ndetstd/core/types/ndet: X intstd/core/types/int: V           // ndet: non-deterministic

The precise effect typing gives Koka rock-solid semantics and deep safety guarantees backed by well-studied category theory, which makes Koka particularly easy to reason about for both humans and compilers. (Given the importance of effect typing, the name Koka was derived from the Japanese word for effective (効果, こうか, Kōka)).

A function without any effect is called totalstd/core/types/total: E and corresponds to mathematically total functions – a good place to be. Then we have effects for partial functions that can raise exceptions (exnstd/core/exn/exn: (E, V) -> V), and potentially non-terminating functions as divstd/core/types/div: X (divergent). The combination of exnstd/core/exn/exn: (E, V) -> V and divstd/core/types/div: X is called purestd/core/pure: E as that corresponds to Haskell's notion of purity. On top of that we find mutability (as ststd/core/types/st: H -> E) up to full non-deterministic side effects in iostd/core/io: E.

Effects can be polymorphic as well. Consider mapping a function over a list:

fun map( xs : list<a>, f : a -> e b ) : e list<b> 
  match xs
    Cons(x,xx) -> Cons( f(x), map(xx,f) )
    Nil        -> Nil  
fun map( xs : liststd/core/types/list: V -> V<a>, f : a -> e b ) : e liststd/core/types/list: V -> V<b> 
  match xs
    Consstd/core/types/Cons: forall<a> (head : a, tail : list<a>) -> list<a>(x,xx) -> Consstd/core/types/Cons: forall<a> (head : a, tail : list<a>) -> list<a>( f(x), map(xx,f) )
    Nilstd/core/types/Nil: forall<a> list<a>        -> Nilstd/core/types/Nil: forall<a> list<a>

Single letter types are polymorphic (aka, generic), and Koka infers that you map from a list of elements a to a list of elements of type b. Since map itself has no intrinsic effect, the effect of applying map is exactly the effect of the function f that is applied, namely e.

Learn more about effect types

2.3. Effect Handlers

Another example of the min-gen design principle: instead of various special language and compiler extensions to support exceptions, generators, async/await etc., Koka has full support for algebraic effect handlers – these lets you define advanced control abstractions like async/await as a user library in a typed and composable way.

Here is an example of an effect definition with one control (ctl) operation to yield intstd/core/types/int: V values:

effect yield
  ctl yield( i : int ) : bool
effectwhy/yield: (E, V) -> V yieldwhy/yield: (E, V) -> V
  ctl yield( ii: int : intstd/core/types/int: V ) : boolstd/core/types/bool: V

Once the effect is declared, we can use it for example to yield the elements of a list:

fun traverse( xs : list<int> ) : yield () 
  match xs 
    Cons(x,xx) -> if yield(x) then traverse(xx) else ()
    Nil        -> ()
fun traversewhy/traverse: (xs : list<int>) -> yield ()( xsxs: list<int> : liststd/core/types/list: V -> V<intstd/core/types/int: V> )result: -> yield () : yieldwhy/yield: (E, V) -> V (std/core/types/unit: V)std/core/types/unit: V 
  match xsxs: list<int> 
    Consstd/core/types/Cons: forall<a> (head : a, tail : list<a>) -> list<a>(xx: int,xxxx: list<int>) -> if yieldwhy/yield: (i : int) -> yield bool(xx: int) then traversewhy/traverse: (xs : list<int>) -> yield ()(xxxx: list<int>) else (std/core/types/Unit: ())std/core/types/Unit: ()
    Nilstd/core/types/Nil: forall<a> list<a>        -> (std/core/types/Unit: ())std/core/types/Unit: ()

The traversewhy/traverse: (xs : list<int>) -> yield () function calls yieldwhy/yield: (i : int) -> yield bool and therefore gets the yieldwhy/yield: (E, V) -> V effect in its type, and if we want to use traversewhy/traverse: (xs : list<int>) -> yield (), we need to handle the yieldwhy/yield: (E, V) -> V effect. This is much like defining an exception handler, except we can receive values (here an intstd/core/types/int: V), and we can resume to the call-site with a result (here, with a boolean that determines if we keep traversing):

fun print-elems() : console () 
  with ctl yield(i)
    println("yielded " ++ i.show)
    resume(i<=2)
  traverse([1,2,3,4])
fun print-elemswhy/print-elems: () -> console ()()result: -> console () : consolestd/core/console/console: X (std/core/types/unit: V)std/core/types/unit: V 
  withhandler: (() -> <yield,console> ()) -> console () ctl yieldyield: (i : int, resume : (bool) -> console ()) -> console ()
(ii: int)
    printlnstd/core/console/string/println: (s : string) -> console ()("yielded "literal: string
count= 8 ++std/core/types/(++): (x : string, y : string) -> console string ii: int.showstd/core/int/show: (i : int) -> console string)
    resumeresume: (bool) -> console ()(ii: int<=std/core/int/(<=): (x : int, y : int) -> console bool2literal: int
dec  = 2
hex8 = 0x02
bit8 = 0b00000010)
  traversewhy/traverse: (xs : list<int>) -> <yield,console> ()([std/core/types/Cons: forall<a> (head : a, tail : list<a>) -> list<a>1literal: int
dec  = 1
hex8 = 0x01
bit8 = 0b00000001,2literal: int
dec  = 2
hex8 = 0x02
bit8 = 0b00000010,3literal: int
dec  = 3
hex8 = 0x03
bit8 = 0b00000011,4literal: int
dec  = 4
hex8 = 0x04
bit8 = 0b00000100]std/core/types/Nil: forall<a> list<a>)

The with statement dynamically binds the handler for yieldwhy/yield: (i : int) -> yield bool control operation over the rest of the scope, in this case traversewhy/traverse: (xs : list<int>) -> yield ()([1,2,3,4]). Every time yieldwhy/yield: (i : int) -> yield bool is called, our control handler is called, prints the current value, and resumes to the call-site with a boolean result. The dynamic binding here is quite safe since we still use static typing! Indeed, the handler discharges the yieldwhy/yield: (E, V) -> V effect – and replaces it with a consolestd/core/console/console: X effect (due to println). When we run the example, we get:

yielded: 1
yielded: 2
yielded: 3

Learn more about with statements

Learn more about effect handlers

2.4. Perceus Optimized Reference Counting

Perceus is the compiler optimized reference counting technique that Koka uses for automatic memory management [13, 22]. This (together with evidence passing [23–25]) enables Koka to compile directly to plain C code without needing a garbage collector or runtime system.

Perceus uses extensive static analysis to aggressively optimize the reference counts. Here the strong semantic foundation of Koka helps a lot: inductive data types cannot form cycles, and potential sharing across threads can be reliably determined.

Normally we need to make a fundamental choice when managing memory:

We either use manual memory management (C, C++, Rust) and we get the best performance but at a significant programming burden,
Or, we use garbage collection (OCaml, C#, Java, Go, etc.) but but now we need a runtime system and pay a price in performance, memory usage, and unpredictable latencies.

perceus-perf

With Perceus, we hope to cross this gap and our goal is to be within 2x of the performance of C/C++. Initial benchmarks are encouraging and show Koka to be close to C performance on various memory intensive benchmarks.

See benchmarks

Read the Perceus technical report

2.5. Reuse Analysis

Perceus also performs reuse analysis as part of reference counting analysis. This pairs pattern matches with constructors of the same size and reuses them in-place if possible. Take for example, the map function over lists:

fun map( xs : list<a>, f : a -> e b ) : e list<b>
  match xs
    Cons(x,xx) -> Cons( f(x), map(xx,f) )
    Nil        -> Nil
fun map( xs : liststd/core/types/list: V -> V<a>, f : a -> e b ) : e liststd/core/types/list: V -> V<b>
  match xs
    Consstd/core/types/Cons: forall<a> (head : a, tail : list<a>) -> list<a>(x,xx) -> Consstd/core/types/Cons: forall<a> (head : a, tail : list<a>) -> list<a>( f(x), map(xx,f) )
    Nilstd/core/types/Nil: forall<a> list<a>        -> Nilstd/core/types/Nil: forall<a> list<a>

Here the matched Consstd/core/types/Cons: forall<a> (head : a, tail : list<a>) -> list<a> can be reused by the new Consstd/core/types/Cons: forall<a> (head : a, tail : list<a>) -> list<a> in the branch. This means if we map over a list that is not shared, like list(1,100000).map(sqr).sumstd/core/list/sum: (xs : list<int>) -> int, then the list is updated in-place without any extra allocation. This is very effective for many functional style programs.

void map( list_t xs, function_t f, 
          list_t* res) 
{
  while (is_Cons(xs)) {
    if (is_unique(xs)) {    // if xs is not shared..
      box_t y = apply(dup(f),xs->head);      
      if (yielding()) { ... } // if f yields to a general ctl operation..
      else {
        xs->head = y;      
        *res = xs;          // update previous node in-place
        res = &xs->tail;    // set the result address for the next node
        xs = xs->tail;      // .. and continue with the next node
      }
    }
    else { ... }            // slow path allocates fresh nodes
  }
  *res = Nil;  
}

Moreover, the Koka compiler also implements tail-recursion modulo cons (TRMC) [11, 12] and instead of using a recursive call, the function is eventually optimized into an in-place updating loop for the fast path, similar to the C code example on the right.

Importantly, the reuse optimization is guaranteed and a programmer can see when the optimization applies. This leads to a new programming technique we call FBIP: functional but in-place. Just like tail-recursion allows us to express loops with regular function calls, reuse analysis allows us to express many imperative algorithms in a purely functional style.

Learn more about FBIP

Read the paper on fully in-place functional programming

Read the paper on generalized tail-recursion modulo cons

2.6. Specialization

As another example of the effectiveness of Perceus and the strong semantics of the Koka core language, we can look at the red-black tree example and look at the code generated when folding a binary tree. The red-black tree is defined as:

type color  
  Red
  Black

type tree<k,a> 
  Leaf
  Node(color : color, left : tree<k,a>, key : k, value : a, right : tree<k,a>)
type colorwhy/color: V  
  Redwhy/Red: color
  Blackwhy/Black: color

type treewhy/tree: (V, V) -> V<kk: V,aa: V> 
  Leafwhy/Leaf: forall<a,b> tree<a,b>
  Nodewhy/Node: forall<a,b> (color : color, left : tree<a,b>, key : a, value : b, right : tree<a,b>) -> tree<a,b>(color : colorwhy/color: V, left : treewhy/tree: (V, V) -> V<kk: V,aa: V>, key : kk: V, value : aa: V, right : treewhy/tree: (V, V) -> V<kk: V,aa: V>)

We can generically fold over a tree t with a function f as:

fun fold(t : tree<k,a>, acc : b, f : (k, a, b) -> b) : b
  match t
    Node(_,l,k,v,r) -> r.fold( f(k,v,l.fold(acc,f)), f)
    Leaf            -> acc
fun foldwhy/fold: forall<a,b,c> (t : tree<c,a>, acc : b, f : (c, a, b) -> b) -> b(tt: tree<$514,$512> : treewhy/tree: (V, V) -> V<kk: V,aa: V>, accacc: $513 : bb: V, ff: ($514, $512, $513) -> $513 : (kk: V, aa: V, bb: V) -> bstd/core/types/total: E)result: -> total 598 : bstd/core/types/total: E
  match tt: tree<$514,$512>
    Nodewhy/Node: forall<a,b> (color : color, left : tree<a,b>, key : a, value : b, right : tree<a,b>) -> tree<a,b>(_,ll: tree<$514,$512>,kk: $514,vv: $512,rr: tree<$514,$512>) -> rr: tree<$514,$512>.foldwhy/fold: (t : tree<$514,$512>, acc : $513, f : ($514, $512, $513) -> $513) -> $513( ff: ($514, $512, $513) -> $513(kk: $514,vv: $512,ll: tree<$514,$512>.foldwhy/fold: (t : tree<$514,$512>, acc : $513, f : ($514, $512, $513) -> $513) -> $513(accacc: $513,ff: ($514, $512, $513) -> $513)), ff: ($514, $512, $513) -> $513)
    Leafwhy/Leaf: forall<a,b> tree<a,b>            -> accacc: $513

This is used in the example to count all the Truestd/core/types/True: bool values in a tree t : treewhy/tree: (V, V) -> V<k,boolstd/core/types/bool: V> as:

val count = t.fold(0, fn(k,v,acc) if v then acc+1 else acc)
val count = t.fold(0, fn(k,v,acc) if v then acc+1 else acc)

This may look quite expensive where we pass a polymorphic first-class function that uses arbitrary precision integer arithmetic. However, the Koka compiler first specializes the fold definition to the passed function, then simplifies the resulting monomorphic code, and finally applies Perceus to insert reference count instructions. This results in the following internal core code:

fun spec-fold(t : tree<k,bool>, acc : int) : int
  match t
    Node(_,l,k,v,r) -> 
      if unique(t) then { drop(k); free(t) } else { dup(l); dup(r) } // perceus inserted
      val x = if v then 1 else 0
      spec-fold(r, spec-fold(l,acc) + x) 
    Leaf -> 
      drop(t)
      acc

val count = spec-fold(t,0)
fun spec-fold(t : treewhy/tree: (V, V) -> V<k,boolstd/core/types/bool: V>, acc : intstd/core/types/int: V) : intstd/core/types/int: V
  match t
    Nodewhy/Node: forall<a,b> (color : color, left : tree<a,b>, key : a, value : b, right : tree<a,b>) -> tree<a,b>(_,l,k,v,r) -> 
      if uniquestd/core/unique: () -> ndet int(t) then { drop(k); free(t) } else { dup(l); dup(r) } // perceus inserted
      val x = if v then 1 else 0
      spec-fold(r, spec-fold(l,acc) + x) 
    Leafwhy/Leaf: forall<a,b> tree<a,b> -> 
      drop(t)
      acc

val count = spec-fold(t,0)

When compiled via the C backend, the generated assembly instructions on arm64 become:

spec_fold:
  ...                       
  LOOP0:   
    mov  x21, x0              ; x20 is t, x21 = acc (x19 = koka context _ctx)
  LOOP1:                      ; the "match(t)" point
    cmp  x20, #9              ; is t a Leaf?
    b.eq LBB15_1              ;   if so, goto Leaf brach
  LBB15_5:                    ;   otherwise, this is the Node(_,l,k,v,r) branch
    mov  x23, x20             ; load the fields of t:
    ldp  x22, x0, [x20, #8]   ;   x22 = l, x0 = k   (ldp == load pair)
    ldp  x24, x20, [x20, #24] ;   x24 = v, x20 = r  
    ldr  w8, [x23, #4]        ;   w8 = reference count (0 is unique)
    cbnz w8, LBB15_11         ; if t is not unique, goto cold path to dup the members
    tbz  w0, #0, LBB15_13     ; if k is allocated (bit 0 is 0), goto cold path to free it
  LBB15_7:                   
    mov  x0, x23              ; call free(t)  
    bl   _mi_free
  LBB15_8:              
    mov  x0, x22              ; call spec_fold(l,acc,_ctx)
    mov  x1, x21              
    mov  x2, x19
    bl   spec_fold
    cmp  x24, #1              ; boxed value is False? 
    b.eq LOOP0                ;   if v is False, the result in x0 is the accumulator
    add  x21, x0, #4          ; otherwise add 1 (as a small int 4*n)
    orr  x8, x21, #1          ;   check for bigint or overflow in one test 
    cmp  x8, w21, sxtw        ;   (see kklib/include/integer.h for details)
    b.eq LOOP1                ; and tail-call into spec_fold if no overflow or bigint
    mov  w1, #5               ; otherwise, use generic bigint addition              
    mov  x2, x19
    bl   _kk_integer_add_generic
    b    LOOP0   
  ...

The polymorphic fold with its higher order parameter is eventually compiled into a tight loop with close to optimal assembly instructions.

advancedHere we see too that the node t is freed explicitly as soon as it is no longer live. This is usually earlier than scope-based deallocation (like RAII) and therefore Perceus can guarantee to be garbage-free where in a (cycle-free) program objects are always immediatedly deallocated as soon as they become unreachable [13–15, 22]. Moreover, it is deterministic and behaves just like regular malloc/free calls. Reference counting may still seem expensive compared to trace-based garbage collection which only (re)visits all live objects and never needs to free objects explicitly. However, Perceus usually frees an object right after its last use (like in our example), and thus the memory is still in the cache reducing the cost of freeing it. Also, Perceus never (re)visits live objects arbitrarily which may trash the caches especially if the live set is large. As such, we think the deterministic behavior of Perceus together with the garbage-free property may work out better in practice.

Read the technical report on garbage-free and frame-limited reuse

Read the technical report on fully in-place functional programming

3. A Tour of Koka

This is a short introduction to the Koka programming language.

Koka is a function-oriented language that separates pure values from side-effecting computation (Given the importance of effect typing, the name Koka was derived from the Japanese word for effective (効果, こうか, Kōka)).

3.1. Basics

3.1.1. Hello world

As usual, we start with the familiar Hello world program:

fun main()
  println("Hello world!") // println output
fun maintour/main: () -> console ()()result: -> console ()
  printlnstd/core/console/string/println: (s : string) -> console ()("Hello world!"literal: string
count= 12) // println output

Functions are declared using the fun keyword (and anonymous functions with fn). Due to brace elision, any indented blocks implicitly get curly braces, and the example can also be written as:

fun main() {
  println("Hello world!") // println output
}
fun maintour/main: () -> console ()() {
  println("Hello world!") // println output
}

using explicit braces. Here is another short example program that encodes a string using the Caesar cipher, where each lower-case letter in a string is replaced by the letter three places up in the alphabet:

fun main() { println(caesar("koka is fun")) }

fun encode( s : string, shift : int )
  fun encode-char(c)
    if c < 'a' || c > 'z' then return c
    val base = (c - 'a').int
    val rot  = (base + shift) % 26
    (rot.char + 'a')
  s.map(encode-char)

fun caesar( s : string ) : string
  s.encode( 3 )
fun encodetour/encode: (s : string, shift : int) -> string( ss: string : stringstd/core/types/string: V, shiftshift: int : intstd/core/types/int: V )result: -> total string
  fun encode-charencode-char: (c : char) -> char(cc: char)result: -> total char
    if cc: char <std/core/char/(<): (char, char) -> bool 'a'literal: char
unicode= u0061 ||std/core/types/(||): (x : bool, y : bool) -> bool cc: char >std/core/char/(>): (char, char) -> bool 'z'literal: char
unicode= u007A then returnreturn: char cc: char
    val basebase: int = (cc: char -std/core/char/(-): (c : char, d : char) -> char 'a'literal: char
unicode= u0061).intstd/core/char/int: (char) -> int
    val rotrot: int  = (basebase: int +std/core/int/(+): (x : int, y : int) -> int shiftshift: int) %std/core/int/(%): (int, int) -> int 26literal: int
dec  = 26
hex8 = 0x1A
bit8 = 0b00011010
    (rotrot: int.charstd/core/char/int/char: (i : int) -> char +std/core/char/(+): (c : char, d : char) -> char 'a'literal: char
unicode= u0061)
  ss: string.mapstd/core/list/string/map: (s : string, f : (char) -> char) -> string(encode-charencode-char: (c : char) -> char)

fun caesartour/caesar: (s : string) -> string( ss: string : stringstd/core/types/string: V )result: -> total string : stringstd/core/types/string: V
  ss: string.encodetour/encode: (s : string, shift : int) -> string( 3literal: int
dec  = 3
hex8 = 0x03
bit8 = 0b00000011 )

In this example, we declare a local function encode-char which encodes a single character c. The final statement s.map(encode-char) applies the encode-char function to each character in the string s, returning a new string where each character is Caesar encoded. The result of the final statement in a function is also the return value of that function, and you can generally leave out an explicit return keyword.

3.1.2. Dot selection

Koka is a function-oriented language where functions and data form the core of the language (in contrast to objects for example). In particular, the expression s.encode(3) does not select the encode method from the stringstd/core/types/string: V object, but it is simply syntactic sugar for the function call encode(s,3) where s becomes the first argument. Similarly, c.int converts a character to an integer by calling int(c) (and both expressions are equivalent). The dot notation is intuïtive and quite convenient to chain multiple calls together, as in:

fun showit( s : string )
  s.encode(3).count.println
fun showittour/showit: (s : string) -> console ()( ss: string : stringstd/core/types/string: V )result: -> console ()
  ss: string.encodetour/encode: (s : string, shift : int) -> console string(3literal: int
dec  = 3
hex8 = 0x03
bit8 = 0b00000011).countstd/core/string/count: (s : string) -> console int.printlnstd/core/console/show/println: (x : int, @implicit/show : (int) -> string) -> console ()
?show=int/show

for example (where the body desugars as println(count(encode(s,3)))). An advantage of the dot notation as syntactic sugar for function calls is that it is easy to extend the ‘primitive’ methods of any data type: just write a new function that takes that type as its first argument. In most object-oriented languages one would need to add that method to the class definition itself which is not always possible if such class came as a library for example.

3.1.3. Type Inference

Koka is also strongly typed. It uses a powerful type inference engine to infer most types, and types generally do not get in the way. In particular, you can always leave out the types of any local variables. This is the case for example for base and rot values in the previous example; hover with the mouse over the example to see the types that were inferred by Koka. Generally, it is good practice though to write type annotations for function parameters and the function result since it both helps with type inference, and it provides useful documentation with better feedback from the compiler.

For the encode function it is actually essential to give the type of the s parameter: since the map function is defined for both liststd/core/types/list: V -> V and stringstd/core/types/string: V types and the program is ambiguous without an annotation. Try to load the example in the editor and remove the annotation to see what error Koka produces.

3.1.4. Anonymous Functions and Trailing Lambdas

Koka also allows for anonymous function expressions using the fn keyword. For example, instead of declaring the encode-char function, we can also pass it directly to the map function as a function expression:

fun encode2( s : string, shift : int )
  s.map( fn(c)
    if c < 'a' || c > 'z' then return c
    val base = (c - 'a').int
    val rot  = (base + shift) % 26
    (rot.char + 'a')
  )
fun encode2tour/encode2: (s : string, shift : int) -> string( ss: string : stringstd/core/types/string: V, shiftshift: int : intstd/core/types/int: V )result: -> total string
  ss: string.mapstd/core/list/string/map: (s : string, f : (char) -> char) -> string( fnfn: (c : char) -> char(cc: char)
    if cc: char <std/core/char/(<): (char, char) -> bool 'a'literal: char
unicode= u0061 ||std/core/types/(||): (x : bool, y : bool) -> bool cc: char >std/core/char/(>): (char, char) -> bool 'z'literal: char
unicode= u007A then returnreturn: char cc: char
    val basebase: int = (cc: char -std/core/char/(-): (c : char, d : char) -> char 'a'literal: char
unicode= u0061).intstd/core/char/int: (char) -> int
    val rotrot: int  = (basebase: int +std/core/int/(+): (x : int, y : int) -> int shiftshift: int) %std/core/int/(%): (int, int) -> int 26literal: int
dec  = 26
hex8 = 0x1A
bit8 = 0b00011010
    (rotrot: int.charstd/core/char/int/char: (i : int) -> char +std/core/char/(+): (c : char, d : char) -> char 'a'literal: char
unicode= u0061)
  )

It is a bit annoying we had to put the final right-parenthesis after the last brace in the previous example. As a convenience, Koka allows anonymous functions to follow the function call instead – this is also known as trailing lambdas. For example, here is how we can print the numbers 1 to 10:

fun main() { print10() }

fun print10()
  for(1,10) fn(i)
    println(i)
fun print10tour/print10: () -> console ()()result: -> console ()
  forstd/core/range/for: (start : int, end : int, action : (int) -> console ()) -> console ()(1literal: int
dec  = 1
hex8 = 0x01
bit8 = 0b00000001,10literal: int
dec  = 10
hex8 = 0x0A
bit8 = 0b00001010) fnfn: (i : int) -> console ()(ii: int)
    printlnstd/core/console/show/println: (x : int, @implicit/show : (int) -> string) -> console ()
?show=int/show(ii: int)

which is desugared to for( 1, 10, fn(i){ println(i) } ). (In fact, since we pass the i argument directly to println, we could have also passed the function itself directly as for(1,10,println).)

Anonymous functions without any arguments can be shortened further by leaving out the fn keyword as well and just use braces directly. Here is an example using the repeat function:

fun main() { printhi10() }

fun printhi10()
  repeat(10)
    println("hi")
fun printhi10tour/printhi10: () -> console ()()result: -> console ()
  repeatstd/core/repeat: (n : int, action : () -> console ()) -> console ()(10literal: int
dec  = 10
hex8 = 0x0A
bit8 = 0b00001010)
    printlnstd/core/console/string/println: (s : string) -> console ()("hi"literal: string
count= 2)

where the body desugars to repeat( 10, { println(hi) } ) which desugars further to repeat( 10, fn(){ println(hi)} ). The is especially convenient for the whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () loop since this is not a built-in control flow construct but just a regular function:

fun main() { print11() }

fun print11()
  var i := 10
  while { i >= 0 }
    println(i)
    i := i - 1
fun print11tour/print11: () -> <console,div> ()()result: -> <console,div> ()
  var ii: local-var<$1802,int> := 10literal: int
dec  = 10
hex8 = 0x0A
bit8 = 0b00001010
  whilestd/core/while: (predicate : () -> <div,local<$1802>,console> bool, action : () -> <div,local<$1802>,console> ()) -> <div,local<$1802>,console> () { ii: int >=std/core/int/(>=): (x : int, y : int) -> <local<$1802>,div,console> bool 0literal: int
dec  = 0
hex8 = 0x00
bit8 = 0b00000000 }
    printlnstd/core/console/show/println: (x : int, @implicit/show : (int) -> string) -> <console,local<$1802>,div> ()
?show=int/show(ii: int)
    ii: local-var<$1802,int> :=std/core/types/local-set: (v : local-var<$1802,int>, assigned : int) -> <local<$1802>,console,div> () ii: int -std/core/int/(-): (x : int, y : int) -> <local<$1802>,console,div> int 1literal: int
dec  = 1
hex8 = 0x01
bit8 = 0b00000001

Note how the first argument to whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () is in braces instead of the usual parenthesis. In Koka, an expression between parenthesis is always evaluated before a function call, whereas an expression between braces (ah, suspenders!) is suspended and may be never evaluated or more than once (as in our example).

3.1.5. With Statements

To the best of our knowledge, Koka was the first language to have generalized trailing lambdas. It was also one of the first languages to have dot notation (This was independently developed but it turns out the D language has a similar feature (called UFCS) which predates dot-notation). Another novel syntactical feature is the with statement. With the ease of passing a function block as a parameter, these often become nested. For example:

fun twice(f)
  f()
  f()

fun test-twice()
  twice
    twice
      println("hi")
fun twicetour/twice: forall<a,e> (f : () -> e a) -> e a(ff: () -> _1952 _1953)result: -> 4910 total
  ff: () -> _1952 _1953()
  ff: () -> _1952 _1953()

fun test-twicetour/test-twice: () -> console ()()result: -> console ()
  twicetour/twice: (f : () -> console ()) -> console ()
    twicetour/twice: (f : () -> console ()) -> console ()
      printlnstd/core/console/string/println: (s : string) -> console ()("hi"literal: string
count= 2)

where "hi" is printed four times (note: this desugars to twicetour/twice: forall<a,e> (f : () -> e a) -> e a( fn(){ twicetour/twice: forall<a,e> (f : () -> e a) -> e a( fn(){ println("hi") }) })). Using the with statement we can write this more concisely as:

pub fun test-with1()
  with twice
  with twice
  println("hi")
pub fun test-with1tour/test-with1: () -> console ()()result: -> console ()
  withwith: () -> console () twicetour/twice: (f : () -> console ()) -> console ()
  withwith: () -> console () twicetour/twice: (f : () -> console ()) -> console ()
  printlnstd/core/console/string/println: (s : string) -> console ()("hi"literal: string
count= 2)

The with statement essentially puts all statements that follow it into an anynomous function block and passes that as the last parameter. In general:

translation

with f(e1,...,eN)
<body>
with f(e1,...,eN)
<body>

f(e1,...,eN, fn(){ <body> })
f(e1,...,eN, fn(){ <body> })

Moreover, a with statement can also bind a variable parameter as:

translation

with x <- f(e1,...,eN)
<body>
with x <- f(e1,...,eN)
<body>

f(e1,...,eN, fn(x){ <body> })
f(e1,...,eN, fn(x){ <body> })

Here is an example using foreach to span over the rest of the function body:

pub fun test-with2() {
  with x <- list(1,10).foreach
  println(x)
}
pub fun test-with2tour/test-with2: () -> console ()()result: -> console () {
  withwith: (x : int) -> console () xx: int <- liststd/core/list/list: (lo : int, hi : int) -> console list<int>(1literal: int
dec  = 1
hex8 = 0x01
bit8 = 0b00000001,10literal: int
dec  = 10
hex8 = 0x0A
bit8 = 0b00001010).foreachstd/core/list/foreach: (xs : list<int>, action : (int) -> console ()) -> console ()
  printlnstd/core/console/show/println: (x : int, @implicit/show : (int) -> string) -> console ()
?show=int/show(xx: int)
}

which desugars to list(1,10).foreach( fn(x){ println(x) } ). This is a bit reminiscent of Haskell do notation. Using the with statement this way may look a bit strange at first but is very convenient in practice – it helps thinking of with as a closure over the rest of the lexical scope.

With Finally

As another example, the finallystd/core/hnd/finally: forall<a,e> (fin : () -> e (), action : () -> e a) -> e a function takes as its first argument a function that is run when exiting the scope – either normally, or through an “exception” (i.e. when an effect operation does not resume). Again, with is a natural fit:

fun test-finally()
  with finally{ println("exiting..") }
  println("entering..")
  throw("oops") + 42
fun test-finallytour/test-finally: () -> <console,exn> int()result: -> <console,exn> int
  withwith: () -> <console,exn> int finallystd/core/hnd/finally: (fin : () -> <console,exn> (), action : () -> <console,exn> int) -> <console,exn> int{ printlnstd/core/console/string/println: (s : string) -> <console,exn> ()("exiting.."literal: string
count= 9) }
  printlnstd/core/console/string/println: (s : string) -> <console,exn> ()("entering.."literal: string
count= 10)
  throwstd/core/exn/throw: (message : string, info : ? exception-info) -> <exn,console> int("oops"literal: string
count= 4) +std/core/int/(+): (x : int, y : int) -> <exn,console> int 42literal: int
dec  = 42
hex8 = 0x2A
bit8 = 0b00101010

which desugars to finallystd/core/hnd/finally: forall<a,e> (fin : () -> e (), action : () -> e a) -> e a(fn(){ println(...) }, fn(){ println("entering"); throwstd/core/exn/throw: forall<a> (message : string, info : ? exception-info) -> exn a("oops") + 42 }), and prints:

entering..
exiting..
uncaught exception: oops

This is another example of the min-gen principle: many languages have have special built-in support for this kind of pattern, like a defer statement, but in Koka it is all just function applications with minimal syntactic sugar.

With Handlers

The with statement is especially useful in combination with effect handlers. An effect describes an abstract set of operations whose concrete implementation can be dynamically bound by a handler. Here is an example of an effect handler for emitting messages:

// declare an abstract operation: emit, how it emits is defined dynamically by a handler.
effect fun emit(msg : string) : ()

// emit a standard greeting.
fun hello() : emit ()
  emit("hello world!")

// emit a standard greeting to the console.
pub fun hello-console1() : console ()
  with handler
    fun emit(msg) println(msg)
  hello()
// declare an abstract operation: emit, how it emits is defined dynamically by a handler.
effecttour/emit: (E, V) -> V fun emittour/emit: (E, V) -> V(msgmsg: string : stringstd/core/types/string: V) : (std/core/types/unit: V)std/core/types/unit: V

// emit a standard greeting.
fun hellotour/hello: () -> emit ()()result: -> emit () : emittour/emit: (E, V) -> V (std/core/types/unit: V)std/core/types/unit: V
  emittour/emit: (msg : string) -> emit ()("hello world!"literal: string
count= 12)

// emit a standard greeting to the console.
pub fun hello-console1tour/hello-console1: () -> console ()()result: -> console () : consolestd/core/console/console: X (std/core/types/unit: V)std/core/types/unit: V
  withwith: () -> <emit,console> () handlerhandler: (() -> <emit,console> ()) -> console ()
    fun emitemit: (msg : string) -> console ()(msgmsg: string) printlnstd/core/console/string/println: (s : string) -> console ()(msgmsg: string)
  hellotour/hello: () -> <emit,console> ()()

In this example, the with expression desugars to (handler{ fun emittour/emit: (msg : string) -> emit ()(msg){ println(msg) } })( fn(){ hellotour/hello: () -> emit ()() } ). Generally, a handler{ <ops> } expression takes as its last argument a function block so it can be used directly with with.

Moreover, as a convenience, we can leave out the handler keyword for effects that define just one operation (like emittour/emit: (E, V) -> V):

translation

with val op = <expr>
with fun op(x){ <body> }
with ctl op(x){ <body> }
with val op = <expr>
with fun op(x){ <body> }
with ctl op(x){ <body> }

with handler{ val op = <expr> }
with handler{ fun op(x){ <body> } }
with handler{ ctl op(x){ <body> } }
with handler{ val op = <expr> }
with handler{ fun op(x){ <body> } }
with handler{ ctl op(x){ <body> } }

Using this convenience, we can write the previous example in more concisely as:

pub fun hello-console2()
  with fun emit(msg) println(msg)
  hello()
pub fun hello-console2()
  with fun emittour/emit: (msg : string) -> emit ()(msg) println(msg)
  hellotour/hello: () -> emit ()()

Intuitively, we can view the handler with fun emittour/emit: (msg : string) -> emit () as a (statically typed) dynamic binding of the function emittour/emit: (msg : string) -> emit () over the rest of the scope.

3.1.6. Optional and Named Parameters

Being a function-oriented language, Koka has powerful support for function calls where it supports both optional and named parameters. For example, the function replace-allstd/core/string/replace-all: (s : string, pattern : string, repl : string) -> string takes a string, a pattern (named pattern), and a replacement string (named repl):

fun main() { println(world()) }

fun world()
  replace-all("hi there", "there", "world")  // returns "hi world"
fun worldtour/world: () -> string()result: -> total string
  replace-allstd/core/string/replace-all: (s : string, pattern : string, repl : string) -> string("hi there"literal: string
count= 8, "there"literal: string
count= 5, "world"literal: string
count= 5)  // returns "hi world"

Using named parameters, we can also write the function call as:

fun main() { println(world2()) }

fun world2()
  "hi there".replace-all( repl="world", pattern="there" )
fun world2tour/world2: () -> string()result: -> total string
  "hi there"literal: string
count= 8.replace-allstd/core/string/replace-all: (s : string, pattern : string, repl : string) -> string( repl="world"literal: string
count= 5, pattern="there"literal: string
count= 5 )

Optional parameters let you specify default values for parameters that do not need to be provided at a call-site. As an example, let's define a function sublisttour/sublist: forall<a> (xs : list<a>, start : int, len : ? int) -> list<a> that takes a list, a start position, and the length len of the desired sublist. We can make the len parameter optional and by default return all elements following the start position by picking the length of the input list by default:

fun main() { println( ['a','b','c'].sublist(1).string ) }

fun sublist( xs : list<a>, start : int, len : int = xs.length ) : list<a>
  if start <= 0 return xs.take(len)
  match xs
    Nil        -> Nil
    Cons(_,xx) -> xx.sublist(start - 1, len)
fun sublisttour/sublist: forall<a> (xs : list<a>, start : int, len : ? int) -> list<a>( xsxs: list<$2519> : liststd/core/types/list: V -> V<aa: V>, startstart: int : intstd/core/types/int: V, lenlen: ? int : intstd/core/types/int: V = xsxs: list<$2519>.lengthstd/core/list/length: (xs : list<$2519>) -> int )result: -> total list<2596> : liststd/core/types/list: V -> V<aa: V>
  if startstart: int <=std/core/int/(<=): (x : int, y : int) -> bool 0literal: int
dec  = 0
hex8 = 0x00
bit8 = 0b00000000 returnreturn: list<$2519> xsxs: list<$2519>.takestd/core/list/take: (xs : list<$2519>, n : int) -> list<$2519>(lenlen: int)std/core/types/Unit: ()
  match xsxs: list<$2519>
    Nilstd/core/types/Nil: forall<a> list<a>        -> Nilstd/core/types/Nil: forall<a> list<a>
    Consstd/core/types/Cons: forall<a> (head : a, tail : list<a>) -> list<a>(_,xxxx: list<$2519>) -> xxxx: list<$2519>.sublisttour/sublist: (xs : list<$2519>, start : int, len : ? int) -> list<$2519>(startstart: int -std/core/int/(-): (x : int, y : int) -> int 1literal: int
dec  = 1
hex8 = 0x01
bit8 = 0b00000001, lenlen: int)

Hover over the sublisttour/sublist: forall<a> (xs : list<a>, start : int, len : ? int) -> list<a> identifier to see its full type, where the len parameter has gotten an optional intstd/core/types/int: V type signified by the question mark: ? intstd/core/types/int: V.

3.1.7. A larger example: cracking Caesar encoding

Here is a slightly larger program inspired by an example in Graham Hutton's most excellent "Programming in Haskell" book:

fun main()
  test-uncaesar()

fun encode( s : string, shift : int )
  fun encode-char(c)
    if c < 'a' || c > 'z' return c
    val base = (c - 'a').int
    val rot  = (base + shift) % 26
    (rot.char + 'a')
  s.map(encode-char)

// The letter frequency table for English
val english = [8.2,1.5,2.8,4.3,12.7,2.2,
               2.0,6.1,7.0,0.2,0.8,4.0,2.4,
               6.7,7.5,1.9,0.1, 6.0,6.3,9.1,
               2.8,1.0,2.4,0.2,2.0,0.1]

// Small helper functions
fun percent( n : int, m : int )
  100.0 * (n.float64 / m.float64)

fun rotate( xs : list<a>, n : int ) : list<a>
  xs.drop(n) ++ xs.take(n)

// Calculate a frequency table for a string
fun freqs( s : string ) : list<float64>
  val lowers = list('a','z')
  val occurs = lowers.map( fn(c) s.count(c.string) )
  val total  = occurs.sum
  occurs.map( fn(i) percent(i,total) )

// Calculate how well two frequency tables match according
// to the _chi-square_ statistic.
fun chisqr( xs : list<float64>, ys : list<float64> ) : float64
  zipwith(xs,ys, fn(x,y) ((x - y)^2.0)/y ).foldr(0.0,(+))

// Crack a Caesar encoded string
fun uncaesar( s : string ) : string
  val table  = freqs(s)                   // build a frequency table for `s`
  val chitab = list(0,25).map fn(n)       // build a list of chisqr numbers for each shift between 0 and 25
                 chisqr( table.rotate(n), english )

  val min    = chitab.minimum()           // find the mininal element
  val shift  = chitab.index-of( fn(f) f == min ).negate  // and use its position as our shift
  s.encode( shift )

fun test-uncaesar()
  println( uncaesar( "nrnd lv d ixq odqjxdjh" ) )
// The letter frequency table for English
val englishtour/english: list<float64> = [std/core/types/Cons: forall<a> (head : a, tail : list<a>) -> list<a>8.2literal: float64
hex64= 0x1.0666666666666p3,1.5literal: float64
hex64= 0x1.8p0,2.8literal: float64
hex64= 0x1.6666666666666p1,4.3literal: float64
hex64= 0x1.1333333333333p2,12.7literal: float64
hex64= 0x1.9666666666666p3,2.2literal: float64
hex64= 0x1.199999999999ap1,
               2.0literal: float64
hex64= 0x1p1,6.1literal: float64
hex64= 0x1.8666666666666p2,7.0literal: float64
hex64= 0x1.cp2,0.2literal: float64
hex64= 0x1.999999999999ap-3,0.8literal: float64
hex64= 0x1.999999999999ap-1,4.0literal: float64
hex64= 0x1p2,2.4literal: float64
hex64= 0x1.3333333333333p1,
               6.7literal: float64
hex64= 0x1.acccccccccccdp2,7.5literal: float64
hex64= 0x1.ep2,1.9literal: float64
hex64= 0x1.e666666666666p0,0.1literal: float64
hex64= 0x1.999999999999ap-4, 6.0literal: float64
hex64= 0x1.8p2,6.3literal: float64
hex64= 0x1.9333333333333p2,9.1literal: float64
hex64= 0x1.2333333333333p3,
               2.8literal: float64
hex64= 0x1.6666666666666p1,1.0literal: float64
hex64= 0x1p0,2.4literal: float64
hex64= 0x1.3333333333333p1,0.2literal: float64
hex64= 0x1.999999999999ap-3,2.0literal: float64
hex64= 0x1p1,0.1literal: float64
hex64= 0x1.999999999999ap-4]std/core/types/Nil: forall<a> list<a>

// Small helper functions
fun percenttour/percent: (n : int, m : int) -> float64( nn: int : intstd/core/types/int: V, mm: int : intstd/core/types/int: V )result: -> total float64
  100.0literal: float64
hex64= 0x1.9p6 *std/num/float64/(*): (x : float64, y : float64) -> float64 (nn: int.float64std/num/float64/float64: (i : int) -> float64 /std/num/float64/(/): (x : float64, y : float64) -> float64 mm: int.float64std/num/float64/float64: (i : int) -> float64)

fun rotatetour/rotate: forall<a> (xs : list<a>, n : int) -> list<a>( xsxs: list<$2808> : liststd/core/types/list: V -> V<aa: V>, nn: int : intstd/core/types/int: V )result: -> total list<2838> : liststd/core/types/list: V -> V<aa: V>
  xsxs: list<$2808>.dropstd/core/list/drop: (xs : list<$2808>, n : int) -> list<$2808>(nn: int) ++std/core/list/(++): (xs : list<$2808>, ys : list<$2808>) -> list<$2808> xsxs: list<$2808>.takestd/core/list/take: (xs : list<$2808>, n : int) -> list<$2808>(nn: int)

// Calculate a frequency table for a string
fun freqstour/freqs: (s : string) -> list<float64>( ss: string : stringstd/core/types/string: V )result: -> total list<float64> : liststd/core/types/list: V -> V<float64std/core/types/float64: V>
  val lowerslowers: list<char> = liststd/core/list/char/list: (lo : char, hi : char) -> list<char>('a'literal: char
unicode= u0061,'z'literal: char
unicode= u007A)
  val occursoccurs: list<int> = lowerslowers: list<char>.mapstd/core/list/map: (xs : list<char>, f : (char) -> int) -> list<int>( fnfn: (c : char) -> int(cc: char) ss: string.countstd/core/string/stringpat/count: (s : string, pattern : string) -> int(cc: char.stringstd/core/string/char/string: (c : char) -> string) )
  val totaltotal: int  = occursoccurs: list<int>.sumstd/core/list/sum: (xs : list<int>) -> int
  occursoccurs: list<int>.mapstd/core/list/map: (xs : list<int>, f : (int) -> float64) -> list<float64>( fnfn: (i : int) -> float64(ii: int) percenttour/percent: (n : int, m : int) -> float64(ii: int,totaltotal: int) )

// Calculate how well two frequency tables match according
// to the _chi-square_ statistic.
fun chisqrtour/chisqr: (xs : list<float64>, ys : list<float64>) -> float64( xsxs: list<float64> : liststd/core/types/list: V -> V<float64std/core/types/float64: V>, ysys: list<float64> : liststd/core/types/list: V -> V<float64std/core/types/float64: V> )result: -> total float64 : float64std/core/types/float64: V
  zipwithstd/core/list/zipwith: (xs : list<float64>, ys : list<float64>, f : (float64, float64) -> float64) -> list<float64>(xsxs: list<float64>,ysys: list<float64>, fnfn: (x : float64, y : float64) -> float64(xx: float64,yy: float64) ((xx: float64 -std/num/float64/(-): (x : float64, y : float64) -> float64 yy: float64)^std/num/float64/(^): (f : float64, p : float64) -> float642.0literal: float64
hex64= 0x1p1)/std/num/float64/(/): (x : float64, y : float64) -> float64yy: float64 ).foldrstd/core/list/foldr: (xs : list<float64>, z : float64, f : (float64, float64) -> float64) -> float64(0.0literal: float64
hex64= 0x0p+0,(+)std/num/float64/(+): (x : float64, y : float64) -> float64)

// Crack a Caesar encoded string
fun uncaesartour/uncaesar: (s : string) -> string( ss: string : stringstd/core/types/string: V )result: -> total string : stringstd/core/types/string: V
  val tabletable: list<float64>  = freqstour/freqs: (s : string) -> list<float64>(ss: string)                   // build a frequency table for `s`
  val chitabchitab: list<float64> = liststd/core/list/list: (lo : int, hi : int) -> list<int>(0literal: int
dec  = 0
hex8 = 0x00
bit8 = 0b00000000,25literal: int
dec  = 25
hex8 = 0x19
bit8 = 0b00011001).mapstd/core/list/map: (xs : list<int>, f : (int) -> float64) -> list<float64> fnfn: (n : int) -> float64(nn: int)       // build a list of chisqr numbers for each shift between 0 and 25
                 chisqrtour/chisqr: (xs : list<float64>, ys : list<float64>) -> float64( tabletable: list<float64>.rotatetour/rotate: (xs : list<float64>, n : int) -> list<float64>(nn: int), englishtour/english: list<float64> )

  val minmin: float64    = chitabchitab: list<float64>.minimumstd/num/float64/minimum: (xs : list<float64>) -> float64()           // find the mininal element
  val shiftshift: int  = chitabchitab: list<float64>.index-ofstd/core/list/index-of: (xs : list<float64>, pred : (float64) -> bool) -> int( fnfn: (f : float64) -> bool(ff: float64) ff: float64 ==std/num/float64/(==): (x : float64, y : float64) -> bool minmin: float64 ).negatestd/core/int/negate: (i : int) -> int  // and use its position as our shift
  ss: string.encodetour/encode: (s : string, shift : int) -> string( shiftshift: int )

fun test-uncaesartour/test-uncaesar: () -> console ()()result: -> console ()
  printlnstd/core/console/string/println: (s : string) -> console ()( uncaesartour/uncaesar: (s : string) -> console string( "nrnd lv d ixq odqjxdjh"literal: string
count= 22 ) )

The val keyword declares a static value. In the example, the value englishtour/english: list<float64> is a list of floating point numbers (of type float64std/core/types/float64: V) denoting the average frequency for each letter. The function freqstour/freqs: (s : string) -> list<float64> builds a frequency table for a specific string, while the function chisqrtour/chisqr: (xs : list<float64>, ys : list<float64>) -> float64 calculates how well two frequency tables match. In the function crack these functions are used to find a shift value that results in a string whose frequency table matches the englishtour/english: list<float64> one the closest – and we use that to decode the string. You can try out this example directly in the interactive environment:

> :l samples/basic/caesar.kk

3.2. Effect types

A novel part about Koka is that it automatically infers all the side effects that occur in a function. The absence of any effect is denoted as totalstd/core/types/total: E (or <>) and corresponds to pure mathematical functions. If a function can raise an exception the effect is exnstd/core/exn/exn: (E, V) -> V, and if a function may not terminate the effect is divstd/core/types/div: X (for divergence). The combination of exnstd/core/exn/exn: (E, V) -> V and divstd/core/types/div: X is purestd/core/pure: E and corresponds directly to Haskell's notion of purity. Non- deterministic functions get the ndetstd/core/types/ndet: X effect. The ‘worst’ effect is iostd/core/io: E and means that a program can raise exceptions, not terminate, be non- deterministic, read and write to the heap, and do any input/output operations. Here are some examples of effectful functions:

fun square1( x : int ) : total int   { x*x }
fun square2( x : int ) : console int { println( "a not so secret side-effect" ); x*x }
fun square3( x : int ) : div int     { x * square3( x ) }
fun square4( x : int ) : exn int     { throw( "oops" ); x*x }
fun square1tour/square1: (x : int) -> int( xx: int : intstd/core/types/int: V )result: -> total int : totalstd/core/types/total: E intstd/core/types/int: V   { xx: int*std/core/int/(*): (int, int) -> intxx: int }
fun square2tour/square2: (x : int) -> console int( xx: int : intstd/core/types/int: V )result: -> console int : consolestd/core/console/console: X intstd/core/types/int: V { printlnstd/core/console/string/println: (s : string) -> console ()( "a not so secret side-effect"literal: string
count= 27 ); xx: int*std/core/int/(*): (int, int) -> console intxx: int }
fun square3tour/square3: (x : int) -> div int( xx: int : intstd/core/types/int: V )result: -> div int : divstd/core/types/div: X intstd/core/types/int: V     { xx: int *std/core/int/(*): (int, int) -> div int square3tour/square3: (x : int) -> div int( xx: int ) }
fun square4tour/square4: (x : int) -> exn int( xx: int : intstd/core/types/int: V )result: -> exn int : exnstd/core/exn/exn: (E, V) -> V intstd/core/types/int: V     { throwstd/core/exn/throw: (message : string, info : ? exception-info) -> exn _3297( "oops"literal: string
count= 4 ); xx: int*std/core/int/(*): (int, int) -> exn intxx: int }

When the effect is totalstd/core/types/total: E we usually leave it out in the type annotation. For example, when we write:

fun square5( x : int ) : int
  x*x
fun square5tour/square5: (x : int) -> int( xx: int : intstd/core/types/int: V )result: -> total int : intstd/core/types/int: V
  xx: int*std/core/int/(*): (int, int) -> intxx: int

the assumed effect is totalstd/core/types/total: E. Sometimes, we write an effectful function, but are not interested in explicitly writing down its effect type. In that case, we can use a wildcard type which stands for some inferred type. A wildcard type is denoted by writing an identifier prefixed with an underscore, or even just an underscore by itself:

fun square6( x : int ) : _e int
  println("I did not want to write down the \"console\" effect")
  x*x
fun square6tour/square6: (x : int) -> console int( xx: int : intstd/core/types/int: V )result: -> console int : _e_e: E intstd/core/types/int: V
  printlnstd/core/console/string/println: (s : string) -> console ()("I did not want to write down the \"console\" effect"literal: string
count= 49)
  xx: int*std/core/int/(*): (int, int) -> console intxx: int

Hover over square6tour/square6: (x : int) -> console int to see the inferred effect for _e.

3.2.1. Semantics of effects

The inferred effects are not just considered as some extra type information on functions. On the contrary, through the inference of effects, Koka has a very strong connection to its denotational semantics. In particular, the full type of a Koka functions corresponds directly to the type signature of the mathematical function that describes its denotational semantics. For example, using 〚t〛 to translate a type t into its corresponding mathematical type signature, we have:

〚`intstd/core/types/int: V -> totalstd/core/types/total: E intstd/core/types/int: V`〛	=
〚`intstd/core/types/int: V -> exnstd/core/exn/exn: (E, V) -> V intstd/core/types/int: V`〛	=
〚`intstd/core/types/int: V -> purestd/core/pure: E intstd/core/types/int: V`〛	=
〚`intstd/core/types/int: V -> <ststd/core/types/st: H -> E<h>,purestd/core/pure: E> intstd/core/types/int: V`〛	=

In the above translation, we use as a sum where we have either a unit (i.e. exception) or a type , and we use for a product consisting of a pair of a heap and a type . From the above correspondence, we can immediately see that a totalstd/core/types/total: E function is truly total in the mathematical sense, while a stateful function (ststd/core/types/st: H -> E<h>) that can raise exceptions or not terminate (purestd/core/pure: E) takes an implicit heap parameter, and either does not terminate () or returns an updated heap together with either a value or an exception ().

We believe that this semantic correspondence is the true power of full effect types and it enables effective equational reasoning about the code by a programmer. For almost all other existing programming languages, even the most basic semantics immediately include complex effects like heap manipulation and divergence. In contrast, Koka allows a layered semantics where we can easily separate out nicely behaved parts, which is essential for many domains, like safe LINQ queries, parallel tasks, tier-splitting, sand-boxed mobile code, etc.

3.2.2. Combining effects

Often, a function contains multiple effects, for example:

fun combine-effects()
  val i = srandom-int() // non-deterministic
  throw("oops")         // exception raising
  combine-effects()     // and non-terminating
fun combine-effectstour/combine-effects: forall<a> () -> <pure,ndet> a()result: -> <exn,ndet,div> _3363
  val ii: int = srandom-intstd/num/random/srandom-int: () -> <ndet,exn,div> int() // non-deterministic
  throwstd/core/exn/throw: (message : string, info : ? exception-info) -> <exn,ndet,div> _3373("oops"literal: string
count= 4)         // exception raising
  combine-effectstour/combine-effects: () -> <exn,ndet,div> _3363()     // and non-terminating

The effect assigned to combine-effectstour/combine-effects: forall<a> () -> <pure,ndet> a are ndetstd/core/types/ndet: X, divstd/core/types/div: X, and exnstd/core/exn/exn: (E, V) -> V. We can write such combination as a row of effects as <divstd/core/types/div: X,exnstd/core/exn/exn: (E, V) -> V,ndetstd/core/types/ndet: X>. When you hover over the combine-effectstour/combine-effects: forall<a> () -> <pure,ndet> a identifiers, you will see that the type inferred is really <purestd/core/pure: E,ndetstd/core/types/ndet: X> where purestd/core/pure: E is a type alias defined as:

alias pure = <div,exn>
alias purestd/core/pure: E = <divstd/core/types/div: X,exnstd/core/exn/exn: (E, V) -> V>

3.2.3. Polymorphic effects

Many functions are polymorphic in their effect. For example, the mapstd/core/list/map: forall<a,b,e> (xs : list<a>, f : (a) -> e b) -> e list<b> function applies a function f to each element of a (finite) list. As such, the effect depends on the effect of f, and the type of map becomes:

map : (xs : list<a>, f : (a) -> e b) -> e list<b>
map : (xs : liststd/core/types/list: V -> V<a>, f : (a) -> e b) -> e liststd/core/types/list: V -> V<b>

We use single letters (possibly followed by digits) for polymorphic types. Here, the map functions takes a list with elements of some type a, and a function f that takes an element of type a and returns a new element of type b. The final result is a list with elements of type b. Moreover, the effect of the applied function e is also the effect of the map function itself; indeed, this function has no other effect by itself since it does not diverge, nor raises exceptions.

We can use the notation <l|e> to extend an effect e with another effect l. This is used for example in the whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () function which has type: whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () : ( pred : () -> <divstd/core/types/div: X|e> boolstd/core/types/bool: V, action : () -> <divstd/core/types/div: X|e> () ) -> <divstd/core/types/div: X|e> (). The whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () function takes a predicate function and an action to perform, both with effect <divstd/core/types/div: X|e>. Indeed, since while may diverge depending on the predicate its effect must include divergence.

The reader may be worried that the type of whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () forces the predicate and action to have exactly the same effect <divstd/core/types/div: X|e>, which even includes divergence. However, when effects are inferred at the call-site, both the effects of predicate and action are extended automatically until they match. This ensures we take the union of the effects in the predicate and action. Take for example the following loop:

fun looptest()
  while { is-odd(srandom-int()) }
    throw("odd")
fun looptesttour/looptest: () -> <pure,ndet> ()()result: -> <pure,ndet> ()
  whilestd/core/while: (predicate : () -> <div,ndet,exn> bool, action : () -> <div,ndet,exn> ()) -> <div,ndet,exn> () { is-oddstd/core/int/is-odd: (i : int) -> <ndet,div,exn> bool(srandom-intstd/num/random/srandom-int: () -> <ndet,div,exn> int()) }
    throwstd/core/exn/throw: (message : string, info : ? exception-info) -> <exn,div,ndet> ()("odd"literal: string
count= 3)

Koka infers that the predicate odd(srandom-int()) has effect <ndetstd/core/types/ndet: X|e1> while the action has effect <exnstd/core/exn/exn: (E, V) -> V|e2> for some e1 and e2. When applying whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> (), those effects are unified to the type <exnstd/core/exn/exn: (E, V) -> V,ndetstd/core/types/ndet: X,divstd/core/types/div: X|e3> for some e3.

3.2.4. Local Mutable Variables

The Fibonacci numbers are a sequence where each subsequent Fibonacci number is the sum of the previous two, where fibtour/fib: (n : int) -> div int(0) == 0 and fibtour/fib: (n : int) -> div int(1) == 1. We can easily calculate Fibonacci numbers using a recursive function:

fun main() { println(fib(10)) }

fun fib(n : int) : div int
  if n <= 0   then 0
  elif n == 1 then 1
  else fib(n - 1) + fib(n - 2)
fun fibtour/fib: (n : int) -> div int(nn: int : intstd/core/types/int: V)result: -> div int : divstd/core/types/div: X intstd/core/types/int: V
  if nn: int <=std/core/int/(<=): (x : int, y : int) -> div bool 0literal: int
dec  = 0
hex8 = 0x00
bit8 = 0b00000000   then 0literal: int
dec  = 0
hex8 = 0x00
bit8 = 0b00000000
  elif nn: int ==std/core/int/(==): (x : int, y : int) -> div bool 1literal: int
dec  = 1
hex8 = 0x01
bit8 = 0b00000001 then 1literal: int
dec  = 1
hex8 = 0x01
bit8 = 0b00000001
  else fibtour/fib: (n : int) -> div int(nn: int -std/core/int/(-): (x : int, y : int) -> div int 1literal: int
dec  = 1
hex8 = 0x01
bit8 = 0b00000001) +std/core/int/(+): (x : int, y : int) -> div int fibtour/fib: (n : int) -> div int(nn: int -std/core/int/(-): (x : int, y : int) -> div int 2literal: int
dec  = 2
hex8 = 0x02
bit8 = 0b00000010)

Note that the type inference engine is currently not powerful enough to prove that this recursive function always terminates, which leads to inclusion of the divergence effect divstd/core/types/div: X in the result type.

Here is another version of the Fibonacci function but this time implemented using local mutable variables. We use the repeat function to iterate n times:

fun main() { println(fib2(10)) }

fun fib2(n)
  var x := 0
  var y := 1
  repeat(n)
    val y0 = y
    y := x+y
    x := y0
  x
fun fib2tour/fib2: (n : int) -> int(nn: int)result: -> total int
  var xx: local-var<$3509,int> := 0literal: int
dec  = 0
hex8 = 0x00
bit8 = 0b00000000
  var yy: local-var<$3509,int> := 1literal: int
dec  = 1
hex8 = 0x01
bit8 = 0b00000001
  repeatstd/core/repeat: (n : int, action : () -> (local<$3509>) ()) -> (local<$3509>) ()(nn: int)
    val y0y0: int = yy: int
    yy: local-var<$3509,int> :=std/core/types/local-set: (v : local-var<$3509,int>, assigned : int) -> (local<$3509>) () xx: int+std/core/int/(+): (x : int, y : int) -> (local<$3509>) intyy: int
    xx: local-var<$3509,int> :=std/core/types/local-set: (v : local-var<$3509,int>, assigned : int) -> (local<$3509>) () y0y0: int
  xx: int

In contrast to a val declaration that binds an immutable value (as in val y0 = y), a var declaration declares a mutable variable, where the (:=) operator can assign a new value to the variable. Internally, the var declarations use a state effect handler which ensures that the state has the proper semantics even if resuming multiple times.

However, that also means that mutable local variables are not quite first-class and we cannot pass them as parameters to other functions for example (as they are always dereferenced). The lifetime of mutable local variable cannot exceed its lexical scope. For example, you get a type error if a local variable escapes through a function expression:

fun wrong() : (() -> console ())
  var x := 1
  (fn(){ x := x + 1; println(x) })
fun wrong() : (() -> consolestd/core/console/console: X ())
  var x := 1
  (fn(){ x := x + 1; println(x) })

This restriction allows for a clean semantics but also for (future) optimizations that are not possible for general mutable reference cells.

3.2.5. Reference Cells and Isolated state

Koka also has first-class heap allocated mutable reference cells. A reference to an integer is allocated using val r = refstd/core/types/ref: forall<a,h> (value : a) -> (alloc<h>) ref<h,a>(0) (since the reference itself is actually a value!), and can be dereferenced using the bang operator, as !r. We can write the Fibonacci function using reference cells as:

fun main() { println(fib3(10)) }

fun fib3(n)
  val x = ref(0)
  val y = ref(1)
  repeat(n)
    val y0 = !y
    y := !x + !y
    x := y0
  !x
fun fib3tour/fib3: (n : int) -> int(nn: int)result: -> total int
  val xx: ref<_3650,int> = refstd/core/types/ref: (value : int) -> <alloc<_3650>,read<_3650>,write<_3650>> ref<_3650,int>(0literal: int
dec  = 0
hex8 = 0x00
bit8 = 0b00000000)
  val yy: ref<_3650,int> = refstd/core/types/ref: (value : int) -> <alloc<_3650>,read<_3650>,write<_3650>> ref<_3650,int>(1literal: int
dec  = 1
hex8 = 0x01
bit8 = 0b00000001)
  repeatstd/core/repeat: (n : int, action : () -> <read<_3650>,write<_3650>,alloc<_3650>> ()) -> <read<_3650>,write<_3650>,alloc<_3650>> ()(nn: int)
    val y0y0: int = !std/core/types/ref/(!): (ref : ref<_3650,int>) -> <read<_3650>,write<_3650>,alloc<_3650>> intyy: ref<_3650,int>
    yy: ref<_3650,int> :=std/core/types/set: (ref : ref<_3650,int>, assigned : int) -> <write<_3650>,read<_3650>,alloc<_3650>> () !std/core/types/ref/(!): (ref : ref<_3650,int>) -> <read<_3650>,write<_3650>,alloc<_3650>> intxx: ref<_3650,int> +std/core/int/(+): (x : int, y : int) -> <read<_3650>,write<_3650>,alloc<_3650>> int !std/core/types/ref/(!): (ref : ref<_3650,int>) -> <read<_3650>,write<_3650>,alloc<_3650>> intyy: ref<_3650,int>
    xx: ref<_3650,int> :=std/core/types/set: (ref : ref<_3650,int>, assigned : int) -> <write<_3650>,read<_3650>,alloc<_3650>> () y0y0: int
  !std/core/types/ref/(!): (ref : ref<_3650,int>) -> <read<_3650>,write<_3650>,alloc<_3650>> intxx: ref<_3650,int>

As we can see, using var declarations are generally preferred as these behave better under multiple resumptions, but also are syntactically more concise as they do not need a dereferencing operator. (Nevertheless, we still need reference cells as those are first-class while var variables cannot be passed to other functions.)

When we look at the types inferred for the references, we see that x and y have type refstd/core/types/ref: (H, V) -> V<h,intstd/core/types/int: V> which stands for a reference to a mutable value of type intstd/core/types/int: V in some heap h. The effects on heaps are allocation as heap<h>, reading from a heap as readstd/core/types/read: H -> X<h> and writing to a heap as writestd/core/types/write: H -> X<h>. The combination of these effects is called stateful and denoted with the alias ststd/core/types/st: H -> E<h>.

Clearly, the effect of the body of fib3tour/fib3: (n : int) -> int is ststd/core/types/st: H -> E<h>; but when we hover over fib3tour/fib3: (n : int) -> int, we see the type inferred is actually the totalstd/core/types/total: E effect: (n:intstd/core/types/int: V) -> intstd/core/types/int: V. Indeed, even though fib3tour/fib3: (n : int) -> int is stateful inside, its side-effects can never be observed. It turns out that we can safely discard the ststd/core/types/st: H -> E<h> effect whenever the heap type h cannot be referenced outside this function, i.e. it is not part of an argument or return type. More formally, the Koka compiler proves this by showing that a function is fully polymorphic in the heap type h and applies the runstd/core/types/run: forall<e,a> (action : forall<h> () -> <alloc<h>,read<h>,write<h>|e> a) -> e a function (corresponding to runST in Haskell) to discard the ststd/core/types/st: H -> E<h> effect.

The Garsia-Wachs algorithm is a nice example where side-effects are used internally across function definitions and data structures, but where the final algorithm itself behaves like a pure function, see the samples/basic/garsia-wachs.kk.

3.3. Data Types

3.3.1. Structs

An important aspect of a function-oriented language is to be able to define rich data types over which the functions work. A common data type is that of a struct or record. Here is an example of a struct that contains information about a person:

struct person
  age : int
  name : string
  realname : string = name

val brian = Person( 29, "Brian" )
struct persontour/person: V
  ageage: int : intstd/core/types/int: V
  namename: string : stringstd/core/types/string: V
  realnamerealname: string : stringstd/core/types/string: V = namename: string

val briantour/brian: person = Persontour/Person: (age : int, name : string, realname : string) -> person( 29literal: int
dec  = 29
hex8 = 0x1D
bit8 = 0b00011101, "Brian"literal: string
count= 5 )

Every struct (and other data types) come with constructor functions to create instances, as in Persontour/Person: (age : int, name : string, realname : string) -> person(19,"Brian"). Moreover, these constructors can use named arguments so we can also call the constructor as Persontour/Person: (age : int, name : string, realname : string) -> person( name = "Brian", age = 19, realname = "Brian H. Griffin" ) which is quite close to regular record syntax but without any special rules; it is just functions all the way down!

Also, Koka automatically generates accessor functions for each field in a struct (or other data type), and we can access the age of a persontour/person: V as briantour/brian: person.age (which is of course just syntactic sugar for age(briantour/brian: person)).

3.3.2. Copying

By default, all structs (and other data types) are immutable. Instead of directly mutating a field in a struct, we usually return a new struct where the fields are updated. For example, here is a birthdaytour/birthday: (p : person) -> person function that increments the age field:

fun main()
  println( brian.birthday.age )

struct person
  age : int
  name : string
  realname : string = name

val brian = Person( 29, "Brian" )

fun birthday( p : person ) : person
  p( age = p.age + 1 )
fun birthdaytour/birthday: (p : person) -> person( pp: person : persontour/person: V )result: -> total person : persontour/person: V
  pp: person( age = pp: person.agetour/person/age: (person : person) -> int +std/core/int/(+): (x : int, y : int) -> int 1literal: int
dec  = 1
hex8 = 0x01
bit8 = 0b00000001 )

Here, birthdaytour/birthday: (p : person) -> person returns a fresh persontour/person: V which is equal to p but with the age incremented. The syntax p(...) is syntactic sugar for calling the copy constructor of a persontour/person: V. This constructor is also automatically generated for each data type, and is internally generated as:

fun main()
  println( brian.copy().age )

struct person( age : int, name : string, realname : string = name )

val brian = Person( 29, "Brian" )

fun copy( p, age = p.age, name = p.name, realname = p.realname )
  Person(age, name, realname)
fun copytour/copy: (p : person, age : ? int, name : ? string, realname : ? string) -> person( pp: person, ageage: ? int = pp: person.agetour/person/age: (person : person) -> int, namename: ? string = pp: person.nametour/person/name: (person : person) -> string, realnamerealname: ? string = pp: person.realnametour/person/realname: (person : person) -> string )result: -> total person
  Persontour/Person: (age : int, name : string, realname : string) -> person(ageage: int, namename: string, realnamerealname: string)

When arguments follow a data value, as in p( age = p.age + 1), it is expanded to call this copy function, as in p.copytour/copy: (p : person, age : ? int, name : ? string, realname : ? string) -> person( age = p.age + 1 ). In adherence with the min-gen principle, there are no special rules for record updates but using plain function calls with optional and named parameters.

3.3.3. Alternatives (or Unions)

Koka also supports algebraic data types where there are multiple alternatives. For example, here is an enumeration:

type color
  Red
  Green
  Blue
type color
  Red
  Green
  Blue

Special cases of these enumerated types are the voidstd/core/types/void: V type which has no alternatives (and therefore there exists no value with this type), the unit type (()) which has just one constructor, also written as () (and therefore, there exists only one value with the type (()), namely ()), and finally the boolean type boolstd/core/types/bool: V with two constructors Truestd/core/types/True: bool and Falsestd/core/types/False: bool.

type void

type ()
  ()

type bool
  False
  True
type voidstd/core/types/void: V

type ()
  ()

type boolstd/core/types/bool: V
  Falsestd/core/types/False: bool
  Truestd/core/types/True: bool

Constructors can have parameters. For example, here is how to create a number type which is either an integer or the infinity value:

type number
  Infinity
  Integer( i : int )
type number
  Infinity
  Integer( i : intstd/core/types/int: V )

We can create such number by writing Integer(1) or Infinity. Moreover, data types can be polymorphic and recursive. Here is the definition of the liststd/core/types/list: V -> V type which is either empty (Nilstd/core/types/Nil: forall<a> list<a>) or is a head element followed by a tail list (Consstd/core/types/Cons: forall<a> (head : a, tail : list<a>) -> list<a>):

type list<a>
  Nil
  Cons{ head : a; tail : list<a> }
type liststd/core/types/list: V -> V<a>
  Nilstd/core/types/Nil: forall<a> list<a>
  Consstd/core/types/Cons: forall<a> (head : a, tail : list<a>) -> list<a>{ head : a; tail : liststd/core/types/list: V -> V<a> }

Koka automatically generates accessor functions for each named parameter. For lists for example, we can access the head of a list as Consstd/core/types/Cons: forall<a> (head : a, tail : list<a>) -> list<a>(1,Nilstd/core/types/Nil: forall<a> list<a>).head.

We can now also see that struct types are just syntactic sugar for a regular type with a single constructor of the same name as the type:

translation

struct tp { <fields> }
struct tp { <fields> }

type tp {
  Tp { <fields> }
}
type tp {
  Tp { <fields> }
}

For example, our earlier persontour/person: V struct, defined as

struct person{ age : int; name : string; realname : string = name }
struct persontour/person: V{ age : intstd/core/types/int: V; name : stringstd/core/types/string: V; realname : stringstd/core/types/string: V = name }

desugars to:

type person
  Person{ age : int; name : string; realname : string = name }
type persontour/person: V
  Persontour/Person: (age : int, name : string, realname : string) -> person{ age : intstd/core/types/int: V; name : stringstd/core/types/string: V; realname : stringstd/core/types/string: V = name }

or with brace elision as:

type person
  Person
    age : int
    name : string
    realname : string = name
type persontour/person: V
  Persontour/Person: (age : int, name : string, realname : string) -> person
    age : intstd/core/types/int: V
    name : stringstd/core/types/string: V
    realname : stringstd/core/types/string: V = name

3.3.4. Matching

Todo

3.3.5. Extensible Data Types

Todo

3.3.6. Inductive, Co-inductive, and Recursive Types

For the purposes of equational reasoning and termination checking, a type declaration is limited to finite inductive types. There are two more declarations, namely co type and rec type that allow for co-inductive types, and arbitrary recursive types respectively.

3.3.7. Value Types

Value types are (non-recursive) data types that are not heap allocated but passed on the stack as a value. Since data types are immutable, semantically these types are equivalent but value types can be more efficient as they avoid heap allocation and reference counting (or more expensive as they need copying instead of sharing a reference).

By default, any non-recursive inductive data type of a size up to 3 machine words (= 24 bytes on a 64-bit platform) is treated as a value type. For example, tuples and 3-tuples are passed and returned by value. Usually, that means that such tuples are for example returned in registers when compiling with optimization.

We can also force a type to be compiled as a value type by using the value keyword in front of a type or struct declaration:

value struct argb{ alpha: int; color-red: int; color-green: int; color-blue: int }
value struct argbtour/argb: V{ alphatour/argb/alpha: (argb : argb) -> int: intstd/core/types/int: V; color-redtour/argb/color-red: (argb : argb) -> int: intstd/core/types/int: V; color-greentour/argb/color-green: (argb : argb) -> int: intstd/core/types/int: V; color-bluetour/argb/color-blue: (argb : argb) -> int: intstd/core/types/int: V }

advanced

Boxing

To support generic polymorphism, sometimes value types are boxed. For example, a list is polymorphic in its elements. That means that if we construct a list of tuples, like [(1,Truestd/core/types/True: bool)], that the element (1,2) will be boxed and heap allocated – essentially the compiler transforms this expression into [Boxstd/core/types/Box: forall<a> (unbox : a) -> box<a>((1,Truestd/core/types/True: bool)] internally.

Note that for regular data types and intstd/core/types/int: V's boxing is free (as in isomorphic). Moreover, value types up to 63 bits (on a 64-bit platform) are boxed in-place and do not require heap allocation (like int32std/core/types/int32: V). The float64std/core/types/float64: V type is also specialized; by default the Koka compiler only heap allocates float64std/core/types/float64: Vs when their absolute value is outside the range 2^-511 up to 2⁵¹² (excluding infinity and NaN)).

For performance sensitive code we may specialize certain polymorphic data types to reduce allocations due to boxing. For example:

type mylist
  MyCons{ head1: int; head2: bool; mytail: mylist }
  MyNil
type mylisttour/mylist: V
  MyConstour/MyCons: (head1 : int, head2 : bool, mytail : mylist) -> mylist{ head1: intstd/core/types/int: V; head2: boolstd/core/types/bool: V; mytail: mylisttour/mylist: V }
  MyNiltour/MyNil: mylist

Our previous example becomes MyConstour/MyCons: (head1 : int, head2 : bool, mytail : mylist) -> mylist(1,Truestd/core/types/True: bool,MyNiltour/MyNil: mylist) now and is more efficient as it only needs one allocation for the MyConstour/MyCons: (head1 : int, head2 : bool, mytail : mylist) -> mylist without an indirection to a tuple. In the future we hope to extend Koka to perform specialization automatically or by using special directives.

3.4. Effect Handlers

Effect handlers [9, 10, 21] are a novel way to define control-flow abstractions and dynamic binding as user defined handlers – no need anymore to add special compiler extensions for exceptions, iterators, async-await, probabilistic programming, etc. Moreover, these handlers can be composed freely so the interaction between, say, async-await and exceptions are well-defined.

3.4.1. Handling

Let's start with defining an exception effect of our own. The effect declaration defines a new type together with operations, for now we use the most general control (ctl) operation:

effect raise
  ctl raise( msg : string ) : a
effecttour/raise: (E, V) -> V raisetour/raise: (E, V) -> V
  ctl raise( msgmsg: string : stringstd/core/types/string: V ) : aa: V

This defines an effect type raisetour/raise: (E, V) -> V together with an operation raisetour/raise: forall<a> (msg : string) -> raise a of type (msg : stringstd/core/types/string: V) -> raisetour/raise: (E, V) -> V a. With the effect signature declared, we can already use the operations:

fun safe-divide( x : int, y : int ) : raise int
  if y==0 then raise("div-by-zero") else x / y
fun safe-dividetour/safe-divide: (x : int, y : int) -> raise int( xx: int : intstd/core/types/int: V, yy: int : intstd/core/types/int: V )result: -> raise int : raisetour/raise: (E, V) -> V intstd/core/types/int: V
  if yy: int==std/core/int/(==): (x : int, y : int) -> raise bool0literal: int
dec  = 0
hex8 = 0x00
bit8 = 0b00000000 then raisetour/raise: (msg : string) -> raise int("div-by-zero"literal: string
count= 11) else xx: int /std/core/int/(/): (x : int, y : int) -> raise int yy: int

where we see that the safe-dividetour/safe-divide: (x : int, y : int) -> raise int function gets the raisetour/raise: (E, V) -> V effect (since we use the raisetour/raise: forall<a> (msg : string) -> raise a operation in the body). Such an effect type means that we can only evaluate the function in a context where raisetour/raise: (E, V) -> V is handled (in other words, where it is “dynamically bound”, or where we “have the raisetour/raise: (E, V) -> V capability”).

We can handle the effect by giving a concrete definition for the raisetour/raise: forall<a> (msg : string) -> raise a operation. For example, we may always return a default value:

fun raise-const() : int
  with handler
    ctl raise(msg) 42
  8 + safe-divide(1,0)
fun raise-consttour/raise-const: () -> int()result: -> total int : intstd/core/types/int: V
  withwith: () -> raise int handlerhandler: (() -> raise int) -> int
    ctl raiseraise: (msg : string, resume : (_4095) -> int) -> int
(msgmsg: string) 42literal: int
dec  = 42
hex8 = 0x2A
bit8 = 0b00101010
  8literal: int
dec  = 8
hex8 = 0x08
bit8 = 0b00001000 +std/core/int/(+): (x : int, y : int) -> raise int safe-dividetour/safe-divide: (x : int, y : int) -> raise int(1literal: int
dec  = 1
hex8 = 0x01
bit8 = 0b00000001,0literal: int
dec  = 0
hex8 = 0x00
bit8 = 0b00000000)

The call raise-consttour/raise-const: () -> int() evaluates to 42 (not 50). When a raisetour/raise: forall<a> (msg : string) -> raise a is called (in safe-dividetour/safe-divide: (x : int, y : int) -> raise int), it will yield to its innermost handler, unwind the stack, and only then evaluate the operation definition – in this case just directly returning 42 from the point where the handler is defined. Now we can see why it is called a control operation as raisetour/raise: forall<a> (msg : string) -> raise a changes the regular linear control-flow and yields right back to its innermost handler from the original call site. Also note that raise-consttour/raise-const: () -> int is totalstd/core/types/total: E again and the handler discharged the raisetour/raise: (E, V) -> V effect.

The handler{ <ops> } expression is a function that itself expects a function argument over which the handler is scoped, as in (handler{ <ops> })(action). This works well in combination with the with statement of course. As a syntactic convenience, for single operations we can leave out the handler keyword which is translated as:

translation

with ctl op(<args>){ <body> }
with ctl op(<args>){ <body> }

with handler
  ctl op(<args>){ <body> }
with handler
  ctl op(<args>){ <body> }

With this translation, we can write the previous example more concisely as:

fun raise-const1() : int
  with ctl raise(msg) 42
  8 + safe-divide(1,0)
fun raise-const1tour/raise-const1: () -> int()result: -> total int : intstd/core/types/int: V
  withhandler: (() -> raise int) -> int ctl raiseraise: (msg : string, resume : (_4174) -> int) -> int
(msgmsg: string) 42literal: int
dec  = 42
hex8 = 0x2A
bit8 = 0b00101010
  8literal: int
dec  = 8
hex8 = 0x08
bit8 = 0b00001000 +std/core/int/(+): (x : int, y : int) -> raise int safe-dividetour/safe-divide: (x : int, y : int) -> raise int(1literal: int
dec  = 1
hex8 = 0x01
bit8 = 0b00000001,0literal: int
dec  = 0
hex8 = 0x00
bit8 = 0b00000000)

which eventually expands to (handler{ ctl raisetour/raise: forall<a> (msg : string) -> raise a(msg){ 42 } })(fn(){ 8 + safe-dividetour/safe-divide: (x : int, y : int) -> raise int(1,0) }).

We have a similar syntactic convenience for effects with one operation where the name of the effect and operation are the same. We can define such an effect by just declaring its operation which implicitly declares an effect type of the same name:

translation

effect ctl op(<parameters>) : <result-type>
effect ctl op(<parameters>) : <result-type>

effect op {
  ctl op(<parameters>) : <result-type>
}
effect op {
  ctl op(<parameters>) : <result-type>
}

That means we can declare our raisetour/raise: (E, V) -> V effect signature also more concisely as:

effect ctl raise( msg : string ) : a
effect ctl raisetour/raise: forall<a> (msg : string) -> raise a( msg : stringstd/core/types/string: V ) : a

3.4.2. Resuming

The power of effect handlers is not just that we can yield to the innermost handler, but that we can also resume back to the call site with a result.

Let's define a asktour/ask: (V, E, V) -> V<a> effect that allows us to get a contextual value of type a:

effect ask<a>                   // or: effect<a> ctl ask() : a
  ctl ask() : a

fun add-twice() : ask<int> int
  ask() + ask()
effecttour/ask: (V, E, V) -> V asktour/ask: (V, E, V) -> V<aa: V>                   // or: effect<a> ctl ask() : a
  ctl ask() : aa: V

fun add-twicetour/add-twice: () -> (ask<int>) int()result: -> (ask<int>) int : asktour/ask: (V, E, V) -> V<intstd/core/types/int: V> intstd/core/types/int: V
  asktour/ask: () -> (ask<int>) int() +std/core/int/(+): (x : int, y : int) -> (ask<int>) int asktour/ask: () -> (ask<int>) int()

The add-twicetour/add-twice: () -> (ask<int>) int function can ask for numbers but it is unaware of how these are provided – the effect signature just specifies an contextual API. We can handle it by always resuming with a constant for example:

fun ask-const() : int
  with ctl ask() resume(21)
  add-twice()
fun ask-consttour/ask-const: () -> int()result: -> total int : intstd/core/types/int: V
  withhandler: (() -> (ask<int>) int) -> int ctl askask: (resume : (int) -> int) -> int
() resumeresume: (int) -> int(21literal: int
dec  = 21
hex8 = 0x15
bit8 = 0b00010101)
  add-twicetour/add-twice: () -> (ask<int>) int()

where ask-consttour/ask-const: () -> int() evaluates to 42. Or by returning random values, like:

fun ask-random() : random int
  with ctl ask() resume(random-int())
  add-twice()
fun ask-randomtour/ask-random: () -> random int()result: -> random int : randomstd/num/random/random: (E, V) -> V intstd/core/types/int: V
  withhandler: (() -> <ask<int>,random> int) -> random int ctl askask: (resume : (int) -> random int) -> random int
() resumeresume: (int) -> random int(random-intstd/num/random/random-int: () -> random int())
  add-twicetour/add-twice: () -> <ask<int>,random> int()

where ask-randomtour/ask-random: () -> random int() now handled the asktour/ask: (V, E, V) -> V<intstd/core/types/int: V> effect, but itself now has randomstd/num/random/random: (E, V) -> V effect (see std/num/random). The resumestd/core/hnd/resume: forall<a,e,e1,b> (r : resume-context<a,e,e1,b>, x : a) -> e b function is implicitly bound by a ctl operation and resumes back to the call-site with the given result.

As we saw in the exception example, we do not need to call resumestd/core/hnd/resume: forall<a,e,e1,b> (r : resume-context<a,e,e1,b>, x : a) -> e b and can also directly return into our handler scope. For example, we may only want to handle a asktour/ask: forall<a> () -> (ask<a>) a once, but after that give up:

fun ask-once() : int
  var count := 0
  with ctl ask()
    count := count + 1
    if count <= 1 then resume(42) else 0
  add-twice()
fun ask-oncetour/ask-once: () -> int()result: -> total int : intstd/core/types/int: V
  var countcount: local-var<$4567,int> := 0literal: int
dec  = 0
hex8 = 0x00
bit8 = 0b00000000
  withhandler: (() -> <ask<int>,local<$4567>> int) -> (local<$4567>) int ctl askask: (resume : (int) -> (local<$4567>) int) -> (local<$4567>) int
()
    countcount: local-var<$4567,int> :=std/core/types/local-set: (v : local-var<$4567,int>, assigned : int) -> (local<$4567>) () countcount: int +std/core/int/(+): (x : int, y : int) -> (local<$4567>) int 1literal: int
dec  = 1
hex8 = 0x01
bit8 = 0b00000001
    if countcount: int <=std/core/int/(<=): (x : int, y : int) -> (local<$4567>) bool 1literal: int
dec  = 1
hex8 = 0x01
bit8 = 0b00000001 then resumeresume: (int) -> (local<$4567>) int(42literal: int
dec  = 42
hex8 = 0x2A
bit8 = 0b00101010) else 0literal: int
dec  = 0
hex8 = 0x00
bit8 = 0b00000000
  add-twicetour/add-twice: () -> (ask<int>) int()

Here ask-oncetour/ask-once: () -> int() evaluates to 0 since the second call to asktour/ask: forall<a> () -> (ask<a>) a does not resume, (and returns directly 0 in the ask-oncetour/ask-once: () -> int context). This pattern can for example be used to implement the concept of fuel in a setting where a computation is only allowed to take a limited amount of steps.

3.4.3. Tail-Resumptive Operations

A ctl operation is one of the most general ways to define operations since we get a first-class resumestd/core/hnd/resume: forall<a,e,e1,b> (r : resume-context<a,e,e1,b>, x : a) -> e b function. However, almost all operations in practice turn out to be tail-resumptive: that is, they resume exactly once with their final result value. To make this more convenient, we can declare fun operations that do this by construction, i.e.

translation

with fun op(<args>){ <body> }
with fun op(<args>){ <body> }

with ctl op(<args>){ val f = fn(){ <body> }; resume( f() ) }
with ctl op(<args>){ val f = fn(){ <body> }; resumestd/core/hnd/resume: forall<a,e,e1,b> (r : resume-context<a,e,e1,b>, x : a) -> e b( f() ) }

(The translation is defined via an intermediate function f so return works as expected).

With this syntactic sugar, we can write our earlier ask-consttour/ask-const: () -> int example using a fun operation instead:

fun ask-const2() : int
  with fun ask() 21
  add-twice()
fun ask-const2tour/ask-const2: () -> int()result: -> total int : intstd/core/types/int: V
  withhandler: (() -> (ask<int>) int) -> int fun askask: () -> int() 21literal: int
dec  = 21
hex8 = 0x15
bit8 = 0b00010101
  add-twicetour/add-twice: () -> (ask<int>) int()

This also conveys better that even though asktour/ask: forall<a> () -> (ask<a>) a is dynamically bound, it behaves just like a regular function without changing the control-flow.

Moreover, operations declared as fun are much more efficient than general ctl operations. The Koka compiler uses (generalized) evidence passing [23–25] to pass down handler information to each call-site. At the call to asktour/ask: forall<a> () -> (ask<a>) a in add-twicetour/add-twice: () -> (ask<int>) int, it selects the handler from the evidence vector and when the operation is a tail-resumptive fun, it calls it directly as a regular function (except with an adjusted evidence vector for its context). Unlike a general ctl operation, there is no need to yield upward to the handler, capture the stack, and eventually resume again. This gives fun (and val) operations a performance cost very similar to virtual method calls which can be quite efficient.

For even a bit more performance, you can also declare upfront that any operation definition must be tail-resumptive, as:

effect ask<a>
  fun ask() : a
effect asktour/ask: (V, E, V) -> V<a>
  fun asktour/ask: forall<a> () -> (ask<a>) a() : a

This restricts all handler definitions for the asktour/ask: (V, E, V) -> V effect to use fun definitions for the asktour/ask: forall<a> () -> (ask<a>) a operation. However, it increases the ability to reason about the code, and the compiler can optimize such calls a bit more as it no longer needs to check at run-time if the handler happens to define the operation as tail-resumptive.

advancedFor even better performance, one can mark the effect as linear (Section 3.4.12). Such effects are statically guaranteed to never use a general control operation and never need to capture a resumption. During compilation, this removes the need for the monadic transformation and improves performance of any effect polymorphic function that uses such effects as well (like map or foldrstd/core/list/foldr: forall<a,b,e> (xs : list<a>, z : b, f : (a, b) -> e b) -> e b). Examples of linear effects are state (ststd/core/types/st: H -> E) and builtin effects (like iostd/core/io: E or consolestd/core/console/console: X).

Value Operations

A common subset of operations always tail-resume with a single value; these are essentially dynamically bound variables (but statically typed!). Such operations can be declared as a val with the following translation:

translation

with val v = <expr>
with val v = <expr>

val x = <expr>
with fun v(){ x }
val x = <expr>
with fun v(){ x }

val x = <expr>
with ctl v(){ resume(x) }
val x = <expr>
with ctl v(){ resumestd/core/hnd/resume: forall<a,e,e1,b> (r : resume-context<a,e,e1,b>, x : a) -> e b(x) }

For an example of the use of value operations, consider a pretty printer that produces pretty strings from documents:

fun pretty( d : doc ) : string
fun prettytour/pretty: (d : doc) -> width string( d : doctour/doc: V ) : stringstd/core/types/string: V

Unfortunately, it has a hard-coded maximum display width of 40 deep down in the code of prettytour/pretty: (d : doc) -> width string:

fun pretty-internal( line : string ) : string
  line.truncate(40)
fun pretty-internaltour/pretty-internal: (line : string) -> width string( line : stringstd/core/types/string: V ) : stringstd/core/types/string: V
  line.truncate(40)

To abstract over the width we have a couple of choices: we could make the width a regular parameter but now we need to explicitly add the parameter to all functions in the library and manually thread them around. Another option is a global mutable variable but that leaks side-effects and is non-modular.

Or, we can define it as a value operation instead:

effect val width : int
effecttour/width: (E, V) -> V val widthtour/width: (E, V) -> V : intstd/core/types/int: V

This also allows us to refer to the widthtour/width: -> width int operation as if it was a regular value (even though internally it invokes the operation). So, the check for the width in the pretty printer can be written as:

fun pretty-internal( line : string ) : width string
  line.truncate(width)
fun pretty-internaltour/pretty-internal: (line : string) -> width string( lineline: string : stringstd/core/types/string: V )result: -> width string : widthtour/width: (E, V) -> V stringstd/core/types/string: V
  lineline: string.truncatestd/core/sslice/string/truncate: (s : string, count : int) -> width string(widthtour/width: -> width int)

When using the pretty printer we can bind the widthtour/width: -> width int as a regular effect handler:

fun pretty-thin(d : doc) : string
  with val width = 40
  pretty(d)
fun pretty-thintour/pretty-thin: (d : doc) -> string(dd: doc : doctour/doc: V)result: -> total string : stringstd/core/types/string: V
  withhandler: (() -> width string) -> string val widthwidth: () -> int = 40literal: int
dec  = 40
hex8 = 0x28
bit8 = 0b00101000
  prettytour/pretty: (d : doc) -> width string(dd: doc)

Note that we did not need to change the structure of the original library functions. However the types of the functions still change to include the widthtour/width: (E, V) -> V effect as these now require the widthtour/width: -> width int value to be handled at some point. For example, the type of prettytour/pretty: (d : doc) -> width string becomes:

fun pretty( d : doc ) : width string
fun prettytour/pretty: (d : doc) -> width string( d : doctour/doc: V ) : widthtour/width: (E, V) -> V stringstd/core/types/string: V

as is requires the widthtour/width: (E, V) -> V effect to be handled (aka, the "dynamic binding for widthtour/width: -> width int : intstd/core/types/int: V to be defined“, aka, the ”widthtour/width: (E, V) -> V capability").

3.4.4. Abstracting Handlers

As another example, a writer effect is quite common where values are collected by a handler. For example, we can define an emittour/emit: (E, V) -> V effect to emit messages:

effect fun emit( msg : string ) : ()
effect fun emittour/emit: (msg : string) -> emit ()( msg : stringstd/core/types/string: V ) : ()

fun ehello() : emit ()
  emit("hello")
  emit("world")
fun ehellotour/ehello: () -> emit ()()result: -> emit () : emittour/emit: (E, V) -> V (std/core/types/unit: V)std/core/types/unit: V
  emittour/emit: (msg : string) -> emit ()("hello"literal: string
count= 5)
  emittour/emit: (msg : string) -> emit ()("world"literal: string
count= 5)

We can define for example a handler that prints the emitted messages directly to the console:

fun ehello-console() : console ()
  with fun emit(msg) println(msg)
  ehello()
fun ehello-consoletour/ehello-console: () -> console ()()result: -> console () : consolestd/core/console/console: X (std/core/types/unit: V)std/core/types/unit: V
  withhandler: (() -> <emit,console> ()) -> console () fun emitemit: (msg : string) -> console ()(msgmsg: string) printlnstd/core/console/string/println: (s : string) -> console ()(msgmsg: string)
  ehellotour/ehello: () -> <emit,console> ()()

Here the handler is defined directly, but we can also abstract the handler for emitting to the console into a separate function:

fun emit-console( action )
  with fun emit(msg) println(msg)
  action()
fun emit-consoletour/emit-console: forall<a,e> (action : () -> <console,emit|e> a) -> <console|e> a( actionaction: () -> <console,emit|_5319> _5235 )result: -> <console|11650> 11566
  withhandler: (() -> <emit,console|_5319> _5235) -> <console|_5319> _5235 fun emitemit: (msg : string) -> <console|_5319> ()(msgmsg: string) printlnstd/core/console/string/println: (s : string) -> <console|_5319> ()(msgmsg: string)
  actionaction: () -> <console,emit|_5319> _5235()

where emit-consoletour/emit-console: forall<a,e> (action : () -> <console,emit|e> a) -> <console|e> a has the inferred type (action : () -> <emittour/emit: (E, V) -> V,consolestd/core/console/console: X|e> a) -> <consolestd/core/console/console: X|e> a (hover over the source to see the inferred types) where the action can have use the effects emittour/emit: (E, V) -> V, consolestd/core/console/console: X, and any other effects e, and where the final effect is just <consolestd/core/console/console: X|e> as the emittour/emit: (E, V) -> V effect is discharged by the handler.

Note, we could have written the above too as:

val emit-console2 = handler
  fun emit(msg) println(msg)
val emit-console2tour/emit-console2: forall<a,e> (() -> <console,emit|e> a) -> <console|e> a = handlerhandler: forall<a,e> (() -> <console,emit|e> a) -> <console|e> a
  fun emitemit: (msg : string) -> <console|_5421> ()(msgmsg: string) printlnstd/core/console/string/println: (s : string) -> <console|_5421> ()(msgmsg: string)

since a handler{ ... } expression is a function itself (and thus a value). Generally we prefer the earlier definition though as it allows further parameters like an initial state.

Since with works generally, we can use the abstracted handlers just like regular handlers, and our earlier example can be written as:

fun ehello-console2() : console ()
  with emit-console
  ehello()
fun ehello-console2tour/ehello-console2: () -> console ()()result: -> console () : consolestd/core/console/console: X (std/core/types/unit: V)std/core/types/unit: V
  withwith: () -> <console,emit> () emit-consoletour/emit-console: (action : () -> <console,emit> ()) -> console ()
  ehellotour/ehello: () -> <emit,console> ()()

(which expands to emit-consoletour/emit-console: forall<a,e> (action : () -> <console,emit|e> a) -> <console|e> a( fn(){ ehellotour/ehello: () -> emit ()() } )). Another useful handler may collect all emitted messages as a list of lines:

fun emit-collect( action : () -> <emit|e> () ) : e string
  var lines := []
  with handler
    return(x)     lines.reverse.join("\n")
    fun emit(msg) lines := Cons(msg,lines)
  action()

fun ehello-commit() : string
  with emit-collect
  ehello()
fun emit-collecttour/emit-collect: forall<e> (action : () -> <emit|e> ()) -> e string( actionaction: () -> <emit|$5450> () : () -> <emittour/emit: (E, V) -> V|std/core/types/effect-extend: (X, E) -> Eee: E> (std/core/types/unit: V)std/core/types/unit: V )result: -> 5733 string : ee: E stringstd/core/types/string: V
  var lineslines: local-var<$5459,list<string>> := [std/core/types/Nil: forall<a> list<a>]std/core/types/Nil: forall<a> list<a>
  withwith: () -> <emit|$5450> () handlerhandler: (() -> <emit,local<$5459>|$5450> ()) -> <local<$5459>|$5450> string
    returnreturn: (x : _5477) -> <local<$5459>|_5497> string(xx: _5477)     lineslines: list<string>.reversestd/core/list/reverse: (xs : list<string>) -> <local<$5459>|_5497> list<string>.joinstd/core/list/joinsep/join: (xs : list<string>, sep : string) -> <local<$5459>|_5497> string("\n"literal: string
count= 1)
    fun emitemit: (msg : string) -> <local<$5459>|$5450> ()(msgmsg: string) lineslines: local-var<$5459,list<string>> :=std/core/types/local-set: (v : local-var<$5459,list<string>>, assigned : list<string>) -> <local<$5459>|$5450> () Consstd/core/types/Cons: forall<a> (head : a, tail : list<a>) -> list<a>(msgmsg: string,lineslines: list<string>)
  actionaction: () -> <emit|$5450> ()()

fun ehello-committour/ehello-commit: () -> string()result: -> total string : stringstd/core/types/string: V
  withwith: () -> emit () emit-collecttour/emit-collect: (action : () -> emit ()) -> string
  ehellotour/ehello: () -> emit ()()

This is a total handler and only discharges the emittour/emit: (E, V) -> V effect.

3.4.5. Return Operations

In the previous emit-collecttour/emit-collect: forall<e> (action : () -> <emit|e> ()) -> e string example we saw the use of a return operation. Such operation changes the final result of the action of a handler. For example, consider our earlier used-defined exception effect raisetour/raise: (E, V) -> V. We can define a general handler that transforms any exceptional action into one that returns a maybestd/core/types/maybe: V -> V type:

fun raise-maybe( action : () -> <raise|e> a ) : e maybe<a>
  with handler
    return(x)      Just(x)   // normal return: wrap in Just
    ctl raise(msg) Nothing   // exception: return Nothing directly
  action()


fun div42()
  (raise-maybe{ safe-divide(1,0) }).default(42)
fun raise-maybetour/raise-maybe: forall<a,e> (action : () -> <raise|e> a) -> e maybe<a>( actionaction: () -> <raise|$5915> $5914 : () -> <raisetour/raise: (E, V) -> V|std/core/types/effect-extend: (X, E) -> Eee: E> aa: V )result: -> 6009 maybe<6008> : ee: E maybestd/core/types/maybe: V -> V<aa: V>
  withwith: () -> <raise|$5915> $5914 handlerhandler: (() -> <raise|$5915> $5914) -> $5915 maybe<$5914>
    returnreturn: (x : $5914) -> _5920 maybe<$5914>(xx: $5914)      Juststd/core/types/Just: forall<a> (value : a) -> maybe<a>(xx: $5914)   // normal return: wrap in Just
    ctl raiseraise: (msg : string, resume : (_5931) -> $5915 maybe<$5914>) -> $5915 maybe<$5914>
(msgmsg: string) Nothingstd/core/types/Nothing: forall<a> maybe<a>   // exception: return Nothing directly
  actionaction: () -> <raise|$5915> $5914()


fun div42tour/div42: () -> int()result: -> total int
  (raise-maybetour/raise-maybe: (action : () -> raise int) -> maybe<int>{ safe-dividetour/safe-divide: (x : int, y : int) -> raise int(1literal: int
dec  = 1
hex8 = 0x01
bit8 = 0b00000001,0literal: int
dec  = 0
hex8 = 0x00
bit8 = 0b00000000) }).defaultstd/core/maybe/default: (m : maybe<int>, nothing : int) -> int(42literal: int
dec  = 42
hex8 = 0x2A
bit8 = 0b00101010)

(where the body of div42tour/div42: () -> int desugars to default( raise-maybetour/raise-maybe: forall<a,e> (action : () -> <raise|e> a) -> e maybe<a>(fn(){ safe-dividetour/safe-divide: (x : int, y : int) -> raise int(1,0) }), 42 )).

A State Effect

For more examples of the use of return operations, we look at a the state effect. In its most general form it has just a set and gettour/get: forall<a> () -> (state<a>) a operation:

effect state<a>
  fun get() : a
  fun set( x : a ) : ()

fun sumdown( sum : int = 0 ) : <state<int>,div> int
  val i = get()
  if i <= 0 then sum else
    set( i - 1 )
    sumdown( sum + i )
effecttour/state: (V, E, V) -> V statetour/state: (V, E, V) -> V<aa: V>
  fun get() : aa: V
  fun set( xx: $6224 : aa: V ) : (std/core/types/unit: V)std/core/types/unit: V

fun sumdowntour/sumdown: (sum : ? int) -> <div,state<int>> int( sumsum: ? int : intstd/core/types/int: V = 0literal: int
dec  = 0
hex8 = 0x00
bit8 = 0b00000000 )result: -> <state<int>,div> int : <std/core/types/total: Estatetour/state: (V, E, V) -> V<intstd/core/types/int: V>,divstd/core/types/div: X> intstd/core/types/int: V
  val ii: int = gettour/get: () -> <state<int>,div> int()
  if ii: int <=std/core/int/(<=): (x : int, y : int) -> <div,state<int>> bool 0literal: int
dec  = 0
hex8 = 0x00
bit8 = 0b00000000 then sumsum: int else
    settour/set: (x : int) -> <state<int>,div> ()( ii: int -std/core/int/(-): (x : int, y : int) -> <state<int>,div> int 1literal: int
dec  = 1
hex8 = 0x01
bit8 = 0b00000001 )
    sumdowntour/sumdown: (sum : ? int) -> <div,state<int>> int( sumsum: int +std/core/int/(+): (x : int, y : int) -> <div,state<int>> int ii: int )

We can define a generic state handler most easily by using var declarations:

fun state( init : a, action : () -> <state<a>,div|e> b ) : <div|e> b
  var st := init
  with handler
    fun get()  st
    fun set(i) st := i
  action()
fun statetour/state: forall<a,b,e> (init : a, action : () -> <state<a>,div|e> b) -> <div|e> b( initinit: $6343 : aa: V, actionaction: () -> <state<$6343>,div|$6345> $6344 : () -> <statetour/state: (V, E, V) -> V<aa: V>,divstd/core/types/div: X|std/core/types/effect-extend: (X, E) -> Eee: E> bb: V )result: -> <div|6602> 6601 : <divstd/core/types/div: X|std/core/types/effect-extend: (X, E) -> Eee: E> bb: V
  var stst: local-var<$6354,$6343> := initinit: $6343
  withwith: () -> <state<$6343>,div|$6345> $6344 handlerhandler: (() -> <state<$6343>,local<$6354>,div|$6345> $6344) -> <local<$6354>,div|$6345> $6344
    fun getget: () -> <local<$6354>,div|$6345> $6343()  stst: $6343
    fun setset: (i : $6343) -> <local<$6354>,div|$6345> ()(ii: $6343) stst: local-var<$6354,$6343> :=std/core/types/local-set: (v : local-var<$6354,$6343>, assigned : $6343) -> <local<$6354>,div|$6345> () ii: $6343
  actionaction: () -> <state<$6343>,div|$6345> $6344()

where statetour/state: forall<a,b,e> (init : a, action : () -> <state<a>,div|e> b) -> <div|e> b(10){ sumdowntour/sumdown: (sum : ? int) -> <div,state<int>> int() } evaluates to 55.

3.4.6. Combining Handlers

advancedWhat makes effect handlers a good control-flow abstraction? There are three fundamental advantages with regard to other approaches:

Effect handlers can have simple (Hindley-Milner) types. This unlike shift/reset for example as that needs type rules with answer types (as the type of shift depends on the context of its matching reset).
The scope of an effect handler is delimited by the handler definition. This is just like shift/reset but unlike call/cc. Delimiting the scope of a resumption has various good properties, like efficient implementation strategies, but also that it allows for modular composition (see also Oleg Kiselyov's “against call/cc”).
Effect handlers can be composed freely. This is unlike general monads which need monad transformers to compose in particular ways. Essentially effect handlers can compose freely because every effect handler can be expressed eventually as an instance of a free monad which do compose. This also means that some monads cannot be expressed as an effect handler (namely the non-algebraic ones). A particular example of this is the continuation monad (which can express call/cc).

The Koka compiler internally uses monads and shift/reset to compile effect handlers though, and it compiles handlers into to an internal free monad based on multi-prompt delimited control [4, 25]. By inlining the monadic bind we are able to generate efficient C code that only allocates continuations in the case one is actually yielding up to a general ctl operation.

A great property of effect handlers is that they can be freely composed together. For example, suppose we have a function that calls raisetour/raise: forall<a> (msg : string) -> raise a if the state is an odd number:

fun no-odds() : <raise,state<int>> int
  val i = get()
  if i.is-odd then raise("no odds") else
    set(i / 2)
    i
fun no-oddstour/no-odds: () -> <raise,state<int>> int()result: -> <raise,state<int>> int : <std/core/types/total: Eraisetour/raise: (E, V) -> V,statetour/state: (V, E, V) -> V<intstd/core/types/int: V>> intstd/core/types/int: V
  val ii: int = gettour/get: () -> <state<int>,raise> int()
  if ii: int.is-oddstd/core/int/is-odd: (i : int) -> <raise,state<int>> bool then raisetour/raise: (msg : string) -> <raise,state<int>> int("no odds"literal: string
count= 7) else
    settour/set: (x : int) -> <state<int>,raise> ()(ii: int /std/core/int/(/): (x : int, y : int) -> <state<int>,raise> int 2literal: int
dec  = 2
hex8 = 0x02
bit8 = 0b00000010)
    ii: int

then we can compose a pstatetour/pstate: forall<a,b,e> (init : a, action : () -> <state<a>,div|e> b) -> <div|e> (b, a) and raise-maybetour/raise-maybe: forall<a,e> (action : () -> <raise|e> a) -> e maybe<a> handler together to handle the effects:

fun state-raise(init) : div (maybe<int>,int)
  with pstate(init)
  with raise-maybe
  no-odds()
fun state-raisetour/state-raise: (init : int) -> div (maybe<int>, int)(initinit: int)result: -> div (maybe<int>, int) : divstd/core/types/div: X (std/core/types/tuple2: (V, V) -> Vmaybestd/core/types/maybe: V -> V<intstd/core/types/int: V>,intstd/core/types/int: V)
  withwith: () -> <state<int>,div> maybe<int> pstatetour/pstate: (init : int, action : () -> <state<int>,div> maybe<int>) -> div (maybe<int>, int)(initinit: int)
  withwith: () -> <raise,state<int>,div> int raise-maybetour/raise-maybe: (action : () -> <raise,state<int>,div> int) -> <state<int>,div> maybe<int>
  no-oddstour/no-odds: () -> <raise,state<int>,div> int()

where both the statetour/state: (V, E, V) -> V<intstd/core/types/int: V> and raisetour/raise: (E, V) -> V effects are discharged by the respective handlers. Note the type reflects that we always return a pair with as a first element either Nothingstd/core/types/Nothing: forall<a> maybe<a> (if raisetour/raise: forall<a> (msg : string) -> raise a was called) or a Juststd/core/types/Just: forall<a> (value : a) -> maybe<a> with the final result, and as the second element the final state. This corresponds to how we usually combine state and exceptions where the state (or heap) has set to the state at the point the exception happened.

However, if we combine the handlers in the opposite order, we get a form of transactional state where we either get an exception (and no final state), or we get a pair of the result with the final state:

fun raise-state(init) : div maybe<(int,int)>
  with raise-maybe
  with pstate(init)
  no-odds()
fun raise-statetour/raise-state: (init : int) -> div maybe<(int, int)>(initinit: int)result: -> div maybe<(int, int)> : divstd/core/types/div: X maybestd/core/types/maybe: V -> V<(std/core/types/tuple2: (V, V) -> Vintstd/core/types/int: V,intstd/core/types/int: V)>
  withwith: () -> <raise,div> (int, int) raise-maybetour/raise-maybe: (action : () -> <raise,div> (int, int)) -> div maybe<(int, int)>
  withwith: () -> <state<int>,div,raise> int pstatetour/pstate: (init : int, action : () -> <state<int>,div,raise> int) -> <div,raise> (int, int)(initinit: int)
  no-oddstour/no-odds: () -> <raise,state<int>,div> int()

3.4.7. Masking Effects

Similar to masking signals in Unix, we can mask effects to not be handled by their innermost effect handler. The expression mask<eff>(action) modularly masks any effect operations in eff inside the action. For example, consider two nested handlers for the emittour/emit: (msg : string) -> emit () operation:

fun mask-emit()
  with fun emit(msg) println("outer:" ++ msg)
  with fun emit(msg) println("inner:" ++ msg)
  emit("hi")
  mask<emit>
    emit("there")
fun mask-emittour/mask-emit: () -> console ()()result: -> console ()
  withhandler: (() -> <emit,console> ()) -> console () fun emitemit: (msg : string) -> console ()(msgmsg: string) printlnstd/core/console/string/println: (s : string) -> console ()("outer:"literal: string
count= 6 ++std/core/types/(++): (x : string, y : string) -> console string msgmsg: string)
  withhandler: (() -> <emit,console,emit> ()) -> <console,emit> () fun emitemit: (msg : string) -> <console,emit> ()(msgmsg: string) printlnstd/core/console/string/println: (s : string) -> <console,emit> ()("inner:"literal: string
count= 6 ++std/core/types/(++): (x : string, y : string) -> <console,emit> string msgmsg: string)
  emittour/emit: (msg : string) -> <emit,emit,console> ()("hi"literal: string
count= 2)
  mask<emittour/emit: (E, V) -> V>
    emittour/emit: (msg : string) -> <emit,console> ()("there"literal: string
count= 5)

If we call mask-emittour/mask-emit: () -> console ()() it prints:

inner: hi
outer: there

The second call to emittour/emit: (msg : string) -> emit () is masked and therefore it skips the innermost handler and is handled subsequently by the outer handler (i.e. mask only masks an operation once for its innermost handler).

The type of mask<l> for some effect label l is (action: () -> e a) -> <l|e> a where it injects the effect l into the final effect result <l|e> (even thought the mask itself never actually performs any operation in l – it only masks any operations of l in action).

This type usually leads to duplicate effect labels, for example, the effect of mask<emittour/emit: (msg : string) -> emit ()>{ emittour/emit: (msg : string) -> emit ()("there") } is <emittour/emit: (E, V) -> V,emittour/emit: (E, V) -> V> signifying that there need to be two handlers for emittour/emit: (E, V) -> V: in this case, one to skip over, and one to subsequently handle the masked operation.

Effect Abstraction

The previous example is not very useful, but generally we can use mask to hide internal effect handling from higher-order functions. For example, consider the following function that needs to handle internal exceptions:

fun mask-print( action : () -> e int ) : e int
  with ctl raise(msg) 42
  val x = mask<raise>(action)
  if x.is-odd then raise("wrong")   // internal exception
  x
fun mask-printtour/mask-print: forall<e> (action : () -> e int) -> e int( actionaction: () -> $7603 int : () -> ee: E intstd/core/types/int: V )result: -> 7698 int : ee: E intstd/core/types/int: V
  withhandler: (() -> <raise|$7603> int) -> $7603 int ctl raiseraise: (msg : string, resume : (_7609) -> $7603 int) -> $7603 int
(msgmsg: string) 42literal: int
dec  = 42
hex8 = 0x2A
bit8 = 0b00101010
  val xx: int = mask<raisetour/raise: (E, V) -> V>(actionaction: () -> $7603 int)
  if xx: int.is-oddstd/core/int/is-odd: (i : int) -> <raise|$7603> bool then raisetour/raise: (msg : string) -> <raise|$7603> ()("wrong"literal: string
count= 5)std/core/types/Unit: ()   // internal exception
  xx: int

Here the type of mask-printtour/mask-print: forall<e> (action : () -> e int) -> e int does not expose at all that we handle the raisetour/raise: (E, V) -> V effect internally for specific code and it is fully abstract – even if the action itself would call raisetour/raise: forall<a> (msg : string) -> raise a, it would neatly skip the internal handler due to the mask<raisetour/raise: forall<a> (msg : string) -> raise a> expression.

If we would leave out the mask, and call action() directly, then the inferred type of action would be () -> <raisetour/raise: (E, V) -> V|e> intstd/core/types/int: V instead, showing that the raisetour/raise: (E, V) -> V effect would be handled. Note that this is usually the desired behaviour since in the majority of cases we want to handle the effects in a particular way when defining handler abstractions. The cases where mask is needed are much less common in our experience.

advanced

State as a Combined Effect

Another nice use-case for mask occurs when modeling state directly using effect handlers without using mutable local variables [1]. We can do this using two separate operations peektour/peek: forall<a> -> (peek<a>) a and poketour/poke: forall<a> (x : a) -> (poke<a>) ():

effect<a> val peek : a             // get the state
effect<a> ctl poke( x : a ) : ()   // set the state to x
effecttour/peek: (V, E, V) -> V<aa: V> val peektour/peek: (V, E, V) -> V : aa: V             // get the state
effecttour/poke: (V, E, V) -> V<aa: V> ctl poketour/poke: (V, E, V) -> V( xx: $7966 : aa: V ) : (std/core/types/unit: V)std/core/types/unit: V   // set the state to x

We can now define a generic state handler as:

fun ppstate( init : a, action : () -> <peek<a>,poke<a>|e> b ) : e b
  with val peek = init
  with ctl poke(x)
    mask<peek>
      with val peek = x
      resume(())
  action()
fun ppstatetour/ppstate: forall<a,b,e> (init : a, action : () -> <peek<a>,poke<a>|e> b) -> e b( initinit: $8019 : aa: V, actionaction: () -> <peek<$8019>,poke<$8019>|$8021> $8020 : () -> <peektour/peek: (V, E, V) -> V<aa: V>,poketour/poke: (V, E, V) -> V<aa: V>|std/core/types/effect-extend: (X, E) -> Eee: E> bb: V )result: -> 8249 8248 : ee: E bb: V
  withhandler: (() -> <peek<$8019>|$8021> $8020) -> $8021 $8020 val peekpeek: () -> $8021 $8019 = initinit: $8019
  withhandler: (() -> <poke<$8019>,peek<$8019>|$8021> $8020) -> <peek<$8019>|$8021> $8020 ctl pokepoke: (x : $8019, resume : (()) -> <peek<$8019>|$8021> $8020) -> <peek<$8019>|$8021> $8020
(xx: $8019)
    mask<peek_1: V>
      withhandler: (() -> <peek<$8019>|$8021> $8020) -> $8021 $8020 val peekpeek: () -> $8021 $8019 = xx: $8019
      resumeresume: (()) -> <peek<$8019>|$8021> $8020((std/core/types/Unit: ())std/core/types/Unit: ())
  actionaction: () -> <peek<$8019>,poke<$8019>|$8021> $8020()

In the handler for poketour/poke: forall<a> (x : a) -> (poke<a>) () we resume under a fresh handler for peektour/peek: forall<a> -> (peek<a>) a that is bound to the new state. This means though there will be an ever increasing “stack” of handlers for peektour/peek: forall<a> -> (peek<a>) a. To keep the type from growing infinitely, we need to mask out any potential operation to a previous handler of peektour/peek: forall<a> -> (peek<a>) a which is why the mask is needed. (Another way of looking at this is to just follow the typing: action has a peektour/peek: (V, E, V) -> V effect, and unifies with the effect of the poketour/poke: forall<a> (x : a) -> (poke<a>) () operation definition. Since it handles its own peektour/peek: (V, E, V) -> V effect, it needs to be injected back in with a mask.)

(Note: since the handler stack grows indefinitely on every poketour/poke: forall<a> (x : a) -> (poke<a>) () this example is mostly of theoretical interest. However, we are looking into a stack smashing technique where we detect at runtime that a mask can discard a handler frame from the stack.)

3.4.8. Overriding Handlers

A common use for masking is to override handlers. For example, consider overriding the behavour of emittour/emit: (msg : string) -> emit ():

fun emit-quoted1( action : () -> <emit,emit|e> a ) : <emit|e> a
  with fun emit(msg) emit("\"" ++ msg ++ "\"")
  action()
fun emit-quoted1tour/emit-quoted1: forall<a,e> (action : () -> <emit,emit|e> a) -> <emit|e> a( actionaction: () -> <emit,emit|$8260> $8259 : () -> <emittour/emit: (E, V) -> V,emittour/emit: (E, V) -> V|std/core/types/effect-extend: (X, E) -> Eee: E> aa: V )result: -> <emit|8352> 8351 : <emittour/emit: (E, V) -> V|std/core/types/effect-extend: (X, E) -> Eee: E> aa: V
  withhandler: (() -> <emit,emit|$8260> $8259) -> <emit|$8260> $8259 fun emitemit: (msg : string) -> <emit|$8260> ()(msgmsg: string) emittour/emit: (msg : string) -> <emit|$8260> ()("\""literal: string
count= 1 ++std/core/types/(++): (x : string, y : string) -> <emit|$8260> string msgmsg: string ++std/core/types/(++): (x : string, y : string) -> <emit|$8260> string "\""literal: string
count= 1)
  actionaction: () -> <emit,emit|$8260> $8259()

Here, the handler for emittour/emit: (msg : string) -> emit () calls itself emittour/emit: (msg : string) -> emit () to actually emit the newly quoted string. The effect type inferred for emit-quoted1tour/emit-quoted1: forall<a,e> (action : () -> <emit,emit|e> a) -> <emit|e> a is (action : () -> <emittour/emit: (E, V) -> V,emittour/emit: (E, V) -> V|e> a) -> <emittour/emit: (E, V) -> V|e> a. This is not the nicest type as it exposes that action is evaluated under (at least) two emittour/emit: (E, V) -> V handlers (and someone could use mask inside action to use the outer emittour/emit: (E, V) -> V handler).

The override keyword keeps the type nice and fully overrides the previous handler which is no longer accessible from action:

fun emit-quoted2( action : () -> <emit|e> a ) : <emit|e> a
  with override fun emit(msg) emit("\"" ++ msg ++ "\"" )
  action()
fun emit-quoted2tour/emit-quoted2: forall<a,e> (action : () -> <emit|e> a) -> <emit|e> a( actionaction: () -> <emit|$8360> $8359 : () -> <emittour/emit: (E, V) -> V|std/core/types/effect-extend: (X, E) -> Eee: E> aa: V )result: -> <emit|8466> 8465 : <emittour/emit: (E, V) -> V|std/core/types/effect-extend: (X, E) -> Eee: E> aa: V
  withhandler: (() -> <emit,emit|$8360> $8359) -> <emit|$8360> $8359 override fun emitemit: (msg : string) -> <emit|$8360> ()(msgmsg: string) emittour/emit: (msg : string) -> <emit|$8360> ()("\""literal: string
count= 1 ++std/core/types/(++): (x : string, y : string) -> <emit|$8360> string msgmsg: string ++std/core/types/(++): (x : string, y : string) -> <emit|$8360> string "\""literal: string
count= 1 )
  actionaction: () -> <emit|$8360> $8359()

This of course applies to any handler or value, for example, to temporarily increase the widthtour/width: -> width int while pretty printing, we can override the widthtour/width: -> width int as:

fun extra-wide( action )
  with override val width = 2*width
  action()
fun extra-widetour/extra-wide: forall<a,e> (action : () -> <width|e> a) -> <width|e> a( actionaction: () -> <width|_8488> _8475 )result: -> <width|18053> 18040
  withhandler: (() -> <width,width|_8488> _8475) -> <width|_8488> _8475 override val widthwidth: () -> <width|_8488> int = 2literal: int
dec  = 2
hex8 = 0x02
bit8 = 0b00000010*std/core/int/(*): (int, int) -> <width|_8488> intwidthtour/width: -> width int
  actionaction: () -> <width|_8488> _8475()

advanced

Mask Behind

Unfortunately, we cannot modularly define overriding with just mask; if we add mask outside of the emittour/emit: (msg : string) -> emit () handler, the emittour/emit: (msg : string) -> emit () call inside the operation definition would get masked and skip our intended handler. On the other hand, if we add mask just over action all its emittour/emit: (msg : string) -> emit () calls would be masked for our intended handler!

For this situation, there is another primitive that only “masks the masks”. The expression mask behind<eff> has type (() -> <eff|e> a) -> <eff,eff|e> a and only masks any masked operations but not the direct ones. The override keyword is defined in terms of this primitive:

translation

with override handler<eff> { <ops> }
<body>
with override handler<eff> { <ops> }
<body>

(handler<eff> { <ops> })(mask behind<eff>{ <body> })
(handler<eff> { <ops> })(mask behind<eff>{ <body> })

This ensures any operation calls in <body> go the newly defined handler while any masked operations are masked one more level and skip both of the two innermost handlers.

3.4.9. Side-effect Isolation

3.4.10. Resuming more than once

Since resumestd/core/hnd/resume: forall<a,e,e1,b> (r : resume-context<a,e,e1,b>, x : a) -> e b is a first-class function (well, almost, see raw control), it is possible to store it in a list for example to implement a scheduler, but it is also possible to invoke it more than once. This can be used to implement backtracking or probabilistic programming models.

A common example of multiple resumptions is the choicetour/choice: (E, V) -> V effect:

effect ctl choice() : bool

fun xor() : choice bool
  val p = choice()
  val q = choice()
  if p then !q else q
effecttour/choice: (E, V) -> V ctl choicetour/choice: (E, V) -> V() : boolstd/core/types/bool: V

fun xortour/xor: () -> choice bool()result: -> choice bool : choicetour/choice: (E, V) -> V boolstd/core/types/bool: V
  val pp: bool = choicetour/choice: () -> choice bool()
  val qq: bool = choicetour/choice: () -> choice bool()
  if pp: bool then !std/core/types/bool/(!): (b : bool) -> choice boolqq: bool else qq: bool

One possible implementation just uses random numbers:

fun choice-random(action : () -> <choice,random|e> a) : <random|e> a
  with fun choice() random-bool()
  action()
fun choice-randomtour/choice-random: forall<a,e> (action : () -> <choice,random|e> a) -> <random|e> a(actionaction: () -> <choice,random|$8712> $8711 : () -> <choicetour/choice: (E, V) -> V,randomstd/num/random/random: (E, V) -> V|std/core/types/effect-extend: (X, E) -> Eee: E> aa: V)result: -> <random|8776> 8775 : <randomstd/num/random/random: (E, V) -> V|std/core/types/effect-extend: (X, E) -> Eee: E> aa: V
  withhandler: (() -> <choice,random|$8712> $8711) -> <random|$8712> $8711 fun choicechoice: () -> <random|$8712> bool() random-boolstd/num/random/random-bool: () -> <random|$8712> bool()
  actionaction: () -> <choice,random|$8712> $8711()

Where choice-randomtour/choice-random: forall<a,e> (action : () -> <choice,random|e> a) -> <random|e> a(xortour/xor: () -> choice bool) returns Truestd/core/types/True: bool and Falsestd/core/types/False: bool at random.

However, we can also resume multiple times, once with Falsestd/core/types/False: bool and once with Truestd/core/types/True: bool, to return all possible outcomes. This also changes the handler type to return a list of all results of the action, and we need a return clause to wrap the result of the action in a singleton list:

fun choice-all(action : () -> <choice|e> a) : e list<a>
  with handler
    return(x)    [x]
    ctl choice() resume(False) ++ resume(True)
  action()
fun choice-alltour/choice-all: forall<a,e> (action : () -> <choice|e> a) -> e list<a>(actionaction: () -> <choice|$8784> $8783 : () -> <choicetour/choice: (E, V) -> V|std/core/types/effect-extend: (X, E) -> Eee: E> aa: V)result: -> 8883 list<8882> : ee: E liststd/core/types/list: V -> V<aa: V>
  withwith: () -> <choice|$8784> $8783 handlerhandler: (() -> <choice|$8784> $8783) -> $8784 list<$8783>
    returnreturn: (x : $8783) -> _8787 list<$8783>(xx: $8783)    [std/core/types/Cons: forall<a> (head : a, tail : list<a>) -> list<a>xx: $8783]std/core/types/Nil: forall<a> list<a>
    ctl choicechoice: (resume : (bool) -> $8784 list<$8783>) -> $8784 list<$8783>
() resumeresume: (bool) -> $8784 list<$8783>(Falsestd/core/types/False: bool) ++std/core/list/(++): (xs : list<$8783>, ys : list<$8783>) -> $8784 list<$8783> resumeresume: (bool) -> $8784 list<$8783>(Truestd/core/types/True: bool)
  actionaction: () -> <choice|$8784> $8783()

where choice-alltour/choice-all: forall<a,e> (action : () -> <choice|e> a) -> e list<a>(xortour/xor: () -> choice bool) returns [Falsestd/core/types/False: bool,Truestd/core/types/True: bool,Truestd/core/types/True: bool,Falsestd/core/types/False: bool].

Resuming more than once interacts in interesting ways with the state effect. Consider the following example that uses both choicetour/choice: (E, V) -> V and statetour/state: (V, E, V) -> V:

fun surprising() : <choice,state<int>> bool
  val p = choice()
  val i = get()
  set(i+1)
  if i>0 && p then xor() else False
fun surprisingtour/surprising: () -> <choice,state<int>> bool()result: -> <choice,state<int>> bool : <std/core/types/total: Echoicetour/choice: (E, V) -> V,statetour/state: (V, E, V) -> V<intstd/core/types/int: V>> boolstd/core/types/bool: V
  val pp: bool = choicetour/choice: () -> <choice,state<int>> bool()
  val ii: int = gettour/get: () -> <state<int>,choice> int()
  settour/set: (x : int) -> <state<int>,choice> ()(ii: int+std/core/int/(+): (x : int, y : int) -> <state<int>,choice> int1literal: int
dec  = 1
hex8 = 0x01
bit8 = 0b00000001)
  if ii: int>std/core/int/(>): (x : int, y : int) -> <choice,state<int>> bool0literal: int
dec  = 0
hex8 = 0x00
bit8 = 0b00000000 &&std/core/types/(&&): (x : bool, y : bool) -> <choice,state<int>> bool pp: bool then xortour/xor: () -> <choice,state<int>> bool() else Falsestd/core/types/False: bool

We can combine the handlers in two interesting ways:

fun state-choice() : div (list<bool>,int)
  pstate(0)
    choice-all(surprising)

fun choice-state() : div list<(bool,int)>
  choice-all
    pstate(0,surprising)
fun state-choicetour/state-choice: () -> div (list<bool>, int)()result: -> div (list<bool>, int) : divstd/core/types/div: X (std/core/types/tuple2: (V, V) -> Vliststd/core/types/list: V -> V<boolstd/core/types/bool: V>,intstd/core/types/int: V)
  pstatetour/pstate: (init : int, action : () -> <state<int>,div> list<bool>) -> div (list<bool>, int)(0literal: int
dec  = 0
hex8 = 0x00
bit8 = 0b00000000)
    choice-alltour/choice-all: (action : () -> <choice,state<int>,div> bool) -> <state<int>,div> list<bool>(surprisingtour/surprising: () -> <choice,state<int>,div> bool)

fun choice-statetour/choice-state: () -> div list<(bool, int)>()result: -> div list<(bool, int)> : divstd/core/types/div: X liststd/core/types/list: V -> V<(std/core/types/tuple2: (V, V) -> Vboolstd/core/types/bool: V,intstd/core/types/int: V)>
  choice-alltour/choice-all: (action : () -> <choice,div> (bool, int)) -> div list<(bool, int)>
    pstatetour/pstate: (init : int, action : () -> <state<int>,div,choice> bool) -> <div,choice> (bool, int)(0literal: int
dec  = 0
hex8 = 0x00
bit8 = 0b00000000,surprisingtour/surprising: () -> <choice,state<int>,div> bool)

In state-choicetour/state-choice: () -> div (list<bool>, int)() the pstatetour/pstate: forall<a,b,e> (init : a, action : () -> <state<a>,div|e> b) -> <div|e> (b, a) is the outer handler and becomes like a global state over all resumption strands in choice-alltour/choice-all: forall<a,e> (action : () -> <choice|e> a) -> e list<a>, and thus after the first resume the i>0 (&&)std/core/types/(&&): (x : bool, y : bool) -> bool p condition in surprisingtour/surprising: () -> <choice,state<int>> bool is Truestd/core/types/True: bool, and we get ([Falsestd/core/types/False: bool,Falsestd/core/types/False: bool,Truestd/core/types/True: bool,Truestd/core/types/True: bool,Falsestd/core/types/False: bool],2).

In choice-statetour/choice-state: () -> div list<(bool, int)>() the pstatetour/pstate: forall<a,b,e> (init : a, action : () -> <state<a>,div|e> b) -> <div|e> (b, a) is the inner handler and becomes like transactional state, where the state becomes local to each resumption strand in choice-alltour/choice-all: forall<a,e> (action : () -> <choice|e> a) -> e list<a>. Now i is always 0 at first and thus we get [(Falsestd/core/types/False: bool,1),(Falsestd/core/types/False: bool,1)].

advancedThis example also shows how var state is correctly saved and restored on resumptions (as part of the stack) and this is essential to the correct composition of effect handlers. If var declarations were instead heap allocated or captured by reference, they would no longer be local to their scope and side effects could “leak” across different resumptions.

3.4.11. Initially and Finally

With arbitrary effect handlers we need to be careful when interacting with external resources like files. Generally, operations can never resume (like exceptions), resume exactly once (giving the usual linear control flow), or resume more than once. To robustly handle these different cases, Koka provides the finallystd/core/hnd/finally: forall<a,e> (fin : () -> e (), action : () -> e a) -> e a and initiallystd/core/hnd/initially: forall<a,e> (init : (int) -> e (), action : () -> e a) -> e a functions. Suppose we have the following low-level file operations on file handles:

type fhandle
fun fopen( path : string )   : <exn,filesys> fhandle
fun hreadline( h : fhandle ) : <exn,filesys> string
fun hclose( h : fhandle )    : <exn,filesys> ()
type fhandle
fun fopen( path : stringstd/core/types/string: V )   : <exnstd/core/exn/exn: (E, V) -> V,filesys> fhandle
fun hreadline( h : fhandle ) : <exnstd/core/exn/exn: (E, V) -> V,filesys> stringstd/core/types/string: V
fun hclose( h : fhandle )    : <exnstd/core/exn/exn: (E, V) -> V,filesys> ()

Using these primitives, we can declare a fread effect to read from a file:

effect fun fread() : string

fun with-file( path : string, action : () -> <fread,exn,filesys|e> a ) : <exn,filesys|e> a
  val h = fopen(path)
  with handler
    return(x)   { hclose(h); x }
    fun fread() hreadline(h)
  action()
effect fun fread() : stringstd/core/types/string: V

fun with-file( path : stringstd/core/types/string: V, action : () -> <fread,exnstd/core/exn/exn: (E, V) -> V,filesys|e> a ) : <exnstd/core/exn/exn: (E, V) -> V,filesys|e> a
  val h = fopen(path)
  with handler
    return(x)   { hclose(h); x }
    fun fread() hreadline(h)
  action()

However, as it stands it would fail to close the file handle if an exceptional effect inside action is used (i.e. or any operation that never resumes). The finallystd/core/hnd/finally: forall<a,e> (fin : () -> e (), action : () -> e a) -> e a function handles these situations, and takes as its first argument a function that is always executed when either returning normally, or when unwinding for a non-resuming operation. So, a more robust way to write with-file is:

fun with-file( path : string, action : () -> <fread,exn,filesys|e> a ) : <exn,filesys|e> a
  val h = fopen(path)
  with finally
    hclose(h)
  with fun fread()
    hreadline(h)
  action()
fun with-file( path : stringstd/core/types/string: V, action : () -> <fread,exnstd/core/exn/exn: (E, V) -> V,filesys|e> a ) : <exnstd/core/exn/exn: (E, V) -> V,filesys|e> a
  val h = fopen(path)
  with finallystd/core/hnd/finally: forall<a,e> (fin : () -> e (), action : () -> e a) -> e a
    hclose(h)
  with fun fread()
    hreadline(h)
  action()

The current definition is robust for operations that never resume, or operations that resume once – but there is still trouble when resuming more than once. If someone calls choicetour/choice: () -> choice bool inside the action, the second time it resumes the file handle will be closed again which is probably not intended. There is active research into using the type system to statically prevent this from happening.

Another way to work with multiple resumptions is to use the initiallystd/core/hnd/initially: forall<a,e> (init : (int) -> e (), action : () -> e a) -> e a function. This function takes 2 arguments: the first argument is a function that is called the first time initiallystd/core/hnd/initially: forall<a,e> (init : (int) -> e (), action : () -> e a) -> e a is evaluated, and subsequently every time a particular resumption is resumed more than once.

Raw Control

advancedUse raw ctl for raw control operations which do not automatically finalize. With raw ctl one can use the implicitly bound resumption context rcontext to either resume (as rcontext.resumestd/core/hnd/resume: forall<a,e,e1,b> (r : resume-context<a,e,e1,b>, x : a) -> e b(x)), or to finalize a resumption (as rcontext.finalizestd/core/hnd/finalize: forall<a,e,e1,b> (r : resume-context<a,e,e1,b>, x : b) -> e b) which runs all finallystd/core/hnd/finally: forall<a,e> (fin : () -> e (), action : () -> e a) -> e a handlers to clean up resources. This allows one to store an rcontext as a first class value to resume or finalize later even from a different scope. Of course, it needs to be used with care since it is now the programmers' resposibility to ensure the resumption is eventually resumed or finalized (such that any resources can be released).

3.4.12. Linear Effects

Todo. Use linear effect to declare effects whose operations are always tail-resumptive and use only linear effects themselves (and thus resume exactly once). This removes monadic translation for such effects and can make code that uses only linear effects more compact and efficient.

3.4.13. Named and Scoped Handlers

Todo. See named-handlers.

3.5. FBIP: Functional but In-Place

With Perceus reuse analysis we can write algorithms that dynamically adapt to use in-place mutation when possible (and use copying when used persistently). Importantly, you can rely on this optimization happening, e.g. see the match patterns and pair them to same-sized constructors in each branch.

This style of programming leads to a new paradigm that we call FBIP: “functional but in place”. Just like tail-call optimization lets us describe loops in terms of regular function calls, reuse analysis lets us describe in-place mutating imperative algorithms in a purely functional way (and get persistence as well).

Note. FBIP is still active research. In particular we'd like to add ways to add annotations to ensure reuse is taking place.

3.5.1. Tree Rebalancing

As an example, we consider insertion into a red-black tree [3]. A polymorphic version of this example is part of the samples directory when you have installed Koka and can be loaded as :l samples/basic/rbtree. We define red-black trees as:

type color
  Red
  Black

type tree
  Leaf
  Node(color: color, left: tree, key: int, value: bool, right: tree)
type color
  Red
  Black

type treetour/tree: V
  Leaf
  Node(color: color, left: treetour/tree: V, key: intstd/core/types/int: V, value: boolstd/core/types/bool: V, right: treetour/tree: V)

The red-black tree has the invariant that the number of black nodes from the root to any of the leaves is the same, and that a red node is never a parent of red node. Together this ensures that the trees are always balanced. When inserting nodes, the invariants need to be maintained by rebalancing the nodes when needed. Okasaki's algorithm [19] implements this elegantly and functionally:

fun balance-left( l : tree, k : int, v : bool, r : tree ): tree
  match l
    Node(_, Node(Red, lx, kx, vx, rx), ky, vy, ry)
      -> Node(Red, Node(Black, lx, kx, vx, rx), ky, vy, Node(Black, ry, k, v, r))
    ...

fun ins( t : tree, k : int, v : bool ): tree
  match t
    Leaf -> Node(Red, Leaf, k, v, Leaf)
    Node(Red, l, kx, vx, r)
      -> if k < kx then Node(Red, ins(l, k, v), kx, vx, r)
         ...
    Node(Black, l, kx, vx, r)
      -> if k < kx && is-red(l) then balance-left(ins(l,k,v), kx, vx, r)
         ...
fun balance-left( l : treetour/tree: V, k : intstd/core/types/int: V, v : boolstd/core/types/bool: V, r : treetour/tree: V ): treetour/tree: V
  match l
    Node(_, Node(Red, lx, kx, vx, rx), ky, vy, ry)
      -> Node(Red, Node(Black, lx, kx, vx, rx), ky, vy, Node(Black, ry, k, v, r))
    ...

fun ins( t : treetour/tree: V, k : intstd/core/types/int: V, v : boolstd/core/types/bool: V ): treetour/tree: V
  match t
    Leaf -> Node(Red, Leaf, k, v, Leaf)
    Node(Red, l, kx, vx, r)
      -> if k < kx then Node(Red, ins(l, k, v), kx, vx, r)
         ...
    Node(Black, l, kx, vx, r)
      -> if k < kx (&&)std/core/types/(&&): (x : bool, y : bool) -> bool is-red(l) then balance-left(ins(l,k,v), kx, vx, r)
         ...

The Koka compiler will inline the balance-left function. At that point, every matched Node constructor in the ins function has a corresponding Node allocation – if we consider all branches we can see that we either match one Node and allocate one, or we match three nodes deep and allocate three. Every Node is actually reused in the fast path without doing any allocations! When studying the generated code, we can see the Perceus assigns the fields in the nodes in the fast path in-place much like the usual non-persistent rebalancing algorithm in C would do.

Essentially this means that for a unique tree, the purely functional algorithm above adapts at runtime to an in-place mutating re-balancing algorithm (without any further allocation). Moreover, if we use the tree persistently [20], and the tree is shared or has shared parts, the algorithm adapts to copying exactly the shared spine of the tree (and no more), while still rebalancing in place for any unshared parts.

3.5.2. Morris Traversal

As another example of FBIP, consider mapping a function f over all elements in a binary tree in-order as shown in the tmap-inordertour/tmap-inorder: (t : tree, f : (int) -> int) -> tree example:

type tree
  Tip
  Bin( left: tree, value : int, right: tree )

fun tmap-inorder( t : tree, f : int -> int ) : tree
  match t
    Bin(l,x,r) -> Bin( l.tmap-inorder(f), f(x), r.tmap-inorder(f) )
    Tip        -> Tip
type treetour/tree: V
  Tiptour/Tip: tree
  Bintour/Bin: (left : tree, value : int, right : tree) -> tree( left: treetour/tree: V, value : intstd/core/types/int: V, right: treetour/tree: V )

fun tmap-inordertour/tmap-inorder: (t : tree, f : (int) -> int) -> tree( tt: tree : treetour/tree: V, ff: (int) -> int : intstd/core/types/int: V -> intstd/core/types/int: V )result: -> total tree : treetour/tree: V
  match tt: tree
    Bintour/Bin: (left : tree, value : int, right : tree) -> tree(ll: tree,xx: int,rr: tree) -> Bintour/Bin: (left : tree, value : int, right : tree) -> tree( ll: tree.tmap-inordertour/tmap-inorder: (t : tree, f : (int) -> int) -> tree(ff: (int) -> int), ff: (int) -> int(xx: int), rr: tree.tmap-inordertour/tmap-inorder: (t : tree, f : (int) -> int) -> tree(ff: (int) -> int) )
    Tiptour/Tip: tree        -> Tiptour/Tip: tree

This is already quite efficient as all the Bintour/Bin: (left : tree, value : int, right : tree) -> tree and Tiptour/Tip: tree nodes are reused in-place when t is unique. However, the tmaptour/tmap: (f : (int) -> int, t : tree, visit : visitor, d : direction) -> div tree function is not tail-recursive and thus uses as much stack space as the depth of the tree.

void inorder( tree* root, void (*f)(tree* t) ) {
  tree* cursor = root;
  while (cursor != NULL /* Tip */) {
    if (cursor->left == NULL) {
      // no left tree, go down the right
      f(cursor->value);
      cursor = cursor->right;
    } else {
      // has a left tree
      tree* pre = cursor->left;  // find the predecessor
      while(pre->right != NULL && pre->right != cursor) {
        pre = pre->right;
      }
      if (pre->right == NULL) {
        // first visit, remember to visit right tree
        pre->right = cursor;
        cursor = cursor->left;
      } else {
        // already set, restore
        f(cursor->value);
        pre->right = NULL;
        cursor = cursor->right;
      }
    }
  }
}

In 1968, Knuth posed the problem of visiting a tree in-order while using no extra stack- or heap space [7] (For readers not familiar with the problem it might be fun to try this in your favorite imperative language first and see that it is not easy to do). Since then, numerous solutions have appeared in the literature. A particularly elegant solution was proposed by Morris [18]. This is an in-place mutating algorithm that swaps pointers in the tree to “remember” which parts are unvisited. It is beyond this tutorial to give a full explanation, but a C implementation is shown here on the side. The traversal essentially uses a right-threaded tree to keep track of which nodes to visit. The algorithm is subtle, though. Since it transforms the tree into an intermediate graph, we need to state invariants over the so-called Morris loops [16] to prove its correctness.

We can derive a functional and more intuitive solution using the FBIP technique. We start by defining an explicit visitor data structure that keeps track of which parts of the tree we still need to visit. In Koka we define this data type as visitortour/visitor: V:

type visitor
  Done
  BinR( right:tree, value : int, visit : visitor )
  BinL( left:tree, value : int, visit : visitor )
type visitortour/visitor: V
  Donetour/Done: visitor
  BinRtour/BinR: (right : tree, value : int, visit : visitor) -> visitor( right:treetour/tree: V, value : intstd/core/types/int: V, visit : visitortour/visitor: V )
  BinLtour/BinL: (left : tree, value : int, visit : visitor) -> visitor( left:treetour/tree: V, value : intstd/core/types/int: V, visit : visitortour/visitor: V )

(As an aside, Conor McBride [17] describes how we can generically derive a zipper [6] visitor for any recursive type as a list of the derivative of that type, namely . In our case, the algebraic representation of the inductive treetour/tree: V type is . Calculating the derivative and by further simplification, we get , which corresponds exactly to our visitortour/visitor: V data type.)

We also keep track of which directiontour/direction: V in the tree we are going, either Uptour/Up: direction or Downtour/Down: direction the tree.

type direction
  Up
  Down
type directiontour/direction: V
  Uptour/Up: direction
  Downtour/Down: direction

We start our traversal by going downward into the tree with an empty visitor, expressed as tmaptour/tmap: (f : (int) -> int, t : tree, visit : visitor, d : direction) -> div tree(f, t, Donetour/Done: visitor, Downtour/Down: direction):

fun tmap( f : int -> int, t : tree, visit : visitor, d : direction )
  match d
    Down -> match t     // going down a left spine
      Bin(l,x,r) -> tmap(f,l,BinR(r,x,visit),Down) // A
      Tip        -> tmap(f,Tip,visit,Up)           // B
    Up -> match visit   // go up through the visitor
      Done        -> t                             // C
      BinR(r,x,v) -> tmap(f,r,BinL(t,f(x),v),Down) // D
      BinL(l,x,v) -> tmap(f,Bin(l,x,t),v,Up)       // E
fun tmaptour/tmap: (f : (int) -> int, t : tree, visit : visitor, d : direction) -> div tree( ff: (int) -> int : intstd/core/types/int: V -> intstd/core/types/int: V, tt: tree : treetour/tree: V, visitvisit: visitor : visitortour/visitor: V, dd: direction : directiontour/direction: V )result: -> div tree
  match dd: direction
    Downtour/Down: direction -> match tt: tree     // going down a left spine
      Bintour/Bin: (left : tree, value : int, right : tree) -> tree(ll: tree,xx: int,rr: tree) -> tmaptour/tmap: (f : (int) -> int, t : tree, visit : visitor, d : direction) -> div tree(ff: (int) -> int,ll: tree,BinRtour/BinR: (right : tree, value : int, visit : visitor) -> visitor(rr: tree,xx: int,visitvisit: visitor),Downtour/Down: direction) // A
      Tiptour/Tip: tree        -> tmaptour/tmap: (f : (int) -> int, t : tree, visit : visitor, d : direction) -> div tree(ff: (int) -> int,Tiptour/Tip: tree,visitvisit: visitor,Uptour/Up: direction)           // B
    Uptour/Up: direction -> match visitvisit: visitor   // go up through the visitor
      Donetour/Done: visitor        -> tt: tree                             // C
      BinRtour/BinR: (right : tree, value : int, visit : visitor) -> visitor(rr: tree,xx: int,vv: visitor) -> tmaptour/tmap: (f : (int) -> int, t : tree, visit : visitor, d : direction) -> div tree(ff: (int) -> int,rr: tree,BinLtour/BinL: (left : tree, value : int, visit : visitor) -> visitor(tt: tree,ff: (int) -> div int(xx: int),vv: visitor),Downtour/Down: direction) // D
      BinLtour/BinL: (left : tree, value : int, visit : visitor) -> visitor(ll: tree,xx: int,vv: visitor) -> tmaptour/tmap: (f : (int) -> int, t : tree, visit : visitor, d : direction) -> div tree(ff: (int) -> int,Bintour/Bin: (left : tree, value : int, right : tree) -> tree(ll: tree,xx: int,tt: tree),vv: visitor,Uptour/Up: direction)       // E

The key idea is that we are either Donetour/Done: visitor (C), or, on going downward in a left spine we remember all the right trees we still need to visit in a BinRtour/BinR: (right : tree, value : int, visit : visitor) -> visitor (A) or, going upward again (B), we remember the left tree that we just constructed as a BinLtour/BinL: (left : tree, value : int, visit : visitor) -> visitor while visiting right trees (D). When we come back up (E), we restore the original tree with the result values. Note that we apply the function f to the saved value in branch D (as we visit in-order), but the functional implementation makes it easy to specify a pre-order traversal by applying f in branch A, or a post-order traversal by applying f in branch E.

Looking at each branch we can see that each Bintour/Bin: (left : tree, value : int, right : tree) -> tree matches up with a BinRtour/BinR: (right : tree, value : int, visit : visitor) -> visitor, each BinRtour/BinR: (right : tree, value : int, visit : visitor) -> visitor with a BinLtour/BinL: (left : tree, value : int, visit : visitor) -> visitor, and finally each BinLtour/BinL: (left : tree, value : int, visit : visitor) -> visitor with a Bintour/Bin: (left : tree, value : int, right : tree) -> tree. Since they all have the same size, if the tree is unique, each branch updates the tree nodes in-place at runtime without any allocation, where the visitortour/visitor: V structure is effectively overlaid over the tree nodes while traversing the tree. Since all tmaptour/tmap: (f : (int) -> int, t : tree, visit : visitor, d : direction) -> div tree calls are tail calls, this also compiles to a tight loop and thus needs no extra stack- or heap space.

Finally, just like with re-balancing tree insertion, the algorithm as specified is still purely functional: it uses in-place updating when a unique tree is passed, but it also adapts gracefully to the persistent case where the input tree is shared, or where parts of the input tree are shared, making a single copy of those parts of the tree.

Read the Perceus technical report

4. Koka language specification

This is the draft language specification of the Koka language, version v3.1.2
Currently only the lexical and context-free grammar are specified. The standard libraries are documented separately.

4.1. Lexical syntax

We define the grammar and lexical syntax of the language using standard BNF notation where non-terminals are generated by alternative patterns:

nonterm

::=

pattern₁ | pattern₂

In the patterns, we use the following notations:

`terminal`		A terminal symbol (in ASCII)
`x1B`		A character with hexadecimal code 1B
`A..F`		The characters from “A” to “F” (using ASCII, i.e. `x61..x66`)

( pattern )		Grouping
[ pattern ]		Optional occurrence of pattern
{ pattern }		Zero or more occurrences of pattern
{ pattern }_n		Exactly n occurrences of pattern
pattern₁ \| pattern₂		Choice: either pattern₁ or pattern₂

pattern_<!diff>		Difference: elements generated by pattern except those in diff
nonterm_[lex]		Generate nonterm by drawing lexemes from lex

Care must be taken to distinguish meta-syntax such as | and ) from concrete terminal symbols as | and ). In the specification the order of the productions is not important and at each point the longest matching lexeme is preferred. For example, even though fun is a reserved word, the word functional is considered a single identifier.

4.1.1. Source code

Source code consists of a sequence of unicode characters. Valid characters in actual program code consist strictly of ASCII characters which range from 0 to 127. Only comments, string literals, and character literals are allowed to contain extended unicode characters. The grammar is designed such that a lexical analyzer and parser can directly work on UTF-8 encoded source files without actually doing UTF-8 decoding or unicode category identification.

4.2. Lexical grammar

In the specification of the lexical grammar all white space is explicit and there is no implicit white space between juxtaposed symbols. The lexical token stream is generated by the non-terminal lex which consists of lexemes and whitespace.

Before doing lexical analysis, there is a linefeed character inserted at the start and end of the input, which makes it easier to specify line comments and directives.

4.2.1. Lexical tokens

lex	::=	lexeme \| whitespace
lexeme	::=	conid \| qconid
	\|	varid \| qvarid
	\|	op \| opid \| qopid \| wildcard
	\|	integer \| float \| stringlit \| charlit
	\|	reserved \| opreserved
	\|	special

The main program consists of whitespace or lexeme's. The context-free grammar will draw it's lexemes from the lex production.

4.2.2. Identifiers

anyid	::=	varid \| qvarid \| opid \| qopid \| conid \| qconid

qconid	::=	modulepath conid
qvarid	::=	modulepath lowerid
modulepath	::=	lowerid `(/)std/core/int/(/): (x : int, y : int) -> int` { lowerid `(/)std/core/int/(/): (x : int, y : int) -> int` }

conid	::=	upperid
varid	::=	lowerid_<!reserved>

lowerid	::=	lower idtail
upperid	::=	upper idtail
wildcard	::=	`_` idtail
typevarid	::=	letter { digit }

idtail	::=	{ idchar } [ idfinal ]
idchar	::=	letter \| digit \| `_` \| `-`
idfinal	::=	{ `'` }

reserved	::=	`infix` \| `infixr` \| `infixl`
	\|	`module` \| `import` \| `as`
	\|	`pub` \| `abstract`
	\|	`type` \| `struct` \| `alias` \| `effect` \| `con`
	\|	`forall` \| `exists` \| `some`
	\|	`fun` \| `fn` \| `val` \| `var` \| `extern`
	\|	`if` \| `then` \| `else` \| `elif`
	\|	`match` \| `return` \| `with` \| `in`
	\|	`handle` \| `handler` \| `mask`
	\|	`ctl` \| `final` \| `raw`
	\|	`override` \| `named`
	\|	`interface` \| `break` \| `continue` \| `unsafe`	(future reserved words)

specialid	::=	`co` \| `rec` \| `open` \| `extendstd/core/sslice/extend: (slice : sslice, count : int) -> sslice` \| `behind`
	\|	`linear` \| `value` \| `reference`
	\|	`inline` \| `noinline` \| `initiallystd/core/hnd/initially: forall<a,e> (init : (int) -> e (), action : () -> e a) -> e a` \| `finallystd/core/hnd/finally: forall<a,e> (fin : () -> e (), action : () -> e a) -> e a`
	\|	`js` \| `c` \| `cs` \| `file`

Identifiers always start with a letter, may contain underscores and dashes, and can end with prime ticks. Like in Haskell, constructors always begin with an uppercase letter while regular identifiers are lowercase. The rationale is to visibly distinguish constants from variables in pattern matches. Here are some example of valid identifiers:

x
concat1
visit-left
is-nil
x'
Cons
True  
x
concat1
visit-left
is-nilstd/core/types/is-nil: forall<a> (list : list<a>) -> bool
x'
Consstd/core/types/Cons: forall<a> (head : a, tail : list<a>) -> list<a>
Truestd/core/types/True: bool

To avoid confusion with the subtraction operator, the occurrences of dashes are restricted in identifiers. After lexical analysis, only identifiers where each dash is surrounded on both sides with a letter are accepted:

fold-right
n-1        // illegal, a digit cannot follow a dash
n - 1      // n minus 1
n-x-1      // illegal, a digit cannot follow a dash
n-x - 1    // identifier "n-x" minus 1
n - x - 1  // n minus x minus 1

Qualified identifiers are prefixed with a module path. Module paths can be partial as long as they are unambiguous.

core/map
std/core/(&)

4.2.3. Operators and symbols

qopid	::=	modulepath opid
opid	::=	`(` symbols `)`

op	::=	symbols_{<!opreserved \| optype>} \| `\|\|`

symbols	::=	symbol { symbol }\| `/`
symbol	::=	`$` \| `(%)std/core/int/(%): (int, int) -> int` \| `&` \| `()std/core/int/(): (int, int) -> int` \| `+`
	\|	`~` \| `!` \| `\` \| `(^{)std/core/int/(}): (i : int, exp : int) -> int` \| `#`
	\|	`=` \| `.` \| `:` \| `-` \| `?`
	\|	anglebar
anglebar	::=	`<` \| `>` \| `\|`

opreserved	::=	`=` \| `.` \| `:` \| `->`
optype	::=	anglebar anglebar { anglebar }

special	::=	`{` \| `}` \| `(` \| `)` \| `[` \| `]` \| `\|` \| `;` \| `,`

4.2.4. Literals

charlit	::=	`'` (char_{<!' \| \>} \| escape) `'`
stringlit	::=	`"` { char_{<!" \| \>} \| escape } `"`
	\|	`r` { `#` }_n `"` rawchars_n `"` { `#` }_n	(n >= 0)
rawchars_n	::=	{ anychar }_{<!{ anychar } " { # }_n { anychar }>}

escape	::=	`\` ( charesc \| hexesc )
charesc	::=	`n` \| `r` \| `t` \| `\` \| `"` \| `'`
hexesc	::=	`x` { hexdigit }₂ \| `u` { hexdigit }₄ \| `U` { hexdigit }₆

float	::=	[ `-` ] (decfloat \| hexfloat)
decfloat	::=	decimal (`.` digits [ decexp ] \| decexp)
decexp	::=	(`e` \| `E`) exponent
hexfloat	::=	hexadecimal (`.` hexdigits [ hexexp ] \| hexexp)
hexexp	::=	(`p` \| `P`) exponent
exponent	::=	[ `-` \| `+` ] digit { digit }

integer	::=	[ `-` ] (decimal \| hexadecimal)
decimal	::=	`0` \| posdigit [ [ `_` ] digits ]
hexadecimal	::=	`0` (`x` \| `X`) hexdigits
digits	::=	digit { digit } { `_` digit { digit } }
hexdigits	::=	hexdigit { hexdigit } { `_` hexdigit { hexdigit } }

4.2.5. White space

whitespace	::=	white { white } \| newline
white	::=	space
	\|	linecomment \| blockcomment
	\|	linedirective

linecomment	::=	`//` { char \| tab }
linedirective	::=	newline `#` { char \| tab }

blockcomment	::=	`/` blockpart* { blockcomment blockpart } `*/`	(allows nested comments)
blockpart	::=	{ anychar }_{<!{ anychar } (/\|/) { anychar }>}

4.2.6. Character classes

letter	::=	upper \| lower
upper	::=	`A..Z`	(i.e. `x41..x5A`)
lower	::=	`a..z`	(i.e. `x61..x7A`)
digit	::=	`0..9`	(i.e. `x30..x39`)
posdigit	::=	`1..9`
hexdigit	::=	`a..f` \| `A..F` \| digit

anychar	::=	char \| tab \| newline	(in comments and raw strings)
newline	::=	[ return ] linefeed	(windows or unix style end of line)

space	::=	`x20`	(a space)
tab	::=	`x09`	(a tab (`\t`))
linefeed	::=	`x0A`	(a line feed (`\n`))
return	::=	`x0D`	(a carriage return (`\r`))

char	::=	unicode_{<!control \| surrogate \| bidi>}	(includes space)
unicode	::=	`x00..x10FFFF`
control	::=	`x00..x1F` \| `x7F` \| `x80..9F`	(C0, DEL, and C1)
surrogate	::=	`xD800..xDFFF`
bidi	::=	`x200E` \| `x200F` \| `x202A..x202E` \| `x2066..x2069`	(bi-directional text control)

Actual program code consists only of 7-bit ASCII characters while only comments and literals can contain extended unicode characters. As such, a lexical analyzer can directly process UTF-8 encoded input as a sequence of bytes without needing UTF-8 decoding or unicode character classification¹. For security [2], some character ranges are excluded: the C0 and C1 control codes (except for space, tab, carriage return, and line feeds), surrogate characters, and bi-directional text control characters.

4.3. Layout

Just like programming languages like Haskell, Python, JavaScript, Scala, Go, etc., there is a layout rule which automatically adds braces and semicolons at appropriate places:

Any block that is indented is automatically wrapped with curly braces:

fun show-messages1( msgs : list<string> ) : console ()  
  msgs.foreach fn(msg)
    println(msg)
fun show-messages1spec/show-messages1: (msgs : list<string>) -> console ()( msgsmsgs: list<string> : liststd/core/types/list: V -> V<stringstd/core/types/string: V> )result: -> console () : consolestd/core/console/console: X (std/core/types/unit: V)std/core/types/unit: V  
  msgs.foreach fn(msg)
    println(msg)

is elaborated to:

fun show-messages1( msgs : list<string> ) : console () {
  msgs.foreach fn(msg) {
    println(msg)
  }
}
fun show-messages1spec/show-messages1: (msgs : list<string>) -> console ()( msgs : liststd/core/types/list: V -> V<stringstd/core/types/string: V> ) : consolestd/core/console/console: X () {
  msgs.foreach fn(msg) {
    println(msg)
  }
}

Any statements and declarations that are aligned in a block are terminated with semicolons, that is:

fun show-messages2( msgs : list<string> ) : console ()  
  msgs.foreach fn(msg)
    println(msg)
    println("--")
  println("done")
fun show-messages2spec/show-messages2: (msgs : list<string>) -> console ()( msgsmsgs: list<string> : liststd/core/types/list: V -> V<stringstd/core/types/string: V> )result: -> console () : consolestd/core/console/console: X (std/core/types/unit: V)std/core/types/unit: V  
  msgs.foreach fn(msg)
    println(msg)
    println("--")
  println("done")

is fully elaborated to:

fun show-messages2( msgs : list<string> ) : console () {
  msgs.foreach fn(msg){
    println(msg);
    println("--");
  };
  println("done");
}
fun show-messages2spec/show-messages2: (msgs : list<string>) -> console ()( msgs : liststd/core/types/list: V -> V<stringstd/core/types/string: V> ) : consolestd/core/console/console: X () {
  msgs.foreach fn(msg){
    println(msg);
    println("--");
  };
  println("done");
}

Long expressions or declarations can still be indented without getting braces or semicolons if it is clear from the start- or previous token that the line continues an expression or declaration. Here is a contrived example:

fun eq2( x : int, 
          y : int ) : io bool  
  print("calc " ++
        "equ" ++
        "ality")
  val result = if (x == y)
                then True
                else False
  result  
fun eq2spec/eq2: (x : int, y : int) -> io bool( xx: int : intstd/core/types/int: V, 
          y :std/core/types/int: V intstd/core/types/int: V ) :std/core/io: E iostd/core/io: E boolstd/core/types/bool: V  
  print("calc " ++
        "equ" ++
        "ality")
  val result = if x: int(xstd/core/int/(==): (x : int, y : int) -> <console,alloc<global>,div,exn,fsys,ndet,net,read<global>,ui,write<global>> bool == y)
                then Truestd/core/types/True: bool
                else Falsestd/core/types/False: bool
  result

is elaborated to:

fun eq2( x : int, 
          y : int ) : io bool {  
  print("calc " ++
        "equ" ++
        "ality");
  val result = if (x == y)
                then True
                else False;
  result
}
fun eq2( x : intstd/core/types/int: V, 
          y : intstd/core/types/int: V ) : iostd/core/io: E boolstd/core/types/bool: V {  
  print("calc " ++
        "equ" ++
        "ality");
  val result = if (x == y)
                then Truestd/core/types/True: bool
                else Falsestd/core/types/False: bool;
  result
}

Here the long string expression is indented but no braces or semicolons are inserted as the previous lines end with an operator (++). Similarly, in the if expression no braces or semicolons are inserted as the indented lines start with then and else respectively. In the parameter declaration, the , signifies the continuation.
More precisely, for long expressions and declarations, indented or aligned lines do not get braced or semicolons if:

The line starts with a clear expression or declaration start continuation token, namely: an operator (including .), then, else, elif, a closing brace (), >, ], or }), or one of ,, ->, { , =, |, ::, ., :=.
The previous line ends with a clear expression or declaration end continuation token, namely an operator (including .), an open brace ((, <, [, or {), or ,.

The layout algorithm is performed on the token stream in-between lexing and parsing, and is independent of both. In particular, there are no intricate dependencies with the parser (which leads to very complex layout rules, as is the case in languages like Haskell or JavaScript).

Moreover, in contrast to purely token-based layout rules (as in Scala or Go for example), the visual indentation in a Koka program corresponds directly to how the compiler interprets the statements. Many tricky layout examples in other programming languages are often based on a mismatch between the visual representation and how a compiler interprets the tokens – with Koka's layout rule such issues are largely avoided.

Of course, it is still allowed to explicitly use semicolons and braces, which can be used for example to put multiple statements on a single line:

fun equal-line( x : int, y : int ) : io bool {
  print("calculate equality"); (x == y)
}  
fun equal-linespec/equal-line: (x : int, y : int) -> io bool( xx: int : intstd/core/types/int: V, yy: int : intstd/core/types/int: V )result: -> io bool : iostd/core/io: E boolstd/core/types/bool: V {
  printstd/core/console/string/print: (s : string) -> <console,alloc<global>,div,exn,fsys,ndet,net,read<global>,ui,write<global>> ()("calculate equality"literal: string
count= 18); (xx: int ==std/core/int/(==): (x : int, y : int) -> <console,alloc<global>,div,exn,fsys,ndet,net,read<global>,ui,write<global>> bool yy: int)
}

The layout algorithm also checks for invalid layouts where the layout would not visually correspond to how the compiler interprets the tokens. In particular, it is illegal to indent less than the layout context or to put comments into the indentation (because of tabs or potential unicode characters). For example, the program:

fun equal( x : int, y : int ) : io bool {   
    print("calculate equality")
  result = if (x == y) then True   // wrong: too little indentation
  /* wrong */      else False
    result
}  
fun equal( x : intstd/core/types/int: V, y : intstd/core/types/int: V ) : iostd/core/io: E boolstd/core/types/bool: V {   
    print("calculate equality")
  result = if (x == y) then Truestd/core/types/True: bool   // wrong: too little indentation
  /* wrong */      else Falsestd/core/types/False: bool
    result
}

is rejected. In order to facilitate code generation or source code compression, compilers are also required to support a mode where the layout rule is not applied and no braces or semicolons are inserted. A recognized command line flag for that mode should be --nolayout.

4.3.1. The layout algorithm

To define the layout algorithm formally, we first establish some terminology:

A new line is started after every linefeed character.
Any non-white token is called a lexeme, where a line without lexemes is called blank.
The indentation of a lexeme is the column number of its first character on that line (starting at 1), and the indentation of a line is the indentation of the first lexeme on the line.
A lexeme is an expression continuation if it is the first lexeme on a line, and the lexeme is a start continuation token, or the previous lexeme is an end continuation token (as defined in the previous section).

Because braces can be nested, we use a layout stack of strictly increasing indentations. The top indentation on the layout stack holds the layout indentation. The initial layout stack contains the single value 0 (which is never popped). We now proceed through the token stream where we perform the following operations in order: first brace insertion, then layout stack operations, and finally semicolon insertion:

Brace insertion: For each non-blank line, consider the first lexeme on the line. If the indentation is larger than the layout indentation, and the lexeme is not an expression continuation, then insert an open brace { before the lexeme. If the indention is less than the layout indentation, and the lexeme is not already a closing brace, insert a closing brace } before the lexeme.
Layout stack operations: If the previous lexeme was an open brace { or the start of the lexical token sequence, we push the indentation of the current lexeme on the layout stack. The pushed indentation must be larger than the previous layout indentation (unless the current lexeme is a closing brace). When a closing brace } is encountered the top indentation is popped from the layout stack.
Semicolon insertion: For each non-blank line, the indentation must be equal or larger to the layout indentation. If the indentation is equal to the layout indentation, and the first lexeme on the line is not an expression continuation, a semicolon is inserted before the lexeme. Also, a semicolon is always inserted before a closing brace } and before the end of the token sequence.

As defined, braces are inserted around any indented blocks, semicolons are inserted whenever statements or declarations are aligned (unless the lexeme happens to be a clear expression continuation). To simplify the grammar specification, a semicolon is also always inserted before a closing brace and the end of the source. This allows us to specify many grammar elements as ended by semicolons instead of separated by semicolons which is more difficult to specify for a LALR(1) grammar.

The layout can be implemented as a separate transformation on the lexical token stream (see the 50 line Haskell implementation in the Koka compiler), or directly as part of the lexer (see the Flex implementation)

4.3.2. Implementation

There is a full Flex (Lex) implementation of lexical analysis and the layout algorithm. Ultimately, the Flex implementation serves as the specification, and this document and the Flex implementation should always be in agreement.

4.4. Context-free syntax

The grammar specification starts with the non terminal module which draws its lexical tokens from lex where all whitespace tokens are implicitly ignored.

4.4.1. Modules

module_{[ lex ]}	::=	[ moduledecl ] modulebody

moduledecl	::=	semis moduleid
moduleid	::=	qvarid \| varid

modulebody	::=	`{` semis declarations `}` semis
	\|	semis declarations

semis	::=	{ `;` }
semi	::=	`;` semis

4.4.2. Top level declarations

declarations	::=	{ importdecl } { fixitydecl } topdecls

importdecl	::=	[ pub ] `import` [ moduleid `=` ] moduleid semi

fixitydecl	::=	[ pub ] fixity integer identifier { `,` identifier } semi
fixity	::=	`infixl` \| `infixr` \| `infix`

topdecls	::=	{ topdecl semi }
topdecl	::=	[ pub ] puredecl
	\|	[ pub ] aliasdecl
	\|	[ pub ] externdecl
	\|	[ pubabstract ] typedecl
	\|	[ pubabstract ] effectdecl

pub	::=	`pub`
pubabstract	::=	`pub` \| `abstract`

4.4.3. Type Declarations

aliasdecl	::=	`alias` typeid [ typeparams ] [ kannot ] `=` type

typedecl	::=	typemod `type` typeid [ typeparams ] [ kannot ] [ typebody ]
	\|	structmod `struct` typeid [ typeparams ] [ kannot ] [ conparams ]

typemod	::=	`co` \| `rec` \| `open` \| `extendstd/core/sslice/extend: (slice : sslice, count : int) -> sslice` \| structmod
structmod	::=	`value` \| `reference`

typeid	::=	varid \| `[]` \| `(` { `,` } `)` \| `<` `>` \| `<` \| `>`

typeparams	::=	`<` [ tbinders ] `>`
tbinders	::=	tbinder { `,` tbinder }
tbinder	::=	varid [ kannot ]
typebody	::=	`{` semis { constructor semi } `}`

constructor	::=	[ pub ] [ `con` ] conid [ typeparams ] [ conparams ]
conparams	::=	`{` semis { parameter semi } `}`

4.4.4. Value and Function Declarations

puredecl	::=	[ inlinemod ] `val` valdecl
	\|	[ inlinemod ] `fun` fundecl
inlinemod	::=	`inline` \| `noinline`

valdecl	::=	binder `=` blockexpr
binder	::=	identifier [ `:` type ]

fundecl	::=	funid funbody
funbody	::=	funparam blockexpr
funparam	::=	[ typeparams ] pparameters [ `:` tresult ] [ qualifier ]
funid	::=	identifier
	\|	`[` { `,` } `]`	(indexing operator)

parameters	::=	`(` [ parameter { `,` parameter } ] `)`
parameter	::=	[ borrow ] paramid [ `:` type ] [ `=` expr ]

pparameters	::=	`(` [ pparameter { `,` pparameter } ] `)`	(pattern matching parameters)
pparameter	::=	[ borrow ] pattern [ `:` type ] [ `=` expr ]

paramid	::=	identifier \| wildcard
borrow	::=	`^`	(not allowed from conparams)

qidentifier	::=	qvarid \| qidop \| identifier
identifier	::=	varid \| idop

qoperator	::=	op
qconstructor	::=	conid \| qconid

4.4.5. Statements

block	::=	`{` semis { statement semi } `}`

statement	::=	decl
	\|	withstat
	\|	withstat `in` expr
	\|	returnexpr
	\|	basicexpr

decl	::=	`fun` fundecl
	\|	`val` apattern `=` blockexpr	(local values can use a pattern binding)
	\|	`var` binder `:=` blockexpr

4.4.6. Expressions

blockexpr	::=	expr	(block is interpreted as statements)

expr	::=	withexpr
		block	(interpreted as `fn(){...}`)
		returnexpr
		valexpr
		basicexpr

basicexpr	::=	ifexpr
	\|	fnexpr
	\|	matchexpr
	\|	handlerexpr
	\|	opexpr

ifexpr	::=	`if` ntlexpr `then` blockexpr { elif } [ `else` blockexpr ]
	\|	`if` ntlexpr `return` expr
elif	::=	`elif` ntlexpr `then` blockexpr

matchexpr	::=	`match` ntlexpr `{` semis { matchrule semi } `}`
returnexpr	::=	`return` expr
fnexpr	::=	`fn` funbody	(anonymous lambda expression)
valexpr	::=	`val` apattern `=` blockexpr `in` expr

withexpr	::=	withstat `in` expr
withstat	::=	`with` basicexpr
		`with` binder `<-` basicexpr
		`with` [ `override` ] heff opclause	(with single operation)
		`with` binder `<-` heff opclause	(with named single operation)

4.4.7. Operator expressions

For simplicity, we parse all operators as if they are left associative with the same precedence. We assume that a separate pass in the compiler will use the fixity declarations that are in scope to properly associate all operators in an expressions.

opexpr	::=	prefixexpr { qoperator prefixexpr }
prefixexpr	::=	{ `!` \| `~` } appexpr
appexpr	::=	appexpr `(` [ arguments ] `)`	(regular application)
	\|	appexpr `[` [ arguments ] `]`	(index operation)
	\|	appexpr (fnexpr \| block)	(trailing lambda expression)
	\|	appexpr `.` atom
	\|	atom

ntlexpr	::=	ntlprefixexpr { qoperator ntlprefixexpr }	(non trailing lambda expression)
ntlprefixexpr	::=	{ `!` \| `~` } ntlappexpr
ntlappexpr	::=	ntlappexpr `(` [ arguments ] `)`	(regular application)
	\|	ntlappexpr `[` [ arguments ] `]`	(index operation)
	\|	ntlappexpr `.` atom
	\|	atom

arguments	::=	argument { `,` argument }
argument	::=	[ identifier `=` ] expr

4.4.8. Atomic expressions

atom	::=	qidentifier
	\|	qconstructor
	\|	literal
	\|	mask
	\|	`(` `)`	(unit)
	\|	`(` annexpr `)`	(parenthesized expression)
	\|	`(` annexprs `)`	(tuple expression)
	\|	`[` [ annexpr { `,` annexprs } [ `,` ] ] `]`	(list expression)

literal	::=	natural \| float \| charlit \| stringlit
mask	::=	`mask` [ `behind` ] `<` tbasic `>`

annexprs	::=	annexpr { `,` annexpr }
annexpr	::=	expr [ `:` typescheme ]

4.4.9. Matching

matchrule	::=	patterns [ `\|` expr ] `->` blockexpr

apattern	::=	pattern [ typescheme ]
pattern	::=	identifier
	\|	identifier `as` apattern	(named pattern)
	\|	qconstructor [`(` [ patargs ] `)`]
	\|	`(` [ apatterns ] `)`	(unit, parenthesized pattern, tuple pattern)
	\|	`[` [ apatterns ] `]`	(list pattern)
	\|	literal
	\|	wildcard

patterns	::=	pattern { `,` pattern }
apatterns	::=	apattern { `,` apattern }
patargs	::=	patarg { `,` patarg }
patarg	::=	[ identifier `=` ] apattern	(possibly named parameter)

4.4.10. Effect Declarations

effectdecl	::=	[ named ] effectmod `effect` varid [ typeparams ] [ kannot ] [ opdecls ]
	\|	[ named ] effectmod `effect` [ typeparams ] [ kannot ] opdecl
	\|	named effectmod `effect` varid [ typeparams ] [ kannot ] `in` type [ opdecls ]
effectmod	::=	[ `linear` ] [ `rec` ]
named	::=	`named`

opdecls	::=	`{` semis { opdecl semi } `}`
opdecl	::=	[ pub ] `val` identifier [ typeparams ] `:` tatom
	\|	[ pub ] (`fun` \| `ctl`) identifier [ typeparams ] parameters `:` tatom

4.4.11. Handler Expressions

handlerexpr	::=	[ `override` ] `handler` heff opclauses
	\|	[ `override` ] `handle` heff `(` expr `)` opclauses
	\|	`named` `handler` heff opclauses
	\|	`named` `handle` heff `(` expr `)` opclauses
heff	::=	[ `<` tbasic `>` ]

opclauses	::=	`{` semis { opclausex semi } `}`

opclausex	\|	opclause
	\|	`finallystd/core/hnd/finally: forall<a,e> (fin : () -> e (), action : () -> e a) -> e a` blockexpr
	\|	`initiallystd/core/hnd/initially: forall<a,e> (init : (int) -> e (), action : () -> e a) -> e a` `(` oparg `)` blockexpr

opclause	::=	`val` qidentifier [ type ] `=` blockexpr
	\|	`fun` qidentifier opargs blockexpr
	\|	[ ctlmod ]`ctl` qidentifier opargs blockexpr
	\|	`return` `(` oparg `)` blockexpr
ctlmod	::=	`final` \| `raw`

opargs	::=	`(` [ oparg { `,` oparg } ] `)`
oparg	::=	paramid [ `:` type ]

4.4.12. Type schemes

typescheme	::=	somes foralls tarrow [ qualifier ]
type	::=	foralls tarrow [ qualifier ]

foralls	::=	[ `forall` typeparams ]
some	::=	[ `some` typeparams ]

qualifier	::=	`with` `(` predicates `)`

predicates	::=	predicate { `,` predicate }
predicate	::=	typeapp	(interface)

4.4.13. Types

tarrow	::=	tatom [ `->` tresult ]
tresult	::=	tatom [ tbasic ]

tatom	::=	tbasic
	\|	`<` anntype { `,` anntype } [ `\|` tatom ] `>`
	\|	`<` `>`

tbasic	::=	typeapp
	\|	`(` `)`	(unit type)
	\|	`(` tparam `)`	(parenthesized type or type parameter)
	\|	`(` tparam { `,` tparam } `)`	(tuple type or parameters)
	\|	`[` anntype `]`	(list type)

typeapp	::=	typecon [ `<` anntype { `,` anntype } `>` ]

typecon	::=	varid \| qvarid
	\|	wildcard
	\|	`(` `,` { `,` } `)`	(tuple constructor)
	\|	`[` `]`	(list constructor)
	\|	`(` `->` `)`	(function constructor)

tparam	::=	[ varid `:` ] anntype
anntype	::=	type [ kannot ]

4.4.14. Kinds

kannot	::=	`::` kind

kind	::=	`(` kind { `,` kind } `)` `->` kind
	\|	katom `->` kind
	\|	katom

katom	::=	`V`	(value type)
	\|	`X`	(effect type)
	\|	`E`	(effect row)
	\|	`H`	(heap type)
	\|	`P`	(predicate type)
	\|	`S`	(scope type)
	\|	`HX`	(handled effect type)
	\|	`HX1`	(handled linear effect type)

4.4.15. Implementation

As a companion to the Flex lexical implementation, there is a full Bison(Yacc) LALR(1) implementation available. Again, the Bison parser functions as the specification of the grammar and this document should always be in agreement with that implementation.

References

[1]Dariusz Biernacki, Maciej Piróg, Piotr Polesiuk, and Filip Sieczkowski. “Handle with Care: Relational Interpretation of Algebraic Effects and Handlers.” Proc. ACM Program. Lang. 2 (POPL’17 issue): 8:1–8:30. Dec. 2017. doi:10.1145/3158096. 🔎

[2]Nicholas Boucher, and Ross Anderson. “Trojan Source: Invisible Vulnerabilities.” Preprint. 2021. arXiv:cs.CR/2111.00169. 🔎

[3]Leo J Guibas, and Robert Sedgewick. “A Dichromatic Framework for Balanced Trees.” In 19th Annual Symposium on Foundations of Computer Science (sfcs 1978), 8–21. IEEE. 1978. 🔎

[4]Carl A. Gunter, Didier Rémy, and Jon G. Riecke. “A Generalization of Exceptions and Control in ML-like Languages.” In Proceedings of the Seventh International Conference on Functional Programming Languages and Computer Architecture, 12–23. FPCA ’95. ACM, La Jolla, California, USA. 1995. doi:10.1145/224164.224173. 🔎

[5]Daniel Hillerström, and Sam Lindley. “Liberating Effects with Rows and Handlers.” In Proceedings of the 1st International Workshop on Type-Driven Development, 15–27. TyDe 2016. Nara, Japan. 2016. doi:10.1145/2976022.2976033. 🔎

[6]Gérard P. Huet. “The Zipper.” Journal of Functional Programming 7 (5): 549–554. 1997. 🔎

[7]Donald E. Knuth. The Art of Computer Programming, Volume 1 (3rd Ed.): Fundamental Algorithms. Addison Wesley Longman Publishing Co., Inc. 1997. 🔎

[8]Daan Leijen. “Extensible Records with Scoped Labels.” In Proceedings of the 2005 Symposium on Trends in Functional Programming, 297–312. 2005. 🔎

[9]Daan Leijen. “Structured Asynchrony with Algebraic Effects.” In Proceedings of the 2nd ACM SIGPLAN International Workshop on Type-Driven Development, 16–29. TyDe 2017. Oxford, UK. 2017. doi:10.1145/3122975.3122977. 🔎

[10]Daan Leijen. “Type Directed Compilation of Row-Typed Algebraic Effects.” In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL’17), 486–499. Paris, France. Jan. 2017. doi:10.1145/3009837.3009872. 🔎

[11]Daan Leijen, and Anton Lorenzen. Tail Recursion Modulo Context – An Equational Approach. MSR-TR-2022-18. Microsoft Research. Jul. 2022. 🔎

[12]Daan Leijen, and Anton Lorenzen. “Tail Recursion Modulo Context: An Equational Approach.” Proc. ACM Program. Lang. 7 (POPL). Jan. 2023. doi:10.1145/3571233. See also [11]. 🔎

[13]Anton Lorenzen, and Daan Leijen. Reference Counting with Frame Limited Reuse. MSR-TR-2021-30. Microsoft Research. Nov. 2022. https://www.microsoft.com/en-us/research/publication/reference-counting-with-frame-limited-reuse-extended-version/. 🔎

[14]Anton Lorenzen, Daan Leijen, and Wouter Swierstra. FP: Fully in-Place Functional Programming. MSR-TR-2023-19. Microsoft Research. May 2023. https://www.microsoft.com/en-us/research/publication/fp2-fully-in-place-functional-programming. 🔎

[15]Anton Lorenzen, Daan Leijen, and Wouter Swierstra. “FP: Fully in-Place Functional Programming.” In Proceedings of the 28th ACM SIGPLAN International Conference on Functional Programming (ICFP’2023). ICFP’23. Seattle, WA, USA. Sep. 2023. See also [14]. 🔎

[16]Prabhaker Mateti, and Ravi Manghirmalani. “Morris’ Tree Traversal Algorithm Reconsidered.” Science of Computer Programming 11 (1): 29–43. 1988. doi:10.1016/0167-6423(88)90063-9. 🔎

[17]Conor McBride. “The Derivative of a Regular Type Is Its Type of One-Hole Contexts.” 2001. http://strictlypositive.org/diff.pdf. (Extended Abstract). 🔎

[18]Joseph M. Morris. “Traversing Binary Trees Simply and Cheaply.” Information Processing Letters 9 (5): 197–200. 1979. doi:10.1016/0020-0190(79)90068-1. 🔎

[19]Chris Okasaki. “Red-Black Trees in a Functional Setting.” Journal of Functional Programming 9 (4). Cambridge University Press: 471–477. 1999. doi:10.1017/S0956796899003494. 🔎

[20]Chris Okasaki. Purely Functional Data Structures. Colombia University, New York. Jun. 1999. 🔎

[21]Gordon D. Plotkin, and Matija Pretnar. “Handlers of Algebraic Effects.” In 18th European Symposium on Programming Languages and Systems, 80–94. ESOP’09. York, UK. Mar. 2009. doi:10.1007/978-3-642-00590-9_7. 🔎

[22]Alex Reinking, Ningning Xie, Leonardo de Moura, and Daan Leijen. “Perceus: Garbage Free Reference Counting with Reuse.” In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, 96–111. PLDI 2021. Virtual, Canada. 2021. doi:10.1145/3453483.3454032. 🔎

[23]Ningning Xie, Jonathan Brachthäuser, Phillip Schuster, Daniel Hillerström, and Daan Leijen. “Effect Handlers, Evidently.” In Proceedings of the 25th ACM SIGPLAN International Conference on Functional Programming (ICFP’2020). ICFP ’20. Jersey City, NJ. Aug. 2020. 🔎

[24]Ningning Xie, and Daan Leijen. Generized Evidence Passing for Effect Handlers – Efficient Compilation of Effect Handlers to C. MSR-TR-2021-5. Microsoft Research. Mar. 2021. https://www.microsoft.com/en-us/research/publication/generalized-evidence-passing-for-effect-handlers/. 🔎

[25]Ningning Xie, and Daan Leijen. “Generized Evidence Passing for Effect Handlers – Efficient Compilation of Effect Handlers to C.” In Proceedings of the 26th ACM SIGPLAN International Conference on Functional Programming (ICFP’2021). ICFP ’21. Virtual. Aug. 2021. 🔎

Appendix

A. Full grammar specification

A.1. Lexical syntax

A.2. Context-free syntax

^1.This is used for example in the Flex implementation. In particular, we only need to adapt the char definition:

char	::=	unicode_{<!control \| surrogate \| bidi>}
unicode	::=	`x00..x7F`	(ASCII)
	\|	(`xC2..xDF`) cont
	\|	`xE0` (`xA0..xBF`) cont	(exclude overlong encodings)
	\|	(`xE1..xEF`) cont cont
	\|	`xF0` (`x90..xBF`) cont cont	(exclude overlong encodings)
	\|	(`xF1..xF3`) cont cont cont
	\|	`xF4` (`x80..x8F`) cont cont	(no codepoint larger than `x10FFFF`)
cont	::=	`x80..xBF`
surrogate	::=	`xED` (`xA0..xBF`) cont
control	::=	`x00..x1F`
	\|	`x7F`
	\|	`xC2` (`x80..x9F`)
bidi	::=	`xE2` `0x80` (`0x8E..0x8F`)	(left-to-right mark (`u200E`) and right-to-left mark (`u200F`))
	\|	`xE2` `0x80` (`0xAA..0xAE`)	(left-to-right embedding (`u202A`) up to right-to-left override (`u202E`))
	\|	`xE2` `0x81` (`0xA6..0xA9`)	(left-to-right isolate (`u2066`) up to pop directional isolate (`u2069`))

↩

lex	::=	lexeme \| whitespace
lexeme	::=	conid \| qconid
	\|	varid \| qvarid
	\|	op \| opid \| qopid \| wildcard
	\|	integer \| float \| stringlit \| charlit
	\|	reserved \| opreserved
	\|	special