🔗#1. Getting started
Welcome to the Koka book. This provides an overview and formal specification of the language. For more background information, see:
- The library documentation.
- The Koka research page and the slides of a talk presented Lang.Next (April 2012).
- The source code of the Koka compiler.
- The article Algebraic Effects for Functional Programming [3] about the algebraic effects in Koka.
- An article about the type system and semantics of Koka [2].
🔗#1.1. Installing the compiler
At this point there are no binary releases of Koka and you need to build the compiler yourself. Fortunately, Koka has few dependencies and builds without problems on most common platforms, e.g. Windows, MacOSX, and Unix.
The following programs are required to build Koka:
- The Haskell platform (version 7.4 or later).
- The NodeJS runtime (version 4.2 LTS or later).
- Some version of Git for version control.
All these programs are very easy to install on most platforms. Now we can build Koka itself:
-
First clone the Koka sources with algebraic effects support:
> git clone https://github.com/koka-lang/koka.gitYou can also use the flag
-b devto get the latest development version. -
Go to the newly created Koka directory:
> cd koka -
Install any needed Node libraries using the Node package manager:
> npm installIf you are running on MacOSX or Unix, you may have to run this as
sudo npm installso that thenpmpackage manager has enough permissions to install thejakeandmadokotools. -
Finally, build the compiler and run the Koka interactive environment:
> jakeYou can type
jake helpto see an overview of all make targets.
The excellent Sublime text editor is recommended to edit Koka programs. You can install support for Koka programs using
> jake sublime
After this .kk files will be properly highlighted. It is also
recommended to use the newly installed snow color theme which is
designed to work well with Koka files.
🔗#1.2. Running the interactive compiler
After running a plain jake command, the Koka interactive environment will start:
__ _
| | | |
| | __ ___ | | __ __ _
| |/ // _ \| |/ // _` | welcome to the koka interpreter
| <| (_) | <| (_| | version 0.7.0-dev (debug), Jun 30 2016
|_|\_\\___/|_|\_\\__,_| type :? for help
loading: std/core
Now you can test some expressions:
> println("hi koka")
hi koka
> :t "hi"
stringstd/core/string: V
> :t println("hi")
consolestd/core/console: X ()
Or load a demo:
> :l demo/collatz
compile: lib/demo/collatz.kk
check : demo/collatz
modules:
demo/collatz
> main()
Collatz(27) took 111 steps.
And quit the interpreter:
> :q
Before the effect one believes in different causes than one does after the effect.
-- Friedrich Nietzsche
You can also run examples in the browser by setting the host:
> :set --host=browser
> 1+2
Some browser specific demo to try is for example demo/dom/conway.kk.
🔗#1.3. Algebraic effect handlers
A novel feature of Koka is a compiled and typed implementation of algebraic
effect handlers (described in detail in [3]).
In the interactive environment, you can load various demo files with algebraic
effects which are located in the test/algeff directory. This is by default
included in the search path, so we can load them directly using
the load (:l) command:
> :l scoped
Use the :? command to get an overview of all commands. After
loading the scoped demo, we can run it directly from the interpreter:
> :l scoped
compile: test/algeff/scoped.kk
check : scoped
modules:
scoped
> main()
[[3],[2,1],[1,2],[1,1,1]]
(state=12, [[3],[2,1],[1,2],[1,1,1]])
[(state=1, [3]),(state=5, [2,1]),(state=5, [1,2]),(state=9, [1,1,1])]
[[3]]
[42]
Some interesting demos are:
-
common.kk: Various examples from the paper “Algebraic Effects for Functional Programming” [3]. Shows how to implement common control-flow abstractions like exceptions, state, iterators, ambiguity, and asynchronous programming. -
scoped.kk: Various examples from the paper “Effect handlers in Scope” [4]. -
nim.kk: Various examples from the paper “Liberating effects with rows and handlers” [1]. -
async*.kk: Various asynchronous effect examples. -
parser.kk: Implements parser combinators as an effect.
🔗#1.4. A primer on effect handlers
Another small demo is effs2 that demonstrates the ambiguity
and state effect:
> :l effs2
compile: test/algeff/effs2.kk
check : effs2
modules:
effs1
> main()
[Falsestd/core/False: bool,Truestd/core/True: bool,Truestd/core/True: bool,Falsestd/core/False: bool]
[Falsestd/core/False: bool,Falsestd/core/False: bool,Truestd/core/True: bool,Truestd/core/True: bool,Falsestd/core/False: bool]
[Falsestd/core/False: bool,Falsestd/core/False: bool]
The effs2.kk module starts by defining the ambgetstarted/amb: HX effect:
effect amb { fun flip() : bool } effect ambgetstarted/amb: HX { fun flipflip: .op-flip() : boolstd/core/bool: V }
This declares ambgetstarted/amb: forall<a,e> (action : () -> <amb|e> a) -> e list<a> as a new effect with a single operation flipgetstarted/flip: () -> amb bool.
We can use this as:
fun xor() : amb bool { val p = flip() val q = flip() (p||q) && not(p&&q) } fun xorgetstarted/xor: () -> amb bool() : ambgetstarted/amb: HX boolstd/core/bool: V { val pp: bool = flipgetstarted/flip: () -> amb bool() val qq: bool = flipgetstarted/flip: () -> amb bool() (pp: bool||std/core/(||): (bool, bool) -> boolqq: bool) &&std/core/(&&): (bool, bool) -> bool notstd/core/not: (bool) -> bool(pp: bool&&std/core/(&&): (bool, bool) -> boolqq: bool) }
where the type of xorgetstarted/xor: () -> amb bool reflects that it has a ambgetstarted/amb: forall<a,e> (action : () -> <amb|e> a) -> e list<a> effect and
returns a boolean.
Next, let's write a handler for this effect:
val amb = handler { return x -> [x] flip() -> resume(False) + resume(True) } val ambgetstarted/amb: forall<a,e> (action : () -> <amb|e> a) -> e list<a> = handlerstd/core/null-const: forall<a> null<a> { returnreturn: (x : _87) -> _90 list<_87> xx: _87 -> [std/core/Cons: forall<a> (head : a, tail : list<a>) -> list<a>xx: _87]std/core/Nil: forall<a> list<a> flipstd/core/resume: forall<a,b,e> (context : resume-context<a,e,b>, result : a) -> e b() -> resumeresume: (result : bool) -> _90 list<_87>(Falsestd/core/False: bool) +std/core/(+).4: forall<a> (xs : list<a>, ys : list<a>) -> list<a> resumeresume: (result : bool) -> _90 list<_87>(Truestd/core/True: bool) }
When a flipgetstarted/flip: () -> amb bool operation is issued, this handler will catch it
dynamically. In the above handler, we resume twice: once with a Falsestd/core/False: bool
value as the result for flipgetstarted/flip: () -> amb bool, and once with a Truestd/core/True: bool value. The
return clause wraps the final result in a list which are concatenated
by the flipgetstarted/flip: () -> amb bool handler. The type of the ambgetstarted/amb: forall<a,e> (action : () -> <amb|e> a) -> e list<a> handler is a function that
removes the ambgetstarted/amb: forall<a,e> (action : () -> <amb|e> a) -> e list<a> effect from its argument, and return a list of results:
> :t amb
forall<a,e> (action : () -> <ambgetstarted/amb: HX|e> a) -> e liststd/core/list: V -> V<a>
We can now run the xorgetstarted/xor: () -> amb bool function using the ambgetstarted/amb: forall<a,e> (action : () -> <amb|e> a) -> e list<a> handler to
handle the flipgetstarted/flip: () -> amb bool operations:
fun test1() { amb(xor).show.println } fun test1getstarted/test1: () -> console ()() { ambgetstarted/amb: forall<a,e> (action : () -> <amb|e> a) -> e list<a>(xorgetstarted/xor: () -> amb bool).showstd/core/show.10: (xs : list<bool>) -> string.printlnstd/core/println: (s : string) -> console () }
Here we used dot syntax introduced in Section 2.2. If we run
test1getstarted/test1: () -> console () in the interpreter, we see all possible results:
> test1()
[Falsestd/core/False: bool,Truestd/core/True: bool,Truestd/core/True: bool,Falsestd/core/False: bool]
Adding state
Let's combine the ambiguity effect with state. The definition
of the state effect is polymorphic in its value:
effect state<s> { fun get() : s fun set(i:s) : () } effect stategetstarted/state: V -> HX<ss: V> { fun gets: V() : ss: V fun sets: V(ii: 10:ss: V) : (std/core/(): V)std/core/(): V }
Next we define a function that uses both ambiguity and the state effect:
fun foo() : <amb,state<int>> bool { val p = flip() val i = get() set(i+1) if (i>0 && p) then xor() else False } fun foogetstarted/foo: () -> <amb,state<int>> bool() : <std/core/(<>): Eambgetstarted/amb: HX,stategetstarted/state: V -> HX<intstd/core/int: V>> boolstd/core/bool: V { val pp: bool = flipgetstarted/flip: () -> amb bool() val ii: int = getgetstarted/get: forall<a> () -> (state<a>) a() setgetstarted/set: forall<a> (i : a) -> (state<a>) ()(ii: int+std/core/(+): (int, int) -> int1) if (istd/core/True: bool>std/core/(>).1: (int, int) -> bool0 &&std/core/(&&): (bool, bool) -> bool pp: bool) then xorgetstarted/xor: () -> amb bool() else Falsestd/core/False: bool }
The handler for the stategetstarted/state: V -> HX effect takes a local parameter that
is propagated through the resume function (as its second argument):
val state = handler(i) { return x -> x get() -> resume(i,i) set(j) -> resume((),j) } val stategetstarted/state: forall<a,b,e> (i : a, action : () -> <state<a>|e> b) -> e b = handlerhandler: forall<a,b,e> (i : a, action : () -> <state<a>|e> b) -> e b(ii: _378) { returnreturn: (x : _377, i : _378) -> _380 _377 xx: _377 -> xx: _377 getstd/core/resume.1: forall<a,b,e,c> (context : resume-context1<a,e,b,c>, result : a, local : c) -> e b() -> resumeresume: (result : _378, i : _378) -> _380 _377(ii: _378,ii: _378) setstd/core/resume.1: forall<a,b,e,c> (context : resume-context1<a,e,b,c>, result : a, local : c) -> e b(jj: _378) -> resumeresume: (result : (), i : _378) -> _380 _377((std/core/(): ())std/core/(): (),jj: _378) }
The type of the stategetstarted/state: forall<a,b,e> (i : a, action : () -> <state<a>|e> b) -> e b handler takes an initial state as an extra argument too:
> :t state
forall<a,b,e>. () -> ((i : a, action : () -> <stategetstarted/state: V -> HX<a>|e> b) -> e b)
We can now combine the ambiguity handler with the state handler in two ways (see Section 2.4 for a primer on anonymous function syntax):
fun test2() { state(0){ amb(foo) } } fun test3() { amb{ state(0,foo) } } fun test2getstarted/test2: () -> list<bool>() { stategetstarted/state: forall<a,b,e> (i : a, action : () -> <state<a>|e> b) -> e b(0){ ambgetstarted/amb: forall<a,e> (action : () -> <amb|e> a) -> e list<a>(foogetstarted/foo: () -> <amb,state<int>> bool) } } fun test3getstarted/test3: () -> list<bool>() { ambgetstarted/amb: forall<a,e> (action : () -> <amb|e> a) -> e list<a>{ stategetstarted/state: forall<a,b,e> (i : a, action : () -> <state<a>|e> b) -> e b(0,foogetstarted/foo: () -> <amb,state<int>> bool) } }
In test2getstarted/test2: () -> list<bool> the state handler is outside and every ambiguity execution
modifies the same global state. In test3getstarted/test3: () -> list<bool> the state handler is inside
and every ambiguity execution has its own local state instead. Can you
predict the outcomes of running the tests?
> test2()
[Falsestd/core/False: bool,Falsestd/core/False: bool,Truestd/core/True: bool,Truestd/core/True: bool,Falsestd/core/False: bool]
> test3()
[Falsestd/core/False: bool,Falsestd/core/False: bool]🔗#2. An overview of Koka
This is a short introduction to the Koka programming language meant for programmers familiar with languages like C++, C#, or JavaScript.
Koka is a function-oriented language that separates pure values from
side-effecting computations (The word ‘koka’ (or 効果) means
“effect” or “effective” in Japanese). Koka is also
flexible and fun: Koka has many features that help programmers to easily
change their data types and code organization correctly even in large-scale
programs, while having a small strongly-typed language core with a familiar
JavaScript like syntax.
🔗#2.1. Hello world
As usual, we start with the familiar Hello world program:
fun main() { println("Hello world!") // println output } fun mainoverview/main: () -> console ()() { printlnstd/core/println: (s : string) -> console ()("Hello world!") // println output }
Koka uses familiar curly-braces syntax where starts a line
comment. Functions are declared using the fun keyword.
If you are reading this on Rise4Fun, you can click the load in editor button in the upper right corner of the example to load it into the editor and run the program.
Here is another short example program that encodes a string using the Caesar cipher, where each lower-case letter in a string is replaced by the letter three places up in the alphabet:
fun main() { println(caesar("koka is fun")) } fun encode( s : string, shift : int ) { fun encode-char(c) { if (c < 'a' || c > 'z') return c val base = (c - 'a').int val rot = (base + shift) % 26 (rot.char + 'a') } s.map(encode-char) } fun caesar( s : string ) : string { s.encode( 3 ) } fun encodeoverview/encode: (s : string, shift : int) -> string( ss: string : stringstd/core/string: V, shiftshift: int : intstd/core/int: V ) { fun encode-charencode-char: (c : char) -> char(cc: char) { if (cstd/core/True: bool <std/core/(<): (char, char) -> bool 'a' ||std/core/(||): (bool, bool) -> bool cc: char >std/core/(>): (char, char) -> bool 'z') returnreturn: char cc: char val basebase: int = (cc: char -std/core/(-).3: (c : char, d : char) -> total char 'a').intstd/core/int: (char) -> int val rotrot: int = (basebase: int +std/core/(+): (int, int) -> int shiftshift: int) %std/core/(%): (int, int) -> int 26 (rotrot: int.charstd/core/char: (int) -> char +std/core/(+).5: (c : char, d : char) -> total char 'a') } ss: string.mapstd/core/map.6: forall<e> (s : string, f : (char) -> e char) -> e string(encode-charencode-char: (c : char) -> char) } fun caesaroverview/caesar: (s : string) -> string( ss: string : stringstd/core/string: V ) : stringstd/core/string: V { ss: string.encodeoverview/encode: (s : string, shift : int) -> string( 3 ) }
In this example, we declare a local function encode-char which encodes a
single character c. The final statement s.map(encode-char) applies the
encode-char function to each character in the string s, returning a
new string where each character is Caesar encoded. The result of the final
statement in a function is also the return value of that function, and you can
generally leave out an explicit return keyword.
Similarly, Koka's grammar is constructed in such a way that no semi-colons
are needed to separate statements.
🔗#2.2. Dot selection
Koka is a function-oriented language where functions and data form the
core of the language (in contrast to objects for example). In particular, the
expression s.encodeoverview/encode: (s : string, shift : int) -> string(3) does not select the encodeoverview/encode: (s : string, shift : int) -> string method from the
stringstd/core/string: V object, but it is simply syntactic sugar for the function call
encodeoverview/encode: (s : string, shift : int) -> string(s,3) where s becomes the first argument. Similarly, c.int
converts a character to an integer by calling int(c) (and both expressions
are equivalent). The dot notation is intuïtive and quite convenient to
chain multiple calls together, as in:
fun showit( s : string ) -> s.encode(3).count.println fun showitoverview/showit: (s : string) -> console ()( ss: string : stringstd/core/string: V ) -> ss: string.encodeoverview/encode: (s : string, shift : int) -> string(3).countstd/core/count.1: (s : string) -> int.printlnstd/core/println.1: (i : int) -> console ()
for example (where the body desugars as println(length(encodeoverview/encode: (s : string, shift : int) -> string(s,3)))). An
advantage of the dot notation as syntactic sugar for function calls is that it
is easy to extend the ‘primitive’ methods of any data type: just write a new
function that takes that type as its first argument. In most object-oriented
languages one would need to add that method to the class definition itself
which is not always possible if such class came as a library for example.
🔗#2.3. Types
Koka is also strongly typed. It uses a powerful type inference engine to
infer most types, and types generally do not get in the way. In
particular, you can always leave out the types of any local variables.
This is the case for example for base and rot values in the
previous example; hover with the mouse over the example to see the types
that were inferred by Koka. Generally, it is good practice though to
write type annotations for function parameters and the function result
since it both helps with type inference, and it provides useful
documentation with better feedback from the compiler.
For the encodeoverview/encode: (s : string, shift : int) -> string function it is actually essential to give the type of
the s parameter: since the map function is defined for both liststd/core/list: V -> V
and stringstd/core/string: V types and the program is ambiguous without an annotation.
Try to load the example in the editor and remove the annotation to see
what error Koka produces.
🔗#2.4. Anonymous functions
Koka also allows for anonymous function expressions. For example, instead of
declaring the encode-char function, we could just have passed it directly to
the map function as a function expression:
fun encode2( s : string, shift : int ) { s.map( fun(c) { if (c < 'a' || c > 'z') return c val base = (c - 'a').int val rot = (base + shift) % 26 (rot.char + 'a') }); } fun encode2overview/encode2: (s : string, shift : int) -> string( ss: string : stringstd/core/string: V, shiftshift: int : intstd/core/int: V ) { ss: string.mapstd/core/map.6: forall<e> (s : string, f : (char) -> e char) -> e string( fun(cc: char) { if (cstd/core/True: bool <std/core/(<): (char, char) -> bool 'a' ||std/core/(||): (bool, bool) -> bool cc: char >std/core/(>): (char, char) -> bool 'z') returnreturn: char cc: char val basebase: int = (cc: char -std/core/(-).3: (c : char, d : char) -> total char 'a').intstd/core/int: (char) -> int val rotrot: int = (basebase: int +std/core/(+): (int, int) -> int shiftshift: int) %std/core/(%): (int, int) -> int 26 (rotrot: int.charstd/core/char: (int) -> char +std/core/(+).5: (c : char, d : char) -> total char 'a') }); }
It is a bit annoying we had to put the final right-parenthesis after the last
brace. As a convenience, Koka allows anonymous functions to follow
the function call instead. For example, here is how we can print the numbers
1 to 10:
fun main() { print10() } fun print10() { for(1,10) fun(i) { println(i) } } fun print10overview/print10: () -> console ()() { forstd/core/for: forall<e> (start : int, end : int, action : (int) -> e ()) -> e ()(1,10) fun(ii: int) { printlnstd/core/println.1: (i : int) -> console ()(ii: int) } }
which is desugared to forstd/core/for: forall<e> (start : int, end : int, action : (int) -> e ()) -> e ()( 1, 10, fun(i){ println(i) } ). In fact, since we
pass the i argument directly to println, we can also the function itself
directly, and write forstd/core/for: forall<e> (start : int, end : int, action : (int) -> e ()) -> e ()(1,10,println).
Anonymous functions without any arguments can be shortened further by leaving
out the fun keyword and just using braces directly. Here is an example using
the repeat function:
fun main() { printhi10() } fun printhi10() { repeat(10) { println("hi") } } fun printhi10overview/printhi10: () -> console ()() { repeatstd/core/repeat.1: forall<e> (n : int, action : () -> e ()) -> e ()(10) { printlnstd/core/println: (s : string) -> console ()("hi") } }
where the body desugars to repeat( 10, fun(){println(hi)} ). The is
especially convenient for the whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () loop since this is not a built-in
operator in Koka but just a regular function:
fun main() { print11() } fun print11() { var i := 10 while { i >= 0 } { println(i) i := i - 1 } } fun print11overview/print11: () -> <div,console> ()() { var ii: ref<_2105,int> := 10 whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () { ii: ref<_2105,int> >=std/core/(>=).1: (int, int) -> bool 0 } { printlnstd/core/println.1: (i : int) -> console ()(ii: ref<_2105,int>) ii: ref<_2105,int> :=std/core/set: forall<h,a> (ref : ref<h,a>, assigned : a) -> (write<h>) () ii: ref<_2105,int> -std/core/(-): (int, int) -> int 1 } }
Note how the first argument to whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () is in braces instead of the usual
parenthesis: Koka makes it always explicit whether code is evaluated
before a function is called (in between parenthesis), or whether
code is evaluated (potentially
multiple times) by the called function instead (in between braces).
🔗#2.5. Effect types
A novel part about Koka is that it automatically infers all the side effects
that occur in a function. The absence of any effect is denoted as totalstd/core/total: E (or
<>) and corresponds to pure mathematical functions. If a function can raise
an exception the effect is exnstd/core/exn: X, and if a function may not terminate the
effect is divstd/core/div: X (for divergence). The combination of exnstd/core/exn: X and divstd/core/div: X is
purestd/core/pure: E and corresponds directly to Haskell's notion of purity. Non-
deterministic functions get the ndetstd/core/ndet: X effect. The ‘worst’ effect is iostd/core/io: E
and means that a program can raise exceptions, not terminate, be non-
deterministic, read and write to the heap, and do any input/output operations.
Here are some examples of effectful functions:
fun square1( x : int ) : total int { return x*x } fun square2( x : int ) : io int { println( "a not so secret side-effect" ) return x*x } fun square3( x : int ) : div int { square3( x ) return x*x } fun square4( x : int ) : exn int { error( "oops" ) return x*x } fun square1overview/square1: (x : int) -> total int( xx: int : intstd/core/int: V ) : totalstd/core/total: E intstd/core/int: V { returnreturn: int xx: int*std/core/(*): (int, int) -> intxx: int } fun square2overview/square2: (x : int) -> io int( xx: int : intstd/core/int: V ) : iostd/core/io: E intstd/core/int: V { printlnstd/core/println: (s : string) -> console ()( "a not so secret side-effect" ) returnreturn: int xx: int*std/core/(*): (int, int) -> intxx: int } fun square3overview/square3: (x : int) -> div int( xx: int : intstd/core/int: V ) : divstd/core/div: X intstd/core/int: V { square3overview/square3: (x : int) -> div int( xx: int ) returnreturn: int xx: int*std/core/(*): (int, int) -> intxx: int } fun square4overview/square4: (x : int) -> exn int( xx: int : intstd/core/int: V ) : exnstd/core/exn: X intstd/core/int: V { errorstd/core/error: forall<a> (message : string) -> exn a( "oops" ) returnreturn: int xx: int*std/core/(*): (int, int) -> intxx: int }
When the effect is totalstd/core/total: E we usually leave it out in the type annotation.
For example, when we write:
fun square5( x : int ) : int { x*x } fun square5overview/square5: (x : int) -> int( xx: int : intstd/core/int: V ) : intstd/core/int: V { xx: int*std/core/(*): (int, int) -> intxx: int }
Then the assumed effect is totalstd/core/total: E. Sometimes, we write an effectful
function, but are not interested in explicitly writing down its effect type.
In that case, we can use a wildcard type which stands for some inferred
type. A wildcard type is denoted by writing an identifier prefixed with an
underscore, or even just an underscore by itself:
fun square6( x : int ) : _e int { println("I did not want to write down the io effect") return x*x } fun square6overview/square6: (x : int) -> console int( xx: int : intstd/core/int: V ) : _e_e: E intstd/core/int: V { printlnstd/core/println: (s : string) -> console ()("I did not want to write down the io effect") returnreturn: int xx: int*std/core/(*): (int, int) -> intxx: int }
Hover over square6overview/square6: (x : int) -> console int to see the inferred effect for _e
🔗#2.6. Semantics of effects
The inferred effects are not just considered as some extra type information on
functions. On the contrary, through the inference of effects, Koka has a very
strong connection to its denotational semantics. In particular, the full type
of a Koka functions corresponds directly to the type signature of the
mathematical function that describes its denotational semantics. For example,
using 〚t〛 to translate a type t into its corresponding
mathematical type signature, we have:
〚intstd/core/int: V -> totalstd/core/total: E intstd/core/int: V〛 | = | ℤ32 → ℤ32 |
〚intstd/core/int: V -> exnstd/core/exn: X intstd/core/int: V〛 | = | ℤ32 → (1 + ℤ32) |
〚intstd/core/int: V -> purestd/core/pure: E intstd/core/int: V〛 | = | ℤ32 → (1 + ℤ32)⊥ |
〚intstd/core/int: V -> <ststd/core/st: H -> E<h>,purestd/core/pure: E> intstd/core/int: V〛 | = | (Heap × ℤ32) → (Heap × (1 + ℤ32))⊥ |
In the above translation, we use 1 + t as a sum
where we have either a unit 1 (i.e. exception) or a type t, and we use
Heap × t for a product consisting of a pair of a
heap and a type t. From the above correspondence, we can immediately see that
a totalstd/core/total: E function is truly total in the mathematical sense, while a stateful
function (ststd/core/st: H -> E<h>) that can raise exceptions or not terminate (purestd/core/pure: E)
takes an implicit heap parameter, and either does not terminate (⊥) or
returns an updated heap together with either a value or an exception (1).
We believe that this semantic correspondence is the true power of full effect types and it enables effective equational reasoning about the code by a programmer. For almost all other existing programming languages, even the most basic semantics immediately include complex effects like heap manipulation and divergence. In contrast, Koka allows a layered semantics where we can easily separate out nicely behaved parts, which is essential for many domains, like safe LINQ queries, parallel tasks, tier-splitting, sand-boxed mobile code, etc.
🔗#2.7. Combining effects
Often, a function contains multiple effects, for example:
fun combine-effects() { val i = random-int() // non-deterministic error("hi") // exception raising combine-effects() // and non-terminating } fun combine-effectsoverview/combine-effects: forall<a> () -> <pure,ndet> a() { val ii: int = random-intstd/core/random-int: () -> ndet int() // non-deterministic errorstd/core/error: forall<a> (message : string) -> exn a("hi") // exception raising combine-effectsoverview/combine-effects: () -> <exn,ndet,div|_2856> _2835() // and non-terminating }
The effect assigned to combine-effectsoverview/combine-effects: forall<a> () -> <pure,ndet> a are ndetstd/core/ndet: X, divstd/core/div: X, and exnstd/core/exn: X. We
can write such combination as a row of effects as <divstd/core/div: X,exnstd/core/exn: X,ndetstd/core/ndet: X>. When
you hover over the combine-effectsoverview/combine-effects: forall<a> () -> <pure,ndet> a identifiers, you will see that the type
inferred is really <purestd/core/pure: E,ndetstd/core/ndet: X> where purestd/core/pure: E is a type alias defined as
alias pure = <div,exn> alias purestd/core/pure: E = <divstd/core/div: X,exnstd/core/exn: X>
🔗#2.8. Polymorphic effects
Many functions are polymorphic in their effect. For example, the
mapstd/core/map: forall<a,b,e> (xs : list<a>, f : (a) -> e b) -> e list<b> function
applies a function f to each element of a (finite) list. As such, the effect
depends on the effect of f, and the type of map becomes:
map : (xs : list<a>, f : (a) -> e b) -> e list<b> map : (xs : liststd/core/list: V -> V<a>, f : (a) -> e b) -> e liststd/core/list: V -> V<b>
We use single letters (possibly followed by digits) for polymorphic types.
Here, the map functions takes a list with elements of some type a, and a
function f that takes an element of type a and returns a new element of
type b. The final result is a list with elements of type b. Moreover,
the effect of the applied function e is also the effect of the map
function itself; indeed, this function has no other effect by itself since it
does not diverge, nor raises exceptions.
We can use the notation <l|e> to extend an effect e with another effect
l. This is used for example in the whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () function which has type:
whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () : ( pred : () -> <divstd/core/div: X|e> boolstd/core/bool: V, action : () -> <divstd/core/div: X|e> () ) -> <divstd/core/div: X|e> ().
The whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () function takes a
predicate function and an action to perform, both with effect <divstd/core/div: X|e>.
Indeed, since while may diverge depending on the predicate its effect must
include divergence.
The reader may be worried that the type of whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () forces the predicate and
action to have exactly the same effect <divstd/core/div: X|e>, which even includes
divergence. However, when effects are inferred at the call-site, both the
effects of predicate and action are extended automatically until they match.
This ensures we take the union of the effects in the predicate and action.
Take for example the following loop
fun main() { looptest() } fun looptest() { while { odd?(random-int()) } { error("<b>") } } fun looptestoverview/looptest: () -> <pure,ndet> ()() { whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> () { odd?std/core/odd?: (int) -> bool(random-intstd/core/random-int: () -> ndet int()) } { errorstd/core/error: forall<a> (message : string) -> exn a("<b>") } }
In the above program, Koka infers that the predicate odd(random-intstd/core/random-int: () -> ndet int()) has
effect <ndetstd/core/ndet: X|e1> while the action has effect <exnstd/core/exn: X|e2> for some e1 and e2.
When applying whilestd/core/while: forall<e> (predicate : () -> <div|e> bool, action : () -> <div|e> ()) -> <div|e> (), those
effects are unified to the type <exnstd/core/exn: X,ndetstd/core/ndet: X,divstd/core/div: X|e3> for some e3 (which can
be seen by hovering over the looptestoverview/looptest: () -> <pure,ndet> () identifier)
🔗#2.9. Isolated state
The Fibonacci numbers are a sequence where each subsequent Fibonacci number is
the sum of the previous two, where fiboverview/fib: (n : int) -> div int(0) == 0 and fiboverview/fib: (n : int) -> div int(1) == 1. We can
easily calculate Fibonacci numbers using a recursive function:
fun main() { println(fib(10)) } fun fib(n : int) : div int { if (n <= 0) then 0 elif (n == 1) then 1 else fib(n - 1) + fib(n - 2) } fun fiboverview/fib: (n : int) -> div int(nn: int : intstd/core/int: V) : divstd/core/div: X intstd/core/int: V { if (nstd/core/True: bool <=std/core/(<=).1: (int, int) -> bool 0) then 0 elif (nstd/core/True: bool ==std/core/(==).1: (int, int) -> bool 1) then 1 else fiboverview/fib: (n : int) -> div int(nn: int -std/core/(-): (int, int) -> int 1) +std/core/(+): (int, int) -> int fiboverview/fib: (n : int) -> div int(nn: int -std/core/(-): (int, int) -> int 2) }
Note that the type inference engine is currently not powerful enough to
prove that this recursive function always terminates, which leads to
inclusion of the divergence effect divstd/core/div: X in the result type.
Here is another version of the Fibonacci function but this time
implemented using local state. We use the repeat function to
iterate n times:
fun main() { println(fib2(10)) } fun fib2(n) { var x := 0 var y := 1 repeat(n) { val y0 = y y := x+y x := y0 } return x } fun fib2overview/fib2: (n : int) -> int(nn: int) { var xx: ref<_1056,int> := 0 var yy: ref<_1056,int> := 1 repeatstd/core/repeat.1: forall<e> (n : int, action : () -> e ()) -> e ()(nn: int) { val y0y0: int = yy: ref<_1056,int> yy: ref<_1056,int> :=std/core/set: forall<h,a> (ref : ref<h,a>, assigned : a) -> (write<h>) () xx: ref<_1056,int>+std/core/(+): (int, int) -> intyy: ref<_1056,int> xx: ref<_1056,int> :=std/core/set: forall<h,a> (ref : ref<h,a>, assigned : a) -> (write<h>) () y0y0: int } returnreturn: int xx: ref<_1056,int> }
The var declaration declares a variable that can be assigned too using the
(:=) operator. In contrast, a regular equality sign, as in y0 = y
introduces an immutable value y0. For clarity, one can actually write val y0 = y
for such declaration too but we usually leave out the val keyword.
Local variables declared using var are actually syntactic sugar for
allocating explicit references to mutable cells. A reference to a mutable
integer is allocated using r = refstd/core/ref: forall<h,a> (value : a) -> (alloc<h>) ref<h,a>(0) (since the reference itself is
actually a value!), and can be dereferenced using the bang operator, as !r.
The desugared version of our previously Fibonacci function can be written
using explicit references as
fun main() { println(fib3(10)) } fun fib3(n) { val x = ref(0) val y = ref(1) repeat(n) { val y0 = !y y := !x + !y x := y0 } return !x } fun fib3overview/fib3: (n : int) -> int(nn: int) { val xx: ref<_1515,int> = refstd/core/ref: forall<h,a> (value : a) -> (alloc<h>) ref<h,a>(0) val yy: ref<_1515,int> = refstd/core/ref: forall<h,a> (value : a) -> (alloc<h>) ref<h,a>(1) repeatstd/core/repeat.1: forall<e> (n : int, action : () -> e ()) -> e ()(nn: int) { val y0y0: int = !std/core/(!).1: forall<h,a,e> (ref : ref<h,a>) -> <read<h>|e> a with hdiv<h,a,e>yy: ref<_1515,int> yy: ref<_1515,int> :=std/core/set: forall<h,a> (ref : ref<h,a>, assigned : a) -> (write<h>) () !std/core/(!).1: forall<h,a,e> (ref : ref<h,a>) -> <read<h>|e> a with hdiv<h,a,e>xx: ref<_1515,int> +std/core/(+): (int, int) -> int !std/core/(!).1: forall<h,a,e> (ref : ref<h,a>) -> <read<h>|e> a with hdiv<h,a,e>yy: ref<_1515,int> xx: ref<_1515,int> :=std/core/set: forall<h,a> (ref : ref<h,a>, assigned : a) -> (write<h>) () y0y0: int } returnreturn: int !std/core/(!).1: forall<h,a,e> (ref : ref<h,a>) -> <read<h>|e> a with hdiv<h,a,e>xx: ref<_1515,int> }
As we can see, using var declarations is quite convenient since such
declaration automatically adds a dereferencing operator to all occurrences
except on the left-hand side of an assignment.
When we look at the types inferred for the references, we see that x and y
have type refstd/core/ref: (H, V) -> V<h,intstd/core/int: V> which stands for a reference to a mutable value of
type intstd/core/int: V in some heap h. The effects on heaps are allocation as
heap<h>, reading from a heap as readstd/core/read: H -> X<h> and writing to a heap as
writestd/core/write: H -> X<h>. The combination of these effects is called stateful and denoted
with the alias ststd/core/st: H -> E<h>.
Clearly, the effect of the body of fib3overview/fib3: (n : int) -> int is ststd/core/st: H -> E<h>; but when we hover over
fib3overview/fib3: (n : int) -> int, we see the type inferred is actually the totalstd/core/total: E effect: (n:intstd/core/int: V) -> intstd/core/int: V.
Indeed, even though fib3overview/fib3: (n : int) -> int is stateful inside, its side-effects can
never be observed. It turns out that we can safely discard the ststd/core/st: H -> E<h>
effect whenever the heap type h cannot be referenced outside this function,
i.e. it is not part of an argument or return type. More formally, the Koka
compiler proves this by showing that a function is fully polymorphic in the
heap type h and applies the runstd/core/run: forall<e,a> (action : forall<h> () -> <st<h>|e> a) -> e a function (corresponding to runST in
Haskell) to discard the ststd/core/st: H -> E<h> effect.
The Garsia-Wachs algorithm is nice example where side-effects are used
internally across function definitions and data structures, but where the
final algorithm itself behaves like a pure function, see the
lib/demo/garsiaWachs.kk example in the distribution.
🔗#2.10. A larger example: cracking Caesar encoding
Enough about effects and imperative updates. Let's look at some more functional examples :-) For example, cracking Caesar encoded strings:
fun main() { test-uncaesar() } fun encode( s : string, shift : int ) { function encode-char(c) { if (c < 'a' || c > 'z') return c val base = (c - 'a').int val rot = (base + shift) % 26 (rot.char + 'a') } s.map(encode-char) } // The letter frequency table for English val english = [8.2,1.5,2.8,4.3,12.7,2.2, 2.0,6.1,7.0,0.2,0.8,4.0,2.4, 6.7,7.5,1.9,0.1, 6.0,6.3,9.1, 2.8,1.0,2.4,0.2,2.0,0.1] // Small helper functions fun percent( n : int, m : int ) { 100.0 * (n.double / m.double) } fun rotate( xs, n ) { xs.drop(n) + xs.take(n) } // Calculate a frequency table for a string fun freqs( s : string ) : list<double> { val lowers = list('a','z') val occurs = lowers.map( fun(c){ s.count(c.string) }) val total = occurs.sum occurs.map( fun(i){ percent(i,total) } ) } // Calculate how well two frequency tables match according // to the _chi-square_ statistic. fun chisqr( xs : list<double>, ys : list<double> ) : double { zipwith(xs,ys, fun(x,y){ ((x - y)^2.0)/y } ).foldr(0.0,(+)) } // Crack a Caesar encoded string fun uncaesar( s : string ) : string { val table = freqs(s) // build a frequency table for `s` val chitab = list(0,25).map( fun(n) { // build a list of chisqr numbers for each shift between 0 and 25 chisqr( table.rotate(n), english ) }) val min = chitab.minimum() // find the mininal element val shift = chitab.index-of( fun(f){ f == min } ).negate // and use its position as our shift s.encode( shift ) } fun test-uncaesar() { println( uncaesar( "nrnd lv d ixq odqjxdjh" ) ) } // The letter frequency table for English val englishoverview/english: list<double> = [std/core/Cons: forall<a> (head : a, tail : list<a>) -> list<a>8.2,1.5,2.8,4.3,12.7,2.2, 2.0,6.1,7.0,0.2,0.8,4.0,2.4, 6.7,7.5,1.9,0.1, 6.0,6.3,9.1, 2.8,1.0,2.4,0.2,2.0,0.1]std/core/Nil: forall<a> list<a> // Small helper functions fun percentoverview/percent: (n : int, m : int) -> double( nn: int : intstd/core/int: V, mm: int : intstd/core/int: V ) { 100.0 *std/core/(*).1: (double, double) -> double (nn: int.doublestd/core/double: (int) -> double /std/core/(/).1: (double, double) -> double mm: int.doublestd/core/double: (int) -> double) } fun rotateoverview/rotate: forall<a> (xs : list<a>, n : int) -> list<a>( xsxs: list<_2482>, nn: int ) { xsxs: list<_2482>.dropstd/core/drop: forall<a> (xs : list<a>, n : int) -> list<a>(nn: int) +std/core/(+).4: forall<a> (xs : list<a>, ys : list<a>) -> list<a> xsxs: list<_2482>.takestd/core/take: forall<a> (xs : list<a>, n : int) -> list<a>(nn: int) } // Calculate a frequency table for a string fun freqsoverview/freqs: (s : string) -> list<double>( ss: string : stringstd/core/string: V ) : liststd/core/list: V -> V<doublestd/core/double: V> { val lowerslowers: list<char> = liststd/core/list.2: (lo : char, hi : char) -> total list<char>('a','z') val occursoccurs: list<int> = lowerslowers: list<char>.mapstd/core/map.5: forall<a,b,e> (xs : list<a>, f : (a) -> e b) -> e list<b>( fun(cc: char){ ss: string.countstd/core/count: (s : string, pattern : string) -> int(cc: char.stringstd/core/string: (c : char) -> string) }) val totaltotal: int = occursoccurs: list<int>.sumstd/core/sum: (xs : list<int>) -> int occursoccurs: list<int>.mapstd/core/map.5: forall<a,b,e> (xs : list<a>, f : (a) -> e b) -> e list<b>( fun(ii: int){ percentoverview/percent: (n : int, m : int) -> double(ii: int,totaltotal: int) } ) } // Calculate how well two frequency tables match according // to the _chi-square_ statistic. fun chisqroverview/chisqr: (xs : list<double>, ys : list<double>) -> double( xsxs: list<double> : liststd/core/list: V -> V<doublestd/core/double: V>, ysys: list<double> : liststd/core/list: V -> V<doublestd/core/double: V> ) : doublestd/core/double: V { zipwithstd/core/zipwith: forall<a,b,c,e> (xs : list<a>, ys : list<b>, f : (a, b) -> e c) -> e list<c>(xsxs: list<double>,ysys: list<double>, fun(xx: double,yy: double){ ((xx: double -std/core/(-).2: (double, double) -> double yy: double)^std/core/(^): (d : double, p : double) -> double2.0)/std/core/(/).1: (double, double) -> doubleyy: double } ).foldrstd/core/foldr: forall<a,b,e> (xs : list<a>, z : b, f : (a, b) -> e b) -> e b(0.0,(+)std/core/(+).2: (double, double) -> double) } // Crack a Caesar encoded string fun uncaesaroverview/uncaesar: (s : string) -> string( ss: string : stringstd/core/string: V ) : stringstd/core/string: V { val tabletable: list<double> = freqsoverview/freqs: (s : string) -> list<double>(ss: string) // build a frequency table for `s` val chitabchitab: list<double> = liststd/core/list: (lo : int, hi : int) -> total list<int>(0,25).mapstd/core/map.5: forall<a,b,e> (xs : list<a>, f : (a) -> e b) -> e list<b>( fun(nn: int) { // build a list of chisqr numbers for each shift between 0 and 25 chisqroverview/chisqr: (xs : list<double>, ys : list<double>) -> double( tabletable: list<double>.rotateoverview/rotate: forall<a> (xs : list<a>, n : int) -> list<a>(nn: int), englishoverview/english: list<double> ) }) val minmin: double = chitabchitab: list<double>.minimumstd/core/minimum.1: (xs : list<double>) -> double() // find the mininal element val shiftshift: int = chitabchitab: list<double>.index-ofstd/core/index-of: forall<a> (xs : list<a>, pred : (a) -> bool) -> int( fun(ff: double){ ff: double ==std/core/(==).2: (double, double) -> bool minmin: double } ).negatestd/core/negate: (i : int) -> int // and use its position as our shift ss: string.encodeoverview/encode: (s : string, shift : int) -> string( shiftshift: int ) } fun test-uncaesaroverview/test-uncaesar: () -> console ()() { printlnstd/core/println: (s : string) -> console ()( uncaesaroverview/uncaesar: (s : string) -> string( "nrnd lv d ixq odqjxdjh" ) ) }
The val keyword declares a static value. In the example, the value englishoverview/english: list<double>
is a list of floating point numbers (of type doublestd/core/double: V) denoting the average
frequency for each letter. The function freqsoverview/freqs: (s : string) -> list<double> builds a frequency table for a
specific string, while the function chisqroverview/chisqr: (xs : list<double>, ys : list<double>) -> double calculates how well two frequency
tables match. In the function crack these functions are used to find a
shift value that results in a string whose frequency table matches the
englishoverview/english: list<double> one the closest – and we use that to decode the string. Let's try
it out in the editor!
🔗#2.11. Optional and named parameters
Being a function-oriented language, Koka has powerful support for function
calls where it supports both optional and named parameters. For example, the
function replace-allstd/core/replace-all: (s : string, pattern : string, repl : string) -> string takes a string, a pattern pattern, and
a replacement string repl:
fun main() { println(world()) } fun world() { replace-all("hi there", "there", "world") // returns "hi world" } fun worldoverview/world: () -> string() { replace-allstd/core/replace-all: (s : string, pattern : string, repl : string) -> string("hi there", "there", "world") // returns "hi world" }
Using named parameters, we can also write the function call as:
fun main() { println(world2()) } fun world2() { return "hi there".replace-all( repl="world", pattern="there" ) } fun world2overview/world2: () -> string() { returnreturn: string "hi there".replace-allstd/core/replace-all: (s : string, pattern : string, repl : string) -> string( repl="world", pattern="there" ) }
Optional parameters let you specify default values for parameters that do not
need to be provided at a call-site. As an example, let's define a function
sublistoverview/sublist: forall<a> (xs : list<a>, start : int, len : ?int) -> list<a> that takes a list, a start position, and the length len of the desired
sublist. We can make the len parameter optional and by default return all
elements following the start position by picking the length of the input list by
default:
fun main() { println( ['a','b','c'].sublist(1).string ) } fun sublist( xs : list<a>, start : int, len : int = xs.length ) : list<a> { if (start <= 0) return xs.take(len) match(xs) { Nil -> Nil Cons(_,xx) -> xx.sublist(start - 1, len) } } fun sublistoverview/sublist: forall<a> (xs : list<a>, start : int, len : ?int) -> list<a>( xsxs: list<_3303> : liststd/core/list: V -> V<aa: V>, startstart: int : intstd/core/int: V, lenlen: ?int : intstd/core/optional: V -> V = xsxs: list<_3303>.lengthstd/core/length.1: forall<a> (xs : list<a>) -> int ) : liststd/core/list: V -> V<aa: V> { if (startstd/core/True: bool <=std/core/(<=).1: (int, int) -> bool 0) returnreturn: list<_3303> xsxs: list<_3303>.takestd/core/take: forall<a> (xs : list<a>, n : int) -> list<a>(lenlen: int)std/core/(): () match(xsxs: list<_3303>) { Nilstd/core/Nil: forall<a> list<a> -> Nilstd/core/Nil: forall<a> list<a> Consstd/core/Cons: forall<a> (head : a, tail : list<a>) -> list<a>(_,xxxx: list<_3303>) -> xxxx: list<_3303>.sublistoverview/sublist: forall<a> (xs : list<a>, start : int, len : ?int) -> list<a>(startstart: int -std/core/(-): (int, int) -> int 1, lenlen: int) } }
Hover over the sublistoverview/sublist: forall<a> (xs : list<a>, start : int, len : ?int) -> list<a> identifier to see its full type, where the len
parameter has gotten an optional intstd/core/int: V type signified by the question mark:
:?int.
🔗#2.12. Structs
An important aspect of a function-oriented language is to be able to define rich data types over which the functions work. A common data type is that of a struct or record. Here is an example of a struct that contains information about a person:
struct person( age : int, name : string, realname : string = name ) val gaga = Person( 25, "Lady Gaga" ) struct personoverview/person: V( ageage: int : intstd/core/int: V, namename: string : stringstd/core/string: V, realnamerealname: string : stringstd/core/string: V = namename: string ) val gagaoverview/gaga: person = Personoverview/Person: (age : int, name : string, realname : ?string) -> person( 25, "Lady Gaga" )
Every struct (and other data types) come with constructor functions to
create instances, as in Personoverview/Person: (age : int, name : string, realname : string) -> person(25,gagaoverview/gaga: person). Moreover, these
constructors can use named arguments so we can also call the constructor
as Personoverview/Person: (age : int, name : string, realname : string) -> person( nameoverview/name: (person : person) -> string = "Lady Gaga", ageoverview/age: (person : person) -> int = 25, realnameoverview/realname: (person : person) -> string = "Stefani Joanne Angelina Germanotta" )
which is quite close regular record syntax but without any special rules;
it is just functions all the way down!
Also, Koka automatically generates accessor functions for each field in a
struct (or other data type), and we can access the ageoverview/age: (person : person) -> int of a personoverview/person: V as
gagaoverview/gaga: person.ageoverview/age: (person : person) -> int (which is of course just syntactic sugar for ageoverview/age: (person : person) -> int(gagaoverview/gaga: person)).
🔗#2.13. Copying
By default, all structs (and other data types) are immutable. Instead of
directly mutating a field in a struct, we usually return a new struct where
the fields are updated. For example, here is a birthdayoverview/birthday: (p : person) -> person function that
increments the ageoverview/age: (person : person) -> int field:
fun main() { println( gaga.birthday.age ) } struct person( age : int, name : string, realname : string = name ) val gaga = Person( 25, "Lady Gaga" ) fun birthday( p : person ) : person { return p( age = p.age + 1 ) } fun birthdayoverview/birthday: (p : person) -> person( pp: person : personoverview/person: V ) : personoverview/person: V { returnreturn: person pp: person( age = pp: person.ageoverview/age: (person : person) -> int +std/core/(+): (int, int) -> int 1 ) }
Here, birthdayoverview/birthday: (p : person) -> person returns a fresh personoverview/person: V which is equal to p but with the
ageoverview/age: (person : person) -> int incremented. The syntax p(...) is sugar for calling the copy constructor of
a personoverview/person: V. This constructor is also automatically generated for each data
type, and is basically defined as:
fun main() { println( gaga.copy().age ) } struct person( age : int, name : string, realname : string = name ) val gaga = Person( 25, "Lady Gaga" ) fun copy( p, age = p.age, name = p.name, rname = p.realname ) { return Person(age, name, rname) } fun copyoverview/copy: (p : person, age : ?int, name : ?string, rname : ?string) -> person( pp: person, ageage: ?int = pp: person.ageoverview/age: (person : person) -> int, namename: ?string = pp: person.nameoverview/name: (person : person) -> string, rnamername: ?string = pp: person.realnameoverview/realname: (person : person) -> string ) { returnreturn: person Personoverview/Person: (age : int, name : string, realname : ?string) -> person(ageage: int, namename: string, rnamername: string) }
When arguments follow a data value, as in p( age = age + 1), it is desugared to call this
copy function, as in p.copyoverview/copy: (p : person, age : ?int, name : ?string, rname : ?string) -> person( ageoverview/age: (person : person) -> int = p.ageoverview/age: (person : person) -> int+1 ). Again, there are no special
rules for record updates and everything is just function calls with optional
and named parameters.
🔗#2.14. More data types
Koka also supports algebraic data types where there are multiple alternatives. For example, here is an enumeration:
type colors { Red Green Blue } type colors { Red Green Blue }
Special cases of these enumerated types are the voidstd/core/void: V type which has no
alternatives (and therefore there exists no value with this type), the unit
type ()std/core/(): V which has just one constructor, also written as () (and
therefore, there exists only one value with the type ()std/core/(): V, namely ()), and
finally the boolean type boolstd/core/bool: V with two constructors Truestd/core/True: bool and Falsestd/core/False: bool.
type void type () { () } type bool { False True } type voidstd/core/void: V type () { () } type boolstd/core/bool: V { Falsestd/core/False: bool Truestd/core/True: bool }
Constructors can have parameters. For example, here is how to create a
number type which is either an integer or the infinity value:
type number { Infinity Integer( i : int ) } type number { Infinity Integer( i : intstd/core/int: V ) }
We can create such number by writing integer(1) or infinity. Moreover,
data types can be polymorphic and recursive. Here is the definition of the
liststd/core/list: V -> V type which is either empty (Nilstd/core/Nil: forall<a> list<a>) or is a head element followed by a
tail list (Consstd/core/Cons: forall<a> (head : a, tail : list<a>) -> list<a>):
type list<a> { Nil Cons( head : a, tail : list<a> ) } type liststd/core/list: V -> V<a> { Nilstd/core/Nil: forall<a> list<a> Consstd/core/Cons: forall<a> (head : a, tail : list<a>) -> list<a>( head : a, tail : liststd/core/list: V -> V<a> ) }
Koka automatically generates accessor functions for each named parameter. For
lists for example, we can access the head of a list as Consstd/core/Cons: forall<a> (head : a, tail : list<a>) -> list<a>(1,Nilstd/core/Nil: forall<a> list<a>).head.
We can now also see that struct types are just syntactic sugar for regular a
type with a single constructor of the same name as the type. For example,
our earlier personoverview/person: V struct, defined as
struct person( age : int, name : string, realname : string = name ) struct personoverview/person: V( ageoverview/age: (person : person) -> int : intstd/core/int: V, nameoverview/name: (person : person) -> string : stringstd/core/string: V, realnameoverview/realname: (person : person) -> string : stringstd/core/string: V = nameoverview/name: (person : person) -> string )
desugars to:
type person { Person( age : int, name : string, realname : string = name ) } type personoverview/person: V { Personoverview/Person: (age : int, name : string, realname : string) -> person( ageoverview/age: (person : person) -> int : intstd/core/int: V, nameoverview/name: (person : person) -> string : stringstd/core/string: V, realnameoverview/realname: (person : person) -> string : stringstd/core/string: V = nameoverview/name: (person : person) -> string ) }
🔗#2.15. Matching
Todo
🔗#2.16. Inductive, co-inductive, and recursive types
For the purposes of equational reasoning and termination checking, a type
declaration is limited to finite inductive types. There are two more
declarations, namely cotype and rectype that allow for co-inductive types,
and arbitrary recursive types respectively.
🔗#3. Koka language specification
This is the draft language specification of the Koka language, version 0.7.
Currently only the lexical and context-free grammar are specified.
The standard libraries are documented separately.
🔗#3.1. Lexical syntax
We define the grammar and lexical syntax of the language using standard BNF notation where non-terminals are generated by alternative patterns:
| nonterm | ::= | pattern1 pattern2 |
In the patterns, we use the following notations:
terminal | A terminal symbol | |
x0A | A character with hexadecimal code 0A | |
a..f | The characters from a to f | |
| ( pattern ) | Grouping | |
| [ pattern ] | Optional occurrence of pattern | |
| { pattern } | Zero or more occurrences of pattern | |
| pattern1 pattern2 | Choice: either pattern1 or pattern2 | |
| pattern<!diff> | Difference: elements generated by pattern except those in diff | |
| pattern<postfix> | Postfix: any elements in pattern that are directly followed by postfix | |
| <prefix>pattern | Prefix: any elements in pattern that directly follow prefix | |
| nonterm[lex] | Generate nonterm by drawing lexemes from lex |
Care must be taken to distinguish meta-syntax such as
and )
from concrete terminal symbols as | and ). In the specification
the order of the productions is not important and at each point the
longest matching lexeme is preferred. For example, even though
function is a reserved word, the word functions is considered a
single identifier. A prefix or postfix pattern is included
when considering a longest match.
🔗#3.1.1. Source code
Source code consists of a sequence of 8-bit characters. Valid characters in actual program code consists strictly of ASCII characters which range from 0 to 127 and can be encoded in 7-bits. Only comments, string literals, and character literals are allowed to contain extended 8-bit characters.
🔗#3.1.2. Encoding
A program source is assumed to be UTF-8 encoded which allows comments,
string literals, and character literals to contain (encoded) unicode
characters. Moreover, the grammar is designed such that a lexical
analyzer and parser can directly work on source files without doing UTF-8
decoding or unicode category identification. To further facilitate the
processing of UTF-8 encoded files the lexical analyzer ignores an initial
byte-order mark that some UTF-8 encoders insert. In particular, any
program source is allowed to start with three byte-order mark bytes
0xEF, 0xBB, and 0xBF, which are ignored.
🔗#3.2. Lexical grammar
In the specification of the lexical grammar all white space is explicit and there is no implicit white space between juxtaposed symbols. The lexical token stream is generated by the non-terminal lex which consists of lexemes and whitespace.
Before doing lexical analysis, there is a linefeed character inserted at the start and end of the input, which makes it easier to specify line comments and directives.
🔗#3.2.1. Lexical tokens
| lex | ::= | lexeme whitespace | |
| lexeme | ::= | conid qconid | |
| | | varid qvarid | ||
| | | op opid qopid wildcard | ||
| | | natural float string char | ||
| | | reserved opreserved | ||
| | | special funanon |
The main program consists of whitespace or lexeme's. The context-free grammar will draw it's lexemes from the lex production.
🔗#3.2.2. Identifiers
| anyid | ::= | varid qvarid opid qopid conid qconid | |
| qconid | ::= | modulepath conid | |
| qvarid | ::= | modulepath lowerid | |
| modulepath | ::= | lowerid / { lowerid / } | |
| conid | ::= | upperid | |
| varid | ::= | lowerid<!reserved> | |
| lowerid | ::= | lower idtail | |
| upperid | ::= | upper idtail | |
| wildcard | ::= | _ idtail | |
| typevarid | ::= | letter { digit } | |
| idtail | ::= | { idchar } [ idfinal ] | |
| idchar | ::= | letter digit _ - | |
| idfinal | ::= | ? { ' } | |
| funanon | ::= | (fun function)< <(> | (anonymous functions must be followed by a ( or <)) |
| reserved | ::= | infix infixr infixl prefix | |
| | | type cotype rectype alias | ||
| | | forall exists some | ||
| | | fun function val var con | ||
| | | if then else elif match return | ||
| | | import as | ||
| | | public private abstract | ||
| | | interface instance with | ||
| | | external inline include | ||
| | | effect handle handler linear | ||
| | | yield qualified hiding | (future reserved words) |
Identifiers always start with a letter, may contain underscores and dashes, and can end with a question mark or primes. Like in Haskell, constructors always begin with an uppercase letter while regular identifiers are lowercase. The rationale is to visibly distinguish constants from variables in pattern matches. Here are some example of valid identifiers:
x concat1 visit-left nil? x' Cons True x concat1 visit-left nil?std/core/nil?: forall<a> (list : list<a>) -> bool x' Consstd/core/Cons: forall<a> (head : a, tail : list<a>) -> list<a> Truestd/core/True: bool
To avoid confusion with the subtraction operator, the occurrences of dashes are restricted in identifiers. After lexical analysis, only identifiers where each dash is surrounded on both sides with a letter are accepted:
fold-right
n-1 // illegal, a digit cannot follow a dash
n - 1 // n minus 1
n-x-1 // illegal, a digit cannot follow a dash
n-x - 1 // identifier "n-x" minus 1
n - x - 1 // n minus x minus 1
Qualified identifiers are prefixed with a module path. Module paths can be partial as long as they are unambiguous.
core/map
std/core/(&)🔗#3.2.3. Operators and symbols
| qopid | ::= | modulepath opid | |
| opid | ::= | ( symbols ) | |
| op | ::= | symbols<! opreservedoptype> || | |
| symbols | ::= | symbol { symbol } / | |
| symbol | ::= | $ % & * + | |
| | | ~ ! \ ^ # | ||
| | | = . : - ? | ||
| | | anglebar | ||
| anglebar | ::= | < > | | |
| opreserved | ::= | = . : -> | |
| optype | ::= | anglebar anglebar { anglebar } | |
| special | ::= | { } ( ) [ ] | ; , lapp lidx | |
| lapp | ::= | <apply>( | |
| lidx | ::= | <apply>[ | |
| apply | ::= | ) ] anyid | |
🔗#3.2.4. Literals
| string | ::= | @" { graphic<"> utf8 space tab newline "" } " | (raw string) |
| | | " { graphic<"\> utf8 space escape } " | ||
| char | ::= | ' ( graphic<'\> utf8 space escape ) ' | |
| escape | ::= | \ ( charesc hexesc ) | |
| charesc | ::= | n r t \ " ' | |
| hexesc | ::= | x hexdigit2 u hexdigit4 U hexdigit4 hexdigit2 | |
| hexdigit4 | ::= | hexdigit hexdigit hexdigit hexdigit | |
| hexdigit2 | ::= | hexdigit hexdigit | |
| float | ::= | decimal . decimal [ exponent ] | |
| exponent | ::= | (e | E) [ - | + ] decimal | |
| natural | ::= | decimal 0 (x | X) hexadecimal | |
| decimal | ::= | digit { digit } | |
| hexadecimal | ::= | hexdigit { hexdigit } |
🔗#3.2.5. White space
| whitespace | ::= | white { white } newline | |
| white | ::= | space | |
| | | linecomment blockcomment | ||
| | | linedirective | ||
| linecomment | ::= | // { linechar } | |
| linedirective | ::= | newline # { linechar } | |
| linechar | ::= | graphic utf8 space tab | |
| blockcomment | ::= | /* blockpart { blockcomment blockpart } */ | (allows nested comments) |
| blockpart | ::= | blockchars<blockchars (/**/) blockchars> | |
| blockchars | ::= | { blockchar } | |
| blockchar | ::= | graphic utf8 space tab newline |
🔗#3.2.6. Character classes
| letter | ::= | upper lower | |
| upper | ::= | A..Z | |
| lower | ::= | a..z | |
| digit | ::= | 0..9 | |
| hexdigit | ::= | a..f A..F digit | |
| newline | ::= | [ return ] linefeed | (windows or unix style end of line) |
| space | ::= | x20 | (a space) |
| tab | ::= | x09 | (a tab (\t)) |
| linefeed | ::= | x0A | (a line feed (\n)) |
| return | ::= | x0D | (a carriage return (\r)) |
| graphic | ::= | x21..x7E | (a visible character) |
| utf8 | ::= | xC0 x80 | (encoded 0 character) |
| | | (xC2..xDF) cont | ||
| | | xE0 (xA0..xBF) cont | ||
| | | (xE1..xEC) cont cont | ||
| | | xED (x80..x9F) cont | ||
| | | (xEE..xEF) cont cont | ||
| | | xF0 (x90..xBF) cont cont | ||
| | | (xF1..xF3) cont cont cont | ||
| | | xF4 (x80..x8F) cont cont | ||
| cont | ::= | x80..xBF |
🔗#3.2.7. Semicolons?
Many languages have rules to avoid writing semicolons to separate statements. Even though most of the time these rules are quite intuitive to use, their actual definition can be quite subtle. For example, both Scala and Go require remembering specific tokens to know precisely when semicolon insertion takes place. In the case of JavaScript and Haskell (I am sad to admit) the precise behavior is bizarrely complex where semicolon insertion depends on the interaction between the lexer and parser.
In Koka, the grammar is carefully constructed to not need any statement separator at all and semicolons are never required by the grammar. They are still allowed in the grammar but strictly to help new programmers that are used to putting semicolons at the end of statements.
The construction of a grammar that does not need statement separators is
also good from a human perspective. The reason semicolons are required is
to resolve ambiguities in the syntax. When such ambiguities do not occur
in the first place, that also removes a cognitive burden from the
programmer. In particular, Koka statements often start with a keyword,
like val or match, signifying intention to both programmer and parser.
In other cases, we restrict the expression grammar. For example, one
reason why C requires semicolons is due to prefix- and postfix operators.
If we write p ++ q the C parser needs a semicolon in order to know if
we meant p++; q or p; ++q. Such ambiguity is resolved in Koka by
not having postfix operators and restricting prefix operators to !
and ~.
One other reason that Koka can do without a statement separator is the effect inference system: without such effect inference subtle bugs may occur if we leave out semicolons. For example, consider the following function:
fun square( x : int ) { x * x } fun squarespec/square: (x : int) -> int( xx: int : intstd/core/int: V ) { xx: int *std/core/(*): (int, int) -> int xx: int }
which returns the square of x. Suppose now that we forgot to put
in the multiplication operation, giving:
fun square-wrong( x : int ) { x x } fun square-wrong( x : intstd/core/int: V ) { x x }
The Koka grammar sees this as 2 separate statements now, i.e. as x; x
returning x instead. In a language without effect inference it is hard
to detect such errors, but the Koka type system rejects this program:
> fun square-wrong(x:int) { x x }
^
((1),27): error: expression has no effect and is unused
hint: did you forget an operator? or is there a space between an application?
🔗#3.2.8. Implementation
There is a full Flex (Lex) implementation of lexical analysis, Ultimately, the Flex implementation serves as the specification, and this document and the Flex implementation should always be in agreement.
🔗#3.3. Context-free syntax
The grammar specification starts with the non terminal module which draws its lexical tokens from lex where all whitespace tokens are implicitly ignored.
🔗#3.3.1. Modules
| module[ lex ] | ::= | [ moduledecl ] modulebody | |
| moduledecl | ::= | semis [ visibility ] moduleid | |
| moduleid | ::= | qvarid varid | |
| modulebody | ::= | { semis declarations } semis | |
| | | semis declarations | ||
| visibility | ::= | public private | |
| semis | ::= | { ; } |
🔗#3.3.2. Top level declarations
| declarations | ::= | { import semis } { fixitydecl semis } { topdecl semis } | |
| import | ::= | [ visibility ] import [ moduleid = ] moduleid | |
| fixitydecl | ::= | [ visibility ] fixity natural identifier { , identifier } | |
| fixity | ::= | infixl | |
| | | infixr | ||
| | | infix | ||
| topdecl | ::= | [ visibility ] puredecl | |
| | | [ visibility ] aliasdecl | ||
| | | [ visibility ] typedecl | ||
| | | [ visibility ] externdecl | ||
| | | abstract typedecl |
🔗#3.3.3. Type declarations
| aliasdecl | ::= | alias typeid [ typeparams ] [ kannot ] = type | |
| typedecl | ::= | typesort [ typemod ] typeid [ typeparams ] [ kannot ] [ typebody ] | |
| | | struct typeid [ typeparams ] [ kannot ] [ conparams ] | ||
| | | effect [ linear ] typeid [ typeparams ] [ kannot ] [ opdecls ] | ||
| typesort | ::= | type cotype rectype | |
| typemod | ::= | open extendstd/core/extend: (slice : sslice, count : int) -> sslice | |
| typeid | ::= | varid | |
| | | [] | ||
| | | ( { , } ) | ||
| | | < > | ||
| | | < | > | ||
| typeparams | ::= | < [ tbinders ] > | |
| tbinders | ::= | tbinder { , tbinder } | |
| tbinder | ::= | varid [ kannot ] | |
| typebody | ::= | { semis { constructor semis } } | |
| constructor | ::= | [ con ] conid [ typeparams ] [ conparams ] | |
| conparams | ::= | lparen [ conparam { , conparam } ] ) | |
| conparam | ::= | [ paramid ] : paramtype [ = expr ] | |
| lparen | ::= | lapp ( | |
| opdecls | ::= | { semis { opdecl semis } } | |
| opdecl | ::= | [ visibility ] identifier [ typeparams ] opparams [ : tatom ] | |
| opparams | ::= | lparen [ opparam { , opparam } ] ) | |
| opparam | ::= | [ paramid ] : paramtype |
🔗#3.3.4. Value and function declarations
| puredecl | ::= | val valdecl | |
| | | (funfunction) fundecl | ||
| valdecl | ::= | binder = expr | |
| binder | ::= | identifier [ : type ] | |
| fundecl | ::= | somes funid fundef bodyexpr | |
| fundef | ::= | [ typeparams ] parameters [ : tresult ] [ qualifier ] | |
| funid | ::= | identifier | |
| | | [ { , } ] | (indexing operator) | |
| parameters | ::= | lparen [ parameter { , parameter } ] ) | |
| parameter | ::= | paramid [ : paramtype ] [ = expr ] | |
| paramid | ::= | identifier wildcard | |
| paramtype | ::= | type | |
| | | ? type | (optional parameter) | |
| qidentifier | ::= | qvarid qidop identifier | |
| identifier | ::= | varid idop | |
| qoperator | ::= | op | |
| qconstructor | ::= | conid qconid |
🔗#3.3.5. Statements
| block | ::= | { semis { statement semis } } | |
| statement | ::= | expr<! funexpr> | |
| | | decl | ||
| decl | ::= | (funfunction) fundecl | |
| | | val apattern = valexpr | (local values can use a pattern binding) | |
| | | var binder := valexpr |
🔗#3.3.6. Expressions
| bodyexpr | ::= | -> blockexpr | |
| | | block | ||
| blockexpr | ::= | expr | (implicitly wrapped in a block) |
| expr | ::= | returnexpr | |
| | | matchexpr | ||
| | | handlerexpr | ||
| | | funexpr | ||
| | | ifexpr | ||
| | | opexpr | ||
| ifexpr | ::= | if atom then { elif } [ else expr<!ifexpr> ] | |
| then | ::= | [ then ] expr<!ifexpr> | |
| elif | ::= | elif atom then | |
| matchexpr | ::= | match atom { semis { matchrule semis } } | |
| returnexpr | ::= | return opexpr | |
| funexpr | ::= | funanon fundef block | |
| | | block | (zero-argument anonymous function) | |
| handlerexpr | ::= | handler handlereff handlerpars { handlerrules } | |
| | | handle handlereff haction handlerpars { handlerrules } | ||
| haction | ::= | lparen expr ) |
🔗#3.3.7. Operator expressions
For simplicity, we parse all operators as if they are left associative with the same precedence. We assume that a separate pass in the compiler will use the fixity declarations that are in scope to properly associate all operators in an expressions.
| opexpr | ::= | prefix { qoperator prefixexpr } | |
| prefixexpr | ::= | { ! ~ } appexpr | |
| appexpr | ::= | appexpr lapp [ arguments ] ) | (regular application) |
| | | appexpr lidx [ arguments ] ] | (index operation) | |
| | | appexpr { funexpr } | (apply function expressions) | |
| | | appexpr . atom | ||
| | | atom | ||
| arguments | ::= | argument { , argument } | |
| argument | ::= | [ identifier = ] expr |
🔗#3.3.8. Atomic expressions
| atom | ::= | qidentifier | |
| | | qconstructor | ||
| | | literal | ||
| | | ( ) | (unit) | |
| | | ( annexpr ) | (parenthesized expression) | |
| | | ( annexprs ) | (tuple expression) | |
| | | [ [ annexpr { , annexprs } [ , ] ] ] | (list expression) | |
| literal | ::= | natural float char string | |
| annexprs | ::= | annexpr { , annexpr } | |
| annexpr | ::= | expr [ : typescheme ] |
🔗#3.3.9. Matching
| matchrule | ::= | patterns | expr -> blockexpr | |
| | | patterns bodyexpr | ||
| apattern | ::= | pattern [ typescheme ] | |
| pattern | ::= | identifier | |
| | | wildcard | ||
| | | qconstructor [lparen [ patargs ] )] | ||
| | | ( [ apatterns ] ) | (unit, parenthesized pattern, tuple pattern) | |
| | | [ [ apatterns ] ] | (list pattern) | |
| | | apattern as identifier | (named pattern) | |
| | | literal | ||
| patterns | ::= | pattern { , pattern } | |
| apatterns | ::= | apattern { , apattern } | |
| patargs | ::= | patarg { , patarg } | |
| patarg | ::= | [ identifier = ] apattern | (possibly named parameter) |
🔗#3.3.10. Handlers
| handlerrules | ::= | semis handlerrule { semis handlerrule } semis | |
| handlerrule | ::= | qidentifier opargs bodyexpr | |
| | | return (lparen oparg ) paramid) bodyexpr | ||
| opargs | | | lparen [ oparg { , oparg } ] ) | |
| oparg | | | paramid [ : type ] |
🔗#3.3.11. Type schemes
| typescheme | ::= | somes foralls tarrow [ qualifier ] | ||
| type | ::= | foralls tarrow [ qualifier ] | ||
| foralls | ::= | [ forall typeparams ] | ||
| some | ::= | [ some typeparams ] | ||
| qualifier | ::= | with ( predicates ) | ||
| predicates | ::= | predicate { , predicate } | ||
| predicate | ::= | typeapp | (interface) |
🔗#3.3.12. Types
| tarrow | ::= | tatom [ -> tresult ] | |
| tresult | ::= | tatom [ tbasic ] | |
| tatom | ::= | tbasic | |
| | | < anntype { , anntype } [ | tatom ] > | ||
| | | < > | ||
| tbasic | ::= | typeapp | |
| | | ( ) | (unit type) | |
| | | ( tparam ) | (parenthesized type or type parameter) | |
| | | ( tparam { , tparam } ) | (tuple type or parameters) | |
| | | [ anntype ] | (list type) | |
| typeapp | ::= | typecon [ < anntype { , anntype } > ] | |
| typecon | ::= | varid qvarid | |
| | | wildcard | ||
| | | ( , { , } ) | (tuple constructor) | |
| | | [ ] | (list constructor) | |
| | | ( -> ) | (function constructor) | |
| tparam | ::= | [ varid : ] anntype | |
| anntype | ::= | type [ kannot ] |
🔗#3.3.13. Kinds
| kannot | ::= | :: kind | |
| kind | ::= | ( kind { , kind } ) -> kind | |
| | | katom -> kind | ||
| | | katom | ||
| katom | ::= | V | (value type) |
| | | X | (effect type) | |
| | | E | (effect row) | |
| | | H | (heap type) | |
| | | P | (predicate type) |
🔗#3.3.14. Implementation
As a companion to the Flex lexical implementation, there is a full Bison(Yacc) LALR(1) implementation available. Again, the Bison parser functions as the specification of the grammar and this document should always be in agreement with that implementation.