Somewhat recently I got a MacBook Pro for work. I already use other Apple products and I figured that being a UNIX based system it is going to be very much useable for work related things. And, of course, I was curious about their new M1 soc. At the end, I was a bit dissapointed with the user experience and the desktop environment, coming from Linux/X11.
After some rudimentary research, I found the following programs to be very useful, and I highly recommend them for improving your interactions with the Mac OS. All of the programs below are free software and can be installed using Homebrew.
For GUI/DE:
For LaTeX workflow:
Other useful utilities:
mpv
. Great media
player.
Leibniz equality states that two things are equal if they are
industinguishable by any predicates. For a type A
and a, b : A
:
Leq(A, a, b) ≡ ∏ (P : A → Prop), P a ⇔ P b
In the language of HoTT we can restate the definition slightly:
Leq(A, a, b) ≡ ∏ (P : A → hProp), P a = P b
where the last equality is equality of hProps: P a =_hProp P b
.
Since hProp : hSet
, we have that P a =_hProp P b : hProp
and,
hence Leq(A, a, b) : hProp
.
Because Leq
is a proposition it cannot describe fully the actual
identity type Id(A, a, b)
. But it does accurately desctibe the
equality in hSets.
Let A : hSet
. Then we can define Leq(A, a, b) <-> Id(A, a, b)
.
First, we define f : Id(A, a, b) → Leq(A, a, b)
by
f(p : a = b) ≡ λ (P : A → hProp). ap P p
In the other direction we define g : Leq(A, a, b) → Id(A, a, b)
.
Suppose we have H : Leq(A, a, b)
. Then we just pick h(x) ≡ (a = x)
,
and H(h) : (a = a) = (a = b)
, which we apply to 1_a
.
g(H : ∏ P, P a = P b) ≡
let h = (λx. a = x) : A → hProp in
transport (λ X. X) (H(h)) (idpath a)
So, for the equality on hSet
s, the Leibniz equality coincides with
the ML indentity type.
The question that I have is whether we can generalize it and define a complete tower of Leibniz equalities? This is what I mean, specifically.
For an (n+1)-Type A
and a, b : A
, we can define Leq_n(A, a, b) : n-Type
as:
Leq_n(A, a, b) ≡ ∏ (P : A → n-Type), P a =_{n-Type} P b
We can still reuse the same functions f
and g
, which type check
correctly. We can also show:
g(f(p : a = b)) = p
By induction
g(f(1_a)) = g(λP. ap P 1_a) = g (λP. 1_{P_a}) =
(idpath (h(a) = h(a)))*(1_a) = 1_a
So we have a section a = b → Leq_n(A, a, b)
. But showing that f(g(H)) = H
is complicated. It would be nice to see if it can be derived
somehow.
In order to show that f(g(H)) = H
, you need to show that for any P
,
ap P (transport (H (λx. a = x)) 1a) = H P
Note that all the elements in the image of f
satisfy this condition.
So perhaps it will make sense to restrict the type Leq(A, a, b)
to
contain only such H
s that satisfy the condition above?
I am not sure, but it is something that might be worth looking into.
Note that the definition of Leibniz equality that we consider here is different from the "unrestricted" Leibniz equality:
Leq_Type(A, a, b) ≡ ∏ (P : A → Type), P a ↔ P b
This Leibniz equality type has been studied in the context of intensional type theory in "Leibniz equality is isomorphic to Martin-Löf identity, parametrically" by Abel, Cockx, Devriese, Timany, and Wadler.
They show that, under the assumptions of parametricity/under the parametricity translation, the above Leq_Type
is isomorphic to the standard intensional equality type.
It might be interesting to compare the "truncated" and "unrestricted" Leibniz equality types.
For example, we can represent a type as a colimit of a sequence of n-truncations.
Does this connection somehow lift to representation of Leq_Type
in terms of Leq_n
's?
Let me know if you have any ideas on this topic.
An algebraic theory $T$ is given by a collection of operations with given arities, and a number of equations of the form $\Gamma \mid t = s$ between terms of $T$. Here $t$ and $s$ can contain free variables from $\Gamma$. A model $M$ for an algebraic theory $T$ is an underlying set (call it $M$ as well) which interprets all the operations, and for which all the equations of $T$ hold. A morphism of $T$-models is a function of the underlying sets that presevres the interpretation of all the operations.
The category $Mod_T$ of $T$-models comes with a forgetful functor $U : Mod_T \to Set$. It has a left adjoint $F : Set \to Mod_T$ which constructs a free $T$-model on the underlying set.
The category $Mod_T$ and the functor $U$ always satisfies a number of properties that can be used for showing that certain theories are not algebraic, i.e. that certain categories do not arise as models of algebraic theories.
One of the properties of $U$ is that it reflects isomorphisms: if $f : M \to M'$ is a morphism of $T$-models, and $f$ has a set-theoretic inverse $g$ (i.e. $g : M' \to M$ such that $U(f) \circ g = 1$ and $g \circ U(f) = 1$), then $f$ is an isomorphism of models.
This can be easily observed for monoids: let $f$ be a group homomorphism and let $g$ be it's set theoretic inverse. Then $g(ab) = g(f(g(a)) f(g(b))) = g (f (g(a)g(b))) = g(a)g(b)$.
Posets. A discrete two-element poset $X = \{0,1\}$ can be embedded into the poset $Y = \{0, 1\}$ with $0 \leq_Y 1$. Moreover, on a set-theoretic level, this injection is just an identity function and hence is an isomorphism. If the theory of posets were algebraic, then $X$ and $Y$ were isomorphic as posets, which is not the case.
Up to isomorphism, a subobject of a $T$-model $M$ is a $T$-model $M'$ such that $M' \subseteq M$ and the interpretation of all the operations on $M'$ coincides with the interpretations in $M$.
Like in many categories, subobjects for a given model form a lattice. The meet of submodels is just a set-theoretic intersection (can be computed from the pullback of one submodel along the other). However, the join of two submodels may not coincide with the set-theoretic union: if we have two submodels $M_1, M_2$, and a binary operation $f$ in the theory, then the set-theoretic union $M_1 \cup M_2$ will not contain $\bar{f}(x, y)$ for $x \in M_1, y \in M_2$ and $\bar{f}$ the interpretation of $f$ in $M$.
Given a set $X \subseteq M$ we define the closure of $X$ to be the smallest submodel $\bar{X} \subseteq M$ that contains $X$. Explicitly, $$ \bar{X} = \{ \bar{f}(x_1, \dots, x_n) \in M \mid x_1, \dots, x_n \in X, f \in T \} $$ where $\bar{f}$ is a $M$-interpretation of an $n$-ary operation $f$ from $T$. $\bar{X}$ behaves like the closure operation and satisfies all the expected properties.
Using the closure operation we can define the join of two submodels $M_1, M_2 \subseteq M$ as $$M_1 \vee M_2 = \overline{M_1 \cup M_2}$$. This can be generalized to unions over arbitrary families $\bigvee_{i \in \mathcal{I}} M_i$.
Finally, note the following: if $M_i, i \in \mathcal{I}$ is a directed family, then $\bigvee_{i \in \mathcal{I}} M_i = \bigcup_{i \in \mathcal{I}} M_i$. If at some point we add $\bar{f}(a, b)$ to the join, with $a \in M_a, b \in M_b$, then there is some $M_c$ that includes both $M_a$ and $M_b$; this submodel $M_c$ already contains $\bar{f}(a, b)$.
Now we are ready to give another counterexample.
Complete lattices. The theory of complete lattices in not algebraic. Consider the completion $\omega+1$ of $\omega$ with the top element $\infty$. Consider a directed family $D = \{ \{ 0, \dots, n \} \mid n \in \mathbb{N} \}$. The set-theoretic union $\bigcup D$ is just the natural numbers (not a complete lattice), but the join $\bigvee D$ is $\omega+1$.
Some time ago I've purchased an Adafruit HD44780 LCD as a fancy accessory for playing around with Raspberry Pi. It was lying around in my office for some time until recently. I finally got around to putting it together (many thanks to my office mate Carlo for helping me solder and wire the thing).
It came with a Python library for controlling it via a high-level API. However, I don't really like Python, and I really don't like the design of CircuitPython. So I decided to write an interface to the LCD in OCaml, that I originally wanted to use on a Raspberry Pi (using the bindings to the WiringPi library).
The code can be found in my fork of the ocaml-wiringpi repository under the lcd_lwt
branch.
It mainly consists of two parts:
the low-level interface for writing data to the display and invoking
some operations, and a high-level interface powered by a "cursor"
module.
The first is a low-level API that controls the display using 4 data pins (D4-D7) and two special "control" pins. The first control pin is EN ("enable pin") which is used to signal the start of data read/write operations. The RS ("register select pin") is used as a flag for selecting which registers to address. The low value denotes the instruction register, and the high value denotes the data register.
For example, to clear the display (instruction 0x01 = 0b00000001) one would send the following signals to the LCD:
Internally, this sequence of signals stores 0b00000001 in the instruction register IR.
Instead of sending the instructions to the LCD, we can instead send data, which ends up in the display data RAM (DDRAM). Suppose we want to display a character 'a' (ascii 97 = 0b1100001). We then do the following sequence of operations:
We implement this operation of writing 8 bits to either the instruction register or the data register as the following function:
write8 : Lcd.t -> ?char_mode:bool -> char -> unit
where char_mode
determines whether we write the data to the data
register (true
) or to the instruction register (false
).
To write out a string to be displayed we can use Bytes.iter
:
let write_bytes lcd bts =
Bytes.iter (fun c -> write8 lcd c ~char_mode:true) bts
The location in which we store the data is determined by the address counter, which is automatically incremented after each write. This means that we can just send a sequence of characters sequentially, without touching the address counter ourselves.
The way that operations with arguments are typically represented is by
using e.g. 4 bits to denote the operation and 4 bits to denote the
arguments. For instance, to activate the 2 line mode, we invoke the
"function set" operation which requires the upper 4 bits to be 0010
,
with the argument "2 line mode" which requires the lower 4 bits to be
1000
. To obtain the code for the whole operation we just take the
bitwise OR:
0b0000 1000 -- 2 line mode
0b0010 0000 -- function set
___________
0b0010 1000
This translates to the following mini program in OCaml:
let _lcd_2line = (0x08) in
let _lcd_functionset = (0x20) in
write8_unsafe lcd (_lcd_2line lor _lcd_functionset);
Let us a consider a simple example: shifting the characters. By default, the data that is actually displayed on the LCD is taken from the addresses 0x0..0x7 and 0x40..0x47 for the first and the second rows, resp. If we want to display further characters we can use this shift operation. (See FIGURE 5 in the docs).
To do this we invoke the "cursor/display shift" operation settin the appropriate bits for moving the display and the move direction:
let _lcd_cursorshift = (0x10)
(* 00010000 *)
let _lcd_displaymove = (0x08)
(* 00001000 *)
let _lcd_moveright = (0x04)
(* 00000100 *)
(* -------- *)
(* 00011100 *)
let shift_right lcd =
write8_unsafe lcd (_lcd_cursorshift lor _lcd_displaymove lor _lcd_moveright)
We can provide a slightly higher-level interface by keeping track of the cursor (and some other settings) in the program.
type Cursor.t = { x: int; y: int; visible: bool; blink: bool; _lcd: mono_lcd }
A /cursor/ is a record that combines the underlying mono_lcd
type (which stores the pin layout), the current position of the cursor x, y
and some settings (whether the cursor should be visible and blinking).
Then all the operation on the cursor are just functions Cursor.t -> Cursor.t
.
For example, a function that write out a string takes in a cursor, writes the underlying bytes using write_bytes
and updates the cursor position.
A function that sets the blinking flag writes out the desired bytes (as descirbed in the "operations with arguments") and updates the boolean flag.
Combination of those operations are just function composition:
let (|>) (m : Cursor.t) (f : Cursor.t -> 'b) = f m
let display_lines lcd l1 l2 =
let col_shift = 3 in
clear lcd;
let cur = Cursor.of_lcd lcd in
let open Cursor in
cur
|> set_visible false
|> set_blink false
|> set_position col_shift 0
|> write_string l1 ~wrap:false
|> set_position col_shift 1
|> write_string l2 ~wrap:false
You can find more usage examples in the lcd_lwt.ml file. Overall, I thought that OCaml was a good fit for this kind of programming, and the type system helps out a bit!
A continuation can be understood as a context that represents the
“rest of the program”. Notice that formulated in this way,
continuation is a meta-level abstraction. Ability to manipulate such
continuations as functions is obtained using control operators. The
Scheme programming language has traditionally been a ripe field for
research on control operators such as (call/cc)
.
Call-with-current-continuation is an example of an unbounded control operator. A different and a more general class of control operators are delimited control operators. The basic idea is that rather to capture the whole rest of the program as a continuation, delimited control operators capture delimited continuations, i.e. some part of the program with a hole.
In order to run the examples from this document in DrRacket make sure to put
(require racket/control)
in your file.
One of the first (and simplest) delimited control operations is
prompt/control. prompt
delimits the continuation, and control
captures the current continuation (up to the innermost prompt
).
(prompt (+ 1 (control k (k 3))))
The code is evaluated as follows:
(prompt (+ 1 (control k (k 3))))
=> (prompt ((λ (k) (k 3)) (λ (x) (+ 1 x))))
=> (prompt ((λ (x) (+ 1 x)) 3))
=> (prompt (+ 1 3)) => (prompt 4) => 4
Here (λ (x) (+ 1 x))
represents the delimited continuation (the
program inside prompt
without the control
part) on an object
level. When (control k body)
is evaluated inside a prompt
, it gets
replaced with (λ (k) body)
(capturing k
inside body
) and applied
to the continuation. In terms of reduction semantics one would have
the following reduction rules:
(prompt v) -> v
(prompt E[(control k body)]) -> (prompt ((λ (k) body) (λ (x) E[x])))
where the evaluation context E does not contain prompt
and x
is a
free variable in E.
In particular, if continuation is not invoked in body
, it is just
discarded, as in the following example:
(prompt (+ 1 (control k 3)))
A captured delimited continuation can be invoked multiple times:
(prompt
(+ 1 (control k (let ([x (k 1)] [y (k 2)])
(k (* x y))))))
In this case the continuation k = (λ (x) (+ 1 x))
, so when it is
invoked in the body of control
, x and y get set to 2 and 3,
respectively. Hence, the final result of that program is 1+2*3=7.
Nested continuations
(prompt
(+ 2 (control k (+ 1 (control k1 (k1 6))))))
=> (prompt
((λ (x) (+ 2 (control k (+ 1 x)))) 6))
=> (prompt
(+ 2 (control k (+ 1 6))))
=> (prompt
(+ 2 (control k 7)))
=> (prompt 7) => 7
We can utilize the prompt/control operators to implement a simple macro for a while loop with an ability to break out of it, sort of similar to the while/break statements in C.
(define-syntax-rule (break) (control k '()))
(define-syntax-rule (while cond body ...)
(prompt
(let loop ()
(when cond
body ...
(loop)))))
We define the (while)
construct as a simple recursive loop with just
one caveat – we wrap the whole thing in a prompt
. Then (break)
,
when evaluated, just discards the whole continuation with is bound to
the (while)
loop. We can test this macro by writing a procedure that
multiplies all the elements in the list.
(define (multl lst)
(define i 0)
(define fin (length lst))
(define res 1)
(while (< i fin)
(let ([val (list-ref lst i)])
(printf "The value of lst[~a] is ~a\n" i val)
(set! res (* res val))
(when (= val 0)
(begin (set! res 0) (break))))
(set! i (+ i 1)))
res)
By running this program we can observe that it finishes early whenever it encounters a zero in the list:
> (multl '(1 2 3))
The value of lst[0] is 1
The value of lst[1] is 2
The value of lst[2] is 3
6
> (multl '(1 2 0 3 4 5))
The value of lst[0] is 1
The value of lst[1] is 2
The value of lst[2] is 0
0
>
The situation with break
in C is slightly different; as it has
been pointed out to me, break is a statement, whereas in our toy
example break
is an expression. Due to the dichotomy of statements
and expressions in C this example is not very faithful to C semantics.
The (while)
macro that we have is actually buggy. The problem arises
when the body of the while loop contains an additional prompt
delimiter:
(while (< i 3) (writeln "Hi") (prompt (break)) (set! i (+ i 1)))
When running this example “Hi” is written to the standard output three time. Oops. To circumvent this problem we can use the tagged prompt-at/control-at operators with the following reduction semantics:
(prompt-at tag v) -> v
(prompt-at tag E[(control-at tag' k body)]) -> (prompt-at tag ((λ (k) body) (λ (x) E[x])))
where tag = tag'
and E
does not contain prompt-at tag
. So the
only difference between prompt-at/control-at and prompt/control is the
presence of prompt tags which allows for a more fine-grained matching
between delimiters and control operators.
We can sort of fix our while macro by creating a special tag
(define while-tag (make-continuation-prompt-tag 'tagje))
(define-syntax-rule (break) (control-at while-tag k '()))
(define-syntax-rule (while cond body ...)
(prompt-at while-tag
(let loop ()
(when cond
body ...
(loop)))))
The following program then prints “Hi” only once.
(define i 0)
(while (< i 3) (writeln "Hi") (prompt (break)) (set! i (+ i 1)))
The other pair of delimited control operators are shift
and reset
.
The reductions for reset/shift
are as follows.
(reset v) -> v
(reset E[(shift k body)]) -> (reset ((λ (k) body) (λ (x) (reset E[x]))))
where the evaluation context E does not contain reset
and x
is a
free variable in E. Contrast this with the reduction rules for
prompt/control
(prompt v) -> v
(prompt E[(control k body)]) -> (prompt ((λ (k) body) (λ (x) E[x])))
As you can notice, the difference between the prompt/control
reductions is that in the case when the continuation is captured, it
is wrapped in an additional reset
. Thus, any invocation of a bound
delimited continuation k
cannot escape to the outer scope.
To observe the practical difference, consider the following example.
(define (skip) '())
(define (bye) (println "Capturing and discarding the continuation...") 42)
(prompt
(let ([act (control k (begin
(k skip)
(k (λ () (control _ (bye))))
(k skip)))])
(act)
(println "Doing stuff")))
If we were to remove the second line (k (λ () (control _ (bye))))
in
the begin block, then this program would print “Doing stuff” twice (as
invoking (k skip)
binds the dummy function skip
to act
and
executes (begin (act) (println "Doing stuff"))
). With such a line
present, during the invocation of k
, act
gets bound to (λ () (control _ (bye)))
. Therefore, when act
is evaluated the
continuation is of the form (prompt E[(control _ bye)])
, which just
reduces to (prompt (bye))
and (prompt 42)
. So, the output of the
program above is
"Doing stuff"
"Capturing and discarding the continuation..."
42
If we replace prompt
with reset
and control
with shift
, as in
the code snippet below, then every invocation of k
wraps the
continuation in another reset
.
(reset
(let ([act (shift k (begin
(k skip)
(k (λ () (shift _ (bye))))
(k skip)))])
(act)
(println "Doing stuff")))
After some reductions, the terms is
(reset
((λ (k) (begin (k skip) (k (λ () (shift _ (bye)))) (K skip)))
(λ (x) (reset (begin (x) (println "Doing stuff"))))))
As you can see, the second invocation of k
results in (reset (begin (shift _ (bye)) (println "Doing stuff")))
, and thus
(k (λ () (shift _ (bye))))
returns!
The output of the snippet is thus
"Doing stuff"
"Capturing and discarding the continuation..."
"Doing stuff"
See references at https://docs.racket-lang.org/reference/cont.html.
Some interesting papers:
shift
is a static delimited control
operator and control
is a dynamic delimited control operator.
Definition. A type $p(x)$ for the theory $T$ is a collection of formulas with a free variable $x$ consitent with $T$.
-- see Michael Weiss' explanation of types.
Definition. A type $p(x)$ is principal if there is a formula $\alpha(x)$ such that $p(x)=\{ \psi(x) \mid \alpha \vdash \psi \}$
Definition. A type $p(x)$ is full if for every formula $\psi(x)$ either $\psi \in p(x)$ or $\neg \psi \in p(x)$.
A full type can be seen as a way of describing an a generic "element" of a theory. For any model $M$ and an element $m \in M$ we can consider the full type $p(m)$ generated by $m$. It contains all the formlas $\psi(m)$ that are true about $m$.
Omitting types theorem says that given a complete countable theory $T$ in a countable language, there is a model $M$ of $T$ which omits countably-many non-principal full types.
Why is the non-principal condition needed?
If a theory $T$ is complete, then any model of $T$ realizes all the full principal types. For if $p(x)$ is a complete principal isolated by $\psi(x)$, then $T \not\vdash \not \exists x.\psi(x)$, for otherwise the type $p(x)$ would be inconsistent with $T$. Because $T$ is complete it must be the case that $T \vdash \exists x. \psi(x)$. Then such $x$ realizes $p(x)$.
What happens if the theory $T$ is not complete?
Then it is not the case that all models of $T$ realize all the full principle types. For instance take Peano arithmetic, and take $T_1 = PA + Con(PA)$ and $T_2 = PA + \neg Con(PA)$. Both $T_1$ and $T_2$ are consistent, thus they have models $M_1$ and $M_2$. Then denote by $q(x)$ the type generated by the element $0$ in $M_1$, and by $p(x)$ the type generated by the element $0$ in $M_2$. Both of those types are isolated by the formula $\phi(x) := \neg \exists y. y < x$, because $\phi$ completely determines $0$. Albeit those types are types for different theories, they are both types for a common theory $PA$. Furthemore, those types are complete and isolated. However, one of the types states the consistency of $PA$ and the other one points out the inconsistency of $PA$. Thus both $q$ and $p$ cannot be realized in the same model.
Adding the DNE to CCCs collapses them into Boolean algebras: Suppose $\mathcal{C}$ is a Cartesian closed category. We can show that if $\mathcal{C}$ has a family of isomorphisms between $A$ and $(A \to 0) \to 0$, then $\mathcal{C}$ is a poset (and a Boolean algebra).
Proposition. For any object $A$ of $\mathcal{C}$, $A \times 0 \simeq 0$.
Proof. ${\mathit{\mathop{Hom}}}(A \times 0, B) \simeq {\mathit{\mathop{Hom}}}(0 \times A, B) \simeq {\mathit{\mathop{Hom}}}(0, B^A) \simeq 1$. Therefore, $A \times 0$ is an initial object in $\mathcal{C}$. Clearly $(\pi_2, \bot_{A \times 0})$ establishes the isomorphism.
Qed
Proposition. Given a morphism $f : A \to 0$, we can conclude that $A \simeq 0$.
Proof. The isomorphism is witnessed by $(f, \bot_A) : A \to A \times 0 \simeq 0$ (where $\bot_A : 0 \to A$ is the unique initial morphism).
Qed
We have seen that given the morphism $f : A \to 0$ we can construct an iso $A \simeq 0$. In particular, that means that there can be at most one arrow between $A$ and $0$; interpreting arrows as "proofs" or "proof object", that means that in the intuitionistic calculus there is at most one proof/realizer of the negation (up to isomorphism).
Now, returning to the original question, how do CCCs with double-negation elimination look like? Specifically, we want to show that if in $\mathcal{C}$ there is an isomorphism between $A$ and $0^{(0^A)}$ for all objects $A$, then the categorical structure collapses into a Boolean algebra.
We say that a category has double negation elimination if for every $A$, it is the case that $A \simeq 0^{(0^A)}$.
Proposition. Given a category with double negation elimination and two morphisms $f, g : A \to B$, it is the case that $f = g$.
Proof. Let $i$ be the isomorphism between $B$ and $0^{(0^B)}$. Then, $i \circ f, i \circ g \in {\mathit{\mathop{Hom}}}(A, 0^{(0^B)}) \simeq {\mathit{\mathop{Hom}}}(A \times (0^B), 0)$. But the latter set can contain at most one element; thus, $i \circ f = i \circ g$. Because $i$ is an isomorphism, $f = g$.
Qed
It might be worth noting that in "standard" cartesian closed categories, a morphism $i : B \to 0^{(0^B)}$ serve as a coequalizer of $f$ and $g$. Perhaps stretching the analogy a bit, we can view $i$ as "equalizes" two constructive proofs of $B$ into a single classical proof.
See also
Disclamer: these notes have been written a couple of years ago. While some of the basic facts are still the case, some details may be subject to change. Trust, but verify!
It’s hard to get into writing code that uses GHC API. The API itself is and the number of various functions and options significantly outnumber the amount of tutorials around.
In this note I will try to elaborate on some of the peculiar, interesting problems I’ve encountered during my experience writing code that uses GHC API and also provide various tips I find useful.
Many of the points I make in this note are actually trivial, but nevertheless I made all of the mistakes mentioned in here myself, perhaps due to my naive approach of quickly diving in and experimenting, instead of reading into the documentation and source code.
This note is an adaptation of a series of my blog posts on the subject.
In order to run examples presented in this note you'll have to
ghc-paths
library
ghc -package ghc file.hs
OR
ghc-7.8.x
library: ghc-pkg expose ghc-7.8.2
The up to date Haddocks for GHC are located here.
There are different ways of bringing contents of Haskell modules into scope, a process that is necessary for evaluating/interpreting bits of code on-the-fly. We will walk through some of the caveats of this process.
Imagine the following situation: we have a Haskell source file with code we want to load dynamically and evaluate. That is a basic task in the GHC API terms, but there are actually different ways of doing that.
Let us have a file 'test.hs' containing the code we want to access:
module Test (test) where
test :: Int
test = 123
The basic way to get the 'test' data would be to load Test
as an interpreted module:
import Control.Applicative
import DynFlags
import GHC
import GHC.Paths
import MonadUtils (liftIO)
import Unsafe.Coerce
main = defaultErrorHandler defaultFatalMessager defaultFlushOut $
runGhc (Just libdir) $ do
-- we have to call 'setSessionDynFlags' before doing
-- everything else
dflags <- getSessionDynFlags
-- If we want to make GHC interpret our code on the fly, we
-- ought to set those two flags, otherwise we
-- wouldn't be able to use 'setContext' below
setSessionDynFlags $ dflags { hscTarget = HscInterpreted
, ghcLink = LinkInMemory
}
setTargets =<< sequence [guessTarget "test.hs" Nothing]
load LoadAllTargets
-- Bringing the module into the context
setContext [IIModule $ mkModuleName "Test"]
-- evaluating and running an action
act <- unsafeCoerce <$> compileExpr "print test"
liftIO act
The reason that we have to use HscInterpreted
and LinkInMemory
is that otherwise it would compile test.hs in the current directory and leave test.hi and test.o files, which we would not be able to load in the interpreted mode. However, setContext
, will try to bring the code in those files first, when looking for the module Test
.
$ ghc -package ghc --make Interp.hs
[1 of 1] Compiling Main ( Interp.hs, Interp.o )
Linking Interp ...
$ ./Interp
123
Yay, it works! But let's try something fancier like printing a list of integers, one-by-one. For that, we will use an awesome forM_
function.
--- the rest of the file is the same
--- ...
-- evaluating and running an action
act <- unsafeCoerce <$> compileExpr "forM_ [1,2,test] print"
liftIO act
When we try to run it:
$ ./Interp
Interp: panic! (the 'impossible' happened)
(GHC version 7.8.2 for x86_64-apple-darwin):
Not in scope: ‘forM_’
Well, fair enough, where can we expect GHC to get the forM_
from? We have to bring Control.Monad
into the scope in order to do that.
This brings us to the next section
Naively, we might want to load Control.Monad
in a similar fashion as we did with loading test.hs
.
main = defaultErrorHandler defaultFatalMessager defaultFlushOut $
runGhc (Just libdir) $ do
dflags <- getSessionDynFlags
setSessionDynFlags $ dflags { hscTarget = HscInterpreted
, ghcLink = LinkInMemory
}
setTargets =<< sequence [ guessTarget "test.hs" Nothing
, guessTarget "Control.Monad" Nothing]
load LoadAllTargets
-- Bringing the module into the context
setContext [IIModule $ mkModuleName "Test"]
-- evaluating and running an action
act <- unsafeCoerce <$> compileExpr "forM_ [1,2,test] print"
liftIO act
Unfortunately, this attempt fails:
Interp: panic! (the 'impossible' happened)
(GHC version 7.8.2 for x86_64-apple-darwin):
module ‘Control.Monad’ is a package module
Huh, what? I thought guessTarget
works on all kinds of modules.
Well, it does. But it doesn't "load the module", it merely sets it as the target for compilation. Basically, it (together with load LoadAllTargets
) is equivalent to calling ghc --make
. And surely it doesn't make much sense trying to ghc --make Control.Monad
when Control.Monad
is a module from the base package. What we need to do instead is to bring the compiled Control.Monad
module into scope. Luckily it's not very hard to do with the help of the simpleImportDecl :: ModuleName -> ImportDecl name
function:
main = defaultErrorHandler defaultFatalMessager defaultFlushOut $
runGhc (Just libdir) $ do
-- we have to call 'setSessionDynFlags' before doing
-- everything else
dflags <- getSessionDynFlags
-- If we want to make GHC interpret our code on the fly, we
-- ought to set those two flags, otherwise we
-- wouldn't be able to use 'setContext' below
setSessionDynFlags $ dflags { hscTarget = HscInterpreted
, ghcLink = LinkInMemory
}
setTargets =<< sequence [ guessTarget "test.hs" Nothing ]
load LoadAllTargets
-- Bringing the module into the context
setContext [ IIModule $ mkModuleName "Test"
, IIDecl . simpleImportDecl
. mkModuleName $ "Control.Monad" ]
-- evaluating and running an action
act <- unsafeCoerce <$> compileExpr "forM_ [1,2,test] print"
liftIO act
And this works like a charm:
$ ./Interp
1
2
123
What we have implemented so far corresponds to the :load* test.hs
command in GHCi, which gives us the full access to the source code of
the program. To illustrate this let's modify our test file:
module Test (test) where
test :: Int
test = 123
test2 :: String
test2 = "Hi"
Now, if we want to load that file as an interpreted module and evaluate
test2
nothing will stop us from doing so.
$ ./Interp2
(123,"Hi")
If we want to use the compiled module (like :load test.hs
in GHCi),
we have to bring Test
into the context the same way we dealt with
Control.Monad
:
main = defaultErrorHandler defaultFatalMessager defaultFlushOut $
runGhc (Just libdir) $ do
dflags <- getSessionDynFlags
setSessionDynFlags $ dflags { hscTarget = HscInterpreted
, ghcLink = LinkInMemory
}
setTargets =<< sequence [ guessTarget "Test" Nothing ]
load LoadAllTargets
-- Bringing the module into the context
setContext [ IIDecl $ simpleImportDecl (mkModuleName "Test")
, IIDecl $ simpleImportDecl (mkModuleName "Prelude")
]
printExpr "test"
printExpr "test2"
printExpr :: String -> Ghc ()
printExpr expr = do
liftIO $ putStrLn ("-- Going to print " ++ expr)
act <- unsafeCoerce <$> compileExpr ("print (" ++ expr ++ ")")
liftIO act
The output:
$ ./Interp2
-- Going to print test
123
-- Going to print test2
target: panic! (the 'impossible' happened)
(GHC version 7.6.3 for x86_64-apple-darwin):
Not in scope: `test2'
Perhaps you meant `test' (imported from Test)
Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug
Our exercises above produced a number of GHC panics/erros. By default you can expect GHC to spew all the errors onto your screen, but for different purposes you might want to, e.g. log them.
Naturally at first I tried the exception handling mechanism:
-- Main.hs:
import GHC
import GHC.Paths
import MonadUtils
import Exception
import Panic
import Unsafe.Coerce
import System.IO.Unsafe
-- I thought this code would handle the exception
handleException :: (ExceptionMonad m, MonadIO m)
=> m a -> m (Either String a)
handleException m =
ghandle (\(ex :: SomeException) -> return (Left (show ex))) $
handleGhcException (\ge -> return (Left (showGhcException ge ""))) $
flip gfinally (liftIO restoreHandlers) $
m >>= return . Right
-- Initializations, needed if you want to compile code on the fly
initGhc :: Ghc ()
initGhc = do
dfs <- getSessionDynFlags
setSessionDynFlags $ dfs { hscTarget = HscInterpreted
, ghcLink = LinkInMemory }
return ()
-- main entry point
main = test >>= print
test :: IO (Either String Int)
test = handleException $ runGhc (Just libdir) $ do
initGhc
setTargets =<< sequence [ guessTarget "file1.hs" Nothing ]
graph <- depanal [] False
loaded <- load LoadAllTargets
-- when (failed loaded) $ throw LoadingException
setContext (map (IIModule . moduleName . ms_mod) graph)
let expr = "run"
res <- unsafePerformIO . unsafeCoerce <$> compileExpr expr
return res
-- file1.hs:
module Main where
main = return ()
run :: IO Int
run = do
n <- x
return (n+1)
The problem is when I run the 'test' function above I receive the following output:
h> test
test/file1.hs:4:10: Not in scope: `x'
Left "Cannot add module Main to context: not a home module"
it :: Either String Int
It appears that the exception handler did cat an error, but a peculiar one, and not that one I wanted to catch.
We've studied the GHC API together with Luite Stegeman and I think we've found a more or less satisfactory solution.
Errors are handled using the LogAction specified in the DynFlags for your GHC session. So to fix this you need to change 'log_action' parameter in dynFlags. For example, you can do this:
initGhc = do
..
ref <- liftIO $ newIORef ""
dfs <- getSessionDynFlags
setSessionDynFlags $ dfs { hscTarget = HscInterpreted
, ghcLink = LinkInMemory
, log_action = logHandler ref -- ^ this
}
-- LogAction == DynFlags -> Severity -> SrcSpan -> PprStyle -> MsgDoc -> IO ()
logHandler :: IORef String -> LogAction
logHandler ref dflags severity srcSpan style msg =
case severity of
SevError -> modifyIORef' ref (++ printDoc)
SevFatal -> modifyIORef' ref (++ printDoc)
_ -> return () -- ignore the rest
where cntx = initSDocContext dflags style
locMsg = mkLocMessage severity srcSpan msg
printDoc = show (runSDoc locMsg cntx)
A package database is a directory where the information about your installed packages is stored. For each package registered in the database there is a .conf file with the package details. The .conf file contains the package description (just like in the .cabal file) as well as path to binaries and a list of resolved dependencies:
$ cat aeson-0.6.1.0.1-5a107a6c6642055d7d5f98c65284796a.conf
name: aeson
version: 0.6.1.0.1
id: aeson-0.6.1.0.1-5a107a6c6642055d7d5f98c65284796a
<..snip..>
import-dirs: /home/dan/.cabal/lib/aeson-0.6.1.0.1/ghc-7.7.20130722
library-dirs: /home/dan/.cabal/lib/aeson-0.6.1.0.1/ghc-7.7.20130722
<..snip..>
depends: attoparsec-0.10.4.0-acffb7126aca47a107cf7722d75f1f5e
base-4.7.0.0-b67b4d8660168c197a2f385a9347434d
blaze-builder-0.3.1.1-9fd49ac1608ca25e284a8ac6908d5148
bytestring-0.10.3.0-66e3f5813c3dc8ef9647156d1743f0ef
<..snip..>
You can use ghc-pkg
to manage installed packages on your system. For
example, to list all the packages you've installed run ghc-pkg list
.
To list all the package databases that are automatically picked up by
ghc-pkg
do the following:
$ ghc-pkg nonexistentpkg
/home/dan/ghc/lib/ghc-7.7.20130722/package.conf.d
/home/dan/.ghc/i386-linux-7.7.20130722/package.conf.d
See ghc-pkg --help
or the online documentation for more details.
By default GHC knows only about two package databases: the global
package database (usually /usr/lib/ghc-something/
on Linux) and the
user-specific database (usually ~/.ghc/lib
). In order to pick up a
package that resides in a different package database you have to
employ some tricks.
For some reason GHC API does not export an clear and easy-to-use function that would allow you to do that, although the code we need is present in the GHC sources.
The way this whole thing works is the following:
GHC calls initPackages, which reads the database files and sets up the internal table of package information
The reading of package databases is performed via the
readPackageConfigs
function. It reads the user package database, the global package
database, the "GHC_PACKAGE_PATH" environment variable, and applies
the extraPkgConfs function, which is a dynflag and has the following
type: extraPkgConfs :: [PkgConfRef] -> [PkgConfRef]
(PkgConfRef
is a type representing the package database). The extraPkgConf
flag
is supposed to represent the -package-db
command line option.
Once the database is parsed, the loaded packages are stored in the
pkgDatabase
dynflag which is a list of PackageConfigs
So, in order to add a package database to the current session we have
to simply modify the extraPkgConfs
dynflag. Actually, there is
already a function present in the GHC source that does exactly what we
need: addPkgConfRef :: PkgConfRef -> DynP ()
. Unfortunately it's not
exported so we can't use it in our own code. I rolled my own functions
that I am using in the interactive-diagrams project, feel free to copy
them:
-- | Add a package database to the Ghc monad
#if __GLASGOW_HASKELL_ >= 707
addPkgDb :: GhcMonad m => FilePath -> m ()
#else
addPkgDb :: (MonadIO m, GhcMonad m) => FilePath -> m ()
#endif
addPkgDb fp = do
dfs <- getSessionDynFlags
let pkg = PkgConfFile fp
let dfs' = dfs { extraPkgConfs = (pkg:) . extraPkgConfs dfs }
setSessionDynFlags dfs'
#if __GLASGOW_HASKELL_ >= 707
_ <- initPackages dfs'
#else
_ <- liftIO $ initPackages dfs'
#endif
return ()
-- | Add a list of package databases to the Ghc monad
-- This should be equivalen to
-- > addPkgDbs ls = mapM_ addPkgDb ls
-- but it is actaully faster, because it does the package
-- reintialization after adding all the databases
#if __GLASGOW_HASKELL_ >= 707
addPkgDbs :: GhcMonad m => [FilePath] -> m ()
#else
addPkgDbs :: (MonadIO m, GhcMonad m) => [FilePath] -> m ()
#endif
addPkgDbs fps = do
dfs <- getSessionDynFlags
let pkgs = map PkgConfFile fps
let dfs' = dfs { extraPkgConfs = (pkgs ++) . extraPkgConfs dfs }
setSessionDynFlags dfs'
#if __GLASGOW_HASKELL_ >= 707
_ <- initPackages dfs'
#else
_ <- liftIO $ initPackages dfs'
#endif
return ()
extraPkgConfs