During my time tutoring a university course on functional programming languages with Haskell, I noticed that many students found the concept of type unification challenging. To help bridge this gap I wrote this summary, complete with visualized examples to make type unification more approachable and easier to grasp.
Motivation
Type unification lets us determine the types of an expression and helps us check if the types of an expression are consistent.
the map function in Haskell has the following type:
map ::1. argumentFunction to be appliedto each element(a→b)→2. argumentList of elements[a]→return valueProcessed list[b]
Its 1st argument is a function that will be applied to each element of the list in the 2nd argument (e.g. map abs [-1,-2,3,4] returns [1,2,3,4])
Imagine we define a function:
mapAbs=mapabs-- which is equivalent to: mapAbs xs = map abs xs
Where map and abs have these types:
map::(a->b)->[a]->[b]abs::Numa=>a->a
This function expects a list and will apply the abs funciton on each element. Since abs gives us the absolute value of a number the most general type of mapAbs will be stricter than map's type. As you can see, the arguments of mapAbs have to be derived from the class Num (e.g. Int, Float) and cannot be of type Char for example. Running :type map abs in GHCi returns the type of mapAbs:
mapAbs::Numb=>[b]->[b]
How can we determine the types of such expressions ourselves though? This is where type unification comes into play. First we will look at the typing rules, particularly the application rule and then we will look at some examples, where we will calculate the types of different expressions.
Typing Rules
There are several typing rules [1,2,3,4,5]. For type unification we are interested in the application rule.
Application Rule
(st)::γ(τ)s::σ→τ;t::ρandγ(σ)=γ(ρ)
Application Rule
s and t are two functions where t is passed as the first argument to s and γ is the most general unifier. σ is the type of the argument passed to the function s which is the function t . τ is the type of the return value of s and ρ is the type of the argument passed to t .
In γ(σ)=γ(ρ) we match the type of the argument of s with the type signature of t . Lastly, in (st)::γ(τ) we define the type of the application (st) as γ(τ) . Notice that γ(τ) is exactly what we want to determine with the type unification.
Similar to the application rule for one argument s and t1…n are functions where the functions t1…n are passed to s . γ is the most general unifier again . σ1…n are the types of the arguments passed to the function s1…n . τ is the type of the return value of s and ρ1…n are the types of the arguments passed to t1…n .
In γ(σ1…n)=γ(ρ1…n) we match the types of the arguments of s with the type signatures of t1…n . Lastly, in (st1…tn)::γ(τ) we define the type of the application (st1…tn) as γ(τ) .
Example
Given two functions f and g we want to determine the type of the application (fg) . Calculate the type of the application (f g).
TIP
It might help to imagine defining a function h which applies f to g:
h=(fg)
The types of the functions f and g are defined as follows:
f::(a->a->b)->a->bfgxs=gxsxsg::c->Int->cgxy=x-- This is the const function, but with a constraint on the-- type of the second argument which has to be of type Int here.
Application Rule
Applying the types defined above to the application rule gives us:
The type variables in ρ and σ must be distinct. If they are not, we have to rename them.
Calculation of γ
We find γ(a→a→b)=γ(c→Int→c) .
In the following we decompose the expression and use expressions that cannot be further decomposed to substitute all their occurrences. [1]
G
E
Explanation
∅
a→a→b≐c→Int→c
Decomposition: First we need to decompose the expression.
∅
a≐ca≐Intb≐c
Substitution: After the decomposition we see that a≐c cannot be further decomposed. Therefore, we substitute all occurrences of a with c_ and move a≐c to the left side.
a↦c
c≐Intb≐c
Substitution: Now we substitute all occurrences of c with IntInt. Notice that we substitute on the left and the right side.
a↦Intc↦Int
b≐Int
Substitution: Finally we need to substitute all occurrences of b. But since there are no further occurences of b there is nothing to substitute.
a↦Intc↦Intb↦Int
∅
With the empty set on the right side we are done and can create the most general unifier.
Now we can determine the type of (fg) by applying the type substitution to γ(a→b) :
(fg)::Int→Int=γ(a→b)■
Therefore, (f g) expects an argument of type Int and returns an Int.
Validation
We can validate the type calculation using the Haskell interpreter. Therefore, open ghci in your terminal and enter the following commands. If you don't have GHCi installed you can open GHCi in the terminal online on tutorialspoint.com, or you can run Haskell locally in your browser using WASM on VaibhavSagar.com/WebVM[1].
-- This will always print the type after evaluation:set+t-- Multiline input can be entered in GHCi by surrounding it with :{ and :}:{f::(a->a->b)->a->bfgxs=gxsxsg::c->Int->cgxy=x:}-- Now we can check the type of (f g):type(fg)
TIP
On hoogle.haskell.org you can search for functions by their type signature.
For a list of all available commands in GHCi type :help in the GHCi prompt.
Example (f length words)
How do we determine the type of the application (f length words)?
This is type of f:
f::(a->b)->(c->[a])->c->[b]fg1g2xs=mapg1$g2xs-- :type (f length words) -- output: (f length words) :: String -> [Int]-- f length words "a bb cccc" -- output: [1,2,4]
The builtin functions (from the Prelude library) length and words have the following types (Note that String = [Char][1])
Now that we have the constraints we can calculate the most general unifier
G
E
Explanation
∅
a→b≐[d]→Intc→[a]≐[Char]→[[Char]]
Decomposition: First we decompose the first expression.
∅
a≐[d]b≐Intc→[a]≐[Char]→[[Char]]
Substitution: a≐[d] cannot be further decomposed. Therefore, we substitute all occurrences of a with [d][d].
a↦[d]
b≐Intc→[[d]]≐[Char]→[[Char]]
Substitution of b (No further b's to substitute).
a↦[d]b↦Int
c→[[d]]≐[Char]→[[Char]]
Decomposition of the second expression.
a↦[d]b↦Int
c≐[Char][[d]]≐[[Char]]
Substitution: of c (No further c's to substitute).
a↦[d]b↦Intc↦[Char]
[[d]]≐[[Char]]
We unpack the lists on both sides.
a↦[d]b↦Intc↦[Char]
d≐Char
Substitution: d≐Char cannot be further decomposed. Therefore, we substitute all occurrences of d with CharChar.
a↦[Char]b↦Intc↦[Char]d↦Char
∅
The most general unifier (MGU) is:
γ={a↦[Char],b↦Int,c↦[Char],d↦Char}
The type of (f length words) is:
(flengthwords)::[Char]→[Int]=γ(c→[b])■
In other words (f length words) takes an argument of type String (String = [Char]) and returns a list of Int's. The returned list shows the length of the words in the passed string.
Do you want to try to find the type of (f2 length words) yourself?
f2::(a->b)->(c->a)->[c]->[b]f2g1g2xs=map(g1.g2)xs-- :type (f2 length words) -- output: (f2 length words) :: [String] -> [Int]-- f2 length words ["a", "b b", "c c c c"] -- output: [1,2,4]
Example Error
Calculate the type of the application (all g).
all::Foldablet=>(a->Bool)->ta->Bool-- ≅ (a -> Bool) -> [a] -> Boolg::(Orda,Numa)=>a->[Bool]-- ≅ a -> [Bool]gx=[x>5,x>50]-- Another function with the same type signature would be:g=zipWith(>)[5,50].replicate2
Where
map
andabs
have these types: