-- Module exports.
module Ast where
import Base
import Map
data Type = TC String | TV String | TAp Type Type deriving Eq
arr a b = TAp (TAp (TC "->") a) b
data Extra = Basic String | Const Int | ChrCon Char | StrCon String | Link String String Qual
data Pat = PatLit Ast | PatVar String (Maybe Pat) | PatCon String [Pat]
data Ast = E Extra | V String | A Ast Ast | L String Ast | Pa [([Pat], Ast)] | Proof Pred
data Constr = Constr String [(String, Type)]
data ModExport = ExportVar String | ExportCon String [String]
data Pred = Pred String Type deriving Eq
data Qual = Qual [Pred] Type
instance Show Type where
showsPrec _ = \case
TC s -> (s++)
TV s -> (s++)
TAp (TAp (TC "->") a) b -> showParen True $ shows a . (" -> "++) . shows b
TAp a b -> showParen True $ shows a . (' ':) . shows b
instance Show Pred where
showsPrec _ (Pred s t) = (s++) . (' ':) . shows t . (" => "++)
instance Show Qual where
showsPrec _ (Qual ps t) = foldr (.) id (map shows ps) . shows t
instance Show Extra where
showsPrec _ = \case
Basic s -> (s++)
Const i -> shows i
ChrCon c -> shows c
StrCon s -> shows s
Link im s _ -> (im++) . ('.':) . (s++)
instance Show Pat where
showsPrec _ = \case
PatLit e -> shows e
PatVar s mp -> (s++) . maybe id ((('@':) .) . shows) mp
PatCon s ps -> (s++) . foldr (.) id (((' ':) .) . shows <$> ps)
showVar s@(h:_) = showParen (elem h ":!#$%&*+./<=>?@\\^|-~") (s++)
instance Show Ast where
showsPrec prec = \case
E e -> shows e
V s -> showVar s
A x y -> showParen (1 <= prec) $ shows x . (' ':) . showsPrec 1 y
L s t -> showParen True $ ('\\':) . (s++) . (" -> "++) . shows t
Pa vsts -> ("\\cases{"++) . foldr (.) id (intersperse (';':) $ map (\(vs, t) -> foldr (.) id (intersperse (' ':) $ map (showParen True . shows) vs) . (" -> "++) . shows t) vsts) . ('}':)
Proof p -> ("{Proof "++) . shows p . ("}"++)
data Instance = Instance
-- Type, e.g. Int for Eq Int.
Type
-- Dictionary name, e.g. "{Eq Int}"
String
-- Context.
[Pred]
-- Method definitions
(Map String Ast)
data Assoc = NAssoc | LAssoc | RAssoc deriving Eq
data Neat = Neat
{ typeclasses :: Map String [String]
, instances :: Map String [Instance]
, topDefs :: [(String, Ast)]
, topDecls :: Map String Qual
-- | Typed ASTs, ready for compilation, including ADTs and methods,
-- e.g. (==), (Eq a => a -> a -> Bool, select-==)
, typedAsts :: Map String (Qual, Ast)
, dataCons :: Map String [Constr]
, type2Cons :: Map String [String]
, ffiImports :: Map String Type
, ffiExports :: Map String String
, moduleImports :: [String]
, moduleExports :: Maybe [String]
, opFixity :: Map String (Int, Assoc)
}
neatEmpty = Neat Tip Tip [] Tip Tip Tip Tip Tip Tip [] Nothing Tip
patVars = \case
PatLit _ -> []
PatVar s m -> s : maybe [] patVars m
PatCon _ args -> concat $ patVars <$> args
typeVars = \case
TC _ -> []
TV v -> [v]
TAp x y -> typeVars x `union` typeVars y
depthFirstSearch = (foldl .) \relation st@(visited, sequence) vertex ->
if vertex `elem` visited then st else second (vertex:)
$ depthFirstSearch relation (vertex:visited, sequence) (relation vertex)
spanningSearch = (foldl .) \relation st@(visited, setSequence) vertex ->
if vertex `elem` visited then st else second ((:setSequence) . (vertex:))
$ depthFirstSearch relation (vertex:visited, []) (relation vertex)
scc ins outs = spanning . depthFirst where
depthFirst = snd . depthFirstSearch outs ([], [])
spanning = snd . spanningSearch ins ([], [])
Client-side compiler
Our next goal is a browser-based edition of our compiler.
Crossly
We first add a few features related to WebAssembly. Firstly, the wasm command
compiles Haskell to C intended to be compiled to wasm. The resulting binary
assumes the host environment supplies a few functions such as env.getchar.
Secondly, the warts command prints the RTS generated by the compiler, also as
C code intended to be compiled to wasm. We later use this to build compilers
that can go directly from Haskell to wasm.
We do some housekeeping. Given a Neat, type inference had produced a tuple of
a particular type that contained the data needed by the next phase. We change
it to produce a new Neat with an updated typedAsts field, so there’s one
fewer data type to occupy our thoughts and APIs. We no longer need to pick out
specific fields to pass to the next phase, as we simply pass everything.
We take a first stab at top-level type declarations. We treat them similarly to default typeclass methods, in that during type inference, we trust the symbol has its annotated type, and only afterwards that we verify the annotated type matches the inferred type. It’s more complex because we must process an entire strongly connected component at a time.
Adding modules has made a mess of our various functions for looking up data constructors, top-level variables, typeclasses, and so on. We reorganize them a little to standardize the logic for searching through the list of imports. This makes it easier to add support for lists of export symbols. We also replace the giant global operator precedence table with local tables.
We experiment with hash consing which reduces heap usage by maximizing sharing. However, it may cost too much, as our compiler has grown appreciably slower.
-- Separate fixity phase.
-- Export lists.
module Parser where
import Base
import Ast
import Map
-- Parser.
data ParserState = ParserState
{ readme :: [(Char, (Int, Int))]
, landin :: String
, indents :: [Int]
}
data Parser a = Parser { getParser :: ParserState -> Either String (a, ParserState) }
instance Functor Parser where fmap f x = pure f <*> x
instance Applicative Parser where
pure x = Parser \inp -> Right (x, inp)
(Parser f) <*> (Parser x) = Parser \inp -> do
(fun, t) <- f inp
(arg, u) <- x t
pure (fun arg, u)
instance Monad Parser where
return = pure
(Parser x) >>= f = Parser \inp -> do
(a, t) <- x inp
getParser (f a) t
instance Alternative Parser where
empty = bad ""
x <|> y = Parser \inp -> either (const $ getParser y inp) Right $ getParser x inp
notFollowedBy p = do
saved <- Parser \pasta -> Right (pasta, pasta)
ret <- p *> pure (bad "") <|> pure (pure ())
Parser \_ -> Right ((), saved)
ret
parse f str = getParser f $ ParserState (rowcol str (1, 1)) [] [] where
rowcol s rc = case s of
[] -> []
h:t -> (h, rc) : rowcol t (advanceRC (ord h) rc)
advanceRC n (r, c)
| n `elem` [10, 11, 12, 13] = (r + 1, 1)
| n == 9 = (r, (c + 8)`mod`8)
| True = (r, c + 1)
indentOf pasta = case readme pasta of
[] -> 1
(_, (_, c)):_ -> c
ins c pasta = pasta { landin = c:landin pasta }
angle n pasta = case indents pasta of
m:ms | m == n -> ins ';' pasta
| n + 1 <= m -> ins '}' $ angle n pasta { indents = ms }
_ -> pasta
curly n pasta = case indents pasta of
m:ms | m + 1 <= n -> ins '{' pasta { indents = n:m:ms }
[] | 1 <= n -> ins '{' pasta { indents = [n] }
_ -> ins '{' . ins '}' $ angle n pasta
sat f = Parser \pasta -> case landin pasta of
c:t -> if f c then Right (c, pasta { landin = t }) else Left "unsat"
[] -> case readme pasta of
[] -> case indents pasta of
[] -> Left "EOF"
m:ms | m /= 0 && f '}' -> Right ('}', pasta { indents = ms })
_ -> Left "unsat"
(h, _):t | f h -> let
p' = pasta { readme = t }
in case h of
'}' -> case indents pasta of
0:ms -> Right (h, p' { indents = ms })
_ -> Left "unsat"
'{' -> Right (h, p' { indents = 0:indents p' })
_ -> Right (h, p')
_ -> Left "unsat"
char c = sat (c ==)
rawSat f = Parser \pasta -> case readme pasta of
[] -> Left "EOF"
(h, _):t -> if f h then Right (h, pasta { readme = t }) else Left "unsat"
eof = Parser \pasta -> case pasta of
ParserState [] [] _ -> Right ((), pasta)
_ -> badpos pasta "want eof"
comment = rawSat ('-' ==) *> some (rawSat ('-' ==)) *>
(rawSat isNewline <|> rawSat (not . isSymbol) *> many (rawSat $ not . isNewline) *> rawSat isNewline) *> pure True
spaces = isNewline <$> rawSat isSpace
whitespace = do
offside <- or <$> many (spaces <|> comment)
Parser \pasta -> Right ((), if offside then angle (indentOf pasta) pasta else pasta)
hexValue d
| d <= '9' = ord d - ord '0'
| d <= 'F' = 10 + ord d - ord 'A'
| d <= 'f' = 10 + ord d - ord 'a'
isNewline c = ord c `elem` [10, 11, 12, 13]
isSymbol = (`elem` "!#$%&*+./<=>?@\\^|-~:")
isSmall c = c <= 'z' && 'a' <= c || c == '_'
small = sat isSmall
large = sat \x -> (x <= 'Z') && ('A' <= x)
hexit = sat \x -> (x <= '9') && ('0' <= x)
|| (x <= 'F') && ('A' <= x)
|| (x <= 'f') && ('a' <= x)
digit = sat \x -> (x <= '9') && ('0' <= x)
decimal = foldl (\n d -> 10*n + ord d - ord '0') 0 <$> some digit
hexadecimal = foldl (\n d -> 16*n + hexValue d) 0 <$> some hexit
nameTailChar = small <|> large <|> digit <|> char '\''
nameTailed p = liftA2 (:) p $ many nameTailChar
escape = char '\\' *> (sat (`elem` "'\"\\") <|> char 'n' *> pure '\n' <|> char '0' *> pure '\0' <|> char 'x' *> (chr <$> hexadecimal))
tokOne delim = escape <|> rawSat (delim /=)
charSeq = mapM char
tokChar = between (char '\'') (char '\'') (tokOne '\'')
quoteStr = between (char '"') (char '"') $ many $ many (charSeq "\\&") *> tokOne '"'
quasiquoteStr = charSeq "[r|" *> quasiquoteBody
quasiquoteBody = charSeq "|]" *> pure [] <|> (:) <$> rawSat (const True) <*> quasiquoteBody
tokStr = quoteStr <|> quasiquoteStr
integer = char '0' *> (char 'x' <|> char 'X') *> hexadecimal <|> decimal
literal = lexeme . fmap E $ Const <$> integer <|> ChrCon <$> tokChar <|> StrCon <$> tokStr
varish = lexeme $ nameTailed small
bad s = Parser \pasta -> badpos pasta s
badpos pasta s = Left $ loc $ ": " ++ s where
loc = case readme pasta of
[] -> ("EOF"++)
(_, (r, c)):_ -> ("row "++) . shows r . (" col "++) . shows c
varId = do
s <- varish
if elem s
["export", "case", "class", "data", "default", "deriving", "do", "else", "foreign", "if", "import", "in", "infix", "infixl", "infixr", "instance", "let", "module", "newtype", "of", "then", "type", "where", "_"]
then bad $ "reserved: " ++ s else pure s
varSymish = lexeme $ (:) <$> sat (\c -> isSymbol c && c /= ':') <*> many (sat isSymbol)
varSym = lexeme $ do
s <- varSymish
if elem s ["..", "=", "\\", "|", "<-", "->", "@", "~", "=>"] then bad $ "reserved: " ++ s else pure s
conId = lexeme $ nameTailed large
conSymish = lexeme $ liftA2 (:) (char ':') $ many $ sat isSymbol
conSym = do
s <- conSymish
if elem s [":", "::"] then bad $ "reserved: " ++ s else pure s
special c = lexeme $ sat (c ==)
comma = special ','
semicolon = special ';'
lParen = special '('
rParen = special ')'
lBrace = special '{'
rBrace = special '}'
lSquare = special '['
rSquare = special ']'
backquoted = between (char '`' *> whitespace) (char '`' *> whitespace)
lexeme f = f <* whitespace
lexemePrelude = whitespace *>
Parser \pasta -> case getParser (res "module" <|> (:[]) <$> char '{') pasta of
Left _ -> Right ((), curly (indentOf pasta) pasta)
Right _ -> Right ((), pasta)
curlyCheck f = do
Parser \pasta -> Right ((), pasta { indents = 0:indents pasta })
r <- f
Parser \pasta -> let pasta' = pasta { indents = tail $ indents pasta } in case readme pasta of
[] -> Right ((), curly 0 pasta')
('{', _):_ -> Right ((), pasta')
(_, (_, col)):_ -> Right ((), curly col pasta')
pure r
conOf (Constr s _) = s
specialCase (h:_) = '|':conOf h
mkCase t cs = insert (specialCase cs)
( Qual [] $ arr t $ foldr arr (TV "case") $ map (\(Constr _ sts) -> foldr arr (TV "case") $ snd <$> sts) cs
, E $ Basic "I")
mkStrs = snd . foldl (\(s, l) u -> ('@':s, s:l)) ("@", [])
scottEncode _ ":" _ = E $ Basic "CONS"
scottEncode vs s ts = foldr L (foldl (\a b -> A a (V b)) (V s) ts) (ts ++ vs)
scottConstr t cs (Constr s sts) = foldr (.) (insertWith (error $ "constructor conflict: " ++ s) s
(Qual [] $ foldr arr t ts , scottEncode (map conOf cs) s $ mkStrs ts)
) [insertWith (error $ "field conflict: " ++ field) field (Qual [] $ t `arr` ft, L s $ foldl A (V s) $ inj $ proj field) | (field, ft) <- sts, field /= ""]
where
ts = snd <$> sts
proj fd = foldr L (V fd) $ fst <$> sts
inj x = map (\(Constr s' _) -> if s' == s then x else V "undefined") cs
mkAdtDefs t cs = foldr (.) (mkCase t cs) $ scottConstr t cs <$> cs
mkFFIHelper n t acc = case t of
TC s -> acc
TAp (TC "IO") _ -> acc
TAp (TAp (TC "->") x) y -> L (show n) $ mkFFIHelper (n + 1) y $ A (V $ show n) acc
updateDcs cs dcs = foldr (\(Constr s _) m -> insert s cs m) dcs cs
addAdt t cs ders neat = foldr derive neat' ders where
neat' = neat
{ typedAsts = mkAdtDefs t cs $ typedAsts neat
, dataCons = updateDcs cs $ dataCons neat
, type2Cons = insert (typeName t) (concatMap cnames cs) $ type2Cons neat
}
typeName = \case
TAp x _ -> typeName x
TC c -> c
cnames (Constr s sts) = s : concatMap (\(s, _) -> if s == "" then [] else [s]) sts
derive = \case
"Eq" -> addInstance "Eq" (mkPreds "Eq") t
[("==", Pa $ map eqCase cs
)]
"Ord" -> addInstance "Ord" (mkPreds "Ord") t
[("<=", Pa $ zipWith (cmpCase (V "True") (V "True") (V "False")) cs $ iterate (Just (V "False"):) $ (Nothing:) $ repeat $ Just $ V "True")
,("compare", Pa $ zipWith (cmpCase (V "LT") (V "EQ") (V "GT")) cs $ iterate (Just (V "GT"):) $ (Nothing:) $ repeat $ Just $ V "LT")
]
"Show" -> addInstance "Show" (mkPreds "Show") t
[("showsPrec", L "prec" $ Pa $ map showCase cs
)]
der -> error $ "bad deriving: " ++ der
rpats = map (\(Constr con args) -> [PatCon con $ mkPatVar "r" . show <$> [1..length args]]) cs
prec0 = A (V "ord") (E $ ChrCon '\0')
showCase (Constr con args) = let as = show <$> [1..length args]
in ([PatCon con $ mkPatVar "" <$> as], case args of
[] -> A (V "++") (E $ StrCon con)
_ -> case con of
':':_ -> A (A (V "showParen") $ V "True") $ foldr1
(\f g -> A (A (V ".") f) g)
[ A (A (V "showsPrec") prec0) (V "1")
, A (V "++") (E $ StrCon $ ' ':con++" ")
, A (A (V "showsPrec") prec0) (V "2")
]
_ -> A (A (V "showParen") $ A (A (V "<=") prec0) $ V "prec")
$ A (A (V ".") $ A (V "++") (E $ StrCon con))
$ foldr (\f g -> A (A (V ".") f) g) (L "x" $ V "x")
$ map (\a -> A (A (V ".") (A (V ":") (E $ ChrCon ' '))) $ A (A (V "showsPrec") prec0) (V a)) as
)
mkPreds classId = Pred classId . TV <$> typeVars t
mkPatVar pre s = PatVar (pre ++ s) Nothing
eqCase (Constr con args) = let as = show <$> [1..length args]
in ([PatCon con $ mkPatVar "l" <$> as], Pa
[ ([PatCon con $ mkPatVar "r" <$> as], foldr (\x y -> (A (A (V "&&") x) y)) (V "True")
$ map (\n -> A (A (V "==") (V $ "l" ++ n)) (V $ "r" ++ n)) as)
, ([PatVar "_" Nothing], V "False")])
cmpCase lt eq gt (Constr con args) xs = ([PatCon con $ mkPatVar "l" <$> as], Pa $ zip rpats $ maybe cmpArgs id <$> xs) where
cmpArgs = foldr go eq as
go a etc = flip A (A (A (V "compare") (V $ 'l':a)) (V $ 'r':a)) $ Pa
[ ([PatCon "LT" []], lt)
, ([PatCon "EQ" []], etc)
, ([PatCon "GT" []], gt)
]
as = show <$> [1..length args]
addClass classId v (sigs, defs) neat = if not $ member classId $ typeclasses neat then neat
{ typeclasses = insert classId (keys sigs) $ typeclasses neat
, typedAsts = selectors $ typedAsts neat
, topDefs = defaults ++ topDefs neat
} else error $ "duplicate class: " ++ classId
where
vars = take (size sigs) $ show <$> [0..]
selectors = foldr (.) id $ zipWith (\var (s, Qual ps t) -> insertWith (error $ "method conflict: " ++ s) s (Qual (Pred classId v:ps) t,
L "@" $ A (V "@") $ foldr L (V var) vars)) vars $ toAscList sigs
defaults = map (\(s, t) -> if member s sigs then ("{default}" ++ s, t) else error $ "bad default method: " ++ s) $ toAscList defs
addInstance classId ps ty ds neat = neat
{ instances = insertWith (++) classId [Instance ty name ps (fromList ds)] $ instances neat
} where
name = '{':classId ++ (' ':shows ty "}")
addTopDecl (s, t) neat = neat { topDecls = insert s t $ topDecls neat }
addForeignImport foreignname ourname t neat = neat
{ typedAsts = insertWith (error $ "import conflict: " ++ ourname) ourname (Qual [] t, mkFFIHelper 0 t $ A (E $ Basic "F") $ A (E $ Basic "NUM") $ E $ Link "{foreign}" foreignname $ Qual [] t) $ typedAsts neat
, ffiImports = insertWith (error $ "duplicate import: " ++ foreignname) foreignname t $ ffiImports neat
}
addForeignExport e f neat = neat { ffiExports = insertWith (error $ "duplicate export: " ++ e) e f $ ffiExports neat }
addDefs ds neat = neat { topDefs = ds ++ topDefs neat }
addImport im neat = neat { moduleImports = im:moduleImports neat }
addFixities os prec neat = neat { opFixity = foldr (\o tab -> insert o prec tab) (opFixity neat) os }
parseErrorRule = Parser \pasta -> case indents pasta of
m:ms | m /= 0 -> Right ('}', pasta { indents = ms })
_ -> badpos pasta "missing }"
res w@(h:_) = reservedSeq *> pure w <|> bad ("want \"" ++ w ++ "\"") where
reservedSeq = if elem w ["let", "where", "do", "of"]
then curlyCheck $ lexeme $ charSeq w *> notFollowedBy nameTailChar
else lexeme $ charSeq w *> notFollowedBy (if isSmall h then nameTailChar else sat isSymbol)
paren = between lParen rParen
braceSep f = between lBrace (rBrace <|> parseErrorRule) $ foldr ($) [] <$> sepBy ((:) <$> f <|> pure id) semicolon
joinIsFail t = A (L "join#" t) (V "fail#")
addLets ls x = L "let" $ foldr L (L "in" bodies) $ fst <$> ls where
bodies = foldr A x $ joinIsFail . snd <$> ls
qconop = conSym <|> res ":" <|> backquoted conId
qconsym = conSym <|> res ":"
op = qconsym <|> varSym <|> backquoted (conId <|> varId)
con = conId <|> paren qconsym
var = varId <|> paren varSym
tycon = do
s <- conId
pure $ if s == "String" then TAp (TC "[]") (TC "Char") else TC s
aType =
lParen *>
( rParen *> pure (TC "()")
<|> (foldr1 (TAp . TAp (TC ",")) <$> sepBy1 _type comma) <* rParen)
<|> tycon
<|> TV <$> varId
<|> (lSquare *> (rSquare *> pure (TC "[]") <|> TAp (TC "[]") <$> (_type <* rSquare)))
bType = foldl1 TAp <$> some aType
_type = foldr1 arr <$> sepBy bType (res "->")
fixityDecl w a = do
res w
n <- lexeme integer
os <- sepBy1 op comma
pure $ addFixities os (n, a)
fixity = fixityDecl "infix" NAssoc <|> fixityDecl "infixl" LAssoc <|> fixityDecl "infixr" RAssoc
cDecls = first fromList . second fromList . foldr ($) ([], []) <$> braceSep cDecl
cDecl = first . (:) <$> genDecl <|> second . (++) <$> defSemi
genDecl = (,) <$> var <* res "::" <*> (Qual <$> (scontext <* res "=>" <|> pure []) <*> _type)
classDecl = res "class" *> (addClass <$> conId <*> (TV <$> varId) <*> (res "where" *> cDecls))
simpleClass = Pred <$> conId <*> _type
scontext = (:[]) <$> simpleClass <|> paren (sepBy simpleClass comma)
instDecl = res "instance" *>
((\ps cl ty defs -> addInstance cl ps ty defs) <$>
(scontext <* res "=>" <|> pure [])
<*> conId <*> _type <*> (res "where" *> braceDef))
letin = addLets <$> between (res "let") (res "in") braceDef <*> expr
ifthenelse = (\a b c -> A (A (A (V "if") a) b) c) <$>
(res "if" *> expr) <*> (res "then" *> expr) <*> (res "else" *> expr)
listify = foldr (\h t -> A (A (V ":") h) t) (V "[]")
alts = joinIsFail . Pa <$> braceSep ((\x y -> ([x], y)) <$> pat <*> gdSep "->")
cas = flip A <$> between (res "case") (res "of") expr <*> alts
lamCase = curlyCheck (res "case") *> alts
nalts = joinIsFail . Pa <$> braceSep ((,) <$> many apat <*> gdSep "->")
lamCases = curlyCheck (res "cases") *> nalts
lam = res "\\" *> (lamCase <|> lamCases <|> liftA2 onePat (some apat) (res "->" *> expr))
flipPairize y x = A (A (V ",") x) y
moreCommas = foldr1 (A . A (V ",")) <$> sepBy1 expr comma
thenComma = comma *> ((flipPairize <$> moreCommas) <|> pure (A (V ",")))
parenExpr = (&) <$> expr <*> (((\v a -> A (V v) a) <$> op) <|> thenComma <|> pure id)
rightSect = ((\v a -> L "@" $ A (A (V v) $ V "@") a) <$> (op <|> (:"") <$> comma)) <*> expr
section = lParen *> (parenExpr <* rParen <|> rightSect <* rParen <|> rParen *> pure (V "()"))
maybePureUnit = maybe (V "pure" `A` V "()") id
stmt = (\p x -> Just . A (V ">>=" `A` x) . onePat [p] . maybePureUnit) <$> pat <*> (res "<-" *> expr)
<|> (\x -> Just . maybe x (\y -> (V ">>=" `A` x) `A` (L "_" y))) <$> expr
<|> (\ds -> Just . addLets ds . maybePureUnit) <$> (res "let" *> braceDef)
doblock = res "do" *> (maybePureUnit . foldr ($) Nothing <$> braceSep stmt)
compQual =
(\p xs e -> A (A (V "concatMap") $ onePat [p] e) xs)
<$> pat <*> (res "<-" *> expr)
<|> (\b e -> A (A (A (V "if") b) e) $ V "[]") <$> expr
<|> addLets <$> (res "let" *> braceDef)
sqExpr = between lSquare rSquare $
((&) <$> expr <*>
( res ".." *>
( (\hi lo -> (A (A (V "enumFromTo") lo) hi)) <$> expr
<|> pure (A (V "enumFrom"))
)
<|> res "|" *>
((. A (V "pure")) . foldr (.) id <$> sepBy1 compQual comma)
<|> (\t h -> listify (h:t)) <$> many (comma *> expr)
)
)
<|> pure (V "[]")
fbind = A <$> (V <$> var) <*> (res "=" *> expr)
fBinds v = (do
fbs <- between lBrace rBrace $ sepBy1 fbind comma
pure $ A (E $ Basic "{=") $ foldr A (E $ Basic "=}") $ v:fbs
) <|> pure v
atom = ifthenelse <|> doblock <|> letin <|> sqExpr <|> section
<|> cas <|> lam <|> (paren comma *> pure (V ","))
<|> V <$> (con <|> var) <|> literal
>>= fBinds
aexp = foldl1 A <$> some atom
chain a = \case
[] -> a
A f b:rest -> case rest of
[] -> A (A f a) b
_ -> L "(" $ A (A (A f a) b) $ foldr A (V ")") rest
_ -> error "unreachable"
expr = chain <$> aexp <*> many (A <$> (V <$> op) <*> aexp)
gcon = conId <|> paren (qconsym <|> (:"") <$> comma) <|> lSquare *> rSquare *> pure "[]"
apat = PatVar <$> var <*> (res "@" *> (Just <$> apat) <|> pure Nothing)
<|> flip PatVar Nothing <$> (res "_" *> pure "_")
<|> flip PatCon [] <$> gcon
<|> PatLit <$> literal
<|> foldr (\h t -> PatCon ":" [h, t]) (PatCon "[]" [])
<$> between lSquare rSquare (sepBy pat comma)
<|> paren (foldr1 pairPat <$> sepBy1 pat comma <|> pure (PatCon "()" []))
where pairPat x y = PatCon "," [x, y]
patChain a = \case
[] -> a
PatCon f [b]:rest -> case rest of
[] -> PatCon f [a, b]
_ -> PatCon "{+" $ PatCon f [a, b] : rest
_ -> error "unreachable"
patAtom = PatCon <$> gcon <*> many apat <|> apat
pat = patChain <$> patAtom <*> many (PatCon <$> qconop <*> ((:[]) <$> patAtom))
maybeWhere p = (&) <$> p <*> (res "where" *> (addLets <$> braceDef) <|> pure id)
gdSep s = maybeWhere $ res s *> expr <|> foldr (\x no -> A (L "no#" x) no) (V "join#") <$> some (between (res "|") (res s) guards <*> expr)
guards = foldr1 (\f g -> \yes -> f (g yes)) <$> sepBy1 guard comma
guard = guardPat <$> pat <*> (res "<-" *> expr) <|> guardExpr <$> expr
<|> addLets <$> (res "let" *> braceDef)
guardExpr x yes = case x of
V "True" -> yes
_ -> A (A (A (V "if") x) yes) (V "no#")
guardPat p x yes = A (Pa [([p], yes), ([PatVar "_" Nothing], V "no#")]) x
onePat vs x = joinIsFail $ Pa [(vs, x)]
defOnePat vs x = Pa [(vs, x)]
opDef x f y rhs = [(f, defOnePat [x, y] rhs)]
leftyPat p expr = case pvars of
[] -> []
(h:t) -> let gen = '@':h in
(gen, expr):map (\v -> (v, A (Pa [([p], V v)]) $ V gen)) pvars
where
pvars = filter (/= "_") $ patVars p
def = liftA2 (\l r -> [(l, r)]) var (liftA2 defOnePat (many apat) $ gdSep "=")
<|> (pat >>= \x -> opDef x <$> varSym <*> pat <*> gdSep "=" <|> leftyPat x <$> gdSep "=")
coalesce = \case
[] -> []
h@(s, x):t -> case t of
[] -> [h]
(s', x'):t' -> let
f (Pa vsts) (Pa vsts') = Pa $ vsts ++ vsts'
f _ _ = error "bad multidef"
in if s == s' then coalesce $ (s, f x x'):t' else h:coalesce t
defSemi = coalesce . concat <$> sepBy1 def (some semicolon)
braceDef = concat <$> braceSep defSemi
simpleType c vs = foldl TAp (TC c) (map TV vs)
conop = conSym <|> backquoted conId
fieldDecl = (\vs t -> map (, t) vs) <$> sepBy1 var comma <*> (res "::" *> _type)
constr = (\x c y -> Constr c [("", x), ("", y)]) <$> bType <*> conop <*> bType
<|> Constr <$> conId <*>
( concat <$> between lBrace rBrace (fieldDecl `sepBy` comma)
<|> map ("",) <$> many aType)
dclass = conId
_deriving = (res "deriving" *> ((:[]) <$> dclass <|> paren (dclass `sepBy` comma))) <|> pure []
adt = addAdt <$> between (res "data") (res "=") (simpleType <$> conId <*> many varId) <*> sepBy constr (res "|") <*> _deriving
impDecl = addImport <$> (res "import" *> conId)
tops = braceSep
$ adt
<|> classDecl
<|> addTopDecl <$> genDecl
<|> instDecl
<|> res "foreign" *>
( res "import" *> var *> (addForeignImport <$> lexeme tokStr <*> var <*> (res "::" *> _type))
<|> res "export" *> var *> (addForeignExport <$> lexeme tokStr <*> var)
)
<|> addDefs <$> defSemi
<|> fixity
<|> impDecl
export_ = ExportVar <$> varId <|> ExportCon <$> conId <*>
( paren ((:[]) <$> res ".." <|> sepBy (var <|> con) comma)
<|> pure []
)
exports = Just <$> paren (export_ `sepBy` comma)
<|> pure Nothing
haskell = between lexemePrelude eof $ some $ mayModule tops
mayModule p = res "module" *> ((,) <$> conId <*> ((,) <$> exports <* res "where" <*> p))
<|> ("Main",) . (Nothing,) <$> p
parseProgram s = fmap fst $ parse haskell s
-- Export lists.
-- Detect missing instances.
-- Top-level type annotations.
module Typer where
import Base
import Map
import Ast
import Parser
import Unify
-- Pattern compiler.
rewritePats searcher = \case
[] -> pure $ V "join#"
vsxs@((as0, _):_) -> case as0 of
[] -> pure $ foldr1 (A . L "join#") $ snd <$> vsxs
_ -> do
let k = length as0
n <- get
put $ n + k
let vs = take k $ (`shows` "#") <$> [n..]
cs <- flip mapM vsxs \(a:at, x) -> (a,) <$> foldM (\b (p, v) -> rewriteCase searcher v Tip [(p, b)]) x (zip at $ tail vs)
flip (foldr L) vs <$> rewriteCase searcher (head vs) Tip cs
patEq lit b x y = A (L "join#" $ A (A (A (V "if") (A (A (V "==") lit) b)) x) $ V "join#") y
rewriteCase searcher caseVar tab = \case
[] -> flush $ V "join#"
((v, x):rest) -> go v x rest
where
rec = rewriteCase searcher caseVar
go v x rest = case v of
PatLit lit -> flush =<< patEq lit (V caseVar) x <$> rec Tip rest
PatVar s m -> let x' = fill s (V caseVar) x in case m of
Nothing -> flush =<< A (L "join#" x') <$> rec Tip rest
Just v' -> go v' x' rest
PatCon con args -> rec (insertWith (flip (.)) con ((args, x):) tab) rest
flush onFail = case toAscList tab of
[] -> pure onFail
-- TODO: Check rest of `tab` lies in cs.
(firstC, _):_ -> do
let cs = either error id $ findCon searcher firstC
jumpTable <- mapM (\(Constr s ts) -> case mlookup s tab of
Nothing -> pure $ foldr L (V "join#") $ const "_" <$> ts
Just f -> rewritePats searcher $ f []
) cs
pure $ A (L "join#" $ foldl A (A (V $ specialCase cs) $ V caseVar) jumpTable) onFail
resolveFieldBinds searcher t = go t where
go t = case t of
E _ -> t
V _ -> t
A (E (Basic "{=")) (A rawExpr fbsAst) -> let
expr = go rawExpr
fromAst t = case t of
A (A (V f) body) rest -> (f, go body):fromAst rest
E (Basic "=}") -> []
fbs@((firstField, _):_) = fromAst fbsAst
(con, fields) = findField searcher firstField
cs = either error id $ findCon searcher con
newValue = foldl A (V con) [maybe (V $ "[old]"++f) id $ lookup f fbs | (f, _) <- fields]
initValue = foldl A expr [maybe (V "undefined") id $ lookup f fbs | (f, _) <- fields]
updater = foldr L newValue $ ("[old]"++) . fst <$> fields
inj x = map (\(Constr con' _) -> if con' == con then x else V "undefined") cs
allPresent = all (`elem` (fst <$> fields)) $ fst <$> fbs
isCon = case expr of
V (h:_) -> 'A' <= h && h <= 'Z'
_ -> False
in if allPresent
then if isCon then initValue else foldl A (A (V $ specialCase cs) expr) $ inj updater
else error "bad fields in update"
A x y -> A (go x) (go y)
L s x -> L s $ go x
fixFixity findPrec t = case t of
E _ -> pure t
V _ -> pure t
A x y -> A <$> go x <*> go y
L s b
| s == "(" -> infixer findPrec =<< go b
| True -> L s <$> go b
Pa vsxs -> Pa <$> mapM (\(ps, a) -> (,) <$> mapM pgo ps <*> go a) vsxs
where
go = fixFixity findPrec
pgo = patFixFixity findPrec
data OpTree = OpLeaf Ast | OpNode String Ast OpTree
infixer findPrec (A (A (A (V s) x) y) t) = go (OpNode s x (OpLeaf y)) t
where
go acc = \case
A (A (V s) z) rest -> do
acc' <- ins s z acc
go acc' rest
V ")" -> pure $ decode acc
_ -> error "unreachable"
ins s z t = case t of
OpNode s' x y -> isStronger findPrec s s' >>= \case
True -> OpNode s' x <$> ins s z y
False -> pure $ OpNode s (decode t) (OpLeaf z)
OpLeaf x -> pure $ OpNode s x (OpLeaf z)
decode = \case
OpNode f x y -> A (A (V f) x) (decode y)
OpLeaf x -> x
isStronger findPrec s s' = if prec <= prec'
then if prec == prec'
then if assoc == assoc'
then case assoc of
LAssoc -> pure False
RAssoc -> pure True
NAssoc -> Left $ "adjacent NAssoc: " ++ s ++ " vs " ++ s'
else Left $ "assoc mismatch: " ++ s ++ " vs " ++ s'
else pure False
else pure True
where
(prec, assoc) = findPrec s
(prec', assoc') = findPrec s'
patFixFixity findPrec p = case p of
PatLit _ -> pure p
PatVar s m -> PatVar s <$> maybe (pure Nothing) (fmap Just . go) m
PatCon "{+" args -> patFixer findPrec args
PatCon con args -> PatCon con <$> mapM go args
where
go = patFixFixity findPrec
data PopTree = PopLeaf Pat | PopNode String Pat PopTree
patFixer findPrec (PatCon f [a, b]:rest) = go seed rest where
seed = PopNode f a (PopLeaf b)
go acc = \case
[] -> pure $ decode acc
PatCon s [z]:rest -> do
acc' <- ins s z acc
go acc' rest
ins s z t = case t of
PopNode s' x y -> isStronger findPrec s s' >>= \case
True -> PopNode s' x <$> ins s z y
False -> pure $ PopNode s (decode t) (PopLeaf z)
PopLeaf x -> pure $ PopNode s x (PopLeaf z)
decode = \case
PopNode f x y -> PatCon f [x, decode y]
PopLeaf x -> x
secondM f (a, b) = (a,) <$> f b
patternCompile searcher t = astLink searcher $ resolveFieldBinds searcher $ evalState (go $ either error id $ fixFixity (findPrec searcher) t) 0 where
go t = case t of
E _ -> pure t
V _ -> pure t
A x y -> liftA2 A (go x) (go y)
L s x -> L s <$> go x
Pa vsxs -> mapM (secondM go) vsxs >>= rewritePats searcher
-- Type inference.
instantiate' t n tab = case t of
TC s -> ((t, n), tab)
TV s -> case lookup s tab of
Nothing -> let va = TV $ show n in ((va, n + 1), (s, va):tab)
Just v -> ((v, n), tab)
TAp x y -> let
((t1, n1), tab1) = instantiate' x n tab
((t2, n2), tab2) = instantiate' y n1 tab1
in ((TAp t1 t2, n2), tab2)
instantiatePred (Pred s t) ((out, n), tab) = first (first ((:out) . Pred s)) (instantiate' t n tab)
instantiate (Qual ps t) n = first (Qual ps1) $ fst $ instantiate' t n1 tab where
((ps1, n1), tab) = foldr instantiatePred (([], n), []) ps
proofApply sub a = case a of
Proof (Pred cl ty) -> Proof (Pred cl $ apply sub ty)
A x y -> A (proofApply sub x) (proofApply sub y)
L s t -> L s $ proofApply sub t
_ -> a
typeAstSub sub (t, a) = (apply sub t, proofApply sub a)
maybeFix s x = if go x then A (E $ Link "#" "fix" $ Qual [] $ arr (arr (TV "a") (TV "a")) (TV "a")) (L s x) else x where
go = \case
V v -> s == v
A x y -> go x || go y
L v x -> s /= v && go x
_ -> False
nonemptyTails [] = []
nonemptyTails xs@(x:xt) = xs : nonemptyTails xt
fv f bound = \case
V s | not (elem s bound) && f s -> [s]
A x y -> fv f bound x `union` fv f bound y
L s t -> fv f (s:bound) t
_ -> []
fill s a t = case t of
E _ -> t
V v -> if s == v then a else t
A x y -> A (fill s a x) (fill s a y)
L v u -> if s == v then t else L v $ fill s a u
simulFill tab = go [] where
go bnd t = case t of
V s | not $ elem s bnd -> maybe t id $ lookup s tab
A x y -> A (go bnd x) (go bnd y)
L s t' -> L s $ go (s:bnd) t'
_ -> t
triangulate vs defs x = foldr triangle x components where
tab = zip vs defs
ios = foldr (\(s, t) (ins, outs) -> let dsts = fv (`elem` vs) [] t in
(foldr (\dst -> insertWith union dst [s]) ins dsts, insertWith union s dsts outs))
(Tip, Tip) tab
components = scc (\k -> maybe [] id $ mlookup k $ fst ios) (\k -> maybe [] id $ mlookup k $ snd ios) vs
triangle names expr = let
tnames = nonemptyTails names
appem vs = foldl1 A $ V <$> vs
suball x = simulFill (zip (init names) $ appem <$> init tnames) x
redef tns x = foldr L (suball x) tns
in foldr (\(x:xt) t -> A (L x t) $ maybeFix x $ redef xt $ maybe (error $ "oops: " ++ x) id $ lookup x tab) (suball expr) tnames
decodeLets x = decodeVars id x where
decodeVars f = \case
L "in" t -> decodeBodies id vs t
L v t -> decodeVars (f . (v:)) t
where
vs = f []
decodeBodies g [] x = ((vs, g []), x)
decodeBodies g (_:t) (A x y) = decodeBodies (g . (x:)) t y
infer msg typed loc ast csn@(cs, n) = case ast of
E x -> Right $ case x of
Basic bug -> error bug
Const n -> ((TC "Int", ast), csn)
ChrCon _ -> ((TC "Char", ast), csn)
StrCon _ -> ((TAp (TC "[]") (TC "Char"), ast), csn)
Link im s q -> insta q
V s -> maybe (Left $ "undefined: " ++ s) Right
$ either (\t -> ((t, ast), csn)) insta <$> lookup s loc
<|> insta . fst <$> mlookup s typed
A x y -> rec loc x (cs, n + 1) >>=
\((tx, ax), csn1) -> rec loc y csn1 >>=
\((ty, ay), (cs2, n2)) -> unifyMsg msg tx (arr ty va) cs2 >>=
\cs -> Right ((va, A ax ay), (cs, n2))
L "let" lets -> do
let ((vars, defs), x) = decodeLets lets
rec loc (triangulate vars defs x) csn
L s x -> first (\(t, a) -> (arr va t, L s a)) <$> rec ((s, Left va):loc) x (cs, n + 1)
where
rec = infer msg typed
va = TV $ show n
insta ty = ((ty1, foldl A ast (map Proof preds)), (cs, n1))
where (Qual preds ty1, n1) = instantiate ty n
findInstance searcher qn@(q, n) p@(Pred cl ty) insts = case insts of
[] -> case ty of
TV _ -> let v = '*':show n in Right (((p, v):q, n + 1), V v)
_ -> Left $ "no instance: " ++ show p
(modName, Instance h name ps _):rest -> case match h ty of
Nothing -> findInstance searcher qn p rest
Just subs -> foldM (\(qn1, t) (Pred cl1 ty1) -> second (A t)
<$> findProof searcher (Pred cl1 $ apply subs ty1) qn1) (qn, if modName == "" then V name else E $ Link modName name undefined) ps
findProof searcher pred@(Pred classId t) psn@(ps, n) = case lookup pred ps of
Nothing -> findInstance searcher psn pred $ findInstances searcher classId
Just s -> Right (psn, V s)
prove searcher psn a = case a of
Proof pred -> findProof searcher pred psn
A x y -> prove searcher psn x >>= \(psn1, x1) ->
second (A x1) <$> prove searcher psn1 y
L s t -> second (L s) <$> prove searcher psn t
_ -> Right (psn, a)
data Dep a = Dep ([String] -> Either String ([String], a))
instance Functor Dep where
fmap f = \(Dep mf) -> Dep \g -> do
(g', x) <- mf g
pure (g', f x)
instance Applicative Dep where
pure x = Dep \g -> Right (g, x)
(Dep mf) <*> (Dep mx) = Dep \g -> do
(g', f) <- mf g
(g'', x) <- mx g'
pure (g'', f x)
addDep s = Dep \deps -> Right (if s `elem` deps then deps else s : deps, ())
badDep s = Dep $ const $ Left s
runDep (Dep f) = f []
unifyMsg s a b c = either (Left . (s++) . (": "++)) Right $ unify a b c
app01 s x y = maybe (A (L s x) y) snd $ go x where
go expr = case expr of
V v -> Just $ if s == v then (True, y) else (False, expr)
A l r -> do
(a, l') <- go l
(b, r') <- go r
if a && b then Nothing else pure (a || b, A l' r')
L v t -> if v == s then Just (False, expr) else second (L v) <$> go t
_ -> Just (False, expr)
optiApp t = case t of
A x y -> let
x' = optiApp x
y' = optiApp y
in case x' of
L s v -> app01 s v y'
_ -> A x' y'
L s x -> L s (optiApp x)
_ -> t
inferno searcher decls typed defmap syms = let
anno s = maybe (Left $ TV $ ' ':s) Right $ mlookup s decls
loc = zip syms $ anno <$> syms
principal ((acc, preds), (subs, n)) s = do
expr <- maybe (Left $ "missing: " ++ s) Right (mlookup s defmap)
((t, a), (ms, n)) <- infer s typed loc expr (subs, n)
case mlookup s decls of
Nothing -> do
soln <- unifyMsg s (TV (' ':s)) t ms
Right (((s, (t, a)):acc, preds), (soln, n))
Just qAnno -> do
let (Qual pAnno tAnno, n1) = instantiate qAnno n
soln <- maybe (Left $ s ++ ": match failed: " ++ show qAnno ++ " vs " ++ show (apply ms t)) Right $ match (apply ms t) tAnno
Right (((s, (t, a)):acc, pAnno ++ preds), (soln @@ ms, n1))
gatherPreds (acc, psn) (s, (t, a)) = do
(psn, a) <- prove searcher psn $ optiApp a
pure ((s, (t, a)):acc, psn)
in do
((stas, preds), (soln, _)) <- foldM principal (([], []), (Tip, 0)) syms
let ps = zip preds $ ("anno*"++) . show <$> [0..]
(stas, (ps, _)) <- foldM gatherPreds ([], (ps, 0)) $ second (typeAstSub soln) <$> stas
let
preds = fst <$> ps
dicts = snd <$> ps
tab = map (\s -> (s, foldl A (V s) $ V <$> dicts)) syms
applyDicts (s, (t, a)) = (s, (Qual preds t, foldr L (simulFill tab a) dicts))
pure $ map applyDicts stas
inferDefs searcher defs decls typed = do
let
insertUnique m (s, (_, t)) = case mlookup s m of
Nothing -> Right $ insert s t m
_ -> Left $ "duplicate: " ++ s
addEdges (sym, (deps, _)) (ins, outs) = (foldr (\dep es -> insertWith union dep [sym] es) ins deps, insertWith union sym deps outs)
graph = foldr addEdges (Tip, Tip) defs
defmap <- foldM insertUnique Tip defs
let
ins k = maybe [] id $ mlookup k $ fst graph
outs k = maybe [] id $ mlookup k $ snd graph
inferComponent typed syms = foldr (uncurry insert) typed <$> inferno searcher decls typed defmap syms
foldM inferComponent typed $ scc ins outs $ keys defmap
dictVars ps n = (zip ps $ map (('*':) . show) [n..], n + length ps)
inferTypeclasses searcher iMap typed = foldM inferInstance typed [(classId, inst) | (classId, insts) <- toAscList iMap, inst <- insts] where
inferInstance typed (classId, Instance ty name ps idefs) = let
dvs = map snd $ fst $ dictVars ps 0
perMethod s = do
let rawExpr = maybe (V $ "{default}" ++ s) id $ mlookup s idefs
expr <- snd <$> patternCompile searcher rawExpr
(ta, (sub, n)) <- either (Left . (name++) . (" "++) . (s++) . (": "++)) Right
$ infer s typed [] expr (Tip, 0)
qc <- typeOfMethod searcher s
let
(tx, ax) = typeAstSub sub ta
-- e.g. qc = Eq a => a -> a -> Bool
-- We instantiate: Eq a1 => a1 -> a1 -> Bool.
(Qual [Pred _ headT] tc, n1) = instantiate qc n
-- Mix the predicates `ps` with the type of `headT`, applying a
-- substitution such as (a1, [a]) so the variable names match.
-- e.g. Eq a => [a] -> [a] -> Bool
Just subc = match headT ty
(Qual ps2 t2, n2) = instantiate (Qual ps $ apply subc tc) n1
case match tx t2 of
Nothing -> Left "class/instance type conflict"
Just subx -> do
((ps3, _), tr) <- prove searcher (dictVars ps2 0) $ optiApp $ proofApply subx ax
if length ps2 /= length ps3
then Left $ ("want context: "++) . (foldr (.) id $ shows . fst <$> ps3) $ name
else pure tr
in do
ms <- mapM perMethod $ findSigs searcher classId
pure $ insert name (Qual [] $ TC "DICTIONARY", flip (foldr L) dvs $ L "@" $ foldl A (V "@") ms) typed
primAdts =
[ (TC "()", [Constr "()" []])
, (TC "Bool", [Constr "True" [], Constr "False" []])
, (TAp (TC "[]") (TV "a"), [Constr "[]" [], Constr ":" $ map ("",) [TV "a", TAp (TC "[]") (TV "a")]])
, (TAp (TAp (TC ",") (TV "a")) (TV "b"), [Constr "," $ map ("",) [TV "a", TV "b"]])
]
prims = let
ro = E . Basic
dyad s = TC s `arr` (TC s `arr` TC s)
wordy = foldr arr (TAp (TAp (TC ",") (TC "Word")) (TC "Word")) [TC "Word", TC "Word", TC "Word", TC "Word"]
bin s = A (ro "Q") (ro s)
in map (second (first $ Qual [])) $
[ ("doubleFromInt", (arr (TC "Int") (TC "Double"), A (ro "T") (ro "FLO")))
, ("intFromDouble", (arr (TC "Double") (TC "Int"), A (ro "T") (ro "OLF")))
, ("doubleFromWord", (arr (TC "Word") (TC "Double"), A (ro "T") (ro "FLW")))
, ("doubleAdd", (arr (TC "Double") (arr (TC "Double") (TC "Double")), A (ro "Q") (ro "FADD")))
, ("doubleSub", (arr (TC "Double") (arr (TC "Double") (TC "Double")), A (ro "Q") (ro "FSUB")))
, ("doubleMul", (arr (TC "Double") (arr (TC "Double") (TC "Double")), A (ro "Q") (ro "FMUL")))
, ("doubleDiv", (arr (TC "Double") (arr (TC "Double") (TC "Double")), A (ro "Q") (ro "FDIV")))
, ("doubleEq", (arr (TC "Double") (arr (TC "Double") (TC "Bool")), bin "FEQ"))
, ("doubleLE", (arr (TC "Double") (arr (TC "Double") (TC "Bool")), bin "FLE"))
, ("rawDouble", (arr (TC "Double") $ arr (arr (TC "Word") $ arr (TC "Word") $ TV "a") $ TV "a", A (ro "T") (ro "PAIR64")))
, ("intEq", (arr (TC "Int") (arr (TC "Int") (TC "Bool")), bin "EQ"))
, ("intLE", (arr (TC "Int") (arr (TC "Int") (TC "Bool")), bin "LE"))
, ("wordLE", (arr (TC "Word") (arr (TC "Word") (TC "Bool")), bin "U_LE"))
, ("wordEq", (arr (TC "Word") (arr (TC "Word") (TC "Bool")), bin "EQ"))
, ("charEq", (arr (TC "Char") (arr (TC "Char") (TC "Bool")), bin "EQ"))
, ("charLE", (arr (TC "Char") (arr (TC "Char") (TC "Bool")), bin "LE"))
, ("fix", (arr (arr (TV "a") (TV "a")) (TV "a"), ro "Y"))
, ("if", (arr (TC "Bool") $ arr (TV "a") $ arr (TV "a") (TV "a"), ro "I"))
, ("intFromWord", (arr (TC "Word") (TC "Int"), ro "I"))
, ("wordFromInt", (arr (TC "Int") (TC "Word"), ro "I"))
, ("chr", (arr (TC "Int") (TC "Char"), ro "I"))
, ("ord", (arr (TC "Char") (TC "Int"), ro "I"))
, ("ioBind", (arr (TAp (TC "IO") (TV "a")) (arr (arr (TV "a") (TAp (TC "IO") (TV "b"))) (TAp (TC "IO") (TV "b"))), ro "C"))
, ("ioPure", (arr (TV "a") (TAp (TC "IO") (TV "a")), ro "V"))
, ("primitiveError", (arr (TAp (TC "[]") (TC "Char")) (TV "a"), ro "ERR"))
, ("newIORef", (arr (TV "a") (TAp (TC "IO") (TAp (TC "IORef") (TV "a"))), ro "NEWREF"))
, ("readIORef", (arr (TAp (TC "IORef") (TV "a")) (TAp (TC "IO") (TV "a")),
A (ro "T") (ro "READREF")))
, ("writeIORef", (arr (TAp (TC "IORef") (TV "a")) (arr (TV "a") (TAp (TC "IO") (TC "()"))),
A (A (ro "R") (ro "WRITEREF")) (ro "B")))
, ("exitSuccess", (TAp (TC "IO") (TV "a"), ro "END"))
, ("unsafePerformIO", (arr (TAp (TC "IO") (TV "a")) (TV "a"), A (A (ro "C") (A (ro "T") (ro "END"))) (ro "K")))
, ("fail#", (TV "a", A (V "unsafePerformIO") (V "exitSuccess")))
, ("join#", (TV "a", A (V "unsafePerformIO") (V "exitSuccess")))
, ("word64Add", (wordy, A (ro "QQ") (ro "DADD")))
, ("word64Sub", (wordy, A (ro "QQ") (ro "DSUB")))
, ("word64Mul", (wordy, A (ro "QQ") (ro "DMUL")))
, ("word64Div", (wordy, A (ro "QQ") (ro "DDIV")))
, ("word64Mod", (wordy, A (ro "QQ") (ro "DMOD")))
]
++ map (\(s, v) -> (s, (dyad "Int", bin v)))
[ ("intAdd", "ADD")
, ("intSub", "SUB")
, ("intMul", "MUL")
, ("intDiv", "DIV")
, ("intMod", "MOD")
, ("intQuot", "QUOT")
, ("intRem", "REM")
, ("intXor", "XOR")
, ("intAnd", "AND")
, ("intOr", "OR")
]
++ map (\(s, v) -> (s, (dyad "Word", bin v)))
[ ("wordAdd", "ADD")
, ("wordSub", "SUB")
, ("wordMul", "MUL")
, ("wordDiv", "U_DIV")
, ("wordMod", "U_MOD")
, ("wordQuot", "U_DIV")
, ("wordRem", "U_MOD")
]
tabulateModules mods = foldM ins Tip mods where
ins tab (k, (mexs, prog)) = case mlookup k tab of
Nothing -> do
v <- maybe pure processExports mexs (foldr ($) neatEmpty{moduleImports = ["#"]} prog)
pure $ insert k v tab
Just _ -> Left $ "duplicate module: " ++ k
processExports exs neat = do
mes <- Just . concat <$> mapM (processExport neat) exs
pure neat { moduleExports = mes }
processExport neat = \case
ExportVar v -> case lookup v $ topDefs neat of
Nothing -> Left $ "bad export " ++ v
Just _ -> Right [v]
ExportCon c ns -> case mlookup c $ type2Cons neat of
Just cnames
| ns == [".."] -> Right cnames
| null delta -> Right ns
| True -> Left $ "bad exports: " ++ show delta
where delta = [n | n <- ns, not $ elem n cnames]
Nothing -> case mlookup c $ typeclasses neat of
Nothing -> Left $ "bad export " ++ c
Just methodNames
| ns == [".."] -> Right methodNames
| null delta -> Right ns
| True -> Left $ "bad exports: " ++ show delta
where delta = [n | n <- ns, not $ elem n methodNames]
data Searcher = Searcher
{ _neatTab :: Map String Neat
, _thisNeat :: Neat
, _mergedSigs :: Map String [[String]]
, _mergedInstances :: Map String [(String, Instance)]
}
searcherNew tab neat = Searcher tab neat mergedSigs mergedInstances where
mergedSigs = foldr (slowUnionWith (++)) Tip $ map (fmap (:[]) . typeclasses) $ neat : map (tab !) imps
mergedInstances = foldr (slowUnionWith (++)) Tip [fmap (map (im,)) $ instances x | (im, x) <- ("", neat) : map (\im -> (im, tab ! im)) imps]
imps = moduleImports neat
astLink sea ast = runDep $ go [] ast where
go bound ast = case ast of
V s
| elem s bound -> pure ast
| member s defs -> unlessAmbiguous s $ addDep s *> pure ast
| member s $ typedAsts $ _thisNeat sea -> unlessAmbiguous s $ pure ast
| True -> case findImportSym sea s of
[] -> badDep $ "missing: " ++ s
[(im, t)] -> pure $ E $ Link im s t
_ -> badDep $ "ambiguous: " ++ s
A x y -> A <$> go bound x <*> go bound y
L s t -> L s <$> go (s:bound) t
_ -> pure ast
unlessAmbiguous s f = case findImportSym sea s of
[] -> f
_ -> badDep $ "ambiguous: " ++ s
defs = fromList $ topDefs $ _thisNeat sea
findImportSym sea s = concat [maybe [] (\(t, _) -> [(im, t)]) $ mlookup s $ typedAsts n | (im, n) <- importedNeats sea s]
findPrec sea s = if s == ":" then (5, RAssoc) else either (const defPrec) id $ findAmong opFixity (visible sea) s
where
defPrec = (9, LAssoc)
findCon sea = findAmong dataCons $ visible sea
findField sea f = case [(con, fields) | dc <- dataCons <$> visible sea f, (_, cons) <- toAscList dc, Constr con fields <- cons, (f', _) <- fields, f == f'] of
[] -> error $ "no such field: " ++ f
h:_ -> h
typeOfMethod sea = fmap fst . findAmong typedAsts (visible sea)
findSigs sea = \s -> case mlookup s $ _mergedSigs sea of
Nothing -> error $ "missing class: " ++ s
Just [sigs] -> sigs
_ -> error $ "ambiguous class: " ++ s
findInstances sea = maybe [] id . (`mlookup` _mergedInstances sea)
importedNeats sea s@(h:_) = [(im, n) | im <- imps, let n = _neatTab sea ! im, h == '{' || isExportOf s n] where
imps = moduleImports $ _thisNeat sea
visible sea s = _thisNeat sea : (snd <$> importedNeats sea s)
isExportOf s neat = case moduleExports neat of
Nothing -> True
Just es -> elem s es
findAmong fun viz s = case concat $ maybe [] (:[]) . mlookup s . fun <$> viz s of
[] -> Left $ "missing: " ++ s
[unique] -> Right unique
_ -> Left $ "ambiguous: " ++ s
slowUnionWith f x y = foldr go x $ toAscList y where go (k, v) m = insertWith f k v m
inferModule tab acc name = case mlookup name acc of
Nothing -> do
let
neat = tab ! name
imps = moduleImports neat
typed = typedAsts neat
acc' <- foldM (inferModule tab) acc imps
let searcher = searcherNew acc' neat
depdefs <- mapM (\(s, t) -> (s,) <$> patternCompile searcher t) $ topDefs neat
typed <- inferDefs searcher depdefs (topDecls neat) typed
typed <- inferTypeclasses searcher (instances neat) typed
Right $ insert name neat { typedAsts = typed } acc'
Just _ -> Right acc
untangle s = do
tab <- insert "#" neatPrim <$> (parseProgram s >>= tabulateModules)
foldM (inferModule tab) Tip $ keys tab
neatPrim = foldr (\(a, b) -> addAdt a b []) neatEmpty { typedAsts = fromList prims } primAdts
-- Export lists.
module Main where
import Base
import Map
import Ast
import RTS
import Typer
import Kiselyov
import System
hide_prelude_here' = hide_prelude_here'
dumpWith dumper s = case untangle s of
Left err -> err
Right tab -> foldr ($) [] $ map (\(name, neat) -> ("module "++) . (name++) . ('\n':) . (foldr (.) id $ dumper neat)) $ toAscList tab
dumpLambs neat = map (\(s, t) -> (s++) . (" = "++) . shows t . ('\n':)) $ second snd <$> toAscList (typedAsts neat)
dumpTypes neat = map (\(s, q) -> (s++) . (" :: "++) . shows q . ('\n':)) $ second fst <$> toAscList (typedAsts neat)
dumpRawCombs neat = map go combs where
rawCombs = optim . nolam . snd <$> typedAsts neat
combs = toAscList $ rawCombs
go (s, t) = (s++) . (" = "++) . shows t . (";\n"++)
dumpCombs neat = map go combs where
rawCombs = optim . nolam . snd <$> typedAsts neat
combs = toAscList $ rewriteCombs rawCombs <$> rawCombs
go (s, t) = (s++) . (" = "++) . shows t . (";\n"++)
main = getArgs >>= \case
"comb":_ -> interact $ dumpWith dumpCombs
"rawcomb":_ -> interact $ dumpWith dumpRawCombs
"lamb":_ -> interact $ dumpWith dumpLambs
"type":_ -> interact $ dumpWith dumpTypes
"wasm":opts -> interact \s -> either id id $ untangle s >>= compileWith "1<<22" libcWasm ("no-main":opts)
"warts":opts -> interact $ either id (warts opts) . untangle
_ -> interact \s -> either id id $ untangle s >>= compile
Precisely
Before proceeding, we crudely implement arbitrary-precision integers to enable some cool demos. It also exercises our code that handles typeclasses.
Unlike standard Haskell, we add a Ring typeclass rather than Num. The (+)
and (*) operators should be reserved for rings, and there are uses for rings
that are not also instances of the Num typeclass. For example, Gaussian
integers are great for indexing a 2D rectangular board, especially when we want
to rotate by a right angle (multiplication by i) or talk about the cardinal
directions (which correspond the units).
The integers are the initial
ring, so we treat the integer constant n as fromInteger n; if this results
in ambiguity, then we drop fromInteger.
Thus the laws that we know to be true in our bones, such as a*(b + c) = a*b, will never lead us astray. We must explicitly write
a*cfromIntegral to,
say, map a Word32 to a Word64. Other languages convert silently, and wind
up defying our algebraic intuition.
To represent an integer, we use a list of Word32 numbers, plus a boolean to
represent its sign. For GHC compatibility we call the function
integerSignList instead of pattern matching on Integer values.
We implement schoolbook algorithms for basic arithmetic, which is straightforward except for division. I realized that when doing long division by hand, to find the next digit of the divisor, I pick something that seems reasonable via a method that seems partly subconscious! How can we possibly code this?
Luckily, there is a simple algorithm that makes good guesses. See Knuth, The Art of Computer Programming.
We rename div and mod to quot and rem, then introduce wrappers for
div and mod. Now our divisions behave correctly, though it is sad that
div and mod need more instructions. (FORTRAN set an unfortunate precedent
of truncating division to zero, ultimately
forcing languages like C and
WebAssembly and even hardware to conform.)
Our treatment of integer literals causes a bootstrapping issue. Suppose a
literal "0" is to be converted to an Int. Then our compiler applies the Int
edition of fromInteger to the Integer 0, which involves a call to mpView,
whose implementation needs the Int 0. If we simply code this as 0, then we
wind up with a circular definition, because our compiler would insert another
fromInteger call. We work around this with a definition that bypasses
overloading by returning ord '\0'.
-- Assumes overloaded integer literals, lexical negation.
-- Bits.
module Base where
infixr 9 .
infixr 8 ^
infixl 7 * , /, `div` , `mod` , `quot`, `rem`
infixr 6 <>
infixl 6 + , -
infixr 5 ++
infixl 4 <*> , <$> , <* , *>
infix 4 == , /= , <= , < , >= , >
infixl 3 && , <|>
infixl 2 ||
infixl 1 >> , >>=
infixr 1 =<<
infixr 0 $
class Semigroup a where
(<>) :: a -> a -> a
class Monoid a where
mempty :: a
mconcat :: [a] -> a
mconcat = foldr (<>) mempty
instance Monoid [a] where
mempty = []
instance Semigroup [a] where
(<>) = (++)
class Functor f where fmap :: (a -> b) -> f a -> f b
class Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b
class Monad m where
return :: a -> m a
(>>=) :: m a -> (a -> m b) -> m b
(<$>) = fmap
liftA2 f x y = f <$> x <*> y
(>>) f g = f >>= \_ -> g
(=<<) = flip (>>=)
instance Functor ((->) r) where fmap = (.)
instance Applicative ((->) r) where
pure = const
f <*> x = \r -> f r (x r)
instance Monad ((->) r) where
return = const
x >>= f = \r -> f (x r) r
class Eq a where (==) :: a -> a -> Bool
instance Eq () where () == () = True
instance Eq Bool where
True == True = True
False == False = True
_ == _ = False
instance (Eq a, Eq b) => Eq (a, b) where
(a1, b1) == (a2, b2) = a1 == a2 && b1 == b2
instance Eq a => Eq [a] where
xs == ys = case xs of
[] -> case ys of
[] -> True
_ -> False
x:xt -> case ys of
[] -> False
y:yt -> x == y && xt == yt
instance Eq Int where (==) = intEq
instance Eq Char where (==) = charEq
($) f x = f x
id x = x
const x y = x
flip f x y = f y x
(&) x f = f x
class Ord a where
(<=) :: a -> a -> Bool
x <= y = case compare x y of
LT -> True
EQ -> True
GT -> False
compare :: a -> a -> Ordering
compare x y = if x <= y then if y == x then EQ else LT else GT
instance Ord Int where (<=) = intLE
instance Ord Char where (<=) = charLE
data Ordering = LT | GT | EQ deriving (Eq, Show)
instance Ord a => Ord [a] where
xs <= ys = case xs of
[] -> True
x:xt -> case ys of
[] -> False
y:yt -> if x <= y then if y <= x then xt <= yt else True else False
compare xs ys = case xs of
[] -> case ys of
[] -> EQ
_ -> LT
x:xt -> case ys of
[] -> GT
y:yt -> if x <= y then if y <= x then compare xt yt else LT else GT
data Maybe a = Nothing | Just a deriving (Eq, Show)
data Either a b = Left a | Right b deriving (Eq, Show)
fst (x, y) = x
snd (x, y) = y
uncurry f (x, y) = f x y
first f (x, y) = (f x, y)
second f (x, y) = (x, f y)
bool a b c = if c then b else a
not a = if a then False else True
x /= y = not $ x == y
(.) f g x = f (g x)
(||) f g = if f then True else g
(&&) f g = if f then g else False
take :: Int -> [a] -> [a]
take 0 xs = []
take _ [] = []
take n (h:t) = h : take (n - 1) t
drop :: Int -> [a] -> [a]
drop n xs | n <= 0 = xs
drop _ [] = []
drop n (_:xs) = drop (n-1) xs
splitAt n xs = (take n xs, drop n xs)
maybe n j m = case m of Nothing -> n; Just x -> j x
instance Functor Maybe where fmap f = maybe Nothing (Just . f)
instance Applicative Maybe where pure = Just ; mf <*> mx = maybe Nothing (\f -> maybe Nothing (Just . f) mx) mf
instance Monad Maybe where return = Just ; mf >>= mg = maybe Nothing mg mf
instance Alternative Maybe where empty = Nothing ; x <|> y = maybe y Just x
foldr c n = \case [] -> n; h:t -> c h $ foldr c n t
length :: [a] -> Int
length = foldr (\_ n -> n + 1) 0
mapM f = foldr (\a rest -> liftA2 (:) (f a) rest) (pure [])
mapM_ f = foldr ((>>) . f) (pure ())
forM = flip mapM
sequence = mapM id
replicateM = (sequence .) . replicate
foldM f z0 xs = foldr (\x k z -> f z x >>= k) pure xs z0
when x y = if x then y else pure ()
unless x y = if x then pure () else y
error = primitiveError
undefined = error "undefined"
foldr1 c l@(h:t) = maybe undefined id $ foldr (\x m -> Just $ maybe x (c x) m) Nothing l
foldl f a bs = foldr (\b g x -> g (f x b)) (\x -> x) bs a
foldl1 f (h:t) = foldl f h t
scanl f q ls = q : (case ls of
[] -> []
x:xs -> scanl f (f q x) xs)
scanl1 f (x:xs) = scanl f x xs
scanl1 _ [] = []
elem k xs = foldr (\x t -> x == k || t) False xs
notElem = (not .) . elem
find f xs = foldr (\x t -> if f x then Just x else t) Nothing xs
(++) = flip (foldr (:))
concat = foldr (++) []
map = flip (foldr . ((:) .)) []
head (h:_) = h
tail (_:t) = t
(!!) :: [a] -> Int -> a
xs!!0 = head xs
xs!!n = tail xs!!(n - 1)
replicate :: Int -> a -> [a]
replicate 0 _ = []
replicate n x = x : replicate (n - 1) x
repeat x = x : repeat x
cycle = concat . repeat
null [] = True
null _ = False
reverse = foldl (flip (:)) []
dropWhile _ [] = []
dropWhile p xs@(x:xt)
| p x = dropWhile p xt
| True = xs
span _ [] = ([], [])
span p xs@(x:xt)
| p x = first (x:) $ span p xt
| True = ([],xs)
break p = span (not . p)
isSpace c = elem (ord c) [32, 9, 10, 11, 12, 13, 160]
words s = case dropWhile isSpace s of
"" -> []
s' -> w : words s'' where (w, s'') = break isSpace s'
lines "" = []
lines s | (l, s') <- break (== '\n') s = l : case s' of
[] -> []
_:s'' -> lines s''
instance Functor [] where fmap = map
instance Applicative [] where pure = (:[]); f <*> x = concatMap (<$> x) f
instance Monad [] where return = (:[]); (>>=) = flip concatMap
instance Alternative [] where empty = [] ; (<|>) = (++)
concatMap = (concat .) . map
lookup s = foldr (\(k, v) t -> if s == k then Just v else t) Nothing
filter p = foldr (\x -> bool id (x:) $ p x) []
filterM p = foldr (\x -> liftA2 (bool id (x:)) $ p x) $ pure []
union xs ys = xs ++ filter (`notElem` xs) (nub ys)
nub l = go l [] where
go [] _ = []
go (y:yt) xs
| elem y xs = go yt xs
| otherwise = y : go yt (y:xs)
partition p = foldr go ([], []) where
go x (ts, fs)
| p x = (x:ts, fs)
| otherwise = (ts, x:fs)
intersect xs ys = filter (\x -> maybe False (\_ -> True) $ find (x ==) ys) xs
xs \\ ys = filter (not . (`elem` ys)) xs
last (x:xt) = go x xt where go x xt = case xt of [] -> x; y:yt -> go y yt
init (x:xt) = case xt of [] -> []; _ -> x : init xt
intercalate sep = \case [] -> []; x:xt -> x ++ concatMap (sep ++) xt
intersperse sep = \case [] -> []; x:xt -> x : foldr ($) [] (((sep:) .) . (:) <$> xt)
all f = and . map f
any f = or . map f
and = foldr (&&) True
or = foldr (||) False
zipWith f xs ys = case xs of [] -> []; x:xt -> case ys of [] -> []; y:yt -> f x y : zipWith f xt yt
zip = zipWith (,)
unzip [] = ([], [])
unzip ((a, b):rest) = (a:at, b:bt) where (at, bt) = unzip rest
transpose [] = []
transpose ([] : xss) = transpose xss
transpose ((x:xs) : xss) = (x : [h | (h:_) <- xss]) : transpose (xs : [ t | (_:t) <- xss])
data State s a = State (s -> (a, s))
runState (State f) = f
instance Functor (State s) where fmap f = \(State h) -> State (first f . h)
instance Applicative (State s) where
pure a = State (a,)
(State f) <*> (State x) = State \s -> let (g, s') = f s in first g $ x s'
instance Monad (State s) where
return a = State (a,)
(State h) >>= f = State $ uncurry (runState . f) . h
evalState m = fst . runState m
execState m = snd . runState m
get = State \s -> (s, s)
put n = State \s -> ((), n)
modify f = put =<< f <$> get
either l r e = case e of Left x -> l x; Right x -> r x
instance Functor (Either a) where fmap f e = either Left (Right . f) e
instance Applicative (Either a) where
pure = Right
ef <*> ex = case ef of
Left s -> Left s
Right f -> either Left (Right . f) ex
instance Monad (Either a) where
return = Right
ex >>= f = either Left f ex
class Alternative f where
empty :: f a
(<|>) :: f a -> f a -> f a
asum = foldr (<|>) empty
(*>) = liftA2 \x y -> y
(<*) = liftA2 \x y -> x
many p = liftA2 (:) p (many p) <|> pure []
some p = liftA2 (:) p (many p)
sepBy1 p sep = liftA2 (:) p (many (sep *> p))
sepBy p sep = sepBy1 p sep <|> pure []
between x y p = x *> (p <* y)
showParen b f = if b then ('(':) . f . (')':) else f
iterate f x = x : iterate f (f x)
takeWhile _ [] = []
takeWhile p xs@(x:xt)
| p x = x : takeWhile p xt
| True = []
dropWhileEnd p = foldr (\x xs -> if p x && null xs then [] else x : xs) []
a ^ b = case b of
0 -> 1
1 -> a
_ -> case r of
0 -> h2
1 -> h2*a
where
(q, r) = divMod b 2
h = a^q
h2 = h*h
class Enum a where
succ :: a -> a
pred :: a -> a
toEnum :: Int -> a
fromEnum :: a -> Int
enumFrom :: a -> [a]
enumFromTo :: a -> a -> [a]
enumFromThen :: a -> a -> [a]
enumFromThenTo :: a -> a -> a -> [a]
succ = toEnum . (+ 1) . fromEnum
pred = toEnum . (- 1) . fromEnum
enumFrom x = map toEnum [fromEnum x ..]
enumFromThen x y = map toEnum [fromEnum x, fromEnum y ..]
enumFromTo x y = map toEnum [fromEnum x .. fromEnum y]
enumFromThenTo x1 x2 y = map toEnum [fromEnum x1, fromEnum x2 .. fromEnum y]
instance Enum Int where
succ = (+ 1)
pred = (- 1)
toEnum = id
fromEnum = id
enumFrom = iterate succ
enumFromThen x y = iterate ((y - x)+) x
enumFromTo lo hi = takeWhile (<= hi) $ enumFrom lo
enumFromThenTo x y lim = takeWhile ((if y < x then (<=) else (>=)) lim) $ enumFromThen x y
instance Enum Bool where
toEnum 0 = False
toEnum 1 = True
fromEnum False = 0
fromEnum True = 1
instance Enum Char where
toEnum = chr
fromEnum = ord
instance Enum Word where
toEnum = wordFromInt
fromEnum = intFromWord
instance Enum Rational where
toEnum = (% 1) . toEnum
fromEnum (a :% b) = fromEnum $ a `div` b
succ = (+ 1)
pred = (- 1)
instance Enum Double where
toEnum = doubleFromInt
fromEnum = intFromDouble
succ = (+ 1)
pred = (- 1)
enumFrom = iterate (1.0+)
enumFromThen x y = (x+) . ((y - x)*) . doubleFromInt <$> [0..]
enumFromTo lo hi = takeWhile (<= hi) $ enumFrom lo
enumFromThenTo x y lim = takeWhile ((if y < x then (<=) else (>=)) lim) $ enumFromThen x y
fromIntegral = fromInteger . toInteger
class Ring a where
(+) :: a -> a -> a
(-) :: a -> a -> a
(*) :: a -> a -> a
fromInteger :: Integer -> a
negate :: a -> a
negate = (0 -)
class Field a where
recip :: a -> a
recip = (1 /)
(/) :: a -> a -> a
a / b = a * recip b
fromRational :: Rational -> a
fromRational (a:%b) = fromInteger a / fromInteger b
class Integral a where
div :: a -> a -> a
mod :: a -> a -> a
quot :: a -> a -> a
rem :: a -> a -> a
toInteger :: a -> Integer
divMod :: a -> a -> (a, a)
divMod a b = (q, a - b*q) where q = div a b
quotRem :: a -> a -> (a, a)
quotRem a b = (q, a - b*q) where q = quot a b
instance Ring Int where
(+) = intAdd
(-) = intSub
(*) = intMul
fromInteger = intFromWord . fromInteger
instance Integral Int where
div = intDiv
mod = intMod
quot = intQuot
rem = intRem
toInteger x
| 0 <= x = Integer True $ if x == 0 then [] else [wordFromInt x]
| True = Integer False [wordFromInt -x]
zeroWord = wordFromInt $ ord '\0'
instance Ring Word where
(+) = wordAdd
(-) = wordSub
(*) = wordMul
fromInteger (Integer xsgn xs) = (if xsgn then id else wordSub zeroWord) case xs of
[] -> zeroWord
(x:_) -> x
instance Integral Word where
div = wordDiv
mod = wordMod
quot = wordQuot
rem = wordRem
toInteger x = Integer True $ if x == 0 then [] else [x]
instance Eq Word where (==) = wordEq
instance Ord Word where (<=) = wordLE
instance Ring Word64 where
(+) = uuAdd
(-) = uuSub
(*) = uuMul
fromInteger (Integer xsgn xs) = (if xsgn then id else negate) $ uuPad x .|. shiftL (uuPad y) 32 where
(x, xt) = mpView xs
(y, _) = mpView xt
instance Eq Word64 where (==) = uuEq
showsUU_ n
| 0 == n = id
| True = showsUU_ (n`div`10) . (chr (48+(fromIntegral $ n`mod`10)):)
instance Show Word64 where
showsPrec _ n
| 0 == n = ('0':)
| True = showsUU_ n
instance Ord Word64 where (<=) = uuLE
instance Integral Word64 where
div = uuDiv
mod = uuMod
quot = uuDiv
rem = uuMod
toInteger x = mpCanon True [wordFromUU x, wordFromUU $ shiftR x 32]
-- Multiprecision arithmetic.
data Integer = Integer Bool [Word] deriving Eq
instance Ring Integer where
Integer xsgn xs + Integer ysgn ys
| xsgn == ysgn = Integer xsgn $ mpAdd xs ys
| True = case mpCompare xs ys of
LT -> mpCanon ysgn $ mpSub ys xs
EQ -> Integer True []
_ -> mpCanon xsgn $ mpSub xs ys
Integer xsgn xs - Integer ysgn ys
| xsgn /= ysgn = Integer xsgn $ mpAdd xs ys
| True = case mpCompare xs ys of
LT -> mpCanon (not ysgn) $ mpSub ys xs
EQ -> Integer True []
_ -> mpCanon xsgn $ mpSub xs ys
Integer xsgn xs * Integer ysgn ys
| null xs || null ys = Integer True []
| True = Integer (xsgn == ysgn) $ mpMul xs ys
fromInteger = id
instance Integral Integer where
div (Integer xsgn xs) (Integer ysgn ys) = if xsgn == ysgn
then Integer True qs
else case rs of
[] -> mpCanon0 False qs
_ -> mpCanon0 False $ mpAdd qs [1]
where (qs, rs) = mpDivMod xs ys
mod (Integer xsgn xs) (Integer ysgn ys) = if xsgn == ysgn
then mpCanon0 xsgn rs
else mpCanon ysgn $ mpSub ys rs
where rs = snd $ mpDivMod xs ys
quot (Integer xsgn xs) (Integer ysgn ys) = mpCanon0 (xsgn == ysgn) $ fst $ mpDivMod xs ys
rem (Integer xsgn xs) (Integer ysgn ys) = mpCanon0 xsgn $ snd $ mpDivMod xs ys
toInteger = id
instance Ord Integer where
compare (Integer xsgn xs) (Integer ysgn ys)
| xsgn = if ysgn then mpCompare xs ys else GT
| True = if ysgn then LT else mpCompare ys xs
instance Enum Integer where
succ = (+ Integer True [1])
pred = (+ Integer False [1])
toEnum = toInteger
fromEnum = fromInteger
enumFrom = iterate succ
enumFromTo lo hi = takeWhile (<= hi) $ enumFrom lo
mpView [] = (0, [])
mpView (x:xt) = (x, xt)
mpCanon sgn xs = mpCanon0 sgn $ reverse $ dropWhile (0 ==) $ reverse xs
mpCanon0 sgn xs = case xs of
[] -> Integer True []
_ -> Integer sgn xs
mpCompare [] [] = EQ
mpCompare [] _ = LT
mpCompare _ [] = GT
mpCompare (x:xt) (y:yt) = case mpCompare xt yt of
EQ -> compare x y
o -> o
mpAdd xs ys = mpAdc xs ys 0 where
mpAdc [] [] c
| c == 0 = []
| otherwise = [1]
mpAdc xs ys c = wordFromUU n : mpAdc xt yt c' where
(x, xt) = mpView xs
(y, yt) = mpView ys
c' = shiftR n 32
n = uuPad x + uuPad y + c
mpSub xs ys = fst $ mpSbb xs ys 0
mpSbb = go where
go [] [] b = ([], b)
go xs ys b = first (wordFromUU n:) $ go xt yt b' where
(x, xt) = mpView xs
(y, yt) = mpView ys
b' = 1 - shiftR n 32
n = shiftL 1 32 + uuPad x - uuPad y - b
mpMulWord x ys = mpMulWord' x ys 0 where
mpMulWord' _ [] c = if c == 0 then [] else [wordFromUU c]
mpMulWord' x (y:yt) c = lo:mpMulWord' x yt hi where
lo = wordFromUU n
hi = shiftR n 32
n = uuPad x * uuPad y + c
mpMul [] _ = []
mpMul (x:xt) ys = case mpMulWord x ys of
[] -> []
z:zs -> z:mpAdd zs (mpMul xt ys)
mpDivModWord xs y = first (reverse . dropWhile (0 ==)) $ go 0 $ reverse xs where
go r [] = ([], r)
go n (x:xt) = first (q:) $ go r xt where
q = wordFromUU $ (uuPad x + shiftL (uuPad n) 32) `div` uuPad y
r = x - q*y -- Only lower bits matter.
mpDivMod xs ys = first (reverse . dropWhile (== 0)) $ go us where
s = mpDivScale $ last ys
us = mpMulWord s (xs ++ [0])
vs = mpMulWord s ys
(v1:vt) = reverse vs
vlen = length vs
go us | ulen <= vlen = ([], fst $ mpDivModWord us s)
| True = first (q:) $ go $ lsbs ++ init ds
where
ulen = length us
(u0:u1:ut) = reverse us
(lsbs, msbs) = splitAt (ulen - vlen - 1) us
quu = (uuPad u1 + shiftL (uuPad u0) 32) `div` uuPad v1
q0 = if shiftL 1 32 <= quu then -1 else wordFromUU quu
(q, ds) = foldr const undefined [(q, ds) | q <- iterate (- 1) q0, let (ds, bor) = mpSbb msbs (mpMulWord q vs) 0, bor == 0]
mpDivScale n
| n1 == 0 = 1
| otherwise = wordFromUU $ shiftL 1 32 `div` uuPad n1
where n1 = succ n
mpBase _ [] = ('0':)
mpBase b xs = go xs where
go [] = id
go xs = go q . shows r where (q, r) = mpDivModWord xs b
instance Show Integer where showsPrec _ (Integer xsgn xs) = (if xsgn then id else ('-':)) . mpBase 10 xs
instance (Ord a, Ord b, Eq a, Eq b) => Ord (a, b) where
(a1, b1) <= (a2, b2) = case compare a1 a2 of
LT -> True
EQ -> b1 <= b2
GT -> False
compare (a1, b1) (a2, b2) = case compare a1 a2 of
LT -> LT
EQ -> compare b1 b2
GT -> GT
a < b = a <= b && a /= b
a > b = b <= a && a /= b
(>=) = flip(<=)
instance Applicative IO where pure = ioPure ; (<*>) f x = ioBind f \g -> ioBind x \y -> ioPure (g y)
instance Monad IO where return = ioPure ; (>>=) = ioBind
instance Functor IO where fmap f x = ioPure f <*> x
class Show a where
showsPrec :: Int -> a -> String -> String
showsPrec _ x = (show x++)
show :: a -> String
show x = shows x ""
showList :: [a] -> String -> String
showList = showList__ shows
shows = showsPrec 0
showList__ _ [] s = "[]" ++ s
showList__ showx (x:xs) s = '[' : showx x (showl xs)
where
showl [] = ']' : s
showl (y:ys) = ',' : showx y (showl ys)
showInt__ n
| 0 == n = id
| True = showInt__ (n`div`10) . (chr (48+n`mod`10):)
instance Show () where show () = "()"
instance Show Bool where
show True = "True"
show False = "False"
instance Show a => Show [a] where showsPrec _ = showList
instance Show Int where
showsPrec _ n
| 0 == n = ('0':)
| 1 <= n = showInt__ n
| 2 * n == 0 = ("-2147483648"++)
| True = ('-':) . showInt__ (0 - n)
showWord_ n
| 0 == n = id
| True = showWord_ (n`div`10) . (chr (48+(intFromWord $ n`mod`10)):)
instance Show Word where
showsPrec _ n
| 0 == n = ('0':)
| True = showWord_ n
showLitChar__ '\n' = ("\\n"++)
showLitChar__ '\\' = ("\\\\"++)
showLitChar__ c
| n < 32 || n > 127 = ('\\':) . protectDecEsc (shows n)
| otherwise = (c:)
where n = ord c
protectDecEsc f s
| (c:_) <- s, '0' <= c, c <= '9' = f $ ("\\&"++) s
| otherwise = f s
instance Show Char where
showsPrec _ '\'' = ("'\\''"++)
showsPrec _ c = ('\'':) . showLitChar__ c . ('\'':)
showList s = ('"':) . foldr (.) id (map go s) . ('"':) where
go '"' = ("\\\""++)
go c = showLitChar__ c
instance (Show a, Show b) => Show (a, b) where
showsPrec _ (a, b) = showParen True $ shows a . (',':) . shows b
integerSignList (Integer xsgn xs) f = f xsgn xs
unwords [] = ""
unwords ws = foldr1 (\w s -> w ++ ' ':s) ws
unlines = concatMap (++"\n")
abs x = if 0 <= x then x else -x
signum x | 0 == x = 0
| 0 <= x = 1
| otherwise = -1
otherwise = True
sum = foldr (+) 0
product = foldr (*) 1
max a b = if a <= b then b else a
min a b = if a <= b then a else b
gcd x y = gcd' (abs x) (abs y) where
gcd' a 0 = a
gcd' a b = gcd' b (a `rem` b)
lcm = \cases
_ 0 -> 0
0 _ -> 0
x y -> abs ((x `quot` (gcd x y)) * y)
readNatural = foldl (\n d -> toInteger 10*n + toInteger (ord d - ord '0')) (toInteger 0)
readInteger ('-':t) = -(readNatural t)
readInteger s = readNatural s
infixl 7 %
data Rational = Integer :% Integer deriving Eq
numerator (p :% _) = p
denominator (_ :% q) = q
x % y = reduce_ (x * signum y) (abs y)
reduce_ x y = (x `quot` d) :% (y `quot` d) where d = gcd x y
instance Ord Rational where (a :% b) <= (c :% d) = a*d <= b*c
instance Ring Rational where
(a :% b) + (c :% d) = reduce_ (a*d + b*c) (b*d)
(a :% b) - (c :% d) = reduce_ (a*d - b*c) (b*d)
(a :% b) * (c :% d) = reduce_ (a*c) (b*d)
fromInteger n = n :% 1
instance Show Rational where
showsPrec _ (a :% b) = shows a . (" % "++) . shows b
instance Field Rational where recip (x :% y) = (y * signum x) :% abs x
instance Ring Double where
(+) = doubleAdd
(-) = doubleSub
(*) = doubleMul
fromInteger n = integerSignList n \sgn ws -> case reverse ws of
[] -> doubleFromInt 0
x:xt -> (if sgn then id else negate) let
dx = doubleFromWord x
sh = foldr1 (*) $ replicate 32 $ doubleFromInt 2
in case xt of
[] -> dx
y:yt -> foldr (const (sh*)) (dx*sh + doubleFromWord y) yt
instance Eq Double where (==) = doubleEq
instance Ord Double where (<=) = doubleLE
instance Show Double where
showsPrec _ d = case compare d 0 of
EQ -> ('0':)
LT -> ('-':) . shows -d
GT
| d >= 9999999.5 -> big 7 10000000
| d >= 0.0000995 -> dotty d
| otherwise -> let
(as, bs) = span (>= d) $ iterate (0.1*) 1
in dotty (d / head bs) . ("e-"++) . shows (length as)
where
dotty norm = let
n = intFromDouble $ 0.0000005 + norm
in shows n . ('.':) . (tail (show $ 1000000 + intFromDouble (0.5 + 1000000 * (norm - doubleFromInt n)))++)
big e b
| 10.0*b > d = dotty (d / b) . ('e':) . shows e
| otherwise = big (e + 1::Int) (10*b)
instance Field Double where (/) = doubleDiv
floor = intFromDouble . doubleFloor
ceiling x = 0 - floor (0 - x)
class Bits a where
xor :: a -> a -> a
(.&.) :: a -> a -> a
(.|.) :: a -> a -> a
shiftL :: a -> Int -> a
shiftR :: a -> Int -> a
rotateL :: a -> Int -> a
rotateR :: a -> Int -> a
complement :: a -> a
complement x = -1 - x
instance Bits Int where
xor = intXor
(.&.) = intAnd
(.|.) = intOr
shiftL n i = intShl n i
shiftR n i = intShr n i
rotateL n i = intOr (intShr n $ 32 - i) (intShl n i)
rotateR n i = intOr (intShr n i) (intShl n $ 32 - i)
instance Bits Word where
xor = wordXor
(.&.) = wordAnd
(.|.) = wordOr
shiftL n i = wordShl n $ fromIntegral i
shiftR n i = wordShr n $ fromIntegral i
rotateL n i = wordOr (wordShr n $ 32 - i) (wordShl n i)
rotateR n i = wordOr (wordShr n i) (wordShl n $ 32 - i)
instance Bits Word64 where
xor = uuXor
(.&.) = uuAnd
(.|.) = uuOr
shiftL n i = uuShl n $ fromIntegral i
shiftR n i = uuShr n $ fromIntegral i
rotateL n i = uuOr (uuShr n $ 64 - i) (uuShl n i)
rotateR n i = uuOr (uuShr n i) (uuShl n $ 64 - i)
Wasm to Haskell?
It’d be nice to write wasm ourselves, but for expedience, we rely on Clang to compile our runtime system to a wasm binary which we manipulate. WebAssembly turns out to be pleasantly malleable. A binary breaks up into independent sections, and we can add, delete, or modify sections before stitching them together again.
We write a tool that just does enough wasm parsing to print the sections of a
wasm file we want in the form of a Haskell module, and run this on the output
of crossly warts to create WartsBytes.hs.
module Main where
import Base
import System
data Charser a = Charser
{ getCharser :: String -> Either String (a, String) }
instance Functor Charser where fmap f (Charser x) = Charser $ fmap (first f) . x
instance Applicative Charser where
pure a = Charser $ \s -> Right (a, s)
f <*> x = Charser \inp -> do
(fun, t) <- getCharser f inp
(arg, u) <- getCharser x t
pure (fun arg, u)
instance Monad Charser where
Charser f >>= g = Charser $ (good =<<) . f
where good (r, t) = getCharser (g r) t
return = pure
bad :: String -> Charser a
bad = Charser . const . Left
headerAndVersion :: String
headerAndVersion = "\0asm\x1\0\0\0"
eof :: Charser Bool
eof = Charser \s -> Right (null s, s)
next :: Charser Int
next = Charser \case
[] -> Left "unexpected EOF"
h:t -> Right (ord h, t)
sat f = Charser \case
h:t | f h -> Right (h, t)
_ -> Left "unsat"
remainder :: Charser String
remainder = Charser \s -> Right (s, "")
varuint7 = next
varuint32 = varuint
varuint :: Charser Int
varuint = unleb 1 0
-- varuint = fromIntegral <$> unleb 1 0
-- unleb :: Integer -> Integer -> Charser Integer
unleb m acc = do
-- d <- fromIntegral <$> next
d <- next
if d > 127 then unleb (m * 128) $ (d - 128) * m + acc else pure $ d*m + acc
sections = eof >>= \b -> if b then pure [] else do
n <- varuint7
s <- vec (chr <$> next)
((n, s):) <$> sections
wasm = do
s <- replicateM 8 (chr <$> next)
if s /= headerAndVersion then bad "bad header or version" else sections
hexDigit n | n < 10 = chr $ n + ord '0'
| True = chr $ n - 10 + ord 'a'
xxd = \case
"" -> ""
h:t -> let n = ord h in hexDigit (div n 16) : hexDigit (mod n 16) : xxd t
replicateM = (mapM id .) . replicate
vec f = varuint >>= (`replicateM` f)
search00type xs = do
fts <- maybe (Left "missing section 1") Right $ lookup 1 xs
ios <- fst <$> getCharser go fts
maybe (Left "missing (0, 0) functype") Right $ lookup (0, 0) $ zip ios [0..]
where
go = vec $ do
sat (== '\x60')
inCount <- varuint
replicateM inCount next
outCount <- varuint
replicateM outCount next
pure (inCount, outCount)
searchExport needle xs = do
exs <- maybe (Left "missing section 7") Right $ lookup 7 xs
maybe (Left "not found") Right =<< asum . fst <$> getCharser go exs
where
go = vec $ do
s <- vec $ chr <$> next
next
n <- varuint
pure $ if s == needle then Just n else Nothing
allFunCount xs = do
impCount <- maybe (Right 0) countImps $ lookup 2 xs
funCount <- maybe (Right 0) countFuns $ lookup 3 xs
pure $ impCount + funCount
where
countImps imps = length . fst <$> getCharser goImps imps
goImps = vec $ do
vec next
vec next
sat (== '\0')
varuint
pure ()
countFuns funs = fst <$> getCharser varuint funs
main = do
s <- getContents
case getCharser wasm s of
Left e -> putStrLn $ "parse error: " ++ e
Right (xs, []) -> do
putStr "module WartsBytes where\nimport Base\nwartsBytes = "
print $ second xxd <$> filter (not . (`elem` [0, 6]) . fst) xs
putStrLn $ either error (("allFunCount = "++) . show) $ allFunCount xs
putStrLn $ either error (("funType00Idx = "++) . show) $ search00type xs
putStrLn $ either error (("reduceFunIdx = "++) . show) $ searchExport "reduce" xs
_ -> error "unreachable"
The RTS includes generate code that wraps foreign imports. It can only be used with programs that declare the same foreign imports.
Webby
We build a compiler that goes directly from Haskell to wasm. The code does little more than dumping the runtime system wasm binary along with bytes describing the initial contents of the heap.
For now we look for the main function in the Main module and export it as
the go function; export declarations are ignored.
-- In-browser compiler.
module Main where
import Base
import Ast
import Map
import Typer
import RTS
import System
import WartsBytes
data StrLen = StrLen { _str :: String -> String, _len :: Int }
instance Monoid StrLen where
mempty = StrLen id 0
instance Semigroup StrLen where
(StrLen s1 n1) <> (StrLen s2 n2) = StrLen (s1 . s2) (n1 + n2)
hexValue d
| d <= '9' = ord d - ord '0'
| d <= 'F' = 10 + ord d - ord 'A'
| d <= 'f' = 10 + ord d - ord 'a'
unxxd s = StrLen (go s) (length s `div` 2) where
go s = case s of
[] -> id
(d1:d0:rest) -> ((chr $ hexValue d1 * 16 + hexValue d0):) . go rest
main = do
putStr "DEBUG\n"
interact $ either id id . toWasm
toWasm s = do
objList <- toAscList <$> objectify s
let
ffis = foldr (\(k, v) m -> insertWith (error $ "duplicate import: " ++ k) k v m) Tip $ concatMap (toAscList . ffiImports . _neat . snd) objList
ffiMap = fromList $ zip (keys ffis) [0..]
offobjs = snd $ foldr (\(name, obj) (n, m) -> (n + length (_mem obj), insert name (n, obj) m)) (0, Tip) objList
(ffeMap, memFun) = foldr (layout offobjs ffiMap) (Tip, id) objList
ffes = toAscList $ fst <$> ffeMap
mem = memFun []
go (n, x)
-- Function section: for each export, declare a function of type () -> ()..
| n == 3 = leb n <> extendSection x (replicate (length ffes) $ unxxd "01")
-- Export section: add each export.
| n == 7 = leb n <> extendSection x (zipWith encodeExport (fst <$> ffes) [allFunCount..])
-- Code section: for nth export, define a function calling rts_reduce([512 + 4*n])
| n == 10 = leb n <> extendSection x (callRoot <$> [0..length ffes - 1])
| True = leb n <> leb (_len s) <> s where s = unxxd x
roots = encodeData rootBase $ littleEndian $ (snd <$> ffes) ++ [0]
prog = encodeData heapBase $ littleEndian $ length mem : mem
pure $ _str (unxxd "0061736d01000000" <> mconcat (map go wartsBytes)
-- Data section:
-- 512 : null-terminated roots array
-- 1048576 - 4: hp
-- 1048576: initial heap contents
<> leb 11 <> extendSection "00" [roots, prog]) ""
extendSection x xs = encodeSection (k + length xs) $ unxxd s <> mconcat xs
where (k, s) = splitLeb x
splitLeb x = go x 1 0 where
go (d1:d0:t) m acc
| n < 128 = (acc + n*m, t)
| True = go t (m*128) $ acc + (n - 128)*m
where
n = hexValue d1 * 16 + hexValue d0
encodeExport s n = encodeString s <> unxxd "00" <> leb n
encodeString s = let n = length s in leb n <> StrLen (s++) n
littleEndian ns = StrLen (foldr (.) id $ go 4 <$> ns) $ 4 * length ns where
go k n
| k == 0 = id
| True = (chr (n `mod` 256):) . go (k - 1) (n `div` 256)
rootBase = unxxd "004180040b" -- sleb 512 = 8004
heapBase = unxxd "0041fcff3f0b" -- sleb (1048576 - 4) = fcff3f
-- 0 locals.
-- i32.const 0; i32.load 512 + 4*n; call $REDUCE; end;
callRoot n = leb (_len s) <> s where
s = unxxd "0041002802" <> slebPos (512 + 4*n) <> unxxd "10"
<> leb reduceFunIdx <> unxxd "0b"
encodeData addr s = addr <> leb (_len s) <> s
encodeSection k s = let n = leb k in leb (_len n + _len s) <> n <> s
leb n
| n <= 127 = StrLen (chr n:) 1
| True = StrLen (chr (128 + n `mod` 128):) 1 <> leb (n `div` 128)
slebPos n
| n <= 63 = StrLen (chr n:) 1
| True = StrLen (chr (128 + n `mod` 128):) 1 <> slebPos (n `div` 128)
webby.wasm
To build a browser-based compiler, we essentially run the previous compiler on itself, thus producing a wasm binary that can translate Haskell directly into wasm.
One change is needed: we swap System.hs for SystemWasm.hs. The
Linux version declares foreign imports for those in our runtime system for
Linux. The wasm version declares foreign imports for functions provided by
the host environment.
-- For wasm environments providing Haskell-like getchar, putchar, eof. module System where import Base foreign import ccall "putchar" putChar :: Char -> IO () foreign import ccall "getchar" getChar :: IO Char foreign import ccall "eof" isEOFInt :: IO Int isEOF = (0 /=) <$> isEOFInt putStr = mapM_ putChar putStrLn = (>> putChar '\n') . putStr print = putStrLn . show getContents = isEOF >>= \b -> if b then pure [] else getChar >>= \c -> (c:) <$> getContents interact f = getContents >>= putStr . f
Messy
By now, we’re painfully aware of the effects of rash decisions made while writing our earlier compilers. Compilation time has grown rapidly. Parts of the code are difficult to extend.
I’m cleaning up the precisely compiler first, and intend to backport the
changes later, so the journey will look smoother. Until then, the evolution
of our compiler will appear jumpy.
Examples:
-
Rewriting let expressions as lambda terms should happen after type-checking, so that type annotations are easier to handle.
-
Some library-like functions like
overFreeshould be scrapped because they’re only used once. -
I renamed
betatofillas it really fills holes in a context, that is, it substitutes without caring about variable capture. We get away with this because we unshadow variables during type-checking except for join point variables.
Introducing features in a different order also smooths the journey. For
example, I originally tried hash consing just before the virtually compiler,
which slowed compilation and whose code changed markedly as syntax features
were added. I pushed it to a far later compiler for faster bootstrapping and
more stable code. Another example is the Show typeclass, which I originally
added in a surprisingly recent compiler.
Built-in primitives started in their own # module, then were replaced by code
that pre-defined them for every module. But I think I was right the first time,
and switched it back. Hopefully there are no traces of my misadventure left.
At the same time, I’m introducing more chaos as I forge ahead with new experiments, especially those involving the web. The above only chronicles the start of adventures in the browser. A fuller account:
-
The
crossly wasmcommand modifies the output C so Clang can compile it to wasm: it adds a minimal bumpmallocimplementation, and uses attributes to declare exports. -
The
crossly wartscommand also modifies the output C so Clang can compile to wasm, though it avoidsmallocso it can prepopulate the heap. At the time, I didn’t want to allocate a heap and then copy the initial state. It only generates code for FFIs and the RTS, and assumes a tool likewebbywill supply the initial heap contents. -
The
Impcompiler is likewebbybut with an extra hack to request the source of dependent import modules. -
The
replyfamily of REPLs introduced more complications. To avoid compilingBaseevery time, we define a CBOR-based object file format that contains the bytecode for each definition, along with symbols, types, typeclasses, fixities, type aliases, and so on. -
For the REPL, we add a
scratchpadbuffer where the interpreter writes new bytecode. For definitions, we write the number of new symbols and callvmgcroot()to add roots so they survive GC. There’s also a tagging mechanism, where the least significant bit indicates the rest of the word is the index of a previous definition. We do all this because GC could occur while compiling new definitions; thescratchpadis untouched by GC. -
Related are
introspect.c, which lets us examine the heap from the REPL, and a reduction function that is aware of "gas": a feature that allows us to run a program for a limited number of steps, and resume execution later if it has yet to finish.
Designing the object file format leads us to take a closer look at the many phases of compilation:
-
During parsing, we compile
datadeclarations, generating ASTs for their Scott-encoding and lookup tables for field names. We also genreate instances forderivingclauses. -
During parsing, we generate selector code for each typeclass declaration.
-
During parsing, we compile each FFI import to an
Fcombinator, adding a lambda abstraction for each argument. -
Immediately after parsing, we find all exported symbols. The sole purpose of the
type2Consfield is to find all exported constructors in a declaration likeFoo(..). -
We apply type aliases to canonicalize all type names.
-
We topologically sort the modules, giving us an order which we can compile them. Recall our compilers expect the input to be the concatenation of all source modules, except for
imp.wasm, which has a way of requesting more modules. -
We rewrite expressions according to fixity declarations, and rewrite pattern matches as lambda terms.
-
At last, we compile top-level definitions to combinators, reconciling them with any type annotations. Life was simpler with our early compilers, where this was the only task we performed.
-
It is only now we can compile typeclass instance definitions, as they may depend on top-level definitions. (In the previous step, there are dependencies going the other way, but we know the purported types of all instances, which is the only information needed.)
Our data structures freely mix information collected during parsing with information generated by compilation.