Client-side compiler
Our next goal is a browser-based edition of our compiler.
Crossly
We first add a few features related to WebAssembly. Firstly, the wasm command compiles Haskell to C intended to be compiled to wasm. The resulting binary assumes the host environment supplies a few functions such as env.getchar. Secondly, the warts command prints the RTS generated by the compiler, also as C code intended to be compiled to wasm. We later use this to build compilers that can go directly from Haskell to wasm.
As usual, a bunch of other changes come along for the ride. Recall we require a fixity declaration to precede the use of its corresponding operator, which forces us to concatenate module sources in a particular order. We remove this wart by adding a new phase. Once done, not only may we paste together modules in any order, but we may also declare fixities anywhere within a module.
During parsing, operators have the same precedence. When a chain of two or more appear in a row, we abuse the syntax tree to store them in a right-associative list, for example: [1 + 2, * 3, - 4, + 5].
For patterns, we use the list field of a PatCon value; a made-up data constructor "{+" indicates the beginning of such a list. Expressions are clumsier; we bookend chains with the made-up basic combinators "{+" and "+}", and fashion a list out of A and V nodes.
By the time we call patternCompile, we have access to all modules. During this phase, we traverse the syntax tree, and we re-associate each specially marked infix chain now that we can look up the fixities of all operators.
The algorithm is conceptually straightforward. Starting from the first binary infix expression, that is, two operands and one operator, for each operator and operand we add on the right, we walk down the right spine of the current syntax tree until we reach a node of higher precedence; leaf nodes are considered to have maximum precedence. Then we insert the operator and operand at this point. We also check for illegal infix operator conflicts.
The code is messy due to a couple of wrinkles. Firstly, we have two distinct ad hoc representations of lists for holding infix chains. Secondly, we temporarily mark operands with more ad hoc conventions to avoid descending too far when reshaping syntax trees. For example, in the expression 1 + (2 + 3) * 4, the subexpression (2 + 3) is atomic.
We only allow top-level fixity declarations. We could add support for scoped fixity declarations with yet more ad hoc encodings that we later use to create scoped fixity lookup tables that override the global ones.
We do some housekeeping. Given a Neat, type inference had produced a tuple of a particular type that contained the data needed by the next phase. We change it to produce a new Neat with an updated typedAsts field, so there’s one fewer data type to occupy our thoughts and APIs. We no longer need to pick out specific fields to pass to the next phase, as we simply pass everything.
We take a first stab at top-level type declarations. We treat them similarly to default typeclass methods, in that during type inference, we trust the symbol has its annotated type, and only afterwards that we verify the annotated type matches the inferred type. It’s more complex because we must process an entire strongly connected component at a time.
Adding modules has made a mess of our various functions for looking up data constructors, top-level variables, typeclasses, and so on. We reorganize them a little to standardize the logic for searching through the list of imports. This makes it easier to add support for lists of export symbols.
-- Separate fixity phase. -- Module exports. module Ast where import Base import Map data Type = TC String | TV String | TAp Type Type deriving Eq arr a b = TAp (TAp (TC "->") a) b data Extra = Basic String | Const Int | ChrCon Char | StrCon String | Link String String Qual data Pat = PatLit Extra | PatVar String (Maybe Pat) | PatCon String [Pat] data Ast = E Extra | V String | A Ast Ast | L String Ast | Pa [([Pat], Ast)] | Ca Ast [(Pat, Ast)] | Proof Pred data Constr = Constr String [(String, Type)] data ModExport = ExportVar String | ExportCon String [String] data Pred = Pred String Type deriving Eq data Qual = Qual [Pred] Type instance Show Type where showsPrec _ = \case TC s -> (s++) TV s -> (s++) TAp (TAp (TC "->") a) b -> showParen True $ shows a . (" -> "++) . shows b TAp a b -> showParen True $ shows a . (' ':) . shows b instance Show Pred where showsPrec _ (Pred s t) = (s++) . (' ':) . shows t . (" => "++) instance Show Qual where showsPrec _ (Qual ps t) = foldr (.) id (map shows ps) . shows t instance Show Extra where showsPrec _ = \case Basic s -> (s++) Const i -> shows i ChrCon c -> shows c StrCon s -> shows s Link im s _ -> (im++) . ('.':) . (s++) instance Show Pat where showsPrec _ = \case PatLit e -> shows e PatVar s mp -> (s++) . maybe id ((('@':) .) . shows) mp PatCon s ps -> (s++) . ("TODO"++) showVar s@(h:_) = showParen (elem h ":!#$%&*+./<=>?@\\^|-~") (s++) instance Show Ast where showsPrec prec = \case E e -> shows e V s -> showVar s A x y -> showParen (1 <= prec) $ shows x . (' ':) . showsPrec 1 y L s t -> showParen True $ ('\\':) . (s++) . (" -> "++) . shows t Pa vsts -> ('\\':) . showParen True (foldr (.) id $ intersperse (';':) $ map (\(vs, t) -> foldr (.) id (intersperse (' ':) $ map (showParen True . shows) vs) . (" -> "++) . shows t) vsts) Ca x as -> ("case "++) . shows x . (" of {"++) . foldr (.) id (intersperse (',':) $ map (\(p, a) -> shows p . (" -> "++) . shows a) as) Proof p -> ("{Proof "++) . shows p . ("}"++) showType = shows -- for Unify. data Instance = Instance -- Type, e.g. Int for Eq Int. Type -- Dictionary name, e.g. "{Eq Int}" String -- Context. [Pred] -- Method definitions (Map String Ast) data Tycl = Tycl [String] [Instance] data Assoc = NAssoc | LAssoc | RAssoc deriving Eq data Neat = Neat { typeclasses :: Map String Tycl , topDefs :: [(String, Ast)] , topDecls :: Map String Qual -- | Typed ASTs, ready for compilation, including ADTs and methods, -- e.g. (==), (Eq a => a -> a -> Bool, select-==) , typedAsts :: Map String (Qual, Ast) , dataCons :: Map String [Constr] , type2Cons :: Map String [String] , ffiImports :: Map String Type , ffiExports :: Map String String , moduleImports :: [String] , moduleExports :: Maybe [String] , opFixity :: Map String (Int, Assoc) } neatEmpty = Neat Tip [] Tip Tip Tip Tip Tip Tip [] Nothing Tip patVars = \case PatLit _ -> [] PatVar s m -> s : maybe [] patVars m PatCon _ args -> concat $ patVars <$> args fvPro bound expr = case expr of V s | not (elem s bound) -> [s] A x y -> fvPro bound x `union` fvPro bound y L s t -> fvPro (s:bound) t Pa vsts -> foldr union [] $ map (\(vs, t) -> fvPro (concatMap patVars vs ++ bound) t) vsts Ca x as -> fvPro bound x `union` fvPro bound (Pa $ first (:[]) <$> as) _ -> [] overFree s f t = case t of E _ -> t V s' -> if s == s' then f t else t A x y -> A (overFree s f x) (overFree s f y) L s' t' -> if s == s' then t else L s' $ overFree s f t' beta s t x = overFree s (const t) x typeVars = \case TC _ -> [] TV v -> [v] TAp x y -> typeVars x `union` typeVars y depthFirstSearch = (foldl .) \relation st@(visited, sequence) vertex -> if vertex `elem` visited then st else second (vertex:) $ depthFirstSearch relation (vertex:visited, sequence) (relation vertex) spanningSearch = (foldl .) \relation st@(visited, setSequence) vertex -> if vertex `elem` visited then st else second ((:setSequence) . (vertex:)) $ depthFirstSearch relation (vertex:visited, []) (relation vertex) scc ins outs = spanning . depthFirst where depthFirst = snd . depthFirstSearch outs ([], []) spanning = snd . spanningSearch ins ([], [])
-- Separate fixity phase. -- Export lists. module Parser where import Base import Ast import Map -- Parser. data ParserState = ParserState { readme :: [(Char, (Int, Int))] , landin :: String , indents :: [Int] } data Parser a = Parser { getParser :: ParserState -> Either String (a, ParserState) } instance Functor Parser where fmap f x = pure f <*> x instance Applicative Parser where pure x = Parser \inp -> Right (x, inp) (Parser f) <*> (Parser x) = Parser \inp -> do (fun, t) <- f inp (arg, u) <- x t pure (fun arg, u) instance Monad Parser where return = pure (Parser x) >>= f = Parser \inp -> do (a, t) <- x inp getParser (f a) t instance Alternative Parser where empty = bad "" x <|> y = Parser \inp -> either (const $ getParser y inp) Right $ getParser x inp parse f str = getParser f $ ParserState (rowcol str (1, 1)) [] [] where rowcol s rc = case s of [] -> [] h:t -> (h, rc) : rowcol t (advanceRC (ord h) rc) advanceRC n (r, c) | n `elem` [10, 11, 12, 13] = (r + 1, 1) | n == 9 = (r, (c + 8)`mod`8) | True = (r, c + 1) indentOf pasta = case readme pasta of [] -> 1 (_, (_, c)):_ -> c ins c pasta = pasta { landin = c:landin pasta } angle n pasta = case indents pasta of m:ms | m == n -> ins ';' pasta | n + 1 <= m -> ins '}' $ angle n pasta { indents = ms } _ -> pasta curly n pasta = case indents pasta of m:ms | m + 1 <= n -> ins '{' pasta { indents = n:m:ms } [] | 1 <= n -> ins '{' pasta { indents = [n] } _ -> ins '{' . ins '}' $ angle n pasta sat f = Parser \pasta -> case landin pasta of c:t -> if f c then Right (c, pasta { landin = t }) else Left "unsat" [] -> case readme pasta of [] -> case indents pasta of [] -> Left "EOF" m:ms | m /= 0 && f '}' -> Right ('}', pasta { indents = ms }) _ -> Left "unsat" (h, _):t | f h -> let p' = pasta { readme = t } in case h of '}' -> case indents pasta of 0:ms -> Right (h, p' { indents = ms }) _ -> Left "unsat" '{' -> Right (h, p' { indents = 0:indents p' }) _ -> Right (h, p') _ -> Left "unsat" char c = sat (c ==) rawSat f = Parser \pasta -> case readme pasta of [] -> Left "EOF" (h, _):t -> if f h then Right (h, pasta { readme = t }) else Left "unsat" eof = Parser \pasta -> case pasta of ParserState [] [] _ -> Right ((), pasta) _ -> badpos pasta "want eof" comment = rawSat ('-' ==) *> some (rawSat ('-' ==)) *> (rawSat isNewline <|> rawSat (not . isSymbol) *> many (rawSat $ not . isNewline) *> rawSat isNewline) *> pure True spaces = isNewline <$> rawSat isSpace whitespace = Parser \pasta -> case landin pasta of [] -> do (offside, pasta') <- getParser (or <$> many (spaces <|> comment)) pasta if offside then Right ((), angle (indentOf pasta') pasta') else Right ((), pasta') _ -> Right ((), pasta) hexValue d | d <= '9' = ord d - ord '0' | d <= 'F' = 10 + ord d - ord 'A' | d <= 'f' = 10 + ord d - ord 'a' isNewline c = ord c `elem` [10, 11, 12, 13] isSymbol = (`elem` "!#$%&*+./<=>?@\\^|-~:") small = sat \x -> ((x <= 'z') && ('a' <= x)) || (x == '_') large = sat \x -> (x <= 'Z') && ('A' <= x) hexit = sat \x -> (x <= '9') && ('0' <= x) || (x <= 'F') && ('A' <= x) || (x <= 'f') && ('a' <= x) digit = sat \x -> (x <= '9') && ('0' <= x) decimal = foldl (\n d -> 10*n + ord d - ord '0') 0 <$> some digit hexadecimal = foldl (\n d -> 16*n + hexValue d) 0 <$> some hexit escape = char '\\' *> (sat (`elem` "'\"\\") <|> char 'n' *> pure '\n' <|> char '0' *> pure '\0' <|> char 'x' *> (chr <$> hexadecimal)) tokOne delim = escape <|> rawSat (delim /=) tokChar = between (char '\'') (char '\'') (tokOne '\'') quoteStr = between (char '"') (char '"') $ many $ many (char '\\' *> char '&') *> tokOne '"' quasiquoteStr = char '[' *> char 'r' *> char '|' *> quasiquoteBody quasiquoteBody = (char '|' *> char ']' *> pure []) <|> (:) <$> rawSat (const True) <*> quasiquoteBody tokStr = quoteStr <|> quasiquoteStr integer = char '0' *> (char 'x' <|> char 'X') *> hexadecimal <|> decimal literal = lexeme $ Const <$> integer <|> ChrCon <$> tokChar <|> StrCon <$> tokStr varish = lexeme $ liftA2 (:) small $ many (small <|> large <|> digit <|> char '\'') bad s = Parser \pasta -> badpos pasta s badpos pasta s = Left $ loc $ ": " ++ s where loc = case readme pasta of [] -> ("EOF"++) (_, (r, c)):_ -> ("row "++) . shows r . (" col "++) . shows c varId = do s <- varish if elem s ["export", "case", "class", "data", "default", "deriving", "do", "else", "foreign", "if", "import", "in", "infix", "infixl", "infixr", "instance", "let", "module", "newtype", "of", "then", "type", "where", "_"] then bad $ "reserved: " ++ s else pure s varSymish = lexeme $ (:) <$> sat (\c -> isSymbol c && c /= ':') <*> many (sat isSymbol) varSym = lexeme $ do s <- varSymish if elem s ["..", "=", "\\", "|", "<-", "->", "@", "~", "=>"] then bad $ "reserved: " ++ s else pure s conId = lexeme $ liftA2 (:) large $ many (small <|> large <|> digit <|> char '\'') conSymish = lexeme $ liftA2 (:) (char ':') $ many $ sat isSymbol conSym = do s <- conSymish if elem s [":", "::"] then bad $ "reserved: " ++ s else pure s special c = lexeme $ sat (c ==) comma = special ',' semicolon = special ';' lParen = special '(' rParen = special ')' lBrace = special '{' rBrace = special '}' lSquare = special '[' rSquare = special ']' backquote = special '`' lexeme f = f <* whitespace lexemePrelude = whitespace *> Parser \pasta -> case getParser (res "module" <|> (:[]) <$> char '{') pasta of Left _ -> Right ((), curly (indentOf pasta) pasta) Right _ -> Right ((), pasta) curlyCheck = Parser \pasta -> case readme pasta of [] -> Right ((), curly 0 pasta) ('{', _):_ -> Right ((), pasta) (_, (_, col)):_ -> Right ((), curly col pasta) conOf (Constr s _) = s specialCase (h:_) = '{':conOf h mkCase t cs = insert (specialCase cs) ( Qual [] $ arr t $ foldr arr (TV "case") $ map (\(Constr _ sts) -> foldr arr (TV "case") $ snd <$> sts) cs , E $ Basic "I") mkStrs = snd . foldl (\(s, l) u -> ('@':s, s:l)) ("@", []) scottEncode _ ":" _ = E $ Basic "CONS" scottEncode vs s ts = foldr L (foldl (\a b -> A a (V b)) (V s) ts) (ts ++ vs) scottConstr t cs (Constr s sts) = foldr (.) (insertWith (error $ "constructor conflict: " ++ s) s (Qual [] $ foldr arr t ts , scottEncode (map conOf cs) s $ mkStrs ts) ) [insertWith (error $ "field conflict: " ++ field) field (Qual [] $ t `arr` ft, L s $ foldl A (V s) $ inj $ proj field) | (field, ft) <- sts, field /= ""] where ts = snd <$> sts proj fd = foldr L (V fd) $ fst <$> sts inj x = map (\(Constr s' _) -> if s' == s then x else V "undefined") cs mkAdtDefs t cs = foldr (.) (mkCase t cs) $ scottConstr t cs <$> cs mkFFIHelper n t acc = case t of TC s -> acc TAp (TC "IO") _ -> acc TAp (TAp (TC "->") x) y -> L (show n) $ mkFFIHelper (n + 1) y $ A (V $ show n) acc updateDcs cs dcs = foldr (\(Constr s _) m -> insert s cs m) dcs cs addAdt t cs ders neat = foldr derive neat' ders where neat' = neat { typedAsts = mkAdtDefs t cs $ typedAsts neat , dataCons = updateDcs cs $ dataCons neat , type2Cons = insert (typeName t) (concatMap cnames cs) $ type2Cons neat } typeName = \case TAp x _ -> typeName x TC c -> c cnames (Constr s sts) = s : concatMap (\(s, _) -> if s == "" then [] else [s]) sts derive "Eq" = addInstance "Eq" (mkPreds "Eq") t [("==", L "lhs" $ L "rhs" $ Ca (V "lhs") $ map eqCase cs )] derive "Show" = addInstance "Show" (mkPreds "Show") t [("showsPrec", L "prec" $ L "x" $ Ca (V "x") $ map showCase cs )] derive der = error $ "bad deriving: " ++ der prec0 = A (V "ord") (E $ ChrCon '\0') showCase (Constr con args) = let as = show <$> [1..length args] in (PatCon con (mkPatVar "" <$> as), case args of [] -> L "s" $ A (A (V "++") (E $ StrCon con)) (V "s") _ -> case con of ':':_ -> A (A (V "showParen") $ V "True") $ foldr1 (\f g -> A (A (V ".") f) g) [ A (A (V "showsPrec") prec0) (V "1") , L "s" $ A (A (V "++") (E $ StrCon $ ' ':con++" ")) (V "s") , A (A (V "showsPrec") prec0) (V "2") ] _ -> A (A (V "showParen") $ A (A (V "<=") prec0) $ V "prec") $ A (A (V ".") $ A (V "++") (E $ StrCon con)) $ foldr (\f g -> A (A (V ".") f) g) (L "x" $ V "x") $ map (\a -> A (A (V ".") (A (V ":") (E $ ChrCon ' '))) $ A (A (V "showsPrec") prec0) (V a)) as ) mkPreds classId = Pred classId . TV <$> typeVars t mkPatVar pre s = PatVar (pre ++ s) Nothing eqCase (Constr con args) = let as = show <$> [1..length args] in (PatCon con (mkPatVar "l" <$> as), Ca (V "rhs") [ (PatCon con (mkPatVar "r" <$> as), foldr (\x y -> (A (A (V "&&") x) y)) (V "True") $ map (\n -> A (A (V "==") (V $ "l" ++ n)) (V $ "r" ++ n)) as) , (PatVar "_" Nothing, V "False")]) emptyTycl = Tycl [] [] addClass classId v (sigs, defs) neat = if null ms then neat { typeclasses = insert classId (Tycl (keys sigs) is) tycl , typedAsts = selectors $ typedAsts neat , topDefs = defaults ++ topDefs neat } else error $ "duplicate class: " ++ classId where vars = take (size sigs) $ show <$> [0..] selectors = foldr (.) id $ zipWith (\var (s, Qual ps t) -> insertWith (error $ "method conflict: " ++ s) s (Qual (Pred classId v:ps) t, L "@" $ A (V "@") $ foldr L (V var) vars)) vars $ toAscList sigs defaults = map (\(s, t) -> if member s sigs then ("{default}" ++ s, t) else error $ "bad default method: " ++ s) $ toAscList defs tycl = typeclasses neat Tycl ms is = maybe emptyTycl id $ mlookup classId tycl addInstance classId ps ty ds neat = neat { typeclasses = insert classId (Tycl ms $ Instance ty name ps (fromList ds):is) tycl } where tycl = typeclasses neat Tycl ms is = maybe emptyTycl id $ mlookup classId tycl name = '{':classId ++ (' ':shows ty "}") addTopDecl (s, t) neat = neat { topDecls = insert s t $ topDecls neat } addForeignImport foreignname ourname t neat = neat { typedAsts = insertWith (error $ "import conflict: " ++ ourname) ourname (Qual [] t, mkFFIHelper 0 t $ A (E $ Basic "F") $ E $ Link "{foreign}" foreignname $ Qual [] t) $ typedAsts neat , ffiImports = insertWith (error $ "duplicate import: " ++ foreignname) foreignname t $ ffiImports neat } addForeignExport e f neat = neat { ffiExports = insertWith (error $ "duplicate export: " ++ e) e f $ ffiExports neat } addDefs ds neat = neat { topDefs = ds ++ topDefs neat } addImport im neat = neat { moduleImports = im:moduleImports neat } addFixities os prec neat = neat { opFixity = foldr (\o tab -> insert o prec tab) (opFixity neat) os } parseErrorRule = Parser \pasta -> case indents pasta of m:ms | m /= 0 -> Right ('}', pasta { indents = ms }) _ -> badpos pasta "missing }" res w = do s <- varish <|> conSymish <|> varSymish when (s /= w) $ bad $ "want \"" ++ w ++ "\"" when (elem w ["let", "where", "do", "of"]) $ curlyCheck pure w paren = between lParen rParen braceSep f = between lBrace (rBrace <|> parseErrorRule) $ foldr ($) [] <$> sepBy ((:) <$> f <|> pure id) semicolon maybeFix s x = if elem s $ fvPro [] x then A (V "fix") (L s x) else x nonemptyTails [] = [] nonemptyTails xs@(x:xt) = xs : nonemptyTails xt addLets ls x = foldr triangle x components where vs = fst <$> ls ios = foldr (\(s, dsts) (ins, outs) -> (foldr (\dst -> insertWith union dst [s]) ins dsts, insertWith union s dsts outs)) (Tip, Tip) $ map (\(s, t) -> (s, intersect (fvPro [] t) vs)) ls components = scc (\k -> maybe [] id $ mlookup k $ fst ios) (\k -> maybe [] id $ mlookup k $ snd ios) vs triangle names expr = let tnames = nonemptyTails names insLams vs t = foldr L t vs appem vs = foldl1 A $ V <$> vs suball expr = foldl A (foldr L expr $ init names) $ appem <$> init tnames redef tns expr = foldr L (suball expr) tns in foldr (\(x:xt) t -> A (L x t) $ maybeFix x $ redef xt $ maybe undefined id $ lookup x ls) (suball expr) tnames qconop = conSym <|> res ":" <|> between backquote backquote conId qconsym = conSym <|> res ":" op = qconsym <|> varSym <|> between backquote backquote (conId <|> varId) con = conId <|> paren qconsym var = varId <|> paren varSym tycon = do s <- conId pure $ if s == "String" then TAp (TC "[]") (TC "Char") else TC s aType = lParen *> ( rParen *> pure (TC "()") <|> (foldr1 (TAp . TAp (TC ",")) <$> sepBy1 _type comma) <* rParen) <|> tycon <|> TV <$> varId <|> (lSquare *> (rSquare *> pure (TC "[]") <|> TAp (TC "[]") <$> (_type <* rSquare))) bType = foldl1 TAp <$> some aType _type = foldr1 arr <$> sepBy bType (res "->") fixityDecl w a = do res w n <- lexeme integer os <- sepBy op comma pure $ addFixities os (n, a) fixity = fixityDecl "infix" NAssoc <|> fixityDecl "infixl" LAssoc <|> fixityDecl "infixr" RAssoc cDecls = first fromList . second fromList . foldr ($) ([], []) <$> braceSep cDecl cDecl = first . (:) <$> genDecl <|> second . (++) <$> defSemi genDecl = (,) <$> var <* res "::" <*> (Qual <$> (scontext <* res "=>" <|> pure []) <*> _type) classDecl = res "class" *> (addClass <$> conId <*> (TV <$> varId) <*> (res "where" *> cDecls)) simpleClass = Pred <$> conId <*> _type scontext = (:[]) <$> simpleClass <|> paren (sepBy simpleClass comma) instDecl = res "instance" *> ((\ps cl ty defs -> addInstance cl ps ty defs) <$> (scontext <* res "=>" <|> pure []) <*> conId <*> _type <*> (res "where" *> braceDef)) letin = addLets <$> between (res "let") (res "in") braceDef <*> expr ifthenelse = (\a b c -> A (A (A (V "if") a) b) c) <$> (res "if" *> expr) <*> (res "then" *> expr) <*> (res "else" *> expr) listify = foldr (\h t -> A (A (V ":") h) t) (V "[]") alts = braceSep $ (,) <$> pat <*> guards "->" cas = Ca <$> between (res "case") (res "of") expr <*> alts lamCase = res "case" *> curlyCheck *> (L "\\case" . Ca (V "\\case") <$> alts) lam = res "\\" *> (lamCase <|> liftA2 onePat (some apat) (res "->" *> expr)) flipPairize y x = A (A (V ",") x) y moreCommas = foldr1 (A . A (V ",")) <$> sepBy1 expr comma thenComma = comma *> ((flipPairize <$> moreCommas) <|> pure (A (V ","))) parenExpr = (&) <$> expr <*> (((\v a -> A (V v) a) <$> op) <|> thenComma <|> pure id) rightSect = ((\v a -> L "@" $ A (A (V v) $ V "@") a) <$> (op <|> (:"") <$> comma)) <*> expr section = lParen *> (parenExpr <* rParen <|> rightSect <* rParen <|> rParen *> pure (V "()")) maybePureUnit = maybe (V "pure" `A` V "()") id stmt = (\p x -> Just . A (V ">>=" `A` x) . onePat [p] . maybePureUnit) <$> pat <*> (res "<-" *> expr) <|> (\x -> Just . maybe x (\y -> (V ">>=" `A` x) `A` (L "_" y))) <$> expr <|> (\ds -> Just . addLets ds . maybePureUnit) <$> (res "let" *> braceDef) doblock = res "do" *> (maybePureUnit . foldr ($) Nothing <$> braceSep stmt) compQual = (\p xs e -> A (A (V "concatMap") $ onePat [p] e) xs) <$> pat <*> (res "<-" *> expr) <|> (\b e -> A (A (A (V "if") b) e) $ V "[]") <$> expr <|> addLets <$> (res "let" *> braceDef) sqExpr = between lSquare rSquare $ ((&) <$> expr <*> ( res ".." *> ( (\hi lo -> (A (A (V "enumFromTo") lo) hi)) <$> expr <|> pure (A (V "enumFrom")) ) <|> res "|" *> ((. A (V "pure")) . foldr (.) id <$> sepBy1 compQual comma) <|> (\t h -> listify (h:t)) <$> many (comma *> expr) ) ) <|> pure (V "[]") fbind = A <$> (E . StrCon <$> var) <*> (res "=" *> expr) mayUpdate v = (do fbs <- between lBrace rBrace $ sepBy1 fbind comma pure $ A (E $ Basic "{=") $ foldr A (E $ Basic "=}") $ v:fbs ) <|> pure v atom = ifthenelse <|> doblock <|> letin <|> sqExpr <|> section <|> cas <|> lam <|> (paren comma *> pure (V ",")) <|> ((V <$> (con <|> var)) >>= mayUpdate) <|> E <$> literal aexp = foldl1 A <$> some atom chain a = \case [] -> a A f b:rest -> case rest of [] -> A (A f a) b _ -> A (E $ Basic "{+") $ A (A (A f a) b) $ foldr A (E $ Basic "+}") rest _ -> error "unreachable" expr = chain <$> aexp <*> many (A <$> (V <$> op) <*> aexp) gcon = conId <|> paren (qconsym <|> (:"") <$> comma) <|> lSquare *> rSquare *> pure "[]" apat = PatVar <$> var <*> (res "@" *> (Just <$> apat) <|> pure Nothing) <|> flip PatVar Nothing <$> (res "_" *> pure "_") <|> flip PatCon [] <$> gcon <|> PatLit <$> literal <|> foldr (\h t -> PatCon ":" [h, t]) (PatCon "[]" []) <$> between lSquare rSquare (sepBy pat comma) <|> paren (foldr1 pairPat <$> sepBy1 pat comma <|> pure (PatCon "()" [])) where pairPat x y = PatCon "," [x, y] patChain a = \case [] -> a PatCon f [b]:rest -> case rest of [] -> PatCon f [a, b] _ -> PatCon "{+" $ PatCon f [a, b] : rest _ -> error "unreachable" patAtom = PatCon <$> gcon <*> many apat <|> apat pat = patChain <$> patAtom <*> many (PatCon <$> qconop <*> ((:[]) <$> patAtom)) maybeWhere p = (&) <$> p <*> (res "where" *> (addLets <$> braceDef) <|> pure id) guards s = maybeWhere $ res s *> expr <|> foldr ($) (V "pjoin#") <$> some ((\x y -> case x of V "True" -> \_ -> y _ -> A (A (A (V "if") x) y) ) <$> (res "|" *> expr) <*> (res s *> expr)) onePat vs x = Pa [(vs, x)] opDef x f y rhs = [(f, onePat [x, y] rhs)] leftyPat p expr = case pvars of [] -> [] (h:t) -> let gen = '@':h in (gen, expr):map (\v -> (v, Ca (V gen) [(p, V v)])) pvars where pvars = filter (/= "_") $ patVars p def = liftA2 (\l r -> [(l, r)]) var (liftA2 onePat (many apat) $ guards "=") <|> (pat >>= \x -> opDef x <$> varSym <*> pat <*> guards "=" <|> leftyPat x <$> guards "=") coalesce = \case [] -> [] h@(s, x):t -> case t of [] -> [h] (s', x'):t' -> let f (Pa vsts) (Pa vsts') = Pa $ vsts ++ vsts' f _ _ = error "bad multidef" in if s == s' then coalesce $ (s, f x x'):t' else h:coalesce t defSemi = coalesce . concat <$> sepBy1 def semicolon braceDef = concat <$> braceSep defSemi simpleType c vs = foldl TAp (TC c) (map TV vs) conop = conSym <|> between backquote backquote conId fieldDecl = (\vs t -> map (, t) vs) <$> sepBy1 var comma <*> (res "::" *> _type) constr = (\x c y -> Constr c [("", x), ("", y)]) <$> aType <*> conop <*> aType <|> Constr <$> conId <*> ( concat <$> between lBrace rBrace (fieldDecl `sepBy` comma) <|> map ("",) <$> many aType) dclass = conId _deriving = (res "deriving" *> ((:[]) <$> dclass <|> paren (dclass `sepBy` comma))) <|> pure [] adt = addAdt <$> between (res "data") (res "=") (simpleType <$> conId <*> many varId) <*> sepBy constr (res "|") <*> _deriving impDecl = addImport <$> (res "import" *> conId) topdecls = braceSep $ adt <|> classDecl <|> addTopDecl <$> genDecl <|> instDecl <|> res "foreign" *> ( res "import" *> var *> (addForeignImport <$> lexeme tokStr <*> var <*> (res "::" *> _type)) <|> res "export" *> var *> (addForeignExport <$> lexeme tokStr <*> var) ) <|> addDefs <$> defSemi <|> fixity <|> impDecl export_ = ExportVar <$> varId <|> ExportCon <$> conId <*> ( paren ((:[]) <$> res ".." <|> sepBy (var <|> con) comma) <|> pure [] ) exports = Just <$> paren (export_ `sepBy` comma) <|> pure Nothing haskell = between lexemePrelude eof $ some do (moduleName, exs) <- mayModule (moduleName,) . (exs,) <$> topdecls mayModule = res "module" *> ((,) <$> conId <*> exports <* res "where") <|> pure ("Main", Nothing) parseProgram s = fmap fst $ parse haskell s
-- Separate fixity phase. -- Export lists. -- Detect missing instances. -- Top-level type annotations. module Typer where import Base import Map import Ast import Parser import Unify freeCount v expr = case expr of E _ -> 0 V s -> if s == v then 1 else 0 A x y -> freeCount v x + freeCount v y L w t -> if v == w then 0 else freeCount v t app01 s x = case freeCount s x of 0 -> const x 1 -> flip (beta s) x _ -> A $ L s x optiApp t = case t of A (L s x) y -> app01 s (optiApp x) (optiApp y) A x y -> A (optiApp x) (optiApp y) L s x -> L s (optiApp x) _ -> t -- Pattern compiler. singleOut s cs = \scrutinee x -> foldl A (A (V $ specialCase cs) scrutinee) $ map (\(Constr s' ts) -> if s == s' then x else foldr L (V "pjoin#") $ map (const "_") ts) cs patEq lit b x y = A (A (A (V "if") (A (A (V "==") (E lit)) b)) x) y unpat searcher as t = case as of [] -> pure t a:at -> get >>= \n -> put (n + 1) >> let freshv = shows n "#" in L freshv <$> let go p x = case p of PatLit lit -> unpat searcher at $ patEq lit (V freshv) x $ V "pjoin#" PatVar s m -> maybe (unpat searcher at) (\p1 x1 -> go p1 x1) m $ beta s (V freshv) x PatCon con args -> case findCon searcher con of Left e -> error e Right cons -> unpat searcher args x >>= \y -> unpat searcher at $ singleOut con cons (V freshv) y in go a t unpatTop searcher als x = case als of [] -> pure x (a, l):alt -> let go p t = case p of PatLit lit -> unpatTop searcher alt $ patEq lit (V l) t $ V "pjoin#" PatVar s m -> maybe (unpatTop searcher alt) go m $ beta s (V l) t PatCon con args -> case findCon searcher con of Left e -> error e Right cons -> unpat searcher args t >>= \y -> unpatTop searcher alt $ singleOut con cons (V l) y in go a x rewritePats' searcher asxs ls = case asxs of [] -> pure $ V "fail#" (as, t):asxt -> unpatTop searcher (zip as ls) t >>= \y -> A (L "pjoin#" y) <$> rewritePats' searcher asxt ls rewritePats searcher vsxs@((vs0, _):_) = get >>= \n -> let ls = map (`shows` "#") $ take (length vs0) [n..] in put (n + length ls) >> flip (foldr L) ls <$> rewritePats' searcher vsxs ls classifyAlt v x = case v of PatLit lit -> Left $ patEq lit (V "of") x PatVar s m -> maybe (Left . A . L "pjoin#") classifyAlt m $ A (L s x) $ V "of" PatCon con args -> Right (insertWith (flip (.)) con ((args, x):)) genCase searcher tab = if size tab == 0 then id else A . L "cjoin#" $ let firstC = case toAscList tab of ((con, _):_) -> con cs = either error id $ findCon searcher firstC in foldl A (A (V $ specialCase cs) (V "of")) $ map (\(Constr s ts) -> case mlookup s tab of Nothing -> foldr L (V "cjoin#") $ const "_" <$> ts Just f -> Pa $ f [(const (PatVar "_" Nothing) <$> ts, V "cjoin#")] ) cs updateCaseSt searcher (acc, tab) alt = case alt of Left f -> (acc . genCase searcher tab . f, Tip) Right upd -> (acc, upd tab) rewriteCase searcher as = acc . genCase searcher tab $ V "fail#" where (acc, tab) = foldl (updateCaseSt searcher) (id, Tip) $ uncurry classifyAlt <$> as resolveFieldBinds searcher t = go t where go t = case t of E _ -> t V _ -> t A (E (Basic "{=")) (A rawExpr fbsAst) -> let expr = go rawExpr fromAst t = case t of A (A (E (StrCon f)) body) rest -> (f, go body):fromAst rest E (Basic "=}") -> [] fbs@((firstField, _):_) = fromAst fbsAst (con, fields) = findField searcher firstField cs = either error id $ findCon searcher con newValue = foldl A (V con) [maybe (V $ "[old]"++f) id $ lookup f fbs | (f, _) <- fields] initValue = foldl A expr [maybe (V "undefined") id $ lookup f fbs | (f, _) <- fields] updater = foldr L newValue $ ("[old]"++) . fst <$> fields inj x = map (\(Constr con' _) -> if con' == con then x else V "undefined") cs allPresent = all (`elem` (fst <$> fields)) $ fst <$> fbs isCon = case expr of V (h:_) -> 'A' <= h && h <= 'Z' _ -> False in if allPresent then if isCon then initValue else foldl A (A (V $ specialCase cs) expr) $ inj updater else error "bad fields in update" A x y -> A (go x) (go y) L s x -> L s $ go x fixFixity searcher t = case t of E _ -> pure t V _ -> pure t A (E (Basic "{+")) ch -> infixer searcher =<< go ch A x y -> A <$> go x <*> go y L s b -> L s <$> go b Pa vsxs -> Pa <$> mapM (\(ps, a) -> (,) <$> mapM pgo ps <*> go a) vsxs Ca x as -> Ca <$> go x <*> mapM (\(p, a) -> (,) <$> pgo p <*> go a) as where go = fixFixity searcher pgo = pure . patFixFixity searcher infixer searcher (A (A (A s x) y) t) = go seed t where seed = A (A s $ protect x) $ protect y protect t = A (E (Basic "!")) t unprotectAll = \case A (E (Basic "!")) x -> unprotectAll x A a b -> A (unprotectAll a) (unprotectAll b) t -> t go acc t = case t of E (Basic "+}") -> pure $ unprotectAll acc A (A (V s) z) rest -> go (rebase s (protect z) acc) rest _ -> error "unreachable" rebase s z = \case A (A (V s') x) y -> let stay = A (A (V s) $ A (A (V s') x) y) z down = A (A (V s') x) $ rebase s z y in extendChain searcher stay down s s' x -> A (A (V s) x) z patFixFixity searcher p = case p of PatLit _ -> p PatVar s m -> PatVar s $ go <$> m PatCon "{+" args -> patFixer searcher args PatCon con args -> PatCon con $ go <$> args where go = patFixFixity searcher patFixer searcher (PatCon f [a, b]:rest) = unprotectAll $ foldr rebase seed rest where seed = PatCon f [protect a, protect b] protect x = PatCon "!" [x] unprotectAll = \case PatCon "!" [x] -> unprotectAll x PatCon con args -> PatCon con $ unprotectAll <$> args p -> p rebase sz@(PatCon s [z]) = \case PatCon s' [x, y] -> let stay = PatCon s [PatCon s' [x, y], z] down = PatCon s' [x, rebase sz y] in extendChain searcher stay down s s' x -> PatCon s [x, z] extendChain searcher stay down s s' = if prec <= prec' then if prec == prec' then if assoc == assoc' then case assoc of LAssoc -> stay RAssoc -> down NAssoc -> error $ "adjacent NAssoc: " ++ s ++ " vs " ++ s' else error $ "assoc mismatch: " ++ s ++ " vs " ++ s' else stay else down where (prec, assoc) = either (const (9, LAssoc)) id $ findPrec searcher s (prec', assoc') = either (const (9, LAssoc)) id $ findPrec searcher s' secondM f (a, b) = (a,) <$> f b patternCompile searcher t = astLink searcher $ optiApp $ resolveFieldBinds searcher $ evalState (go $ either error id $ fixFixity searcher t) 0 where go t = case t of E _ -> pure t V _ -> pure t A x y -> liftA2 A (go x) (go y) L s x -> L s <$> go x Pa vsxs -> mapM (secondM go) vsxs >>= rewritePats searcher Ca x as -> liftA2 A (L "of" . rewriteCase searcher <$> mapM (secondM go) as >>= go) (go x) -- Type inference. instantiate' t n tab = case t of TC s -> ((t, n), tab) TV s -> case lookup s tab of Nothing -> let va = TV $ show n in ((va, n + 1), (s, va):tab) Just v -> ((v, n), tab) TAp x y -> let ((t1, n1), tab1) = instantiate' x n tab ((t2, n2), tab2) = instantiate' y n1 tab1 in ((TAp t1 t2, n2), tab2) instantiatePred (Pred s t) ((out, n), tab) = first (first ((:out) . Pred s)) (instantiate' t n tab) instantiate (Qual ps t) n = first (Qual ps1) $ fst $ instantiate' t n1 tab where ((ps1, n1), tab) = foldr instantiatePred (([], n), []) ps proofApply sub a = case a of Proof (Pred cl ty) -> Proof (Pred cl $ apply sub ty) A x y -> A (proofApply sub x) (proofApply sub y) L s t -> L s $ proofApply sub t _ -> a typeAstSub sub (t, a) = (apply sub t, proofApply sub a) infer msg typed loc ast csn@(cs, n) = case ast of E x -> Right $ case x of Basic bug -> error bug Const n -> ((TC "Int", E $ ChrCon $ chr n), csn) ChrCon _ -> ((TC "Char", ast), csn) StrCon _ -> ((TAp (TC "[]") (TC "Char"), ast), csn) Link im s q -> insta q V s -> maybe (Left $ "undefined: " ++ s) Right $ either (\t -> ((t, ast), csn)) insta <$> lookup s loc <|> insta . fst <$> mlookup s typed A x y -> rec loc x (cs, n + 1) >>= \((tx, ax), csn1) -> rec loc y csn1 >>= \((ty, ay), (cs2, n2)) -> unifyMsg msg tx (arr ty va) cs2 >>= \cs -> Right ((va, A ax ay), (cs, n2)) L s x -> first (\(t, a) -> (arr va t, L s a)) <$> rec ((s, Left va):loc) x (cs, n + 1) where rec = infer msg typed va = TV $ show n insta ty = ((ty1, foldl A ast (map Proof preds)), (cs, n1)) where (Qual preds ty1, n1) = instantiate ty n findInstance searcher qn@(q, n) p@(Pred cl ty) insts = case insts of [] -> case ty of TV _ -> let v = '*':show n in Right (((p, v):q, n + 1), V v) _ -> Left $ "no instance: " ++ show p (modName, Instance h name ps _):rest -> case match h ty of Nothing -> findInstance searcher qn p rest Just subs -> foldM (\(qn1, t) (Pred cl1 ty1) -> second (A t) <$> findProof searcher (Pred cl1 $ apply subs ty1) qn1) (qn, if modName == "" then V name else E $ Link modName name undefined) ps findProof searcher pred@(Pred classId t) psn@(ps, n) = case lookup pred ps of Nothing -> findInstance searcher psn pred $ findTypeclass searcher classId Just s -> Right (psn, V s) prove searcher psn a = case a of Proof pred -> findProof searcher pred psn A x y -> prove searcher psn x >>= \(psn1, x1) -> second (A x1) <$> prove searcher psn1 y L s t -> second (L s) <$> prove searcher psn t _ -> Right (psn, a) data Dep a = Dep ([String] -> Either String ([String], a)) instance Functor Dep where fmap f = \(Dep mf) -> Dep \g -> do (g', x) <- mf g pure (g', f x) instance Applicative Dep where pure x = Dep \g -> Right (g, x) (Dep mf) <*> (Dep mx) = Dep \g -> do (g', f) <- mf g (g'', x) <- mx g' pure (g'', f x) addDep s = Dep \deps -> Right (if s `elem` deps then deps else s : deps, ()) badDep s = Dep $ const $ Left s runDep (Dep f) = f [] unifyMsg s a b c = either (Left . (s++) . (": "++)) Right $ unify a b c forFree cond f bound t = case t of E _ -> t V s -> if (not $ s `elem` bound) && cond s then f t else t A x y -> A (rec bound x) (rec bound y) L s t' -> L s $ rec (s:bound) t' where rec = forFree cond f inferno searcher decls typed defmap syms = let anno s = maybe (Left $ TV $ ' ':s) Right $ mlookup s decls loc = zip syms $ anno <$> syms principal ((acc, preds), (subs, n)) s = do expr <- maybe (Left $ "missing: " ++ s) Right (mlookup s defmap) ((t, a), (ms, n)) <- infer s typed loc expr (subs, n) case mlookup s decls of Nothing -> do soln <- unifyMsg s (TV (' ':s)) t ms Right (((s, (t, a)):acc, preds), (soln, n)) Just qAnno -> do let (Qual pAnno tAnno, n1) = instantiate qAnno n soln <- maybe (Left $ s ++ ": match failed: " ++ show qAnno ++ " vs " ++ show (apply ms t)) Right $ match (apply ms t) tAnno Right (((s, (t, a)):acc, pAnno ++ preds), (soln @@ ms, n1)) gatherPreds (acc, psn) (s, (t, a)) = do (psn, a) <- prove searcher psn a pure ((s, (t, a)):acc, psn) in do ((stas, preds), (soln, _)) <- foldM principal (([], []), ([], 0)) syms let ps = zip preds $ ("anno*"++) . show <$> [0..] (stas, (ps, _)) <- foldM gatherPreds ([], (ps, 0)) $ second (typeAstSub soln) <$> stas let preds = fst <$> ps dicts = snd <$> ps applyDicts (s, (t, a)) = (s, (Qual preds t, foldr L (forFree (`elem` syms) (\t -> foldl A t $ V <$> dicts) [] a) dicts)) pure $ map applyDicts stas inferDefs searcher defs decls typed = do let insertUnique m (s, (_, t)) = if isBuiltIn s then Left $ "reserved: " ++ s else case mlookup s m of Nothing -> Right $ insert s t m _ -> Left $ "duplicate: " ++ s addEdges (sym, (deps, _)) (ins, outs) = (foldr (\dep es -> insertWith union dep [sym] es) ins deps, insertWith union sym deps outs) graph = foldr addEdges (Tip, Tip) defs defmap <- foldM insertUnique Tip defs let ins k = maybe [] id $ mlookup k $ fst graph outs k = maybe [] id $ mlookup k $ snd graph inferComponent typed syms = foldr (uncurry insert) typed <$> inferno searcher decls typed defmap syms foldM inferComponent typed $ scc ins outs $ keys defmap dictVars ps n = (zip ps $ map (('*':) . show) [n..], n + length ps) inferTypeclasses searcher ienv typed = foldM perClass typed $ toAscList ienv where perClass typed (classId, Tycl sigs insts) = foldM perInstance typed insts where perInstance typed (Instance ty name ps idefs) = do let dvs = map snd $ fst $ dictVars ps 0 perMethod s = do let Just rawExpr = mlookup s idefs <|> pure (V $ "{default}" ++ s) expr <- snd <$> patternCompile searcher rawExpr (ta, (sub, n)) <- either (Left . (name++) . (" "++) . (s++) . (": "++)) Right $ infer s typed [] expr ([], 0) qc <- typeOfMethod searcher s let (tx, ax) = typeAstSub sub ta -- e.g. qc = Eq a => a -> a -> Bool -- We instantiate: Eq a1 => a1 -> a1 -> Bool. (Qual [Pred _ headT] tc, n1) = instantiate qc n -- Mix the predicates `ps` with the type of `headT`, applying a -- substitution such as (a1, [a]) so the variable names match. -- e.g. Eq a => [a] -> [a] -> Bool Just subc = match headT ty (Qual ps2 t2, n2) = instantiate (Qual ps $ apply subc tc) n1 case match tx t2 of Nothing -> Left "class/instance type conflict" Just subx -> do ((ps3, _), tr) <- prove searcher (dictVars ps2 0) (proofApply subx ax) if length ps2 /= length ps3 then Left $ ("want context: "++) . (foldr (.) id $ shows . fst <$> ps3) $ name else pure tr ms <- mapM perMethod sigs pure $ insert name (Qual [] $ TC "DICTIONARY", flip (foldr L) dvs $ L "@" $ foldl A (V "@") ms) typed primAdts = [ (TC "()", [Constr "()" []]) , (TC "Bool", [Constr "True" [], Constr "False" []]) , (TAp (TC "[]") (TV "a"), [Constr "[]" [], Constr ":" $ map ("",) [TV "a", TAp (TC "[]") (TV "a")]]) , (TAp (TAp (TC ",") (TV "a")) (TV "b"), [Constr "," $ map ("",) [TV "a", TV "b"]]) ] prims = let ro = E . Basic dyad s = TC s `arr` (TC s `arr` TC s) wordy = foldr arr (TAp (TAp (TC ",") (TC "Word")) (TC "Word")) [TC "Word", TC "Word", TC "Word", TC "Word"] bin s = A (ro "Q") (ro s) in map (second (first $ Qual [])) $ [ ("intEq", (arr (TC "Int") (arr (TC "Int") (TC "Bool")), bin "EQ")) , ("intLE", (arr (TC "Int") (arr (TC "Int") (TC "Bool")), bin "LE")) , ("wordLE", (arr (TC "Word") (arr (TC "Word") (TC "Bool")), bin "U_LE")) , ("wordEq", (arr (TC "Word") (arr (TC "Word") (TC "Bool")), bin "EQ")) , ("charEq", (arr (TC "Char") (arr (TC "Char") (TC "Bool")), bin "EQ")) , ("charLE", (arr (TC "Char") (arr (TC "Char") (TC "Bool")), bin "LE")) , ("fix", (arr (arr (TV "a") (TV "a")) (TV "a"), ro "Y")) , ("if", (arr (TC "Bool") $ arr (TV "a") $ arr (TV "a") (TV "a"), ro "I")) , ("intFromWord", (arr (TC "Word") (TC "Int"), ro "I")) , ("wordFromInt", (arr (TC "Int") (TC "Word"), ro "I")) , ("chr", (arr (TC "Int") (TC "Char"), ro "I")) , ("ord", (arr (TC "Char") (TC "Int"), ro "I")) , ("ioBind", (arr (TAp (TC "IO") (TV "a")) (arr (arr (TV "a") (TAp (TC "IO") (TV "b"))) (TAp (TC "IO") (TV "b"))), ro "C")) , ("ioPure", (arr (TV "a") (TAp (TC "IO") (TV "a")), A (A (ro "B") (ro "C")) (ro "T"))) , ("primitiveError", (arr (TAp (TC "[]") (TC "Char")) (TV "a"), ro "ERR")) , ("newIORef", (arr (TV "a") (TAp (TC "IO") (TAp (TC "IORef") (TV "a"))), A (A (ro "B") (ro "C")) (A (A (ro "B") (ro "T")) (ro "REF")))) , ("readIORef", (arr (TAp (TC "IORef") (TV "a")) (TAp (TC "IO") (TV "a")), A (ro "T") (ro "READREF"))) , ("writeIORef", (arr (TAp (TC "IORef") (TV "a")) (arr (TV "a") (TAp (TC "IO") (TC "()"))), A (A (ro "R") (ro "WRITEREF")) (ro "B"))) , ("exitSuccess", (TAp (TC "IO") (TV "a"), ro "END")) , ("unsafePerformIO", (arr (TAp (TC "IO") (TV "a")) (TV "a"), A (A (ro "C") (A (ro "T") (ro "END"))) (ro "K"))) , ("fail#", (TV "a", A (V "unsafePerformIO") (V "exitSuccess"))) , ("word64Add", (wordy, A (ro "QQ") (ro "DADD"))) , ("word64Sub", (wordy, A (ro "QQ") (ro "DSUB"))) , ("word64Mul", (wordy, A (ro "QQ") (ro "DMUL"))) , ("word64Div", (wordy, A (ro "QQ") (ro "DDIV"))) , ("word64Mod", (wordy, A (ro "QQ") (ro "DMOD"))) ] ++ map (\(s, v) -> (s, (dyad "Int", bin v))) [ ("intAdd", "ADD") , ("intSub", "SUB") , ("intMul", "MUL") , ("intDiv", "DIV") , ("intMod", "MOD") , ("intQuot", "QUOT") , ("intRem", "REM") ] ++ map (\(s, v) -> (s, (dyad "Word", bin v))) [ ("wordAdd", "ADD") , ("wordSub", "SUB") , ("wordMul", "MUL") , ("wordDiv", "U_DIV") , ("wordMod", "U_MOD") , ("wordQuot", "U_DIV") , ("wordRem", "U_MOD") ] neatNew = foldr (\(a, b) -> addAdt a b []) neatEmpty { typedAsts = fromList prims } primAdts isBuiltIn s = member s $ typedAsts neatNew tabulateModules mods = foldM ins Tip =<< mapM go mods where go (name, (mexs, prog)) = (name,) <$> maybe Right processExports mexs (foldr ($) neatNew prog) ins tab (k, v) = case mlookup k tab of Nothing -> Right $ insert k v tab Just _ -> Left $ "duplicate module: " ++ k processExports exs neat = do mes <- Just . concat <$> mapM (processExport neat) exs pure neat { moduleExports = mes } processExport neat = \case ExportVar v -> case lookup v $ topDefs neat of Nothing -> Left $ "bad export " ++ v Just _ -> Right [v] ExportCon c ns -> case mlookup c $ type2Cons neat of Just cnames | ns == [".."] -> Right cnames | null delta -> Right ns | True -> Left $ "bad exports: " ++ show delta where delta = [n | n <- ns, not $ elem n cnames] Nothing -> case mlookup c $ typeclasses neat of Nothing -> Left $ "bad export " ++ c Just (Tycl methodNames _) | ns == [".."] -> Right methodNames | null delta -> Right ns | True -> Left $ "bad exports: " ++ show delta where delta = [n | n <- ns, not $ elem n methodNames] data Searcher = Searcher { astLink :: Ast -> Either String ([String], Ast) , findPrec :: String -> Either String (Int, Assoc) , findCon :: String -> Either String [Constr] , findField :: String -> (String, [(String, Type)]) , typeOfMethod :: String -> Either String Qual , findTypeclass :: String -> [(String, Instance)] } isExportOf s neat = case moduleExports neat of Nothing -> True Just es -> elem s es findAmong fun viz s = case concat $ maybe [] (:[]) . mlookup s . fun <$> viz s of [] -> Left $ "missing: " ++ s [unique] -> Right unique _ -> Left $ "ambiguous: " ++ s searcherNew tab neat ienv = Searcher { astLink = astLink' , findPrec = \s -> if s == ":" then Right (5, RAssoc) else findAmong opFixity visible s , findCon = findAmong dataCons visible , findField = findField' , typeOfMethod = fmap fst . findAmong typedAsts visible , findTypeclass = \s -> concat [maybe [] (\(Tycl _ is) -> (im,) <$> is) $ mlookup s $ classes im | im <- "":imps] } where findImportSym s = concat [maybe [] (\(t, _) -> [(im, t)]) $ mlookup s $ typedAsts n | (im, n) <- importedNeats s] importedNeats s@(h:_) = if isBuiltIn s then [] else [(im, n) | im <- imps, let n = tab ! im, h == '{' || isExportOf s n] visible s = neat : (snd <$> importedNeats s) classes im = if im == "" then ienv else typeclasses $ tab ! im findField' f = case [(con, fields) | dc <- dataCons <$> visible f, (_, cons) <- toAscList dc, Constr con fields <- cons, (f', _) <- fields, f == f'] of [] -> error $ "no such field: " ++ f h:_ -> h imps = moduleImports neat defs = fromList $ topDefs neat astLink' ast = runDep $ go [] ast where go bound ast = case ast of V s | elem s bound -> pure ast | isBuiltIn s -> pure ast | member s $ typedAsts neat -> unlessAmbiguous s $ pure ast | member s defs -> unlessAmbiguous s $ addDep s *> pure ast | True -> case findImportSym s of [] -> badDep $ "missing: " ++ s [(im, t)] -> pure $ E $ Link im s t _ -> badDep $ "ambiguous: " ++ s A x y -> A <$> go bound x <*> go bound y L s t -> L s <$> go (s:bound) t _ -> pure ast unlessAmbiguous s f = case findImportSym s of [] -> f _ -> badDep $ "ambiguous: " ++ s inferModule tab acc name = case mlookup name acc of Nothing -> do let neat = tab ! name imps = moduleImports neat typed = typedAsts neat fillSigs (cl, Tycl sigs is) = (cl,) $ case sigs of [] -> Tycl (findSigs cl) is _ -> Tycl sigs is findSigs cl = maybe (error $ "no sigs: " ++ cl) id $ find (not . null) [maybe [] (\(Tycl sigs _) -> sigs) $ mlookup cl $ typeclasses (tab ! im) | im <- imps] ienv = fromList $ fillSigs <$> toAscList (typeclasses neat) genDefaultMethod qcs (classId, s) = case mlookup defName qcs of Nothing -> Right $ insert defName (q, V "fail#") qcs Just (Qual ps t, _) -> case match t t0 of Nothing -> Left $ "bad default method type: " ++ s _ -> case ps of [Pred cl _] | cl == classId -> Right qcs _ -> Left $ "bad default method constraints: " ++ show (Qual ps0 t0) where defName = "{default}" ++ s (q@(Qual ps0 t0), _) = qcs ! s acc' <- foldM (inferModule tab) acc imps let searcher = searcherNew acc' neat ienv depdefs <- mapM (\(s, t) -> (s,) <$> patternCompile searcher t) $ topDefs neat typed <- inferDefs searcher depdefs (topDecls neat) typed typed <- inferTypeclasses searcher ienv typed typed <- foldM genDefaultMethod typed [(classId, sig) | (classId, Tycl sigs _) <- toAscList $ typeclasses neat, sig <- sigs] Right $ insert name neat { typedAsts = typed } acc' Just _ -> Right acc untangle s = do tab <- parseProgram s >>= tabulateModules foldM (inferModule tab) Tip $ keys tab
-- Separate fixity phase. -- Export lists. module Main where import Base import Map import Ast import RTS import Typer import Kiselyov import System hide_prelude_here' = hide_prelude_here' dumpWith dumper s = case untangle s of Left err -> err Right tab -> foldr ($) [] $ map (\(name, neat) -> ("module "++) . (name++) . ('\n':) . (foldr (.) id $ dumper neat)) $ toAscList tab dumpLambs neat = map (\(s, t) -> (s++) . (" = "++) . shows t . ('\n':)) $ second snd <$> toAscList (typedAsts neat) dumpTypes neat = map (\(s, q) -> (s++) . (" :: "++) . shows q . ('\n':)) $ second fst <$> toAscList (typedAsts neat) dumpCombs neat = map go $ optiComb $ second snd <$> toAscList (typedAsts neat) where go (s, t) = (s++) . (" = "++) . shows t . (";\n"++) main = getArgs >>= \case "comb":_ -> interact $ dumpWith dumpCombs "lamb":_ -> interact $ dumpWith dumpLambs "type":_ -> interact $ dumpWith dumpTypes "wasm":opts -> interact \s -> either id id $ untangle s >>= compileWith "1<<22" libcWasm ("no-main":opts) "warts":opts -> interact $ either id (warts opts) . untangle _ -> interact \s -> either id id $ untangle s >>= compile
Precisely
Before proceeding, we crudely implement arbitrary-precision integers to enable some cool demos. It also exercises our code that handles typeclasses.
Unlike standard Haskell, we add a Ring typeclass rather than Num. The (+) and (*) operators should be reserved for rings, and there are uses for rings that are not also instances of the Num typeclass. For example, Gaussian integers are great for indexing a 2D rectangular board, especially when we want to rotate by a right angle (multiplication by i) or talk about the cardinal directions (which correspond the units).
The integers are the initial ring, so we treat the integer constant n as fromInteger n; if this results in ambiguity, then we drop fromInteger.
Thus the laws that we know to be true in our bones, such as a*(b + c) = a*b + a*c, will never lead us astray. We must explicitly write fromIntegral to, say, map a Word32 to a Word64. Other languages convert silently, and wind up defying our algebraic intuition.
To represent an integer, we use a list of Word32 numbers, plus a boolean to represent its sign. For GHC compatibility we call the function integerSignList instead of pattern matching on Integer values.
We implement schoolbook algorithms for basic arithmetic, which is straightforward except for division. I realized that when doing long division by hand, to find the next digit of the divisor, I pick something that seems reasonable via a method that seems partly subconscious! How can we possibly code this?
Luckily, there is a simple algorithm that makes good guesses. See Knuth, The Art of Computer Programming.
We rename div and mod to quot and rem, then introduce wrappers for div and mod. Now our divisions behave correctly, though it is sad that div and mod need more instructions. (FORTRAN set an unfortunate precedent of truncating division to zero, ultimately forcing languages like C and WebAssembly and even hardware to conform.)
Our treatment of integer literals causes a bootstrapping issue. Suppose a literal "0" is to be converted to an Int. Then our compiler applies the Int edition of fromInteger to the Integer 0, which involves a call to mpView, whose implementation needs the Int 0. If we simply code this as 0, then we wind up with a circular definition, because our compiler would insert another fromInteger call. We work around this with a definition that bypasses overloading by returning ord '\0'.
-- Ring, Integral, Integer. module Base where infixr 9 . infixr 8 ^ infixl 7 * , `div` , `mod` infixr 6 <> infixl 6 + , - infixr 5 ++ infixl 4 <*> , <$> , <* , *> infix 4 == , /= , <= , < infixl 3 && , <|> infixl 2 || infixl 1 >> , >>= infixr 1 =<< infixr 0 $ class Monoid a where mempty :: a (<>) :: a -> a -> a mconcat :: [a] -> a mconcat = foldr (<>) mempty instance Monoid [a] where mempty = [] (<>) = (++) class Functor f where fmap :: (a -> b) -> f a -> f b class Applicative f where pure :: a -> f a (<*>) :: f (a -> b) -> f a -> f b class Monad m where return :: a -> m a (>>=) :: m a -> (a -> m b) -> m b (<$>) = fmap liftA2 f x y = f <$> x <*> y (>>) f g = f >>= \_ -> g (=<<) = flip (>>=) class Eq a where (==) :: a -> a -> Bool instance Eq () where () == () = True instance Eq Bool where True == True = True False == False = True _ == _ = False instance (Eq a, Eq b) => Eq (a, b) where (a1, b1) == (a2, b2) = a1 == a2 && b1 == b2 instance Eq a => Eq [a] where xs == ys = case xs of [] -> case ys of [] -> True _ -> False x:xt -> case ys of [] -> False y:yt -> x == y && xt == yt instance Eq Int where (==) = intEq instance Eq Char where (==) = charEq ($) f x = f x id x = x const x y = x flip f x y = f y x (&) x f = f x class Ord a where (<=) :: a -> a -> Bool x <= y = case compare x y of LT -> True EQ -> True GT -> False compare :: a -> a -> Ordering compare x y = if x <= y then if y <= x then EQ else LT else GT instance Ord Int where (<=) = intLE instance Ord Char where (<=) = charLE data Ordering = LT | GT | EQ deriving (Eq, Show) instance Ord a => Ord [a] where xs <= ys = case xs of [] -> True x:xt -> case ys of [] -> False y:yt -> case compare x y of LT -> True GT -> False EQ -> xt <= yt data Maybe a = Nothing | Just a deriving (Eq, Show) data Either a b = Left a | Right b deriving (Eq, Show) fst (x, y) = x snd (x, y) = y uncurry f (x, y) = f x y first f (x, y) = (f x, y) second f (x, y) = (x, f y) bool a b c = if c then b else a not a = if a then False else True x /= y = not $ x == y (.) f g x = f (g x) (||) f g = if f then True else g (&&) f g = if f then g else False take 0 xs = [] take _ [] = [] take n (h:t) = h : take (n - 1) t drop n xs | n <= 0 = xs drop _ [] = [] drop n (_:xs) = drop (n-1) xs splitAt n xs = (take n xs, drop n xs) maybe n j m = case m of Nothing -> n; Just x -> j x instance Functor Maybe where fmap f = maybe Nothing (Just . f) instance Applicative Maybe where pure = Just ; mf <*> mx = maybe Nothing (\f -> maybe Nothing (Just . f) mx) mf instance Monad Maybe where return = Just ; mf >>= mg = maybe Nothing mg mf instance Alternative Maybe where empty = Nothing ; x <|> y = maybe y Just x foldr c n = \case [] -> n; h:t -> c h $ foldr c n t length :: [a] -> Int length = foldr (\_ n -> n + 1) 0 mapM f = foldr (\a rest -> liftA2 (:) (f a) rest) (pure []) mapM_ f = foldr ((>>) . f) (pure ()) forM = flip mapM sequence = mapM id replicateM = (sequence .) . replicate foldM f z0 xs = foldr (\x k z -> f z x >>= k) pure xs z0 when x y = if x then y else pure () unless x y = if x then pure () else y error = primitiveError undefined = error "undefined" foldr1 c l@(h:t) = maybe undefined id $ foldr (\x m -> Just $ maybe x (c x) m) Nothing l foldl f a bs = foldr (\b g x -> g (f x b)) (\x -> x) bs a foldl1 f (h:t) = foldl f h t elem k xs = foldr (\x t -> x == k || t) False xs find f xs = foldr (\x t -> if f x then Just x else t) Nothing xs (++) = flip (foldr (:)) concat = foldr (++) [] map = flip (foldr . ((:) .)) [] head (h:_) = h tail (_:t) = t xs!!0 = head xs xs!!n = tail xs!!(n - 1) replicate 0 _ = [] replicate n x = x : replicate (n - 1) x repeat x = x : repeat x cycle = concat . repeat null [] = True null _ = False reverse = foldl (flip (:)) [] dropWhile _ [] = [] dropWhile p xs@(x:xt) | p x = dropWhile p xt | True = xs span _ [] = ([], []) span p xs@(x:xt) | p x = first (x:) $ span p xt | True = ([],xs) break p = span (not . p) isSpace c = elem (ord c) [32, 9, 10, 11, 12, 13, 160] words s = case dropWhile isSpace s of "" -> [] s' -> w : words s'' where (w, s'') = break isSpace s' instance Functor [] where fmap = map instance Applicative [] where pure = (:[]); f <*> x = concatMap (<$> x) f instance Monad [] where return = (:[]); (>>=) = flip concatMap instance Alternative [] where empty = [] ; (<|>) = (++) concatMap = (concat .) . map lookup s = foldr (\(k, v) t -> if s == k then Just v else t) Nothing filter f = foldr (\x xs -> if f x then x:xs else xs) [] union xs ys = foldr (\y acc -> (if elem y acc then id else (y:)) acc) xs ys intersect xs ys = filter (\x -> maybe False (\_ -> True) $ find (x ==) ys) xs last (x:xt) = go x xt where go x xt = case xt of [] -> x; y:yt -> go y yt init (x:xt) = case xt of [] -> []; _ -> x : init xt intercalate sep = \case [] -> []; x:xt -> x ++ concatMap (sep ++) xt intersperse sep = \case [] -> []; x:xt -> x : foldr ($) [] (((sep:) .) . (:) <$> xt) all f = foldr (&&) True . map f any f = foldr (||) False . map f and = foldr (&&) True or = foldr (||) False zipWith f xs ys = case xs of [] -> []; x:xt -> case ys of [] -> []; y:yt -> f x y : zipWith f xt yt zip = zipWith (,) unzip [] = ([], []) unzip ((a, b):rest) = (a:at, b:bt) where (at, bt) = unzip rest transpose [] = [] transpose ([] : xss) = transpose xss transpose ((x:xs) : xss) = (x : [h | (h:_) <- xss]) : transpose (xs : [ t | (_:t) <- xss]) data State s a = State (s -> (a, s)) runState (State f) = f instance Functor (State s) where fmap f = \(State h) -> State (first f . h) instance Applicative (State s) where pure a = State (a,) (State f) <*> (State x) = State \s -> let (g, s') = f s in first g $ x s' instance Monad (State s) where return a = State (a,) (State h) >>= f = State $ uncurry (runState . f) . h evalState m s = fst $ runState m s get = State \s -> (s, s) put n = State \s -> ((), n) either l r e = case e of Left x -> l x; Right x -> r x instance Functor (Either a) where fmap f e = either Left (Right . f) e instance Applicative (Either a) where pure = Right ef <*> ex = case ef of Left s -> Left s Right f -> either Left (Right . f) ex instance Monad (Either a) where return = Right ex >>= f = either Left f ex class Alternative f where empty :: f a (<|>) :: f a -> f a -> f a asum = foldr (<|>) empty (*>) = liftA2 \x y -> y (<*) = liftA2 \x y -> x many p = liftA2 (:) p (many p) <|> pure [] some p = liftA2 (:) p (many p) sepBy1 p sep = liftA2 (:) p (many (sep *> p)) sepBy p sep = sepBy1 p sep <|> pure [] between x y p = x *> (p <* y) showParen b f = if b then ('(':) . f . (')':) else f iterate f x = x : iterate f (f x) takeWhile _ [] = [] takeWhile p xs@(x:xt) | p x = x : takeWhile p xt | True = [] divMod a b = (q, a - b*q) where q = div a b a ^ b = case b of 0 -> 1 1 -> a _ -> case r of 0 -> h2 1 -> h2*a where (q, r) = divMod b 2 h = a^q h2 = h*h class Enum a where succ :: a -> a pred :: a -> a toEnum :: Int -> a fromEnum :: a -> Int enumFrom :: a -> [a] enumFromTo :: a -> a -> [a] instance Enum Int where succ = (+1) pred = (+(0-1)) toEnum = id fromEnum = id enumFrom = iterate succ enumFromTo lo hi = takeWhile (<= hi) $ enumFrom lo instance Enum Char where succ = chr . (+1) . ord pred = chr . (+(0-1)) . ord toEnum = chr fromEnum = ord enumFrom = iterate succ enumFromTo lo hi = takeWhile (<= hi) $ enumFrom lo fromIntegral = fromInteger . toInteger class Ring a where (+) :: a -> a -> a (-) :: a -> a -> a (*) :: a -> a -> a fromInteger :: Integer -> a class Integral a where div :: a -> a -> a mod :: a -> a -> a quot :: a -> a -> a rem :: a -> a -> a toInteger :: a -> Integer -- TODO: divMod, quotRem instance Ring Int where (+) = intAdd (-) = intSub (*) = intMul -- TODO: Negative case. fromInteger (Integer xsgn xs) = intFromWord $ fst $ mpView xs instance Integral Int where div = intDiv mod = intMod quot = intQuot rem = intRem toInteger x | 0 <= x = Integer True $ if x == 0 then [] else [wordFromInt x] | True = Integer False [wordFromInt $ 0 - x] instance Ring Word where (+) = wordAdd (-) = wordSub (*) = wordMul -- TODO: Negative case. fromInteger (Integer xsgn xs) = fst $ mpView xs instance Integral Word where div = wordDiv mod = wordMod quot = wordQuot rem = wordRem toInteger x = Integer True $ if x == zeroWord then [] else [x] instance Eq Word where (==) = wordEq instance Ord Word where (<=) = wordLE data Word64 = Word64 Word Word deriving Eq instance Ring Word64 where Word64 a b + Word64 c d = uncurry Word64 $ word64Add a b c d Word64 a b - Word64 c d = uncurry Word64 $ word64Sub a b c d Word64 a b * Word64 c d = uncurry Word64 $ word64Mul a b c d -- TODO: Negative case. fromInteger (Integer xsgn xs) = Word64 x y where (x, xt) = mpView xs (y, _) = mpView xt instance Ord Word64 where Word64 a b <= Word64 c d | b == d = a <= c | True = b <= d -- Multiprecision arithmetic. data Integer = Integer Bool [Word] deriving Eq instance Ring Integer where Integer xsgn xs + Integer ysgn ys | xsgn == ysgn = Integer xsgn $ mpAdd xs ys | True = case mpCompare xs ys of LT -> mpCanon ysgn $ mpSub ys xs _ -> mpCanon xsgn $ mpSub xs ys Integer xsgn xs - Integer ysgn ys | xsgn /= ysgn = Integer xsgn $ mpAdd xs ys | True = case mpCompare xs ys of LT -> mpCanon (not ysgn) $ mpSub ys xs _ -> mpCanon xsgn $ mpSub xs ys Integer xsgn xs * Integer ysgn ys = Integer (xsgn == ysgn) $ mpMul xs ys fromInteger = id instance Integral Integer where -- TODO: Trucate `quot` towards zero. div (Integer xsgn xs) (Integer ysgn ys) = mpCanon0 (xsgn == ysgn) $ fst $ mpDivMod xs ys mod (Integer xsgn xs) (Integer ysgn ys) = mpCanon0 ysgn $ snd $ mpDivMod xs ys quot (Integer xsgn xs) (Integer ysgn ys) = mpCanon0 (xsgn == ysgn) $ fst $ mpDivMod xs ys rem (Integer xsgn xs) (Integer ysgn ys) = mpCanon0 ysgn $ snd $ mpDivMod xs ys toInteger = id instance Ord Integer where compare (Integer xsgn xs) (Integer ysgn ys) | xsgn = if ysgn then mpCompare xs ys else GT | True = if ysgn then LT else mpCompare ys xs instance Enum Integer where succ = (+ Integer True [oneWord]) pred = (+ Integer False [oneWord]) toEnum = toInteger fromEnum = fromInteger enumFrom = iterate succ enumFromTo lo hi = takeWhile (<= hi) $ enumFrom lo zeroWord = wordFromInt $ ord '\0' oneWord = wordFromInt 1 mpView [] = (zeroWord, []) mpView (x:xt) = (x, xt) mpCanon sgn xs = mpCanon0 sgn $ reverse $ dropWhile (zeroWord ==) $ reverse xs mpCanon0 sgn xs = case xs of [] -> Integer True [] _ -> Integer sgn xs mpCompare [] [] = EQ mpCompare [] _ = LT mpCompare _ [] = GT mpCompare (x:xt) (y:yt) = case mpCompare xt yt of EQ -> compare x y o -> o mpAdc [] [] c = ([], c) mpAdc xs ys c = first (lo:) $ mpAdc xt yt hi where (x, xt) = mpView xs (y, yt) = mpView ys (lo,hi) = uncurry (word64Add c zeroWord) $ word64Add x zeroWord y zeroWord mpAdd xs ys | c == zeroWord = zs | True = zs ++ [c] where (zs, c) = mpAdc xs ys zeroWord mpSub xs ys = fst $ mpSbb xs ys zeroWord mpSbb xs ys b = go xs ys b where go [] [] b = ([], b) go xs ys b = first (lo:) $ go xt yt $ oneWord - hi where (x, xt) = mpView xs (y, yt) = mpView ys (lo,hi) = uncurry word64Sub (word64Sub x oneWord y zeroWord) b zeroWord mpMulWord _ [] c = if c == zeroWord then [] else [c] mpMulWord x (y:yt) c = lo:mpMulWord x yt hi where (lo, hi) = uncurry (word64Add c zeroWord) $ word64Mul x zeroWord y zeroWord mpMul [] _ = [] mpMul (x:xt) ys = case mpMulWord x ys zeroWord of [] -> [] z:zs -> z:mpAdd zs (mpMul xt ys) mpDivModWord xs y = first (reverse . dropWhile (zeroWord ==)) $ go zeroWord $ reverse xs where go r [] = ([], r) go n (x:xt) = first (q:) $ go r xt where q = fst $ word64Div x n y zeroWord r = fst $ word64Mod x n y zeroWord mpDivMod xs ys = first (reverse . dropWhile (== zeroWord)) $ go us where s = mpDivScale $ last ys us = mpMulWord s (xs ++ [zeroWord]) zeroWord vs = mpMulWord s ys zeroWord (v1:vt) = reverse vs vlen = length vs go us | ulen <= vlen = ([], fst $ mpDivModWord us s) | True = first (q:) $ go $ lsbs ++ init ds where ulen = length us (u0:u1:ut) = reverse us (lsbs, msbs) = splitAt (ulen - vlen - 1) us (ql, qh) = word64Div u1 u0 v1 zeroWord q0 = if oneWord <= qh then (zeroWord-oneWord) else ql (q, ds) = foldr const undefined [(q, ds) | q <- iterate (- oneWord) q0, let (ds, bor) = mpSbb msbs (mpMulWord q vs zeroWord) zeroWord, bor == zeroWord] mpDivScale n = fst $ word64Div zeroWord oneWord (n + oneWord) zeroWord mpBase _ [] = ('0':) mpBase b xs = go xs where go [] = id go xs = go q . shows r where (q, r) = mpDivModWord xs b instance Show Integer where showsPrec _ (Integer xsgn xs) = (if xsgn then id else ('-':)) . mpBase (wordFromInt 10) xs instance (Ord a, Ord b) => Ord (a, b) where (a1, b1) <= (a2, b2) = a1 <= a2 && (not (a2 <= a1) || b1 <= b2) a < b = a <= b && a /= b a > b = b <= a && a /= b (>=) = flip(<=) instance Applicative IO where pure = ioPure ; (<*>) f x = ioBind f \g -> ioBind x \y -> ioPure (g y) instance Monad IO where return = ioPure ; (>>=) = ioBind instance Functor IO where fmap f x = ioPure f <*> x class Show a where showsPrec :: Int -> a -> String -> String showsPrec _ x = (show x++) show :: a -> String show x = shows x "" showList :: [a] -> String -> String showList = showList__ shows shows = showsPrec 0 showList__ _ [] s = "[]" ++ s showList__ showx (x:xs) s = '[' : showx x (showl xs) where showl [] = ']' : s showl (y:ys) = ',' : showx y (showl ys) showInt__ n | 0 == n = id | True = showInt__ (n`div`10) . (chr (48+n`mod`10):) instance Show () where show () = "()" instance Show Bool where show True = "True" show False = "False" instance Show a => Show [a] where showsPrec _ = showList instance Show Int where showsPrec _ n | 0 == n = ('0':) | 1 <= n = showInt__ n | True = ('-':) . showInt__ (0 - n) -- Fails for INT_MIN. showWord_ n | zeroWord == n = id | True = showWord_ (n`div`wordFromInt 10) . (chr (48+(intFromWord $ n`mod`wordFromInt 10)):) instance Show Word where showsPrec _ n | zeroWord == n = ('0':) | True = showWord_ n showLitChar__ '\n' = ("\\n"++) showLitChar__ '\\' = ("\\\\"++) showLitChar__ c = (c:) instance Show Char where showsPrec _ '\'' = ("'\\''"++) showsPrec _ c = ('\'':) . showLitChar__ c . ('\'':) showList s = ('"':) . foldr (.) id (map go s) . ('"':) where go '"' = ("\\\""++) go c = showLitChar__ c instance (Show a, Show b) => Show (a, b) where showsPrec _ (a, b) = showParen True $ shows a . (',':) . shows b integerSignList (Integer xsgn xs) f = f xsgn xs unwords [] = "" unwords ws = foldr1 (\w s -> w ++ ' ':s) ws unlines = concatMap (++"\n") abs x = if 0 <= x then x else 0 - x otherwise = True sum = foldr (+) 0 product = foldr (*) 1
Wasm to Haskell?
It’d be nice to write wasm ourselves, but for expedience, we rely on Clang to compile our runtime system to a wasm binary which we manipulate. WebAssembly turns out to be pleasantly malleable. A binary breaks up into independent sections, and we can add, delete, or modify sections before stitching them together again.
We write a tool that just does enough wasm parsing to print the sections of a wasm file we want in the form of a Haskell module, and run this on the output of crossly warts to create WartsBytes.hs.
module Main where import Base import System data Charser a = Charser { getCharser :: String -> Either String (a, String) } instance Functor Charser where fmap f (Charser x) = Charser $ fmap (first f) . x instance Applicative Charser where pure a = Charser $ \s -> Right (a, s) f <*> x = Charser \inp -> do (fun, t) <- getCharser f inp (arg, u) <- getCharser x t pure (fun arg, u) instance Monad Charser where Charser f >>= g = Charser $ (good =<<) . f where good (r, t) = getCharser (g r) t return = pure bad :: String -> Charser a bad = Charser . const . Left headerAndVersion :: String headerAndVersion = "\0asm\x1\0\0\0" eof :: Charser Bool eof = Charser \s -> Right (null s, s) next :: Charser Int next = Charser \case [] -> Left "unexpected EOF" h:t -> Right (ord h, t) sat f = Charser \case h:t | f h -> Right (h, t) _ -> Left "unsat" remainder :: Charser String remainder = Charser \s -> Right (s, "") varuint7 = next varuint32 = varuint varuint :: Charser Int varuint = unleb 1 0 -- varuint = fromIntegral <$> unleb 1 0 -- unleb :: Integer -> Integer -> Charser Integer unleb m acc = do -- d <- fromIntegral <$> next d <- next if d > 127 then unleb (m * 128) $ (d - 128) * m + acc else pure $ d*m + acc sections = eof >>= \b -> if b then pure [] else do n <- varuint7 s <- vec (chr <$> next) ((n, s):) <$> sections wasm = do s <- replicateM 8 (chr <$> next) if s /= headerAndVersion then bad "bad header or version" else sections hexDigit n | n < 10 = chr $ n + ord '0' | True = chr $ n - 10 + ord 'a' xxd = \case "" -> "" h:t -> let n = ord h in hexDigit (div n 16) : hexDigit (mod n 16) : xxd t replicateM = (mapM id .) . replicate vec f = varuint >>= (`replicateM` f) search00type xs = do fts <- maybe (Left "missing section 1") Right $ lookup 1 xs ios <- fst <$> getCharser go fts maybe (Left "missing (0, 0) functype") Right $ lookup (0, 0) $ zip ios [0..] where go = vec $ do sat (== '\x60') inCount <- varuint replicateM inCount next outCount <- varuint replicateM outCount next pure (inCount, outCount) searchExport needle xs = do exs <- maybe (Left "missing section 7") Right $ lookup 7 xs maybe (Left "not found") Right =<< asum . fst <$> getCharser go exs where go = vec $ do s <- vec $ chr <$> next next n <- varuint pure $ if s == "reduce" then Just n else Nothing allFunCount xs = do impCount <- maybe (Right 0) countImps $ lookup 2 xs funCount <- maybe (Right 0) countFuns $ lookup 3 xs pure $ impCount + funCount where countImps imps = length . fst <$> getCharser goImps imps goImps = vec $ do vec next vec next sat (== '\0') varuint pure () countFuns funs = fst <$> getCharser varuint funs main = do s <- getContents case getCharser wasm s of Left e -> putStrLn $ "parse error: " ++ e Right (xs, []) -> do putStr "module WartsBytes where\nimport Base\nwartsBytes = " print $ second xxd <$> filter (not . (`elem` [0, 6]) . fst) xs putStrLn $ either error (("allFunCount = "++) . show) $ allFunCount xs putStrLn $ either error (("funType00Idx = "++) . show) $ search00type xs putStrLn $ either error (("reduceFunIdx = "++) . show) $ searchExport "reduce" xs _ -> error "unreachable"
The RTS includes generate code that wraps foreign imports. It can only be used with programs that declare the same foreign imports.
Webby
We build a compiler that goes directly from Haskell to wasm. The code does little more than dumping the runtime system wasm binary along with bytes describing the initial contents of the heap.
For now we look for the main function in the Main module and export it as the go function; export declarations are ignored.
-- In-browser compiler. module Main where import Base import Ast import Map import Typer import RTS import System import WartsBytes data StrLen = StrLen { _str :: String -> String, _len :: Int } instance Monoid StrLen where mempty = StrLen id 0 (StrLen s1 n1) <> (StrLen s2 n2) = StrLen (s1 . s2) (n1 + n2) hexValue d | d <= '9' = ord d - ord '0' | d <= 'F' = 10 + ord d - ord 'A' | d <= 'f' = 10 + ord d - ord 'a' unxxd s = StrLen (go s) (length s `div` 2) where go s = case s of [] -> id (d1:d0:rest) -> ((chr $ hexValue d1 * 16 + hexValue d0):) . go rest main = interact toWasm toWasm s = case untangle s of Left err -> err Right mods -> let ffis = foldr (\(k, v) m -> insertWith (error $ "duplicate import: " ++ k) k v m) Tip $ concatMap (toAscList . ffiImports) $ elems mods ffes1 = foldr (\(expName, v) m -> insertWith (error $ "duplicate export: " ++ expName) expName v m) Tip [ (expName, addr) | (modName, neat) <- toAscList mods , (expName, ourName) <- toAscList $ ffiExports neat , let addr = maybe (error $ "missing: " ++ ourName) id $ mlookup ourName $ bigmap ! modName ] -- Assume they have type IO (). mainExport = case mlookup "main" ffes1 of Nothing -> maybe [] (:[]) $ do mod <- mlookup "Main" bigmap addr <- mlookup "main" mod pure ("main", addr) _ -> [] ffes = mainExport ++ toAscList ffes1 (bigmap, mem) = codegen ffis mods go (n, x) -- Function section: for each export, declare a function of type () -> ().. | n == 3 = leb n <> extendSection x (replicate (length ffes) $ unxxd "01") -- Export section: add each export. | n == 7 = leb n <> extendSection x (zipWith encodeExport (fst <$> ffes) [allFunCount..]) -- Code section: for nth export, define a function calling rts_reduce([512 + 4*n]) | n == 10 = leb n <> extendSection x (callRoot <$> [0..length ffes - 1]) | True = leb n <> leb (_len s) <> s where s = unxxd x roots = encodeData rootBase $ littleEndian $ (snd <$> ffes) ++ [0] prog = encodeData heapBase $ littleEndian $ length mem : mem wasm = _str (unxxd "0061736d01000000" <> mconcat (map go wartsBytes) -- Data section: -- 512 : null-terminated roots array -- 1048576 - 4: hp -- 1048576: initial heap contents <> leb 11 <> extendSection "00" [roots, prog]) "" in wasm extendSection x xs = encodeSection (k + length xs) $ unxxd s <> mconcat xs where (k, s) = splitLeb x splitLeb x = go x 1 0 where go (d1:d0:t) m acc | n < 128 = (acc + n*m, t) | True = go t (m*128) $ acc + (n - 128)*m where n = hexValue d1 * 16 + hexValue d0 encodeExport s n = encodeString s <> unxxd "00" <> leb n encodeString s = let n = length s in leb n <> StrLen (s++) n littleEndian ns = StrLen (foldr (.) id $ go 4 <$> ns) $ 4 * length ns where go k n | k == 0 = id | True = (chr (n `mod` 256):) . go (k - 1) (n `div` 256) rootBase = unxxd "004180040b" -- sleb 512 = 8004 heapBase = unxxd "0041fcff3f0b" -- sleb (1048576 - 4) = fcff3f -- 0 locals. -- i32.const 0; i32.load 512 + 4*n; call $REDUCE; end; callRoot n = leb (_len s) <> s where s = unxxd "0041002802" <> slebPos (512 + 4*n) <> unxxd "10" <> leb reduceFunIdx <> unxxd "0b" encodeData addr s = addr <> leb (_len s) <> s encodeSection k s = let n = leb k in leb (_len n + _len s) <> n <> s leb n | n <= 127 = StrLen (chr n:) 1 | True = StrLen (chr (128 + n `mod` 128):) 1 <> leb (n `div` 128) slebPos n | n <= 63 = StrLen (chr n:) 1 | True = StrLen (chr (128 + n `mod` 128):) 1 <> slebPos (n `div` 128)
webby.wasm
To build a browser-based compiler, we essentially run the previous compiler on itself, thus producing a wasm binary that can translate Haskell directly into wasm.
One change is needed: we swap System1.hs for SystemWasm.hs. The Linux version declares foreign imports for those in our runtime system for Linux. The wasm version declares foreign imports for functions provided by the host environment.
-- For wasm environments providing Haskell-like getchar, putchar, eof. module System where import Base foreign import ccall "putchar" putChar :: Char -> IO () foreign import ccall "getchar" getChar :: IO Char foreign import ccall "eof" isEOFInt :: IO Int isEOF = (0 /=) <$> isEOFInt putStr = mapM_ putChar putStrLn = (>> putChar '\n') . putStr print = putStrLn . show getContents = isEOF >>= \b -> if b then pure [] else getChar >>= \c -> (c:) <$> getContents interact f = getContents >>= putStr . f
Ben Lynn blynn@cs.stanford.edu 💡