Background
Booster currently uses [(Term, Term)] (a Haskell linked list) as the key-value pair storage for KMapF:
data TermF t = ... | KMapF KMapDefinition [(t, t)] (Maybe t) | ...
This design has several performance bottlenecks when maps grow in size. This issue provides a detailed analysis of the root causes and proposes four possible approaches for discussion.
Problem Analysis
1. Core operations have excessive complexity
All key-based lookup operations use O(n) linear scans — lookup, findIndex, partition, and elem on the pairs list:
|
mapLookupHook :: BuiltinFunction |
|
mapLookupHook args |
|
| [KMap _ pairs _mbRest, key] <- args = |
|
-- if the key is not found, return Nothing (no result), |
|
-- regardless of whether the key _could_ still be there. |
|
pure $ lookup key pairs |
|
| [_other, _] <- args = |
|
pure Nothing -- not an internalised map, maybe a function call |
|
| otherwise = |
|
arityError "MAP.lookup" 2 args |
|
mapUpdateHook :: BuiltinFunction |
|
mapUpdateHook args |
|
| [KMap def pairs mbRest, key, newValue] <- args = do |
|
case findIndex ((== key) . fst) pairs of |
|
Just idx -> |
|
-- key was found (syntactically), update pairs list |
|
let newPairs = take idx pairs <> ((key, newValue) : drop (idx + 1) pairs) |
|
in pure $ Just $ KMap def newPairs mbRest |
|
Nothing -- key could be in unevaluated or opaque part |
|
| Just _ <- mbRest -> |
|
pure Nothing -- have opaque part, no result |
|
| any (not . isConstructorLike_ . fst) pairs -> |
|
pure Nothing -- have unevaluated keys, no result |
|
| not $ isConstructorLike_ key -> |
|
pure Nothing -- unevaluated update key, no result |
|
| otherwise -> -- key certain to be absent, no rest: add pair |
|
pure $ Just $ KMap def ((key, newValue) : pairs) Nothing |
|
| [_other, _, _] <- args = |
|
pure Nothing -- not an internalised map, maybe a function call |
|
| otherwise = |
|
arityError "MAP.update" 3 args |
|
mapRemoveHook :: BuiltinFunction |
|
mapRemoveHook args |
|
| [m@(KMap def pairs mbRest), key] <- args = do |
|
case findIndex ((== key) . fst) pairs of |
|
Just idx -> |
|
-- key was found (syntactically), remove it |
|
let newPairs = take idx pairs <> drop (idx + 1) pairs |
|
in pure $ Just $ KMap def newPairs mbRest |
|
Nothing -- key could be in unevaluated or opaque part |
|
| Just _ <- mbRest -> |
|
pure Nothing -- have opaque part, no result |
|
| any (not . isConstructorLike_ . fst) pairs -> |
|
pure Nothing -- have unevaluated keys, no result |
|
| not $ isConstructorLike_ key -> |
|
pure Nothing -- remove key unevaluated, no result |
|
| otherwise -> -- key certain to be absent, no rest: map unchanged |
|
pure $ Just m |
|
| [_other, _] <- args = |
|
pure Nothing -- not an internalised map, maybe a function call |
|
| otherwise = |
|
arityError "MAP.remove" 2 args |
|
mapInKeysHook :: BuiltinFunction |
|
mapInKeysHook args |
|
| [key, KMap _ pairs mbRest] <- args = do |
|
-- only consider evaluated keys, return Nothing if any unevaluated ones are present |
|
let (eval'edKeys, uneval'edKeys) = |
|
partition isConstructorLike_ (map fst pairs) |
|
case (key `elem` eval'edKeys, key `elem` uneval'edKeys) of |
|
(True, _) -> |
|
-- constructor-like (evaluated) key is present |
|
pure $ Just $ boolTerm True |
|
(False, True) -> |
|
-- syntactically-equal unevaluated key is present |
|
pure $ Just $ boolTerm True |
|
(False, False) |
|
| Nothing <- mbRest -- no opaque rest |
|
, isConstructorLike_ key -- key to search is evaluated |
|
, null uneval'edKeys -> -- no keys unevaluated |
|
pure $ Just $ boolTerm False |
|
| otherwise -> -- key could be present once evaluated |
|
pure Nothing |
|
| [_, _other] <- args = do |
|
-- other `shouldHaveSort` "SortMap" |
|
pure Nothing -- not an internalised map, maybe a function call |
|
| otherwise = |
|
arityError "MAP.in_keys" 2 args |
2. Smart constructor unconditionally sorts and deduplicates
Every construction of a KMap via the smart constructor calls sortAndDeduplicate — O(n log n) with expensive Ord. This is triggered after every substitution and evaluation, even when keys have not changed:
|
pattern KMap :: KMapDefinition -> [(Term, Term)] -> Maybe Term -> Term |
|
pattern KMap def keyVals rest <- Term _ (KMapF def keyVals rest) |
|
where |
|
KMap def keyVals rest = |
|
let argAttributes = |
|
case (keyVals, rest) of |
|
([], Nothing) -> mempty |
|
([], Just s) -> getAttributes s |
|
(_ : _, Nothing) -> foldl1' (<>) $ concatMap (\(k, v) -> [getAttributes k, getAttributes v]) keyVals |
|
(_ : _, Just r) -> |
|
foldl' (<>) (getAttributes r) $ concatMap (\(k, v) -> [getAttributes k, getAttributes v]) keyVals |
|
(keyVals', rest') = case rest of |
|
Just (KMap def' kvs r) | def' == def -> (kvs, r) |
|
r -> ([], r) |
|
newKeyVals = sortAndDeduplicate $ keyVals ++ keyVals' |
|
newRest = rest' |
|
in case (newKeyVals, newRest) of |
|
([], Just r) -> r |
|
_ -> |
|
Term |
|
argAttributes |
|
{ hash = |
|
Hashable.hash |
|
( "KMap" :: ByteString |
|
, def |
|
, map (\(k, v) -> (hash $ getAttributes k, hash $ getAttributes v)) newKeyVals |
|
, hash . getAttributes <$> newRest |
|
) |
|
} |
|
$ KMapF def newKeyVals newRest |
|
sortAndDeduplicate :: Ord a => [a] -> [a] |
|
sortAndDeduplicate = Set.toAscList . Set.fromList |
3. Ord Term is not optimized
Ord Term is auto-derived, comparing TermAttributes fields in declaration order. The first field compared is variables :: Set Variable (O(k) cost), while the cached hash :: Int (O(1)) is only the third field. In contrast, Eq Term is already manually optimized with hash-first comparison:
|
data TermAttributes = TermAttributes |
|
{ variables :: !(Set Variable) |
|
, isEvaluated :: !Bool |
|
-- ^ false for function calls, true for |
|
-- variables, recursive through AndTerm |
|
, hash :: !Int |
|
, isConstructorLike :: !Bool |
|
-- ^ Means that logic equality is the same as syntactic equality. |
|
-- True for domain values and constructor symbols (recursive |
|
-- through arg.s), recursive through AndTerm. |
|
, canBeEvaluated :: !Bool |
|
-- ^ false for function calls, variables, and AndTerms |
|
} |
|
deriving stock (Eq, Ord, Show, Generic, Data, Lift) |
|
deriving anyclass (NFData, Hashable) |
|
instance Eq Term where |
|
Term TermAttributes{hash = hash1} t1f == Term TermAttributes{hash = hash2} t2f = |
|
hash1 == hash2 && t1f == t2f -- compare directly to cater for collisions |
4. Matching repeatedly creates temporary data structure conversions
matchMaps converts [(Term, Term)] to a temporary Data.Map for intersection/difference on every match. The subject map is converted and discarded each time — pure O(n log n) overhead:
|
let patternKeyVals = map (first (Substitution.substituteInTerm st.mSubstitution)) patKeyVals |
|
-- check for duplicate keys |
|
checkDuplicateKeys patternKeyVals patRest |
|
checkDuplicateKeys subjKeyVals subjRest |
|
|
|
let patMap = Map.fromList patternKeyVals |
|
subjMap = Map.fromList subjKeyVals |
|
-- handles syntactically identical keys in pattern and subject |
|
commonMap = Map.intersectionWith (,) patMap subjMap |
|
patExtra = patMap `Map.withoutKeys` Map.keysSet commonMap |
|
subjExtra = subjMap `Map.withoutKeys` Map.keysSet commonMap |
|
|
|
rest <- |
|
matchRemainderMaps |
|
(toRemainderMap (Map.toList patExtra) patRest) |
|
(toRemainderMap (Map.toList subjExtra) subjRest) |
|
enqueueRegularProblems $ (Seq.fromList $ Map.elems commonMap) >< rest |
5. No distinction between concrete and symbolic keys
Builtin operations and matching frequently check isConstructorLike_ and call partition, indicating concrete and symbolic keys are semantically different. The current storage does not reflect this, requiring runtime checks every time:
|
toRemainderMap :: [(Term, Term)] -> Maybe Term -> RemainderMap |
|
toRemainderMap kvs rest = |
|
let (conc, sym) = partitionConcreteKeys kvs |
|
in foldr |
|
(uncurry ConstructorKey) |
|
(Rest $ foldr (uncurry OtherKey) (maybe EmptyMap Remainder rest) sym) |
|
conc |
|
where |
|
partitionConcreteKeys :: [(Term, Term)] -> ([(Term, Term)], [(Term, Term)]) |
|
partitionConcreteKeys = partition (\(Term attrs _, _) -> attrs.isConstructorLike) |
Proposed Approaches
Approach A: Optimize Ord Term + avoid unnecessary sorting
Keep [(t, t)] unchanged, make two optimizations:
A1. Hand-write Ord Term with hash-first comparison:
instance Ord Term where
compare (Term a1 t1) (Term a2 t2) =
compare a1.hash a2.hash <> compare t1 t2
A2. After substitution/evaluation, detect whether keys have changed; skip sortAndDeduplicate if unchanged.
| Pros |
Cons |
| Smallest change, lowest risk |
Single key operations remain O(n) linear scan |
Does not affect TermF's Functor/Foldable/Traversable |
Matching still requires temporary Map.fromList conversion |
| Beneficial regardless of which other approach is adopted |
Does not change asymptotic complexity, only optimizes constant factors |
Approach B: Data.Map Term Term + optimized Ord Term
Replace [(t, t)] with Data.Map Term Term, with hand-written Ord Term (hash-first).
| Pros |
Cons |
| Single key operations drop from O(n) to O(log n) |
Requires hand-written Functor/Foldable/Traversable for TermF |
| Matching needs no temporary conversion |
When keys change, still requires O(n log n) rebuild |
| Built-in sort and dedup, smart constructor simplifies |
Does not distinguish concrete/symbolic keys |
| Output determinism (maintains sorted order) |
Relies on Ord Term, still slower than O(1) hash even when optimized |
intersection/difference and other set operations directly available |
|
Regarding the TermF Functor constraint: currently [(t,t)] allows derived fmap to automatically apply to both keys and values, while Data.Map's fmap only applies to values. After investigation, only 3 functions implicitly rely on the derived instance to process KMapF keys: modifyVariablesInT, checkTermSymbols, and getCell — none are on performance hot paths (substitution, matching, and evaluation all manually pattern match). Hand-writing Functor/Foldable/Traversable instances resolves this.
Approach C: Data.HashMap Term Term
Leverage Term's existing cached hash (Hashable Term is O(1)), replacing [(t, t)] with Data.HashMap Term Term.
| Pros |
Cons |
| Single key operations O(1) amortized |
Output non-determinism — HashMap.toList order is unstable, affecting test golden files, serialization consistency, debugging |
| Construction O(n), no sorting needed |
No efficient isSubmapOfBy |
| Fully utilizes cached hash |
Does not distinguish concrete/symbolic keys |
| Rebuild after substitution/evaluation is O(n) |
Externalise needs sorting for deterministic output → additional O(n log n) |
Requires hand-written Functor/Foldable/Traversable for TermF |
Worst case O(n) on hash collision |
Approach D: Separate concrete / symbolic storage
Following the pattern of the old kore backend's NormalizedAc, separate key-value pairs by key type:
KMapF KMapDefinition
(HashMap Term Term) -- concrete keys (constructorLike = True, no variables)
[(Term, Term)] -- symbolic keys (contain variables or function calls, typically 0-2)
(Maybe Term) -- opaque rest
Reference implementation in the old kore backend:
|
data NormalizedAc (collection :: Type -> Type -> Type) key child = NormalizedAc |
|
{ elementsWithVariables :: [Element collection child] |
|
-- ^ Non-concrete elements of the structure. |
|
-- These would be of sorts @(Int, String)@ for a map from @Int@ to @String@. |
|
, concreteElements :: HashMap key (Value collection child) |
|
-- ^ Concrete elements of the structure. |
|
-- These would be of sorts @(Int, String)@ for a map from @Int@ to @String@. |
|
, opaque :: [child] |
|
-- ^ Unoptimized (i.e. non-element) parts of the structure. |
|
} |
| Pros |
Cons |
| Concrete key lookup O(1) |
Largest amount of changes |
| Substitution can skip concrete keys entirely (no variables) |
Requires hand-written Functor/Foldable/Traversable for TermF |
| Matching needs no temporary conversion or partition |
Every operation must handle both parts separately |
isConstructorLike_ runtime checks eliminated |
After substitution, symbolic keys may become concrete, requiring reclassification |
Production-proven in kore's NormalizedAc |
Need to ensure consistent criteria for concrete key determination |
| Perfectly matches the actual concrete/symbolic distinction in K semantics |
Externalise needs sorting concrete keys for deterministic output |
Full Pipeline Comparison Summary
n = map size, p = pattern pairs (typically 1-3), n_s = symbolic pairs (typically 0-2).
Common case: subject map has all concrete keys, pattern has a few symbolic keys.
| Stage |
Current |
A (Ord opt) |
B (Map) |
C (HashMap) |
D (Separate) |
| Internalise |
O(n log n) expensive Ord |
O(n log n) fast Ord |
O(n log n) fast Ord |
O(n) |
O(n) |
| Matching: conversion |
O(n log n) expensive Ord |
O(n log n) fast Ord |
O(0) |
O(0) |
O(0) |
| Matching: intersect/diff |
O(p + n) |
O(p + n) |
O(p + n) |
O(min(p,n)) |
O(p) |
| Matching: partition |
O(p + n) |
O(p + n) |
O(p + n) |
O(p + n) |
O(0) |
| Substitution: iterate |
O(n) all |
O(n) all |
O(n) all |
O(n) all |
O(n) value + O(n_s) key |
| Substitution: rebuild |
O(n log n) unconditional |
O(0) common case |
O(n) map value |
O(n) map value |
O(n_s) |
| Evaluation: iterate |
O(n) key+value |
O(n) key+value |
O(n) key+value |
O(n) key+value |
O(n) value only |
| Evaluation: rebuild |
O(n log n) unconditional |
O(0) common case |
O(n) map value |
O(n) map value |
O(0) key unchanged |
| Single key lookup |
O(n) |
O(n) |
O(log n) |
O(1) |
O(1) |
| Output determinism |
Yes |
Yes |
Yes |
No (needs extra sort) |
Needs sort for concrete keys |
| Per-rule total |
~O(n log n) × 3 expensive Ord |
~O(n log n) × 2 fast Ord |
~O(n) × 3 |
~O(n) × 3 |
~O(n) + O(p) |
Notes
- KSet has similar issues (
[t] storage + sortAndDeduplicate on every construction) and could benefit from the same optimization approach.
- KList using
[t] is reasonable (ordered, no dedup needed), though LIST.get and other random access operations could benefit from Data.Seq if needed.
- Approach A's
Ord Term optimization is beneficial regardless of which other approach is adopted.
Background
Booster currently uses
[(Term, Term)](a Haskell linked list) as the key-value pair storage forKMapF:This design has several performance bottlenecks when maps grow in size. This issue provides a detailed analysis of the root causes and proposes four possible approaches for discussion.
Problem Analysis
1. Core operations have excessive complexity
All key-based lookup operations use O(n) linear scans —
lookup,findIndex,partition, andelemon the pairs list:haskell-backend/booster/library/Booster/Builtin/MAP.hs
Lines 150 to 159 in c7e22cb
haskell-backend/booster/library/Booster/Builtin/MAP.hs
Lines 48 to 68 in c7e22cb
haskell-backend/booster/library/Booster/Builtin/MAP.hs
Lines 118 to 138 in c7e22cb
haskell-backend/booster/library/Booster/Builtin/MAP.hs
Lines 182 to 206 in c7e22cb
2. Smart constructor unconditionally sorts and deduplicates
Every construction of a
KMapvia the smart constructor callssortAndDeduplicate— O(n log n) with expensiveOrd. This is triggered after every substitution and evaluation, even when keys have not changed:haskell-backend/booster/library/Booster/Pattern/Base.hs
Lines 580 to 609 in c7e22cb
haskell-backend/booster/library/Booster/Pattern/Base.hs
Lines 168 to 169 in c7e22cb
3.
Ord Termis not optimizedOrd Termis auto-derived, comparingTermAttributesfields in declaration order. The first field compared isvariables :: Set Variable(O(k) cost), while the cachedhash :: Int(O(1)) is only the third field. In contrast,Eq Termis already manually optimized with hash-first comparison:haskell-backend/booster/library/Booster/Pattern/Base.hs
Lines 121 to 135 in c7e22cb
haskell-backend/booster/library/Booster/Pattern/Base.hs
Lines 155 to 157 in c7e22cb
4. Matching repeatedly creates temporary data structure conversions
matchMapsconverts[(Term, Term)]to a temporaryData.Mapfor intersection/difference on every match. The subject map is converted and discarded each time — pure O(n log n) overhead:haskell-backend/booster/library/Booster/Pattern/Match.hs
Lines 699 to 715 in c7e22cb
5. No distinction between concrete and symbolic keys
Builtin operations and matching frequently check
isConstructorLike_and callpartition, indicating concrete and symbolic keys are semantically different. The current storage does not reflect this, requiring runtime checks every time:haskell-backend/booster/library/Booster/Pattern/Match.hs
Lines 626 to 635 in c7e22cb
Proposed Approaches
Approach A: Optimize
Ord Term+ avoid unnecessary sortingKeep
[(t, t)]unchanged, make two optimizations:A1. Hand-write
Ord Termwith hash-first comparison:A2. After substitution/evaluation, detect whether keys have changed; skip
sortAndDeduplicateif unchanged.TermF's Functor/Foldable/TraversableMap.fromListconversionApproach B:
Data.Map Term Term+ optimizedOrd TermReplace
[(t, t)]withData.Map Term Term, with hand-writtenOrd Term(hash-first).TermFOrd Term, still slower than O(1) hash even when optimizedintersection/differenceand other set operations directly availableRegarding the
TermFFunctor constraint: currently[(t,t)]allows derivedfmapto automatically apply to both keys and values, whileData.Map'sfmaponly applies to values. After investigation, only 3 functions implicitly rely on the derived instance to process KMapF keys:modifyVariablesInT,checkTermSymbols, andgetCell— none are on performance hot paths (substitution, matching, and evaluation all manually pattern match). Hand-writing Functor/Foldable/Traversable instances resolves this.Approach C:
Data.HashMap Term TermLeverage
Term's existing cached hash (Hashable Termis O(1)), replacing[(t, t)]withData.HashMap Term Term.HashMap.toListorder is unstable, affecting test golden files, serialization consistency, debuggingisSubmapOfByTermFApproach D: Separate concrete / symbolic storage
Following the pattern of the old kore backend's
NormalizedAc, separate key-value pairs by key type:Reference implementation in the old kore backend:
haskell-backend/kore/src/Kore/Internal/NormalizedAc.hs
Lines 169 to 178 in c7e22cb
TermFisConstructorLike_runtime checks eliminatedNormalizedAcFull Pipeline Comparison Summary
n = map size, p = pattern pairs (typically 1-3), n_s = symbolic pairs (typically 0-2).
Common case: subject map has all concrete keys, pattern has a few symbolic keys.
Notes
[t]storage +sortAndDeduplicateon every construction) and could benefit from the same optimization approach.[t]is reasonable (ordered, no dedup needed), thoughLIST.getand other random access operations could benefit fromData.Seqif needed.Ord Termoptimization is beneficial regardless of which other approach is adopted.