Egglog

date
reference
    'Better Together: Unifying Datalog and Equality Saturation'
tags#paper, #resource

Datalog

% Facts
parent(alice, bob).
parent(bob, charlie).

% Rule
grandparent(X, Z) :- parent(X, Y), parent(Y, Z).

% Query
?- grandparent(alice, charlie).

Advanced Features

  1. efficient incremental execution
    • recompute results when the underlying data changes, without starting from scratch.
    • When facts are added or removed, the system tracks which derived facts are affected and only recomputes those portions of the analysis.
  2. cooperating analyses
    • allows multiple different analyses to share information and mutually refine each other’s results within the same Datalog program.
  3. lattice-based reasoning
    • lattice: a partially ordered set in which every pair of elements has a unique supremum and a unique infimum
    • Instead of facts being simply present or absent, they can have values from a partially ordered set where:
      • There’s a “bottom” element (least precise)
      • There’s a “top” element (most precise)
      • Values can be compared and combined using lattice operations (join/meet)

Equality saturation (EqSat)

Core concept: EqSat works by

  1. representing programs as e-graphs
  2. saturating the e-graph by repeatedly applying rewrite rules until no new equalities can be found
  3. extracting the best equivalent program according to some cost function

Benefits:

Applications:

Example: consider optimizing (x + 1) * (x + 1)

  1. saturation discovers:
    • original: (x + 1) * (x + 1)
    • commutativity: (1 + x) * (x + 1)
    • CSE: let t = x + 1 in t * t
    • power: (x + 1)²
  2. extraction chooses the cheapest form based on target architecture
    • maybe CSE form if multiplication is expensive

Phase-ordering problem in traditional term rewriting:

E-graph (equivalence graph)

Basic structure:

Properties:

Operations:

Exponential compression:

Efficient queries:

vs. other representations

Applications: e-graphs appears in

Fixpoint reasoning framework

: a computational approach for solving problems where you repeatedly apply rules or transformations until you reach a stable state (the fixpoint) where no further changes occur

Egglog

(relation edge (i64 i64))
(relation path (i64 i64))
(rule ((edge x y))
((path xy)))
(rule ((path x y)(edge y z))
((path x z)))
(edge 1 2)
(edge 2 3)
(edge 3 4)
(run)
(check (path 1 4));; succeeds
(function edge (i64 i64) i64)
(function path (i64 i64) i64 :merge (min old new))
(rule ((= (edge x y) len))
((set (path x y)len)))
(rule ((=(path x y)xy)(=(edge y z)yz))
((set (path x z)(+xy yz))))
(set (edge 1 2) 10)
(set (edge 2 3) 10)
(set (edge 1 3) 30)
(run)
(check (path 1 3));;prints "20"
(sort Node)
(function mk (i64)Node)
(relation edge (Node Node))
(relation path (Node Node))
(rule ((edge x y))
((path x y)))
(rule ((path x y) (edge y z))
((path x z)))
(edge(mk 1)(mk 2))
(edge (mk 2)(mk 3))
(edge (mk 5)(mk 6))
(union (mk 3)(mk 5))
(run)
(check (edge (mk 3)(mk 6)))
(check (path (mk 1)(mk 6)))
  1. basic equality saturation
(datatype Math
(Num i64)
(Var String)
(Add Math Math)
(Mu1 Math Math))
;;expr1 =2*(x+ 3)
(define expr1(Mul(Num 2)(Add (Var"x")(Num 3))))
;;expr2=6+2*x
(define expr2(Add (Num 6)(Mul (Num 2)(Var "x"))))
(rewrite (Add a b)(Add b a))
(rewrite (Mul a(Add b c))(Add (Mul a b)(Mul a c)))
(rewrite (Add (Num a)(Num b))(Num(+a b)))
(rewrite (Mul (Num a)(Num b))(Num (* a b)))
(run)
(check (= expr1 expr2))