# logic – How does the DPLL algorithm work?

## logic – How does the DPLL algorithm work?

The DPLL is essentially a **backtracking algorithm**, and thats the main idea behind the recursive calls.

The algorithm is building solution while trying assignments, you have a partial solution which might prove successful or not-successful as you go on. The geniusity of the algorithm is how to build the partial solution.

First, lets define what a **unit clause** is:

A unit clause is a clause which has exactly one literal which is still unassigned, and the other (assigned) literals – are all assigned to false.

The importance of this clause is that if the current assignment is valid – you can determine what is the value of the variable which is in the unassigned literal – because the literal must be true.

For example: If we have a formula:

```
(x1 / x2 / x3) / (~x1 / ~x4 / x5) / ( ~x1 / x4)
```

And we already assigned:

```
x1=true, x4=true
```

Then `(~x1 / ~x4 / x5)`

is a unit clause, because you must assign `x5=true`

in order to satisfy this clause in the current partial assignment.

**The basic idea of the algorithm is:**

- Guess a variable
- Find all unit clauses created from the last assignment and assign the needed value
- Iteratively retry step 2 until there is no change (found transitive closure)
- If the current assignment cannot yield true for all clauses – fold back from recursion and retry a different assignment
- If it can – guess another variable (recursively invoke and return to 1)

**Termination:**

- There is nowhere to go back to and change a guess (no solution)
- All clauses are satisfied (there is a solution, and the algorithm found it)

You can also have a look at these lecture slides for more information and an example.

**Usage example and importance:**

The DPLL, though 50 years old – is still the basis for most SAT solvers.

SAT Solvers are very useful for solving hard problems, one example in software verification – where you represent your model as a set of formulas, and the condition you want to verify – and invoke the SAT solver over it. Although exponential worst case – the average case is fast enough and is widely used in the industry (mainly for verifying Hardware components)

I will note that the technique used in DPLL is a common technique used in proofs in complexity theory, where you *guess* a partial assignment to things, and then try to *fill in* the rest. For more references or inspiration as to why DPLL works the way it does, you might try reading some of the complexity theoretic material surrounding SAT (in any good textbook on complexity theory).

Using DPLL off the shelf actually leads to a pretty crappy solution, and there are a few key tricks that you can play to do much better! Along with Amits answer, I will give some practical references for understanding how realistic DPLL works:

- If we have a formula with many variables
`{x1,...,xn}`

, you will find that the DPLL algorithm will terminate much more quickly (in the case that the formula*is*satisfiable) depending on*which*variable you choose. You will also find that choosing*correctly*is (obviously) more helpful. - There are multiple techniques to help you do this, called variable selection heuristics.
- There are also a number of optimizations to be made in the representation of the formula so that you can
*quickly*propagate decisions and saturate clauses, notably the two watched literals technique. - The
*real*breakthrough in SAT is based on*clause learning*. Whenever you get stuck, you create a new clause to add to your database, which will preclude you from searching the useless areas of your space. There has been a lot of research on the best strategies for including learned clauses: which should be included, and when? - MiniSat is a realistic and highly optimized SAT solver. I found that the Original MiniSat paper was a real eye opener in figuring out how to do really optimal SAT solving. Its really a great read, and highly recommended if you have any interest in learning more about the implementation of dependable SAT solvers.

So, at its core, SAT is a very important problem from a theoretical perspective (first NP complete reduction via Karp, interesting and tedious constructive technique that any complexity book will introduce), but also has very *practical* applications in model checking and software verification. If you are interested in a classic example of how to solve an NP complete problem very quickly, give a look to the implementation of industrial strength SAT solvers, its fun!