Home page: Sudoku solver in Objective Caml

Ok, last Sunday (2005-11-13), my father (hi, dad!) told me about Sudoku. And yesterday, Jon Harrop announced a very simple Sudoku solver on the OCaml mailing list.

This gave me the motivation to give it a try. I wrote a simple yet quite efficient solver for 9x9, 16x16 and 25x25 grids.

For those who don't know, Objective Caml (OCaml) a modern general-purpose and multi-paradigm programming language (functional, imperative, object-oriented) with a strong type system. It is also quite efficient.

For those who don't know - but I guess nobody doesn't know - Sudoku is a popular kind of pure-logic puzzles. I'm sure you can find information about Sudoku yourself!

## Getting the code

- Download. Version 20051128.

To compile the solver, you need a working installation of the Objective Caml compiler.

## Using the solver

Here is how to run the solver:

./sudoku [options...] < puzzle

Here is a description of the possible options.

The ** [-rank n]** option choose the size of the grid.
Possible values for sudoku grids are: 9 (grid 9x9), 15 (grid 15x15, with 5x3
boxes), 16 (grid 16x16), or 25 (grid 25x25). Default is 9.
It is also possible to use the solver for QWH puzzles
(as sudoku, without boxes constraints). You must
use the additional

**option, and n can be anything between 1 and 30.**

`[-magic]`
The ** [-only n]** option tells the solver to stop when n solutions
have been found. If you know that the puzzle has at most one
solution, you can use [-only 1]. Default is n=0: the solver
displays all the solutions.

The ** [-digit s]** option allows to choose the set of digits.
The length of the string s must the value n in the

**option. By default, the digits are taken from the string 123456789ABCDEFGHIJKLMNOPQRSTU.**

`-rank n`
The ** [-strat n]** option chooses a solving strategy.
The legal values for n are 1,2,3,4 (default:3). See below for
a description of the strategies.

The ** [-bench]** option runs the solver in benchmark mode.
Instead of reading and solving a single puzzle, the input
is assumed to contain many puzzles, which are solved in sequence.
It is possible to restrict the range of puzzles which are
used with the

**option (n and m are the number of the first and last puzzles in the input, where the numbering starts from 1).**

`[-range n-m]`
The ** [-timer]** option display the computation time for each
puzzle. The

**option display the number of solutions for each puzzle. The**

`[-count]`**option turns off the display of solutions.**

`[-quiet]`
The ** [-minimize]** option tells the solver to minimize
the puzzle. Instead of finding solutions, the solver will try
to remove clues as long as it is possible so that the resulting
puzzle still has a single solution (it is assumed that the input
puzzle has a single solution). A trace of the computation is printed.

The ** [-logic]** option tells the solver to run
in ``logic'' mode: no branching occurs. The solver is thus incomplete.
It prints the puzzle obtained after applying all the logic rules.

The file puzzle contains the description of one puzzle or many
puzzles (for the ** -bench** mode).
Characters 0 (zero) and . (dot) are interpreted as an unknown cell.
The digit string specify which characters are digits.
All other characters are ignored. However, the end-of-line characters
may only appear at the end of a full line of the puzzle.
In the

**mode, puzzles must be separated by an end-of-line character.**

`-bench`
The solver displays the solutions on its standard output, separated
by a line ** ---**.

## The solver's strategy

The solver is actually quite simple. It maintains in a matrix all the possible digits for each cell. When the value for a cell is discovered, the corresponding digit is removed from the possible choices for all the other cells one the same line, columns, or box. Also, when a digit can be placed in only one cell of a region (line, column, or box), the solver place the digit. These rules are applied until a fix-point is reached.

Now, there is some dose of case disjunction. For each cell in turn, we try all the possible choices and see what happens (applying the two propagation rules explained above). If one choice yield a contradiction (no remaining choice for a cell, or no possible position for a digit in a region), we can remove this choice from the set of possible digits for the current cell. We take the union of all the grids for each non-contradictory choice. This gives a new constraint on the possible choices. Let's consider an example. For the cell x, we have three choices 1,2,3 and for cell y, we have three choice 4,5,6. Imagine that if we set x to 1 (resp. 2), we discover that cell y can be only be 4 (resp. 5), and that if we set x to 3, we get a contradiction. This restrict the possible choices for x to 1,2 and the possible choices for y to 4,5.

We also iterate this process until a fix point is reached.
To continue, we will choose a cell and try in turn all the possible
choices. How to choose this cell? Well, in the process described
above, we already tried all the choices for each cell. We try
to choose the best cell to branch. This choice depends on the
** [-strat n]** option. If n=2, we choose a cell
with a minimum branching factor (minimum number of choices).
Similarly for n=1, but in this case, we don't even do
the case disjunction mentionned above.
If n=3, we choose a cell which will minimize further computation time
with some ad-hoc formulas to approximate this computation time (see
the code). For n=4, we take the lexicographic ordering of
the two previous criterion (first, we try to minimize the branching
factor, and in case of equality, we use the estimation
on computation time).

Globally, the solver explores several branches. When a branch yields a contradiction (no possible choice of digit for a cell), it is simply killed.

And that's all! The solver was able to solve all the 9x9, 16x16 and 25x25 puzzles I could find (often in a fraction of seconds, sometimes around the minute for the most complex 25x25). Please send me difficult puzzles!

## Minimal big puzzles

A minimal puzzle is a puzzle with the following properties:

- The puzzle has a single solution.
- If any of the clues is dropped, the resuling puzzle has more than one solution.

The ** [-minimize]** option of the solver produces minimal
sudokus or QWHs from puzzles with a single solution.

I used the solver to generate some minimal 25x25 sudokus, starting from this list by Guenter Sterten. You can also find some minimal 30x30 QWHs.

Home page: Sudoku solver in Objective Caml