This gave me the motivation to give it a try. I wrote a simple yet quite efficient solver for 9x9, 16x16 and 25x25 grids.
For those who don't know, Objective Caml (OCaml) a modern general-purpose and multi-paradigm programming language (functional, imperative, object-oriented) with a strong type system. It is also quite efficient.
For those who don't know - but I guess nobody doesn't know - Sudoku is a popular kind of pure-logic puzzles. I'm sure you can find information about Sudoku yourself!
- Download. Version 20051128.
To compile the solver, you need a working installation of the Objective Caml compiler.
Here is how to run the solver:
./sudoku [options...] < puzzle
Here is a description of the possible options.
The [-rank n] option choose the size of the grid. Possible values for sudoku grids are: 9 (grid 9x9), 15 (grid 15x15, with 5x3 boxes), 16 (grid 16x16), or 25 (grid 25x25). Default is 9. It is also possible to use the solver for QWH puzzles (as sudoku, without boxes constraints). You must use the additional [-magic] option, and n can be anything between 1 and 30.
The [-only n] option tells the solver to stop when n solutions have been found. If you know that the puzzle has at most one solution, you can use [-only 1]. Default is n=0: the solver displays all the solutions.
The [-digit s] option allows to choose the set of digits. The length of the string s must the value n in the -rank n option. By default, the digits are taken from the string 123456789ABCDEFGHIJKLMNOPQRSTU.
The [-strat n] option chooses a solving strategy. The legal values for n are 1,2,3,4 (default:3). See below for a description of the strategies.
The [-bench] option runs the solver in benchmark mode. Instead of reading and solving a single puzzle, the input is assumed to contain many puzzles, which are solved in sequence. It is possible to restrict the range of puzzles which are used with the [-range n-m] option (n and m are the number of the first and last puzzles in the input, where the numbering starts from 1).
The [-timer] option display the computation time for each puzzle. The [-count] option display the number of solutions for each puzzle. The [-quiet] option turns off the display of solutions.
The [-minimize] option tells the solver to minimize the puzzle. Instead of finding solutions, the solver will try to remove clues as long as it is possible so that the resulting puzzle still has a single solution (it is assumed that the input puzzle has a single solution). A trace of the computation is printed.
The [-logic] option tells the solver to run in ``logic'' mode: no branching occurs. The solver is thus incomplete. It prints the puzzle obtained after applying all the logic rules.
The file puzzle contains the description of one puzzle or many puzzles (for the -bench mode). Characters 0 (zero) and . (dot) are interpreted as an unknown cell. The digit string specify which characters are digits. All other characters are ignored. However, the end-of-line characters may only appear at the end of a full line of the puzzle. In the -bench mode, puzzles must be separated by an end-of-line character.
The solver displays the solutions on its standard output, separated by a line ---.
The solver is actually quite simple. It maintains in a matrix all the possible digits for each cell. When the value for a cell is discovered, the corresponding digit is removed from the possible choices for all the other cells one the same line, columns, or box. Also, when a digit can be placed in only one cell of a region (line, column, or box), the solver place the digit. These rules are applied until a fix-point is reached.
Now, there is some dose of case disjunction. For each cell in turn, we try all the possible choices and see what happens (applying the two propagation rules explained above). If one choice yield a contradiction (no remaining choice for a cell, or no possible position for a digit in a region), we can remove this choice from the set of possible digits for the current cell. We take the union of all the grids for each non-contradictory choice. This gives a new constraint on the possible choices. Let's consider an example. For the cell x, we have three choices 1,2,3 and for cell y, we have three choice 4,5,6. Imagine that if we set x to 1 (resp. 2), we discover that cell y can be only be 4 (resp. 5), and that if we set x to 3, we get a contradiction. This restrict the possible choices for x to 1,2 and the possible choices for y to 4,5.
We also iterate this process until a fix point is reached. To continue, we will choose a cell and try in turn all the possible choices. How to choose this cell? Well, in the process described above, we already tried all the choices for each cell. We try to choose the best cell to branch. This choice depends on the [-strat n] option. If n=2, we choose a cell with a minimum branching factor (minimum number of choices). Similarly for n=1, but in this case, we don't even do the case disjunction mentionned above. If n=3, we choose a cell which will minimize further computation time with some ad-hoc formulas to approximate this computation time (see the code). For n=4, we take the lexicographic ordering of the two previous criterion (first, we try to minimize the branching factor, and in case of equality, we use the estimation on computation time).
Globally, the solver explores several branches. When a branch yields a contradiction (no possible choice of digit for a cell), it is simply killed.
And that's all! The solver was able to solve all the 9x9, 16x16 and 25x25 puzzles I could find (often in a fraction of seconds, sometimes around the minute for the most complex 25x25). Please send me difficult puzzles!
A minimal puzzle is a puzzle with the following properties:
- The puzzle has a single solution.
- If any of the clues is dropped, the resuling puzzle has more than one solution.
The [-minimize] option of the solver produces minimal sudokus or QWHs from puzzles with a single solution.