SEclone tagging basics


Tagging basics



Starting with “aligned pair exclusion” solving technique, the program use the “tagging” process of my solver adjusted to Sudoku Explainer specifications.


The first adjustment to comply with SE is that no group of candidates exist (at least not directly), so we only consider single candidates in the tagging.


In the tagging process, all candidates linked thru a strong link chain are forming a “layer”.


A layer is reduced to 2 “tags”, one for each set of candidates simultaneously “in” or “out” of the solution.


These tags are binary indicators (printed as lower case upper case letters eg a;A)  such as a=A^1.


Tags 0;1 are unused and show untagged candidates.


Tagging core process


The core of the process in the tagging procedure is the table of  “primary weak links.

Primary weak links are shown in 2 ways:


A table storing one occurrence of each couple of tags for which a weak link has been found;


A bit field table in which the primary link field is shown as an implication.

  If “a” - “b”  then a -> B and b-> A

  In the bit field dedicated  to “a”, “B” is set to 1

  In the bit field dedicated to “b”, “A” is set to 1.


The primary table is expanded to find all chains in a very simple way


If “x” -> “y” then “x” -> “bit field x”  | “ bit field y”


Usually, a small number of iterations gives the final table where all loop and discontinued loop can be found.


Extending the specifications


To add new properties, as in our example induced locked pairs, induced alignment …, the general way is to make a “vicinity analysis”, detecting all local conditions creating the property and generating a new set of weak links.


In my solver, I do this partially, and I also generate new strong links with groups. In SE clone, only the generation of new weak links will be authorized.


Nested chains is quite another story and will be considered in due time


Sets of candidates and derived weak links


The next feature common to SE and to my solver is the use of sets of candidates.

In SE, a set of candidates can be


All candidates of a cell

All candidates of a digit in a row; column; box.


They are shown in the program as a set of tags.


Such a set (choice in my program) has 2 main uses


If a tag is in conflict with all tags of a set, then that tag is not valid


If a tag “a” is in conflict with all tags of a set but one “b”, then we have “a” -> “b”, a new implication, also called  a new derived weak link if shown as “a” - “B”.


Several iterations can be made in derivation till the moment where nothing more appear.

If no more derivation comes, the process is locked.


Cloning SE, we will have to play on several parameters to be in the right condition of a specific step.


Typical lay out for a step in the search of chains


The typical process will be


Collect all strong links (and “equivalence links” if relevant  x <-> y )

Switch to tagging

Collect weak links and expand as required in the context

  (here some derivations should be made in general)

Look for results


What takes place in layers


A problem, to clone Sudoku Explainer using tagging comes from the layers.


Generally speaking, part of the chains are included in specific layers,

Sometimes, especially when the solving path is close to the end, a layer has all the chain included in it.


Both cases have to be seen carefully, especially to come to the length as defined in SE rating.


It will be easy to detect if eliminations or “nice loops” have been found. It will be slightly more difficult to compute the length.


When we reach a quasi BUG situation, we usually have a very small  number of layers, but many loop in the main ones. Even if an action has been proven, the process to identify the shortest loop must still be written with care.


Here is also a processing time issue, especially in “Y” mode. To overcome the difficulty, a reasonable limit has been set in the overall authorized length.


Computing the length of a chain/loop


No surprise, we need 2 interleaved process


One coming from the tags giving the mains bridges between layers

One to fill the gap within a layer.


The first process is not needed if all takes place within a layer, the second one is not needed in “Y” mode since we have no layer of more than one strong link.


As we store only one bridge (weak link) per couple of tags.  The code had to be revisited to find the shortest path. This must be done with care knowing that at the end of each game we can end with many bi value cells that could lead to  a very high number of paths .


As far as performances are concerned, out of the problem mentioned earlier at the end of a game, there is no key issue looking for the shortest chain. We are in a situation where eliminations have been established, so, at best,  .5/1000 of the possible starts from one specific candidate.