Chunk critical loop split in chunks. 
Chunk critical loop
The best way to understand the phase 1 process is to study first the critical loop. Upstream task are heavily oriented in the preparation of the chunk loop.
struct G17CHUNK{ V256_UAS xv[G17CHKX], yv[G17CHKY]; V256_GUAS xgv[G17CHKX], * ygv; // ygv pointer to the sub table in indexstep int nxc, nyc, ix, iy; XY_EXPAND tyec[G17CHKY], txec[G17CHKX];// current tables void GoChunk(); };
void G17CHUNK::GoChunk() {// elementary 'X' x 'Y' ychunk is 256x256 uint16_t * limit_store; uint16_t tstore[G17CHX * G17CHY]; // storing XY passing filter {//=========Find all XY passing first filter for the chunk register uint16_t * pstore = tstore; register int iy, store, ix; register BF128 * tvx = xv[0].v; for (ix = 0; ix < nxc; ix++, tvx += NVUAS128) { store = ix << 8; //8 low bits for iy register uint64_t Rx = txec[ix].stacks.u64; BF128 vx = tvx[0], vx2 = tvx[1], vx3 = tvx[2], vx4 = tvx[3]; register BF128 * tvy = yv[0].v; for (iy = 0; iy < nyc; iy++,tvy+= NVUAS128) { BF128 vy = tvy[0]; if ((vx & vy).isNotEmpty())continue; if (indexstep.n128vua > 1) { vy = tvy[1]; if ((vx2 & vy).isNotEmpty()) continue; if (indexstep.n128vua > 2) { vy = tvy[2]; if ((vx3 & vy).isNotEmpty()) continue; if (indexstep.n128vua > 3) { vy = tvy[3]; if ((vx4 & vy).isNotEmpty()) continue; } } } register uint64_t Ry = Rx + tyec[iy].stacks.u64; if ((Ry & 0xff) > 6) continue; if ((Ry & 0xff0000) > 0x60000) continue; if ((Ry & 0xff00000000) > 0x600000000) continue; *pstore++ = store  iy;// (ix<<8)  iy } }
limit_store = pstore; }
{//=========== send such XY to next step in the process To see later , } }
The primary filter is expected to clear more than 99% of the XY. To do this with a good cache effect, the fist step check a G17CHY x G17CHY matrix
Vector of still valid UAs have been prepared for each X,Y and moved in xv;yv. During the expansion of band1 and band2, the count per stack has been done. The relevant part of theses tables has been moved in tyec, txec.
The double loop test each XY
BF128 vy = tvy[0]; if ((vx & vy).isNotEmpty())continue;
These 2 instructions, testing the 128 smallest UAs should clear most of the XY. Very often, the next 384 UAs don’t exist.
The last test is the stack limit control. Each stack has the count (a 16 bits field), part of a 64 bits field. Adding X count and Y count gives the global count and each stack is checked against the limit of 6 given.
If the XY can not be cleared, the ix,iy are stored for the next step.
The matrix size can easily be changed. Tests give good results with a 256 x 256 matrix.
Exit of the critical loop
Again, we start from the downstream process to explain what must be done upstream. Having passed he first filter, we have
an XY with the expansion data (given, bit field of the given, count per stack) A possible list of more bands 1+2 UAs in FIFO tables The socket vectors of the X and of the Y.
The table of XY built in the chunk is scanned and the downstream process called for each XY still there.
The next task will be to do the cheapest controls before the brute force validity check of XY;
The simplest and likely most efficient control is to verify that recent bands 1+2 UAs found during the validation of previous XY in the same step are hit. As we have no vector here, the check is a classical scan of the FIFO tables against the XY pattern. The FIFO table (another idea of blue) works more or less as a vicinity mode. To have a chance to keep smaller UAs on a longer time, 2 different tables are built, one for smaller UAs, one for bigger UAs.
Other filters are using the valid sockets.
The vectors prepared at the step level combined to the socket vectors tell us which socket are already valid. A socket produce compulsory GUA2s GUA3s GUA4s GUA6s if a socket UA is not hit.
Only GUA2s GUA3s are of interest here, and the constraint is studied for each band 3.
Having the GUA2s GUA3s already valid for a band, The band is discarded if the total count of given required by the GUA2s GUA3s structure is >6 (6 given in band3). We can also see here if the count per stack pass the 6 clues limit for one of the stacks.
If all bands 3 are discarded, the XY is not valid.
Counting the minimum number of given for a set of GUA2s GUA3s
We are here analyzing a specific band. All “already valid” sockets are known, At the very beginning, socket giving GUA2s and GUA3s in the band have been identified. So we very fast go from the sockets list to the GUA2s GUA3s list of the band 3.
By construction, mini rows are disjoints. Adding the minimum of clues per mini row will give the minimum of given for the set of GUA2s GUA3s.
A GUA3s in minirow where we have GUA2s is redundant; A GUA3 alone is counted for one clue. A minirow with one pair is counted for one clue. A minirow with 3 pairs is counted for 2 clues. A minirow with 2 pairs is counted for one clue. The common cell hit both GUA2s
If the minimum count is 6, the band is said critical. All clues belong to the GUA2s GUA3s and in the minirows with 2 pairs, the common cell can be assigned.
If the count in stack (XY plus GUA2s GUA3s) is 6, the stack is critical. All clues in this stack are in the GUA2s GUA3s and in minirows with 2 pairs, the common pair can be assigned.
First check of stacks
In the code, the stack control with GUA2s GUA3s is done before the validation of the XY by the brute force in stacks with a minimum of 5 XY clues. The underlying idea is that this can limit the calls to the expensive brute force. More tests could tell whether it is worth doing this (may be depending on the number of bands 3).
Brute force validation for the XY
This is the final task in the phase 1. The brute force called send back a UA if the puzzle has multiple solutions. If the new UA is small enough, the FIFO table for small UAs is updated, and the UA is sent to the Add process of the primary table.
The brute force here is designed to produce a small UA if possible, assuming that the result will be negative.. The first guess is to set to “true” the digit with the highest count of candidates. If this fails (only one solution), the smallest count is used to build all possible “one digit” solutions and each one is used as guess (very similar to the last step in the global brute force).
