Overview
This article covers two classes of square root algorithms suited for microcontrollers: the classical iterative method (similar to long division) and Newton's Method. The focus is on implementations that avoid hardware multiplication — essential for 8-bit PICs.
Decimal Square Root (Long Division Method)
The classical algorithm works digit by digit, similar to long division:
- Starting at the decimal point, pair off the digits
- Find the largest perfect square less than or equal to the leftmost pair. Its square root is the first digit of the result
- Bring down the next pair; multiply current result by 20; find next digit d such that
(20·current + d)·d ≤ remainder - Subtract and repeat
Example: \(\sqrt{31415.92653}\)
1 7 7. 2 ...
-----------------
) 03 14 15.92 65
-1
--
2 14 (1×20=20, find d: (20+7)×7=189 ≤ 214) → d=7
-1 89
------
25 15 (17×20=340, find d: (340+7)×7=2429 ≤ 2515) → d=7
-24 29
------
86 92 (177×20=3540, find d: (3540+2)×2=7084 ≤ 8692) → d=2
-70 84
------
16 08 ... → √31415.92653 ≈ 177.2...
Binary Square Root — Restoring Method
The binary algorithm builds the result one bit at a time. At each step, it appends 01 to the developed root and subtracts. If the remainder is positive, the new bit is 1; if negative, the bit is 0 and the remainder is restored:
Example: \(\sqrt{\mathtt{01011111}_2}\)
1 0 0 1.
-----------------
) 01 01 11 11.00
-1
---- → positive: bit = 1
00 01
-1 01 (developed root "1", append 01)
-----
11 00 → negative: bit = 0, restore
+1 01
------
00 01 11
-10 01 (developed root "10", append 01)
---------
11 11 10 → negative: bit = 0, restore
+10 01
---------
01 11 11
-1 00 01 (developed root "100", append 01)
----------
0 11 10 → positive: bit = 1 → root = 1001₂ = 9
Iterative Algorithm (Squaring Method)
Build the result bit by bit from the MSB, squaring at each step:
unsigned char isqrt(unsigned int N) {
unsigned int x = 0, j;
for(j = 1 << 7; j > 0; j >>= 1) {
x = x + j;
if(x * x > N)
x = x - j; // Undo if overshot
}
return (unsigned char)x;
}
This works well for processors with hardware multiply. For PICs without multiply, we can avoid squaring by exploiting the polynomial expansion. For an 8-bit number \(y\):
Since each bit \(b_i \in \{0,1\}\), we have \(b_i^2 = b_i\), and the iterative update becomes:
Optimized algorithm without multiply
s1 = 1 << 14; // Start with MSB pair
s2 = 0;
N = input;
do {
if((s1 + s2) <= N) {
N = N - s2 - s1;
s2 = s2 | (s1 << 1);
}
s1 = s1 >> 1;
N = N << 1; // Roll left (handle overflow bit!)
} while(s1 > (1 << 6));
return (s2 >> 8); // Result in upper byte
Newton's Method
Newton's iterative formula for square roots:
Starting from an initial guess \(x_0\), this converges quadratically (doubles correct digits each iteration). It requires hardware division, making it ideal for processors with FPUs (like a Pentium) but impractical for most 8-bit microcontrollers.
• PIC/AVR (no hardware multiply): Use the optimized shift-and-subtract iterative method
• ARM Cortex-M (hardware multiply): Use the squaring iterative method
• x86/x64 with FPU: Use Newton's method or the hardware FSQRT instruction