Abstract
We propose a new digit recurrence decimal square root (DSR) design and provide its ASIC implementation. The interim square root digits are in \([ {-5,5} ]\). The proposed architecture generally follows that of a previous radix-10 divider. However, it provides novel solutions with regard to few DSR-specific challenges. For example, complex error analysis shows that only four (out of sixteen) digits of partial square root is sufficient to estimate partial remainders that are required for the more complicated square root digit selection. This design performs about 10 % faster and consumes 28 % less area than the previously reported ASIC digit recurrence decimal square rooter.
Similar content being viewed by others
References
H. Baliga, N. Cooray , E. Gamsaragan, P. Smith, K. Yoon, J. Abel, A. Valles, Improvements in the Intel Core2 Penryn processor family architecture and microarchitecture. Intel Technol. J., 179–192 (2008). http://www.intel.com/technology/itj/2008/v12i3/3-paper/1-abstract.htm
N. Burgess, C.N. Hinds, Design of the ARM VFP11 divide and square root synthesisable macrocell, in Proceedings of the 18th IEEE Symposium on Computer Arithmetic, pp. 87–96 (2007)
F.Y. Busaba, C.A. Krygowski, W.H. Li, E.M. Schwarz, S.R. Carlough, The IBM z900 decimal arithmetic unit, in Asilomar Conference on Signals, Systems, and Computers, vol. 2, pp. 1335–1339 (2001)
M.D. Ercegovac, T. Lang, On-the-fly rounding. IEEE Trans. Comput. 41(12), 1497–1503 (1992)
M.D. Ercegovac, T. Lang, Digital Arithmetic (Morgan Kaufmann Publishers, Los Altos, 2004)
M.D. Ercegovac, R. McIlhenny, Design and FPGA implementation of radix-10 algorithm for division with limited precision primitives, in Proceedings of the 42nd Asilomar Conference on Signals, Systems and Computers (2008)
M.D. Ercegovac, R. McIlhenny, Design and FPGA implementation of radix-10 algorithm for square root with limited precision primitives, in Proceedings of the 43rd Asilomar Conference on Signals, Systems and Computers (2009)
M.D. Ercegovac, R. McIlhenny, Design and FPGA implementation of radix-10 combined division/square root algorithm with limited precision primitives, in Proceedings of the 44rd Asilomar Conference on Signals, Systems and Computers (2010)
G. Gerwig, H. Wetter, E.M. Schwarz, J. Haess, High performance floating-point unit with 116 bit wide divider, in Proceedings of the 16th Symposium on Computer Arithmetic, pp. 87–94 (2003)
L. Han, S. Ko, High speed parallel decimal multiplication with redundant internal encoding. IEEE Trans. Comput. 62(5), 956–968 (2013)
IEEE Inc., IEEE 754-2008 Standard for Floating-Point Arithmetic (2008)
G. Jaberipur, B. Parhami, M. Ghodsi, Weighted two-valued digit-set encodings: unifying efficient hardware representation schemes for redundant number systems. IEEE Trans. Circuits Syst. I 52(7), 1348–1357 (2005)
A. Kaivani, A. Hosseiny, G. Jaberipur, Improving the speed of decimal divider. IET Comput. Digit. Tech. 5(5), 393–404 (2011)
A. Kaivani, S.-B. Ko, Decimal SRT square root: algorithm and architecture. Circuits Syst. Signal Process. 32(5), 2137–2150 (2013)
T. Lang, A. Nannarelli, A radix-10 digit-recurrence division unit: algorithm and architecture. IEEE Trans. Comput. 56(6), 727–739 (2007)
H. Nikmehr, B. Philips, Fast decimal floating-point division. IEEE Trans. VLSI 14(9), 951–961 (2006)
NVIDIA. Fermi. NVIDIA’s Next Generation CUDA Computer Architecture. http://www.nvidia.com/content/PDF/fermiwhitepapers/NVIDIAFermiComputeArchitectureWhitepaper.pdf
S.F. Oberman, Floating-point division and square root algorithms and implementation in the AMD-K7 microprocessor, in Proceedings of the 14th Symposium on Computer Arithmetic, pp. 106–115 (1999)
R. Raafat, A. Mohamed, H.A.H. Fahmy, Y. Farouk, M. Elkhouly, T. Eldeeb, R. Samy, Decimal Floating-Point Square-Root Unit Using Newton–Raphson Iterations. United States Patent Application Publication, Pub. No: US 2012/0011182 A1 (2012)
R.K. Richards, Arithmetic Operations in Digital Computers (Van Nostrand, New York, 1955)
M. Schmookler, A. Weinberger, High speed decimal addition. IEEE Trans. Comput. C-20(8), 862–866 (1971)
E.M. Schwarz, Power6 decimal divider, in Proceedings of the IEEE International Conference on Applied-Specific Systems, Architecture and Processors (ASAP), pp. 128–133 (2007)
A. Vazquez, E. Antelo, E.P. Montuschi, (2007) A radix-10 SRT divider based on alternative BCD codings, in XXV IEEE International Conference on Computer Design (ICCD 2007), Lake Tahoe, CA, USA, pp. 280–287
L.K. Wang, M.J. Schulte, (2004) Decimal floating-point division using Newton-Raphson iteration, in Proceedings of the IEEE International Conference on Applied-Specific Systems, Architecture, and Processors (ASAP), pp. 84–95
L.K. Wang, M.J. Schulte, (2005) Decimal floating-point square root using Newton–Raphson iteration, in Proceedings of the IEEE International Conference on Applied-Specific Systems, Architecture, Processors (ASAP), pp. 309–315
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hosseiny, A., Jaberipur, G. Decimal Square Root: Algorithm and Hardware Implementation. Circuits Syst Signal Process 35, 4195–4219 (2016). https://doi.org/10.1007/s00034-015-0215-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-015-0215-1