A New Rounding Method Based on Parallel Remainder Estimation for Goldschmidt and Newton-Raphson Algorithms

Newton-Raphson and Goldschmidt algorithms can be sped up by using variable latency hardware architectures for rounding division, square root and their reciprocals. A new approach based on a rounding method with remainder estimate calculated concurrently with the algorithm was proposed in [5]. This paper presents an study of the hardware implementation of this approach and shows that does not suppose additional latency and avoids conventional remainder calculation most of the times. By using a CMOS 90 nm technology library different hardware architectures are presented. The results show that the expected performance improvements are obtained with reasonable increments in area (up to 5.6%), critical path (up to 6.7%) and better power performance (up to -24%).