Euro project adds up to faster, truer maths unit

Euro project adds up to faster, truer maths unitRichard Ball  
A European project co-ordinated by Newcastle upon Tyne University aims to produce an arithmetic unit that is more accurate and twice as fast as existing designs.
The concept is stunningly simple. As every schoolchild knows, multiplying and dividing numbers can be easily achieved by adding or subtracting their logarithms.
Expressing a 32-bit floating point value as its logarithm turns it into a 32-bit fixed point number and makes the maths easy. It reduces the computation time by a factor of five for multiplies and 15 for divides. Why Bother?
What’s so great about a mere doubling of arithmetic speed? The inherent growth rate of the industry says performance will double within 18 months anyway.
Well, looking at it from another angle, for a specific application the clock speed can be halved. This has a corresponding effect on the power consumption, and improves electromagnetic compatibility to boot.
Offer any mobile phone manufacturer a painless way of doubling talk time and they’ll snap your hand off. At the shoulder.  
So why aren’t all arithmetic units, microprocessors and DSPs organised this way? Unfortunately, while logarithms make multiplying and dividing a breeze, addition and subtraction (when the original numbers are expressed as logs) are very tricky.
The obvious answer is to do adds and subtracts in floating point and multiply and divides in logs. Indeed, if the data could be converted to and from logarithms on the fly, as and when needed, the problem would be solved.
But this would take far too much time. Therefore when data for the processor comes into the system, from an A/D converter for example, it has to be stored as a logarithm, never as a floating point number. And so additions and subtractions must be done in logarithms.
The sum of two numbers, x and y, when they are expressed in logs – i=log(x) and j=log(y) – is log(2j+2i), which can’t be calculated on a simple processor.
The project team has solved this problem by recourse to some clever maths techniques.
Log(2j+2i) can be expanded to equal i+log(1+2j-i). This is a non-linear function which can be evaluated by storing the solutions in a look up table (LUT).
However, for 32-bit numbers, the LUT would be enormous, so a small LUT is used with interpolation using the Taylor approximation. This adds errors, so the team has developed a brand new algorithm that runs in parallel with the LUT and calculates the error in the Taylor series which is subtracted out.
This process of addition or subtraction while in logs takes the same amount of time as it would in floating point. In typical algorithms, the acceleration of the multiplies and divides results in an overall doubling of speed. What about accuracy?
When arithmetic is performed on floating point numbers, each operation, such as the common multiply accumulate, can cause a half bit rounding error. Over the course of a long computation, a Fourier transform for instance, these errors can reach several bits.
Fixed point numbers on the other hand do not suffer from such errors. And, hey presto, the log of a 32-bit floating point number is fixed point. Newcastle claims the improvements in accuracy can be seen during simulations when 3D graphics data is being computed.

Leave a Reply

Your email address will not be published. Required fields are marked *