Computer arithmetic: Algorithms and Hardware Designs by behrooz parhami. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, without the prior permission of Oxford University Press.
Computer arithmetic: Algorithms and Hardware Designs by behrooz parhami. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, without the prior permission of Oxford University Press.
Department of Electrical and Computer Engineering University of California. Santa Barbara
New York Oxford
OXFORD UNIVERSITY PRESS 2000
Oxford University Press
Oxford New York
Athens Auckland Bangkok Bogota Buenos Aires Calcutta
Cape Town Chennai Dar es Salaam Delhi Florence Hong Kong Istanbul Karachi Kuala Lumpur Madrid Melbourne Mexico City Mumbai Nairobi Paris Sao Paulo Singapore Taipei Tokyo Toronto Warsaw
198 Madison Avenne, New York, New York 10016 http://www.oup-usa.org
. Oxford is a registered trademark of Oxford University Press
AIl rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise.
without the prior permission of Oxford University Press.
Library of Congress Cataloging-in-Publication Data
Parhami, Behrooz.
Computer arithmetic: algorithms and hardware designs I Behrooz Parhami.
p. em.
Includes bibliographical references and index. ISBN 0-19-512583-5 (cloth)
28 PAST, PRESENT, AND FUTURE 464 28.1 Historical Perspective 464
28.2 An Early High-Performance Machine 466 28.3 A Modern Vector Supercomputer 468 28.4 Digital Signal Processors 469
28.5 A Widely Used Microprocessor 472 28.6 Trends and Future Outlook 473
Problems 475 References 477
Index 479
PREFACE
THE CONTEXT OF COMPUTER ARITHMETIC
Advances in computer architecture over the past two decades have allowed the performance of digital computer hardware to continue its exponential growth, despite increasing technological difficulty in speed improvement at the circuit level. This phenomenal rate of growth, which is expected to continue in the near future, would not have been possible without theoretical insights, experimental research, and tool-building efforts that have helped transform computer architecture from an art into one of the most quantitative branches of computer science and engineering. Better understanding of the various forms of concurrency and the development of a reasonably efficient and user-friendly programming model have been key enablers of this success story.
The downside of exponentially rising processor performance is an unprecedented increase in hardware and software complexity. The trend toward greater complexity is not only at odds with testability and certifiability but also hampers adaptability, performance tuning, and evaluation of the various trade-offs, all of which contribute to soaring development costs. A key challenge facing current and future computer designers is to reverse this trend by removing layer after layer of complexity, opting instead for clean, robust, and easily certifiable designs, while continuing to try to devise novel methods for gaining performance and ease-of-use benefits from simpler circuits that can be readily adapted to application requirements.
In the computer designers'quest for user-friendliness, compactness, simplicity, high performance, low cost, and low power, computer arithmetic plays a key role. It is one of oldest subfields of computer architecture. The bulk of hardware in early digital computers resided in accumulator and other arithmeticllogic circuits. Thus, first-generation computer designers were motivated to simplify and share hardware to the extent possible and to carry out detailed costperformance analyses before proposing a design. Many of the ingenious design methods that we use today have their roots in the bulky, power-hungry machines of 30-50 years ago.
In fact computer arithmetic has been so successful that it has, at times, become transparent.
Arithmetic circuits are no longer dominant in terms of complexity; registers, memory and memory management, instruction issue logic, and pipeline control have become the dominant consumers of chip area in today's processors. Correctness and high performance of arithmetic circuits is routinely expected, and episodes such as the Intel Pentium division bug are indeed rare.
The preceding context is changing for several reasons. First, at very high clock rates, the interfaces between arithmetic circuits and the rest of the processor become critical. Arithmetic units can no longer be designed and verified in isolation. Rather, an integrated design optimization is required, which makes the development even more complex and costly. Second, optimizing arithmetic circuits to meet design goals by taking advantage of the strengths of new
xv
xvi Preface
technologies, and making them tolerant to the weaknesses, requires a reexamination of existing design paradigms. Finally, incorporation of higher-level arithmetic primitives into hardware makes the design, optimization, and verification efforts highly complex and interrelated.
This is why computer arithmetic is alive and well today. Designers and researchers in this area produce novel structures with amazing regularity. Carry-lookahead adders comprise a case in point. We used to think:, in the not so distant past, that we knew all there was to know about carry-lookahead fast adders. Yet, new designs, improvements, and optimizations are still appearing. The ANSIIIEEE standard fioating-point format has removed many of the concerns with compatibility and error control in fioating-point computations, thus resulting in new designs and products with mass-market appeal. Given the arithmetic-intensive nature of many novel application areas (such as encryption, error checking, and multimedia), computer arithmetic will continue to thrive for years to come.
THE GOALS AND STRUCTURE OF THIS BOOK
The field of computer arithmetic has matured to the point that a dozen or so texts and reference books have been published. Some of these books that cover computer arithmetic in general (as opposed to special aspects or advanced/unconventional methods) are listed at the end of the preface. Each of these books has its unique strengths and has contributed to the formation and fruition of the field. The current text, Computer Arithmetic: Algorithms and Hardware Designs, is an outgrowth of lecture notes the author developed and refined over many years. Here are the most important features of this text in comparison to the listed books:
Division of mate rial into lecture-size chapters. In my approach to teaching, a lecture is a more or less self-contained module with links to past lectures and pointers to what will transpire in future. Each lecture must have a theme or title and must proceed from motivation, to details, to conclusion. In designing the text, I strived to divide the material into chapters, each of which is suitable for one lecture (1-2 hours). A short lecture can cover the first few subsections, while a longer lecture can deal with variations, peripheral ideas, or more advanced material near the end of the chapter. To make the structure hierarchical, as opposed to fiat or linear, lectures are grouped into seven parts, each composed of four lectures and covering one aspect of the field (Fig. P.!).
Emphasis on both the underlying theory and actual hardware designs. The ability to cope with complexity requires both a deep knowledge of the theoretical underpinnings of computer arithmetic and examples of designs that help us understand the theory. Such designs also provide building blocks for synthesis as well as reference points for cost-performance comparisons. This viewpoint is reflected in, for example, the detailed coverage of redundant number representations and associated arithmetic algorithms (Chapter 3) that later lead to a better understanding of various multiplier designs and on-line arithmetic. Another example can be found in Chapter 22, where CORDIC algorithms are introduced from the more intuitive geometric viewpoint.
Linking computer arithmetic to other sub fields of computing. Computer arithmetic is nourished by, and in turn nourishes, other subfields of computer architecture and technology. Examples of such links abound. The design of carry-Iookahead adders became mnch more systematic once it was realized that the carry computation is a special case of parallel prefix computation that had been extensively studied by researchers in parallel computing. Arithmetic for and by neural networks is an area that is still being
Preface xvii
Book Book parts Chapters
1. Numbers and Arithmetic
Number Representation 2. Representing Signed Numbers
(Part I) 3. Redundant Number Systems
4. Residue Number Systems
5. Basic Addition and Counting
Addition/Subtraction 6. Carry-Lookabead Adders
'" (Part II) 7. Variations in Fast Adders
.::
.~ 8. Multioperand Addition
'" ~
'"
Cl .~
~ " 9. Basic Multiplication Schemes
t:l c,
~ 0 Multiplication 10. High-Radix Multipliers
1: ~ (Part III) I!. Tree and Array Multipliers
~ o 12. Variations in Multipliers
~ S
::! ..!l
t:l ~
'" 13. Basic Division Schemes
!::
;S Division 14. High-Radix Dividers
.;::: (Part IV) 15. Variations in Dividers
c
~ 16. Division by Convergence
~
.2
ti 17. Floating-Point Representations
!::
;S Real Arithmetic 18. Floating-Point Operations
.;::: (Part V) 19. Errors and Error Control
~
"- 20. Precise and Certifiable Arithmetic
;::
::::
f
IS 21. Square-Rooting Methods
Function Evaluation 22. The CORDIC Algorithms
(Part VI) 23. Variations in Function Evaluation
24. Arithmetic by Table Lookup
25. High-Throughput Arithmetic
Implementation Topics 26. Low-Power Arithmetic
(Part VII) 27. Fault-Tolerant Arithmetic
28. Past, Present, and Future
Fig. P.I The structure of this book in parts and chapters. explored. The residue number system has provided an invaluable tool for researchers interested in complexity theory and the limits of fast arithmetic, as well as to the designers of fault-tolerant digital systems.
Wide coverage of important topics. The text covers virtually all important algorithmic and hardware design topics in computer arithmetic, thus providing a balanced and complete view of the field. Coverage of unconventional number representation methods (Chapters 3 and 4), arithmetic by table lookup (Chapter 24), which is becoming increasingly important, multiplication and division by constants (Chapters 9 and 13), errors and certifiable arithmetic (Chapters 19 and 20), and the topics in Part VII (Chapters 25-28) do not all appear in other textbooks.
xviii Preface
Unified and consistent notation and terminology throughout the text. Every effort is made to use consistent notation and terminology throughout the text. For example, r always stands for the number representation radix and s for the remainder in division or square-rooting. While other authors have done this in the basic parts of their texts, many tend to cover more advanced research topics by simply borrowing the notation and terminology from the reference source. Such an approach has the advantage of making the transition between reading the text and the original reference source easier, but it is utterly confusing to the majority of the students, who rely on the text and do not consult the original references except, perhaps, to write a research paper.
SUMMARY OF TOPICS
The seven parts of this book, each composed of four chapters, were written with the following goals.
Part I sets the stage, gives a taste of what is to come, and provides a detailed perspective on the various ways of representing fixed-point numbers. Included are detailed discussions of signed numbers, redundant representations, and residue number systems.
Part II covers addition and subtraction, which form the most basic arithmetic building blocks and are often used in implementing other arithmetic operations. Included in the discussions are addition of a constant (counting), various methods for designing fast adders, and multi operand addition.
Part III deals exclusively with multiplication, beginning with the basic shift/add algorithms and moving on to high-radix, tree, array, bit-serial, modular, and a variety of other multipliers. The special case of squaring is also discussed.
Part IV covers division algorithms and their hardware implementations, beginning with the basic shift/subtract algorithms and moving on to high-radix, prescaled, modular, array, and convergence dividers.
Part V deals with real number arithmetic, including various methods for representing real numbers, floating-point arithmetic, errors in representation and computation, and methods for high-precision and certifiable arithmetic.
Part VI covers function evaluation, beginning with the important special case of squarerooting and moving on to CORDIC algorithms, followed by general convergence and approximation methods. including the use of lookup tables.
Part VII deals with broad design and implementation topics. including pipelining, lowpower arithmetic, and fault tolerance. This part concludes by providing historical perspective and examples of arithmetic units in real computers.
POINTERS ON HOW TO USE THE BOOK
For classroom use, the topics in each chapter of this text can be covered in a lecture lasting 1-2 hours. In my own teaching, I have used the chapters primarily for 1.5-hour lectures, twice a week, in a lO-week quarter, omitting or combining some chapters to fit the material into 18-20
Preface xix
lectures. But the modular structure of the text lends itself to other lecture formats, self-study, or review of the field by practitioners. In the latter two cases, readers can view each chapter as a study unit (for one week, say) rather than as a lecture. Ideally, all topics in each chapter should be covered before the reader moves to the next chapter. However, if fewer lecture hours are available, some of the subsections located at the end of chapters can be omitted or introduced only in terms of motivations and key results.
Problems of varying complexities, from straightforward numerical examples or exercises to more demanding studies or miniprojects, are supplied for each chapter. These problems form an integral part of the book: they were not added as afterthoughts to make the book more attractive for use as a text. A total of 464 problems are included (15-18 per chapter). Assuming that two lectures are given per week, either weekly or biweekly homework can be assigned, with each assignment having the specific coverage of the respective half-part (two chapters) or full-part (four chapters) as its "title."
An instructor's manual, with problem solutions and enlarged versions of the diagrams and tables, suitable for reproduction as transparencies, is planned. The author's detailed syllabus for the course ECE 252B at UCSB is available at:
A simulator for numerical experimentation with various arithmetic algorithms is available at:
http://www.ecs.umass.edu/ece.koren/arith.simulator courtesy of Pofessor Israel Koren.
References to classical papers in computer arithmetic, key design ideas, and important state-of-the-art research contributions are listed at the end of each chapter. These references provide good starting points for in-depth studies or for term papers or projects. A large number of classical papers and important contributions in computer arithmetic have been reprinted in two volumes [Swar90].
New ideas in the field of computer arithmetic appear in papers presented at biannual conferences, known as ARITH-n, held in odd-numbered years [Arit], Other conferences of interest include Asilomar Conference on Signals, Systems, and Computers [Asil], International Conference on Circuits and Systems [ICCSl, Midwest Symposium on Circuits and Systems [MSCS], and international Conference on Computer Design [ICCD]. Relevant journals include IEEE Transactions on Computers [TrCo], particularly its special issues on computer arithmetic, IEEE Transactions on Circuits and Systems [TrCS], Computers & Mathematics with Applications [CoMa], lEE Proceedings: Computers and Digital Techniques [PrCD], IEEE Transactions on VLSI Systems LTrVLJ, and Journal ojVLSI Signal Processing [JVSPl
ACKNOWLEDGMENTS
Computer Arithmetic: Algorithms and Hardware Designs is an outgrowth of lecture notes the author used for the graduate course "ECE 252B: Computer Arithmetic" at the University of California, Santa Barbara, and, in rudimentary forms, at several other institutions prior to 1988. The text has benefited greatly from keen observations, curiosity, and encouragement of my many students in these courses. A sincere thanks to all of them!
xx Preface
REFERENCES
[Arit] International Symposium on Computer Arithmetic, sponsored by the IEEE Computer Society. This series began with a one-day workshop in 1969 and was subsequently held in 1972, 1975, 1978, and in odd-numbered years since 1981. The 13th symposium in the series, ARITH-13, was held on July 6-9, 1997, in Asilomar, California. ARITH-14 was held April 14-16, 1999, in Adelaide, Australia.
[Asil] Asilomar Conference on Signals Systems, and Computers, sponsored annually by IEEE and held on the Asilomar Conference Grounds in Pacific Grove, California. The 32nd conference in this series was held on November 1-4, 1998.
[Cava84] Cavanagh, J. J. F., Digital Computer Arithmetic: Design and Implementation, McGrawHill, 1984.
Computers & Mathematics with Applications, journal published by Pergamon Press. Flores, 1., The Logic of Computer Arithmetic, Prentice-Hall, 1963.
Gosling, J. B., Design of Arithmetic Units fur Digital Computers, Macmillan, 1980. Hwang, K., Computer Arithmetic: Principles, Architecture, and Design, Wiley, 1979. International Conference on Computer Design, sponsored annually by the IEEE Computer Society. ICCD-98 was held on October 4-7, 1998, in Austin, Texas.
International Conference on Circuits and Systems, sponsored annually by the IEEE Circuits and Systems Society. The latest in this series was held on May 31-June 3, 1998, in
Monterey, California.
[lVSP] J. VLSI Signal Processing, published by Kluwer Academic Publishers.
lKnut97] Knuth, D. E., The Art of Computer Programming, Vol. 2: Seminurnerical Algorithms, 3rd ed., Addison-Wesley, 1997. (The widely used second edition, published in 1981, is cited in Parts V and Vl.)
[Kore93] Koren, I., Computer Arithmetic Algorithms, Prentice-Hall, 1993.
[Kuli81] Kulisch, u. W., and W. L. Miranker, Computer Arithmetic in Theory and Practice, Academic Press, 1981.
[MSCS] Midwest Symposium on Circuits and Systems, sponsored annually by the IEEE Circuits and Systems Society. This series of symposia began in 1955, with the 41 st in the series held on August 9-12, 1998, in Notre Dame, Indiana.
[Omon94] Omondi, A. R., Computer Arithmetic Systems: Algorithms, Architecture and Implementations, Prentice-Hall, 1994.
lEE Proc: Computers and Digital Techniques, journal published by the Institution of Electrical Engineers, United Kingdom.
Richards, R. K., Arithmetic Operations in Digital Computers, Van Nostrand, 1955. Scott, N. R., Computer Number Systems and Arithmetic, Prentice-Hall, 1985.
Stein, M. L., and W. D. Munro, Introduction to Machine Arithmetic, Addison-Wesley, 1971.
[CoMa] [F1or63] [GosI80] [Hwan79] [ICCD]
[ICCS]
[PrCD]
[Rich55] [Scot851 [Stei71J
[Swar90] Swartzlander, E. E., Jr., Computer Arithmetic, Vols. I and II, IEEE Computer Society Press, 1990.
[TrCo] IEEE Trans. Computers, journal published by the IEEE Computer Society. Occasionally entire special issues or sections are devoted to computer arithmetic (e.g.: Vol. 19, No.8, August 1970; Vol. 22, No.6, June 1973; Vol. 26, No.7, July 1977; Vol. 32, No.4, April 1983; Vol. 39, No.8, August 1990; VoL 41, No.8, August 1992; Vol. 43, No.8, August 1994; Vol. 47, No.7, July 1998).
[TrCS] IEEE Trans. Circuits and Svstems=-Il: Analog and Digital Signal Processing, journal published by IEEE.
lTrVL] IEEE Trans. Very Large Scale integration (VLSI) Systems, journal published jointly by the
IEEE Circuits and Systems Society, Computer Society, and Solid-State Circuits Council. [Wase82] Waser, S., and M. J. Flynn, Introduction to Arithmetic for Digital Systems Designers, Holt, Rinehart, & Winston, 1982.
[Wino80j Winograd, S., Arithmetic Complexity of Computations, SIAM, 1980.
COMPUTER ARITHMETIC
PART I
NUMBER REPRESENTATION
Number representation is arguably the most important topic in computer arithmetic. In justifying this claim, it suffices to note that several important classes of number representations were discovered, or rescued from obscurity, by computer designers in their quest for simpler and faster circuits. Furthermore, the choice of number representation affects the implementation cost and delay of all arithmetic operations. We thus begin our study of computer arithmetic by reviewing conventional and exotic representation methods for integers. Conventional methods are of course used extensively. Some of the unconventional methods have been applied to special-purpose digital systems or in the intermediate steps of arithmetic hardware implementations where they are often invisible to computer users. This part consists of the following four chapters:
Chapter 1 N umbers and Arithmetic Chapter 2 Representing Signed Numbers Chapter 3 Redundant Number Systems Chapter 4 Residue Number Systems
Chapter
1
NUMBERS AND ARITHMETIC
This chapter motivates the reader, sets the context in which the material in the rest of the book is presented, and reviews positional representations of fixed-point numbers. The chapter ends with a review of methods for number radix conversion and a preview of other number representation methods to be covered. Chapter topics include:
1.1 What is Computer Arithmetic? 1.2 A Motivating Example
1.3 Numbers and Their Encodings
1.4 Fixed-Radix Positional Number Systems 1.5 Number Radix Conversion
1.6 Classes of Number Representations
1.1 WHAT IS COMPUTER ARITHMETIC?
A sequence of events, begun in late 1994 and extending into 1995, embarrassed the world's largest computer chip manufacturer and put the normally dry subject of computer arithmetic on the front pages of major newspapers. The events were rooted in the work of Thomas Nicely, a mathematician at the Lynchburg College in Virginia, who is interested in twin primes (consecutive odd numbers such as 29 and 31 that are both prime). Nicely's work involves the distribution of twin primes and, particularly, the sum of their reciprocals S = 1/5 + 1/7 + 1/11 + 1/13 + 1/17 + 1/19 + 1/29 + 1/31 + ... + 1/ p + 1/(p + 2) + .... While it is known that the infinite sum S has a finite value, no one knows what the value is.
Nicely was using several different computers for his work and in March 1994 added a machine based on the Intel Pentium processor to his collection. Soon he began noticing inconsistencies in his calculations and was able to trace them back to the values computed for 1/ p and 1/ (p + 2) on the Pentium processor. At first, he suspected his own programs, the compiler, and the operating system, but by October, he became convinced that the Intel Pentium chip was at fault. This suspicion was confirmed by several other researchers following a barrage of e-mail exchanges and postings on the Internet.
3
4 Numbers and Arithmetic
The diagnosis finally came from Tim Coe, an engineer at Vitessc Semiconductor. Coe built a model of Pentium's floating-point division hardware based on the radix-4 SRT algorithm and came up with an example that produces the worst-case error. Using double-precision fioatingpoint computation, the ratio c = 4 195 835/3 ]45 727 = 1.333 82044· .. is computed as 1.333 739 06 on the Pentium. This latter result is accurate to only 14 bits; the error is even larger than that of single-precision floating-point and more than 10 orders of magnitude worse that what is expected of double-precision computation [Mole95].
The rest, as they say, is history. Intel at first dismissed the severity of the problem and admitted only a "subtle flaw," with a probability of 1 in 9 billion, or once in 27,000 years for the average spreadsheet user, of leading to computational errors. It nevertheless published a "white paper" that described the bug and its potential consequences and announced a replacement policy for the defective chips based on "customer need"; that is, customers had to show that they were doing a lot of mathematical calculations to get a free replacement. Under heavy criticism from customers, manufacturers using the Pentium chip in their products, and the on-line community, Intel later revised its policy to no-questions-asked replacement.
Whereas supercomputing, microchips, computer networks, advanced applications (particularly chess-playing programs), and many other aspects of computer technology have made the news regularly in recent years, the Intel Pentium bug was the first instance of arithmetic (or anything inside the CPU for that matter) becoming front-page news. While this can be interpreted as a sign of pedantic dryness, it is more likely an indicator of stunning technological success. Glaring software failures have come to be routine events in our information-based society, but hardware bugs are rare and newsworthy.
Having read the foregoing account, you may wonder what the radix-4 SRT division algorithm is and how it can lead to such problems. Well, that's the whole point of this introduction! You need computer arithmetic to understand the rest of the story. Computer arithmetic is a subfield of digital computer organization. It deals with the hardware realization of arithmetic functions to support various computer architectures as well as with arithmetic algorithms for firmware or software implementation. A major thrust of digital computer arithmetic is the design of hardware algorithms and circuits to enhance the speed of numeric operations. Thus much of what is presented here complements the architectural and algorithmic speedup techniques studied in the context of high-performance computer architecture and parallel processing.
A majority of our discussions relate to the design of top-of-the-line CPUs with highperformance parallel arithmetic circuits. However, we will at times also deal with slow bitserial designs for embedded applications, where implementation cost and I/O pin limitations are of prime concern. It would be a mistake, though, to conclude that computer arithmetic is useful only to computer designers. We will see shortly that you can use scientific calculators more effectively and write programs that are more accurate and/or more efficient after a study of computer arithmetic. You will be able to render informed judgment when faced with the problem of choosing a digital signal processor (DSP) chip for your project. And, of course, you will know what exactly wenl wrong in the Pentium.
Figure 1.1 depicts the scope of computer arithmetic. On the hardware side, the focus is on implementing the four basic arithmetic operations (five, if you count square-rooting), as well as commonly used computations such as exponentials, logarithms, and trigonometric functions. For this, we need to develop algorithms, translate them to hardware structures, and choose from among multiple implementations based on cost-performance criteria. Since the exact computations to be carried out by the general-purpose hardware are not known a priori, benchmarking is used to predict the overall system performance for typical operation mixes and to make various design decisions.
On the software side, the primitive functions are given (e.g., in the form of a hardware chip such as the Pentium processor or a software tool such as Mathematica), and the task is
1.2 A MOTIVATING EXAMPLE 5
Hardware (our focus in this book)
Software
Design of efficient digital circuits for primiti ve and other arithmetic operations such as +. -. x , -i-, J, log, sin, and cos
Fast primitive operations like +,-,x,-;-,J Benchmarking
Tailored to application areas such as Digital filtering
Image processing
Radar tracking
Fig. 1.1 The scope of computer arithmetic.
to synthesize cost-effective algorithms, with desirable error characteristics, to solve various problems of interest. These topics are covered in numerical analysis and computational science courses and textbooks and are thus mostly outside the scope of this book.
Within the hardware realm, we will be dealing with hoth general-purpose arithmeticllogic units (ALUs), of the type found in many commercially available processors, and special-purpose structures for solving specific application problems. The differences in the two areas are minor as far as the arithmetic algorithms are concerned. However, in view of the specific technological constraints, production volumes, and performance criteria, hardware implementations tend to be quite different. General-purpose processor chips that are mass-produced have highly optimized custom designs. Implementations of low-volume, special-purpose systems, on the other hand, typically rely on semicustom and off-the-shelf components. However, when critical and strict requirements, such as extreme speed, very low power consumption, and miniature size, preclude the use of semicustom or off-the-shelf components, the much higher cost of a custom design may be justified even for a special-purpose system.
1.2 A MOTIVATING EXAMPLE
Use a calculator that has the square-root, square, and exponentiation (x") functions to perform the following computations. I have given the numerical results obtained with my (l0+2)-digit scientific calculator. You may obtain slightly different values.
First, compute "the 1024th root of 2" in the following two ways:
u = J J. .. v'2 = 1.000677 131
10 times
v = 21/1024 = 1.000677 131
Save both u and v in memory, if possible. If you can't store u and v, simply recompute them when needed. Now, perform the following two equivalent computations based on u:
The four different values obtained for x, x', y, and y', in lieu of 2, hint that perhaps v and u are not really the same value. Let's compute their difference:
w = v - u = 1 X 10-11
Why isn't w equal to zero? The reason is that even though u and v are displayed identically, they in fact have different internal representations. Most calculators have hidden or guard digits (mine has two) to provide a higher degree of accuracy and to reduce the effect of accumulated errors when long computation sequences are performed.
Let's see if we can determine the hidden digits for the u and v values above. Here is one way:
(u - 1) x 1000 = 0.677 130680 (v - 1) x 1000 = 0.677 130690
[Hidden· .. (0) 68] [Hidden .. (0) 69]
This explains why w is not zero, which in turn tells us why u 1024 i=- V I 024. The following simple analysis might be helpful in this regard.
V1024 = (u + 10-11)1024
~ ul024 + 1024 X 1O-llu1023 ~ u1024 + 2 x 10 8
The difference between v1024 and ulO24 is in good agreement with the result of the preceding analysis. The difference between «(u2)2) ... )2 and u 1024 exists because the former is computed through repeated multiplications while the latter uses the built-in exponentiation routine of the calculator, which is likely to be less precise.
Despite the discrepancies, the results of the foregoing computations are remarkably precise.
The values of u and v agree to II decimal digits, while those of x, x', y, y' are identical to eight digits. This is better than single-precision, floating-point arithmetic on the most elaborate and expensive computers. Do we have a right to expect more from a calculator that costs $20 or less? Ease of use is, of course, a different matter from speed or precision. For a detailed exposition of some deficiencies in current calculators, and a refreshingly new design approach, see [Thim95].
1.3 NUMBERS AND THEIR ENCODINGS
Number representation methods have advanced in parallel with the evolution of language. The oldest method for representing numbers consisted of the use of stones or sticks. Gradually, as
1.3 NUMBERS AND THEIR ENCODINGS 7
larger numbers were needed, it became difficult to represent them or develop a feeling for their magnitudes. More importantly, comparing large numbers was quite cumbersome. Grouping the stones or sticks (e.g., representing the number 27 by 5 groups of 5 sticks plus 2 single sticks) was only a temporary cure. It was the use of different stones or sticks for representing groups of 5, 10, etc. that produced the first major breakthrough.
The latter method gradually evolved into a symbolic form whereby special symbols were used to denote larger units. A familiar example is the Roman numeral system. The units of this system are 1,5,10,50,100,500,1000,10000, and 100000, denoted by the symbols I, V, X, L, C, D, M, «(1)), and «(I))), respectively. A number is represented by a string of these symbols, arranged in descending order of values from left to right. To shorten some of the cumbersome representations, allowance is made to count a symbol as representing a negative value if it is to the left of a larger symbol. For example, IX is used instead of VIllI to denote the number 9 and LD is used for CCCCL to represent the number 450.
Clearly, the Roman numeral system is not suitable for representing very large numbers.
Furthermore, it is difficult to do arithmetic on numbers represented with this notation. The positional system of number representation was first used by the Chinese. In this method, the value represented by each symbol depends not only on its shape but also on its position relative to other symbols. Our conventional method of representing numbers is based on a positional system.
For example in the number 222, each of the "2" digits represents a different value. The leftmost 2 represents 200. The middle 2 represents 20. Finally, the rightmost 2 is worth 2 units. The representation of time intervals in terms of days, hours, minutes, and seconds (i.e., as four-element vectors) is another example of the positional system. For instance, in the vector T = 5 5 5 5, the leftmost element denotes 5 days, the second from the left represents 5 hours, the third element stands for 5 minutes, and the rightmost element denotes 5 seconds.
If in a positional number system, the unit corresponding to each position is a constant multiple of the unit for its right neighboring position, the conventional fixed-radix positional system is obtained. The decimal number system we use daily is a positional number system with 10 as its constant radix. The representation of time intervals, as just discussed, provides an example of a mixed-radix positional system for which the radix is the vector R = 0 24 60 60.
The method used to represent numbers affects not just the ease of reading and understanding numbers but also the complexity of arithmetic algorithms used for computing with numbers. The popularity of positional number systems is in part due to the availability of simple and elegant algorithms for performing arithmetic on such numbers. We will see in subsequent chapters that other representations provide advantages over the positional representation in terms of certain arithmetic operations or the needs of particular application areas. However, these systems are of limited use precisely because they do not support universally simple arithmetic.
In digital systems, numbers are encoded by means of binary digits or bits. Suppose you have 4 bits to represent numbers. There are 16 possible codes. You are free to assign the 16 codes to numbers as you please. However, since number representation has significant effects on algorithm and circuit complexity, only some of the wide range of possibilities have found applications.
To simplify arithmetic operations, including the required checking for singularities or special cases, the assignment of codes to numbers must be done in a logical and systematic manner. For example, if you assign codes to 2 and 3 but not to 5, then adding 2 and 3 will cause an "overflow" (yields an unrepresentable value) in your number system.
Figure 1.2 shows some examples of assignments of 4-bit codes to numbers. The first choice is to interpret the 4-bit patterns as 4-bit binary numbers, leading to the representation of natural numbers in the range [0,15]. The signed-magnitude scheme results in integers in the range [-7,7] being represented, with 0 having two representations, (viz., ±O). The 3-plus-l fixed-point number system (3 whole bits, 1 fractional bit) gives us numbers from Oto 7.5 in increments ofO.5. Viewing
8 Numbers and Arithmetic
the 4-bit codes as signed fractions gives us a range of [-0.875, +0.8751 or [-I, +0.875], depending on whether we use signed-magnitude or 2's-complement representation.
The 2-plus-2 unsigned floating-point number system in Fig. 1.2, with its 2-bit exponent e in the range [-2, 1] and 2-bit integer significand s in [0, 3], can represent certain values s x 2e in [0, 6]. In this system, 0.00 has four representations, 0.50, 1.00, and 2.00 have two representations each, and 0.25, 0.75, l.50, 3.00, 4.00, and 6.00 are uniquely represented. The 2-plus-2 logarithmic number system, which represents a number by approximating its 2-plus-2, fixed-point, base-Z Iogarithm, completes the choices shown in Fig. 1.2.
1.4 FIXED-RADIX POSITIONAL NUMBER SYSTEMS
A conventional fixed-radix, fixed-point positional number system is usually based on a positive integer radix (base) r and an implicit digit set {O, 1, ... , r - I}. Each unsigned integer is represented by a digit vector of length k + I, with k digits for the whole part and 1 digits for the fractional part. By convention, the digit vector Xk-1Xk-2 .•• X1XO.x-) X-2 ••. X_I represents the value:
k (Xk-1Xk2'" X)XO'X_1X_2'" X-I)r = L x.r' ;=-1
One can easily generalize to arbitrary radices (not necessarily integer or positive or constant) and digit sets of arbitrary size or composition. In what follows, we restrict our attention to digit sets composed of consecutive integers, since digit sets of other types complicate arithmetic and have no redeeming property. Thus, we denote our digit set by {-a, -a + I, .. ·,13 -I, f3J = [-a, 13].
The following examples demonstrate the wide range of possibilities in selecting the radix and digit set.
Fig. 1.2 Some of the possible ways of assigning 16 distinct codes to represent numbers.
1.4 FIXED-RADIX POSITIONAL NUMBER SYSTEMS 9
• Example 1.1 Balanced ternary number system: r = 3, digit set = [-1, 1].
• Example 1.2 Negative-radix number systems: radix -r, digit set = [0, r - 1].
even! odd!
= ( ... OX40X20XO.OX-20X_40X_6 .. ')r - ( ... XSOX30Xl 0.X_10X_30X_SO· . '}r The special case with r = - 2 and digit set of [O, 11 is known as the negabinary number system.
• Example 1.3 Nonredundant signed-digit number systems: digit set [-a, r - I-a] for radix r. As an example, one can use the digit set [-4,5] for r = 10. We denote a negative digit by preceding it with a minus sign, as usual, or by using a hyphen as a left superscript when the minus sign could be mistaken for subtraction. For example,
(3 -1 5)len represents the decimal number 295 = 300 - 10 + 5
(-3 1 5)ten represents the decimal number - 285 = -300 + 10 + 5
• Example 1.4 Redundant signed-digit number systems: digit set [-a, In with a + f3 :::: r for radix r. One can use the digit set [-7,7], say, for r = 10. In such redundant number systems, certain values may have multiple representations. For example, here are some representations for the decimal number 295:
(3 -1 5}tcn = (30 -S)ten = (l -70 -5)ten
We will study redundant representations in detail in Chapter 3.
• Example 1.5 Fractional radix number systems: r = O.l with digit set [0, 9].
10 Numbers and Arithmetic
• Example 1.6 Irrational radix number systems: r = .j2 with digit set [0, 1]. ( ... XSX4X3X2Xl XO·X-l X-2X-3X-4X-SX-6 •. ')"/2 = LXi (h)i
These examples illustrate the generality of our definition by introducing negative, fractional, and irrational radices and by using both nonredundant or minimal (r different digit values) and redundant (» r digit values) digit sets in the common case of positive integer radices. Wc can go even further and make the radix an imaginary or complex number .
• Example 1.7 Complex-radix number systems: the quater-imaginary number system uses r = 2j, where j = yCT, and the digit set [0, 3].
It is easy to see that any complex number can be represented in the quater-imaginary number system of Example 1.7, with the advantage that ordinary addition (with a slightly modified carry rule) and multiplication can be used for complex-number computations. The modified carry rule is that a carry of -1 (a borrow actually) goes two positions to the left when the position sum, or digit total in a given position, exceeds 3.
In radix r, with the standard digit set [0, r - 1], the number of digits needed to represents the natural numbers in [0, max] is:
k = [log, maxJ + 1 = rlogr(max + l )" Note that the number of different values represented is max + 1.
With fixed-point representation using k whole and I fractional digits, we have:
We will find the term ulp, for unit in least (significant) position, quite useful in describing certain arithmetic concepts without distinguishing between integers and fixed-point representations that include fractional parts. For integers, ulp = 1.
Specification of time intervals in terms of weeks, days, hours, minutes, seconds, and milliseconds is an example of mixed-radix representation. Given the two-part radix vector ... r3rZrjrO; r -I r -2' .. defining the mixed radix, the two-part digit vector· .. X3X2X1XO; X-I X-2 ... represents the value
X-I X-2
.. ,x3r2rjrO +x2rjrO +xlrO +xQ + - + __ - + ...
r_j r_)r_2
In the time interval example, the mixed radix is > ·7,24,60,60; 1000· .. and the digit vector 3,2,9, 22, 57; 492 (3 weeks, 2 days, 9 hours, 22 minutes, 57 seconds, and 492 milliseconds) represents
1.5 NUMBER RADIX CONVERSION 11
(3 x 7 x 24 x 60 x 60) + (2 x 24 x 60 x 60) + (9 x 60 x 60) + (22 x 60) + 57 + 492/1000 = 2 020 977 .492 seconds
In Chapter 4, we will see that mixed-radix representation plays an important role in dealing with values represented in residue number systems.
1.5 NUMBER RADIX CONVERSION
Assuming that the unsigned value u has exact representations in radices rand R, we can write:
u = W.V
= (Xk-lXk-2" 'XIXO,X-IX-2" 'X-I)r
= (XK-1XK-2", XIXO.X-1X-2··· X-dR
If an exact representation does not exist in one or both of the radices, the foregoing equalities will be approximate.
The radix conversion problem is defined as follows:
Given r the old radix,
R the new radix, and the
XiS digits in the radix-r representation of u
find the XiS digits in the radix-R representation of u In the rest of this section, we will describe two methods for radix conversion based on doing the arithmetic in the old radix r or in the new radix R. We will also present a shortcut method, involving very little computation, that is applicable when the old and new radices are powers of the same number (e.g., 8 and 16, which are both powers of 2).
Note that in converting u from radix r to radix R, where rand R are positive integers, we can convert the whole and fractional parts separately. This is because an integer (fraction) is an integer (fraction), independent of the number representation radix.
Doing the arithmetic in the old radix r
We use this method when radix -r arithmetic is more familiar or efficient. The method is useful, for example, when we do manual computations and the old radix is r = 10. The procedures for converting the whole and fractional parts, along with their justifications or proofs, are given below.
Converting the whole part w
Procedure: Repeatedly divide the integer w = (Xk-lXk-l'" XIXO)r by the radix-r representation of R. The remainders are the XiS, with Xu generated first.
Justification: (XK-IXK-2'" X1XO)R - (XO)R is divisible by R. Therefore, Xo is the remainder of dividing the integer w = (Xk-lXk-2 ... XIXO)r by the radix-rrepresentation ofR.
12 Numbers and Arithmetic
Example: (105)len = (?)fivc Repeatedly divide by 5:
Quotient Remainder
105 0
21 1
4 4
o
From the above, we conclude that (105)ten = (41O)five.
Converting the fractional part v
Procedure: Repeatedly multiply the fraction v = (.X-1X-2·· ,x')r by the radix-r representation of R. In each step, remove the whole part before multiplying again. The whole parts obtained are the XiS, with X-1 generated first.
Example: (l05.486)(en = (41O.?)t1ve Repeatedly multiply by 5:
Whole part Fraction
.486
2 .430
2 .150
o .750
3 .750
3 .750
From the above, we conclude that (105.486)len ::::;; (41O.22033)five'
Doing the arithmetic in the new radix R
We use this method when radix-R arithmetic is more familiar or efficient. The method is useful, for example, when we manually convert numbers to radix 10. Again, the whole and fractional parts are converted separately.
Converting the whole part w
Procedure: Use repeated multiplications by rfollowed by additions according to the formula «(- .. «xk-lr + xk-2)r + xk-3)r + .. ·)r + xdr + Xo.
Justification: The given formula is the well-known Homer's method (or rule), first presented in the early nineteenth century, for the evaluation of the (k - l)th-degree polynomial Xk_lrk-1 + Xk_2rk-2 + ... + xlr + Xo [Knut97].
1.5 NUMBER RADIX CONVERSION 13
Example: (410)five = (?)'en
«4 x 5) + I) x 5 + 0 = 105 =} (410)five = (l05)ten
Converting the fractional part v
Procedure: Convert the integer r1(O.v) and then divide by r' in the new radix.
From the above, we conclude that (410.22033)five = (l05.48576)'en'
Note: Homer's method works here as well but is generally less practical. The digits of the fractional part are processed from right to left and the multiplication operation is replaced with division. Figure 1.3 shows how Homer's method can be applied to the preceding example.
Shortcut method for r = bg and R = bG
In the special case when the old and new radices are integral powers of a common base b, that is, r = bg and R = bG, one can convert from radix r to radix b and then from radix b to radix R. Both these conversions arc quite simple and require virtually no computation.
Fig.l.3 Horner's rule used tu convert (.22033)five to decimal.
14 Numbers and Arithmetic
To convert from the old radix r = bg to radixb, simply convert each radix-rdigit individually into a g-digit radix-h number and then juxtapose the resulting g-digit numbers.
To convert from radix b to the new radix R = bG, form G-digit groups of the radix-b digits starting from the radix point (to the left and to the right). Then convert the G-digit radix-b number of each group into a single radix-R digit and juxtapose the resulting digits .
(2 301.302)fom = (10 11 0001.11 00 lO)two
-----_-
2301302
= (10 110001.110 OlO)two = (26 1.62) eight
-----
2 6 1 6 2
• Example 1.8 (2 30l.302)four = (?)cight
We have 4 = 22 and 8 = 23• Thus, conversion through the intermediate radix 2 is used. Each radix-a digit is independently replaced by a 2-bit radix-2 number. This is followed by 3-bit groupings of the resulting binary digits to find the radix-8 digits.
Clearly, when g = leG = 1), the first (second) step of the shortcut conversion procedure is eliminated. This corresponds to the special case of R = rG(r = Rg). For example, conversions between radix 2 and radix 8 or 16 belong to these special cases.
1.6 CLASSES OF NUMBER REPRESENTATIONS
In Sections 1.4 and 1.5, we considered the representation of unsigned fixed-point numbers using fixed-radix number systems, with standard and nonstandard digit sets, as well as methods for converting between such representations with standard digit sets. In digital computations, we also deal with signed fixed-point numbers as well as signed and unsigned real values. Additionally, we may use unconventional representations for the purpose of speeding up arithmetic operations or increasing their accuracy. Understanding different ways of representing numbers, including their relative cost-performance benefits and conversions between various representations, is an important prerequisite for designing efficient arithmetic algorithms or circuits.
In the next three chapters, we will review techniques for representing fixed-point numbers, beginning with conventional methods and then moving on to some unconventional representations.
Signed fixed-point numbers, including various ways of representing and handling the sign information, are covered in Chapter 2. Signed-magnitude, biased, and complement representations (including both 1 's- and 2's-complement) are covered in some detail.
The signed-digit number systems of Chapter 3 can also be viewed as methods for representing signed numbers, although their primary significance lies in the redundancy that allows addition without carry propagation. The material in Chapter 3 is essential for understanding several speedup methods in multiplication, division, and function evaluation.
Chapter 4 introduces residue number systems (for representing unsigned or signed integers) that allow some arithmetic operations to be performed in a truly parallel fashion at very high speed. Unfortunately, the difficulty of division and certain other arithmetic operations renders
PROBLEMS 15
these number systems unsuitable for general applications. In Chapter 4, we also use residue representations to explore the limits of fast arithmetic.
Representation of real numbers can take different forms. Examples include slash number systems (for representing rational numbers), logarithmic number systems (for representing real values), and of course, floating-point numbers that constitute the primary noninteger data format in modern digital systems. These representations are discussed in Chapter 17 (introductory chapter of Part V), immediately before we deal with algorithms, hardware implementations, and error analyses for real-number arithmetic.
By combining features from two or more of the aforementioned "pure" representations, we can obtain many hybrid schemes. Examples include hybrid binary/signed-digit (see Section 3.4), hybrid residue/binary (see Section 4.5), hybrid logarithmic/signed-digit (see Section 17.6), and hybrid floating-point/logarithmic (see Problem 17.16) representations.
1.1 Arithmetic algorithms Consider the integral In = fo' xne-x dx that has the exact solution n ![l- (lIe) L;=o 1/r!]. The integral can also be computed based on the recurrence equation In = nln-' - lie with 10 = 1 - lie.
a. Prove that the recurrence equation is correct.
b. Use a calculator or write a program to compute the values of Ij for 1 :s j :s 20.
c. Repeat part b with a different calculator or with a different precision in your program.
d. Compare your results to the exact value 120 = 0.018 350 468 and explain any difference.
1.2 Arithmetic algorithms Consider the sequence {ud defined by the recurrence Uj+, iu, - i, with u, = e.
a. Use a calculator or write a program to determine the values of u, for I :s i :s 25.
b. Repeat part a with a different calculator or with a different precision in your program.
c. Explain thc results.
1.3 Arithmetic algorithms Consider the sequence {ai} defined by the recurrence aj+2 = III - 1130Iai+l + 3000/(aj+la;), with ao = 11/2 and a, = 61/11. The exact limit of this sequence is 6; but on any real machine, a different limit is obtained. Use a calculator or write a program to determine the values of a, for 2 :s j :s 25. What limit do you seem to be getting? Explain the outcome.
1.4 Positional representation of the integers
a. Prove that an unsigned binary integer x is a power of 2 if and only if the bitwise logical AND of x and x-I is O.
b. Prove that an unsigned radix-3 integer x = (Xk-lXk-2 ... x,XO)three is even if and only if "k-l .
1 L...j=O Xi IS even.
c. Prove that an unsigned binary integer x = (Xk-lXk-2 ... XIXO)two is divisible by 3 if and only if Leveni Xi - LOdd; Xi is a multiple of 3.
d. Generalize the statements of parts band c to obtain rules for divisibility of radix-r integers by r - 1 and r + 1.
16 Numbers and Arithmetic
1.5 Unconventional radices
a. Convert the negabinary number (00011111 0010 l10l)-two to radix 16 (hexadecimal).
b. Repeat part a for radix -16 (negahexadecimal).
c. Derive a procedure for converting numbers from radix r to radix -r and vice versa.
1.6 Unconventional radices Consider the number x whose representation in radix -r (with r a positive integer) is the (2k + I)-element all-l s vector.
a. Find the value of x in terms of k and r.
b. Represent -x in radix -r (negation or sign change).
c. Represent x in the positive radix r.
d. Represent -x in the positive radix r.
1.7 Unconventional radices Let () be a number in the negative radix -r whose digits are all r - 1. Show that -() is represented by a vector of all 2s, except for its most- and least-significant digits, which are Is.
1.8 Unconventional radices Consider a fixed-radix positional number system with the digit set [-2,2] and the imaginary radix r = 2j (j = -J=]).
a. Describe a simple procedure to determine whether a number thus represented is real.
b. Show that all integers arc representable and that some integers have multiple representations.
c. Can this system represent any complex number with integral real and imaginary parts?
d. Describe simple procedures for finding the representations of a - bj and 4(a + bj), given the representation of a + bj.
1.9 Unconventional radices Consider the radix r = -1 + j (j = R) with the digit set [0, 1].
a. Express the complex number -49 + j in this number system.
b. Devise a procedure for determining whether a given bit string represents a real number.
c. Show that any natural number is representable with this number system.
1.10 Number radix conversion
a. Convert the following octal (radix-8) numbers to hexadecimal (radix-16) notation: 12,5655,2550276,76545 336, 3 726755
b. Represent (48A.C2)sixteen and (192.837)ten in radices 2,8, 10, 12, and 16.
c. Outline procedures for converting an unsigned radix-r number, using the standard digit set [0, r - 1], into radices 11r, ..[i and j ~(j = yCl), using the same digit set.
1.11 Number radix conversion Consider a fixed-point, radix-4 number system in which a number x is represented with k whole and I fractional digits.
PROBLEMS 17
a. Assuming the use of standard radix-4 digit set [0, 3J and radix-8 digit set [0, 7J, determine K and L, the numbers of whole and fractional digits in the radix -8 representation of x as functions of k and I.
b. Repeat part a for the more general case in which the radix-4 and radix-8 digit sets are r -0:,131 and [-20:,213], respectively, with 0: ~ ° and f3 ~ 0.
1.12 Number radix conversion Dr. N. E. Patent, a frequent contributor to scientific journals, claims to have invented a simple logic circuit for conversion of numbers from radix 2 to radix 10. The novelty of this circuit is that it can convert arbitrarily long numbers. The binary number is input one bit at a time. The decimal output will emerge one digit at a time after a fixed initial delay that is independent of the length of the input number. Evaluate this claim using only the information given.
1.13 Fixed-point number representation Consider a fixed-point, radix-3 number system, using the digit set [-1, IJ, in which numbers are represented with k integer digits and I fractional digits as: dk-1dk-2 .•. d1do.d_1d_2 ... d_,
a. Determine the range of numbers represented as a function of k and l.
b. Given that each radix-3 digit needs a 2-bit encoding, compute the representation efficiency of this number system relative to the binary representation.
c. Outline a carry-free procedure for converting one of the above radix-3 numbers to an equivalent radix-3 number using the redundant digit set [0, 3]. By a carry-free procedure, we mean a procedure that determines each digit of the new representation locally from a few neighboring digits of the original representation, so that the speed of the circuit is independent of the length of the original number.
1.14 Number radix conversion Discuss the design of a hardware number radix converter that receives its radix-r input digit-serially and produces the radix-R output (R > r) in the same manner. Multiple conversions are to be performed continuously; that is, once the last digit of one number has been input, the presentation of the second number can begin with no time gap [Parh92J.
1.15 Decimal-to-binary conversion Consider a 2k-bit register, the upper half of which holds a decimal number, with each digit encoded as a 4-bit binary number (binary-coded decimal or BCD). Show that repeating the following steps k times will yield the binary equivalent of the decimal number in the lower half of the 2k-bit register: Shift the 2k-bit register one bit to the right; independently subtract 3 units from each 4-bit segment of the upper half whose binary value equals or exceeds 8 (there are kl4 such 4-bit segments).
1.16 Design of comparators An h-bit comparator is a circuit with two h-bit unsigned binary inputs, x and y, and two binary outputs designating the conditions x < y and x > y. Sometimes a third output corresponding to x = y is also provided, but we do not need it for this problem.
a. Present the design of a 4-bit comparator.
b. Show how five 4-bit comparators can be cascaded to compare two 16-bit numbers.
c. Show how a three-level tree of 4-bit comparators can be used to compare two 28-bit numbers. Try to use as few 4-bit comparator blocks as possible.
18 Numbers and Arithmetic
d. Generalize the result of part b to derive a synthesis method for large comparators built from a cascaded chain of smaller comparators.
e. Generalize the result of part c to derive a synthesis method for large comparators built from a tree of smaller comparators.
REFERENCES
[Knut97] Knuth, D. E., The Art of Computer Programming, 3rd ed., Vol. 2: Seminumerical Algorithms, Addison-Wesley, 1997.
[Mole95] Moler, C., "A Tale of Two Numbers," SIAM News, Vol. 28, No. I, pp. I, 16, 1995. [Parh92] Parhami, B., "Systolic Number Radix Converters," Computer J., Vol. 35, No.4, pp. 405- 409, August 1992.
[Scot85] Scott, N. R., Computer Number Systems and Arithmetic, Prentice-Hall, 1985.
[Thim95] Thimbleby, H., "A New Calculator and Why It Is Necessary," Computer J., Vol. 38, No. 6, pp. 418-433, 1995.
Chapter
2
REPRESENTING SIGNED NUMBERS
This chapter deals with the representation of signed fixed-point numbers by providing an attached sign bit, adding a fixed bias to all numbers, complementing negative values, attaching signs to digit positions, or using signed digits. In view of its importance in the design of fast arithmetic algorithms and hardware, representing signed fixed-point numbers by means of signed digits is further explored in Chapter 3. Chapter topics include:
2.4 Two's- and l's-Complement Numbers 2.5 Direct and Indirect Signed Arithmetic 2.6 Using Signed Positions or Signed Digits
2.1 SIGNED-MAGNITUDE REPRESENTATION
The natural numbers 0, 1,2, . ", max can be represented as fixed-point numbers without fractional parts (refer to Section 1.4). In radix r, the number k of digits needed for representing the natural numbers up to max is
k = [log, maxJ + 1 = flogr(max + 1)1
Conversely, with k digits, one can represent the values ° through rk - 1, inclusive; that is, the interval [0, rk - 1] = [0, rk) of natural numbers.
Natural numbers are often referred to as "unsigned integers," which form a special data type in many programming languages and computer instruction sets. The advantage of using this data type as opposed to "integers" when the quantities of interest are known to be nonnegative is that a larger representation range can be obtained (e.g., maximum value of 255, rather than 127, with 8 bits).
One way to represent both positive and negative integers is to use "signed magnitudes," or the sign-and-magnitude format, in which one bit is devoted to sign. The common convention is
19
20 Representing Signed Numbers
to let 1 denote a negative sign and 0 a positive sign. In the case of radix-2 numbers with a total length of k bits, k -1 bits will be available to represent the magnitude or absolute value of the number. The range of k-bit signed-magnitude binary numbers is thus [- (2k -I - 1), 2k-1 - 1]. Figure 2.1 depicts the assignment of values to bit patterns for a 4-bit signed-magnitude format.
Advantages of signed-magnitude representation include its intuitive appeal, conceptual simplicity, symmetric range, and simple negation (sign change) by flipping or inverting the sign bit. The primary disadvantage is that addition of numbers with unlike signs (subtraction) must be handled differently from that of same-sign operands.
The hardware implementation of an adder for signed-magnitude numbers either involves a magnitude comparator and a separate subtractor circuit or else is based on the use of complement representation (see Section 2.3) internally within the arithmeticllogic unit (ALU). Tn the latter approach, a negative operand is complemented at the ALU's input, the computation is done by means of complement representation, and the result is complemented, if necessary, to produce the signed-magnitude output. Because the pre- and postcomplementation steps add to the computation delay, it is better to use the complement representation throughout.
Besides the aforementioned extra delay in addition and subtraction, signed-magnitude representation allows two representations for 0, leading to the need for special care in number comparisons or added overhead for detecting -0 and changing it to +0. This drawback, however, is unavoidable in any radix-2 number representation system with symmetric range.
Figure 2.2 shows the hardware implementation of signed-magnitude addition using selective pre- and postcomplementation. The control circuit receives as inputs the operation to be performed (0 = add, 1 = subtract), the signs of the two operands x and y, the carry-out of the adder, and the sign of the addition result. It produces signals for the adder's carry-in, complementation of x, complementation of the addition result, and the sign of the result. Note that complementation hardware is provided only for the x operand. This is because x - y can be obtained by first computing y - x and then changing the sign of the result. You will understand this design much better after we have covered complement representations of negative numbers in Sections 2.3 and 2.4.
0000
Bit pattern (representation)
0011
Signed values (signed magnitude)
Increment
1000
Fig. 2.1 A 4-bit signed-magnitude number representation system for integers.
2.2 BIASED REPRESENTATIONS 21
Sign x Sign y
, ,
Add/Sub
Control
y
Sign
Campi s
Sign s
s
Fig. 2.2 Adding signed-magnitude numbers using pre-complementation and postcomplementation.
2.2 BIASED REPRESENTATIONS
One way to deal with signed numbers is to devise a representation or coding scheme that converts signed numbers into unsigned numbers. For example, the biased representation is based on adding a positive value bias to all numbers, allowing us to represents the integers from -bias to max ~ bias using unsigned values from 0 to max. Such a representation is sometimes referred to as "excess-bias" (e.g., excess-3 or excess-128) coding. We will see in Chapter 17 that biased representation is used to encode the exponent part of a floating-point number.
Figure 2.3 shows how signed integers in the range [-8, +71 can be encoded as unsigned values 0 through 15 by using a bias of 8. With k-bit representations and a bias of 2k-1, the leftmost bit indicates the sign of the value represented (0 = negative, 1 = positive). Note that this is the opposite of the commonly used convention for number signs. With a bias of 2k or 2k - 1, the range of represented integers is almost symmetric.
Biased representation does not lend itself to simple arithmetic algorithms. Addition and subtraction become somewhat more complicated because one must subtract or add the bias from/to the result of a normal add/subtract operation, since:
x + y + bias = (x + bias) + (y + bias) - bias x - y + bias = (x + bias) - (y + bias) + bias
With k-bit numbers and a bias of 2k-1 , adding or subtracting the bias amounts to complementing the leftmost bit. Thus, the extra complexity in addition or subtraction is negligible.
Multiplication and division become significantly more difficult if these operations are to be performed directly on biased numbers. For this reason, the practical use of biased representation is limited to the exponent parts of floating-point numbers, which are never multiplied or divided.
22 Representing Signed Numbers
0000
Bit pattern (representation)
Signed values (biased by 8)
1 Increment 0100J
1000
Fig.2.3 A 4-bit biased integer number representation system with a bias of 8.
2.3 COMPLEMENT REPRESENTATIONS
In a complement number representation system, a suitably large complementation constant M is selected and the negative value -xis represented as the unsigned value M - x. Figure 2.4 depicts the encodings used for positive and negative values and the arbitrary boundary between the two regions.
To represent integers in the range [- N, + P] unambiguously, the complementation constant M must satisfy M ~ N + P + I. This is justified by noting that to prevent overlap between the
o
Unsigned representations
Fig. 2.4 Complement representation of signed integers.
2.3 COMPLEMENT REPRESENTATIONS 23
representations of positive and negative values in Figure 2.4, we must have M - N > P. The choice of M = N + P + I yields maximum coding efficiency, since no code will go to waste.
In a complement system with the complementation constant M and the number representation range L - N, + P J, addition is done by adding the respective unsigned representations (modulo M). The addition process is thus always the same, independent of the number signs. This is easily understood if we note that in modulo-M arithmetic adding M - I (e.g.), is the same as subtracting 1. Table 2.1 shows the addition rules for complement representations, along with conditions that lead to overflow.
Subtraction can be performed by complementing the subtrahend and then performing addition. Thus, assuming that a selective complementer is available, addition and subtraction become essentially the same operation, and this is the primary advantage of complement representations.
Complement representation can be used for fixed-point numbers that have a fractional part.
The only difference is that consecutive values in the circular representation of Fig. 2.4 will be separated by ulp instead of by 1. As a decimal example, given the complementation constant M = 12.000 and a fixed-point number range of [-6.000, +5.999], the fixed-point number -3.258 has the complement representation 12.000 - 3.258 = 8.742.
We note that two auxiliary operations are required for complement representations to be effective: complementation or change of sign (computing M - x) and computations of residues mod M. If finding M - x requires subtraction and finding residues mod M implies division, then complement representation becomes quite inefficient. Thus M must be selected such that these two operations are simplified. Two choices allow just this for fixed-point radix-r arithmetic with k whole digits and I fractional digits:
Radix complement
Digit or diminished-radix complement
M = rk
M = rk - ulp
For radix-complement representations, modulo-M reduction is done by ignoring the carry-out from digit position k - I in a (k +l)-digit radix-r addition. For digit-complement representations, computing the complement of x (i.e., M - x), is done by simply replacing each nonzero digit Xi by r - I - Xi. This is particularly easy if r is a power of 2. Complementation with M = rk and mod-M reduction with M = rk - ulp are similarly simple. You should be able to supply the details for radix r after reading Section 2.4, which deals with the important special case of r =2.
TABLE 2.1
Addition in a complement number system with the complementation constant M and range [-N, +P].
Desired Computation to be Correct result Overflow
operation performed mod M with no overflow condition
(+xl + (+y) x+y x+y x + y > P
(+x) + (-y) x + (M - y) x-yify~x N/A
M - (y - x) if y > x
(-xl + (+y) (M-x)+y y - x if x ~ y N/A
M - (x - y) if x > y
(-x) + (-y) (M - x) + (M - y) M - (x + y) x+y > N 24 Representing Signed Numbers
2.4 TWO'S- AND 1 'S-COMPLEMENT NUMBERS
In the special case of r = 2, the radix complement representation that corresponds to M = 2k is known as twos complement. Figure 2.5 shows the 4-bit, 2's-complement integer system (k = 4, I = 0, M = 24 = 16) and the meanings of the 16 representations allowed with 4 bits. The boundary between positive and negative values is drawn approximately in the middle to make the range roughly symmetric and to allow simple sign detection (the leftmost bit is the sign).
The 2's complement of a number x can be found via bitwise complementation of x and the addition of ulp:
2k - X = [(2k - ulp) - x] + ulp = xcompl + ulp
Note that the binary representation of 2k - ulp consists of alII s, making (2k - ulp) - x equivalent to the bitwise complement of x, denoted as xcompl. Whereas finding the bitwise complement of x is easy, adding ulp to the result is a slow process, since in the worst case it involves full carry propagation. We will see later how this addition of ulp can usually be avoided.
To add numbers modulo 2k, we simply drop a carry-out of I produced by position k - 1.
Since this carry is worth 2k units, dropping it is equivalent to reducing the magnitude of the
result by 2k. '
The range of representable numbers in a 2's-complement number system with k whole bits is:
from
to
2k-1 - ulp
Because of this slightly asymmetric range, complementation can lead to overflow! Thus, if complementation is done as a separate sign change operation, it must include overflow detection.
1011
0000 o
0001 1
Unsigned representations
1101 13
3 0011
1100 12
4 0100
5
0101
0110
1010 9
1001
8
0111
1000
Fig.2.5 A 4-bit, 2's-complement number representation system for integers.
2.4 TWO'S- AND liS-COMPLEMENT NUMBERS 25
However, we will see later that complementation needed to convert subtraction into addition requires no special provision.
The name "Z's complement" actually comes from the special case of k = 1 that Icads to the complementation constant M = 2. In this case, represented numbers have one whole bit, which acts as the sign, and I fractional bits. Thus, fractional values in the range [-1, I - ulp] are represented in such a fractional 2's-complement number system.
The digit or diminished-radix complement representation is known as one s complement in the special case of r = 2. The complementation constant in this case is M = 2k - ul p . For example, Fig. 2.6 shows the 4-bit, 1 's-complement integer system (k = 4, I = 0, M = 24 - I = 15) and the meanings of the 16 representations allowed with 4 bits. The boundary between positive and negative values is again drawn approximately in the middle to make the range symmetric and to allow simple sign detection (the leftmost bit is the sign).
Note that compared to the 2's-complement representation of Fig. 2.5, the representation for -8 has been eliminated and instead an alternate code has been assigned to 0 (technically, -0). This may somewhat complicate 0 detection in that both the all-Os and the all-I s patterns represent O. The arithmetic circuits can be designed such that the all-I s pattern is detected and automatically converted to the all-Os pattern. Keeping -0 intact does not cause problems in computations, however, since all computations are modulo 15. For example, adding + 1 (000 I ) to -0 (1111) will yield the correct result of + 1 (0001) when the addition is done modulo 15.
The 1 's-complement of a number x can be found by bitwise complementation:
(2k - ulp) - x = xcolllpi
To add numbers modulo 2k - ulp, we simply drop a carry-out of 1 produced by position k - I and simultaneously insert a carry-in of 1 into position -E. Since the dropped carry is worth 2k units and the inserted carry is worth ulp, the combined effect is to reduce the magnitude of the result by 2k - ulp, In terms of hardware, the carry-out of our (k + I)-bit adder should be directly connected to its carry-in; this is known as end-around carry.
1011
0000 o
Unsigned representations
3 0011
1100 12
4 0100
5 0101
0110
1010
9 1001
8
0111
1000
Fig.2.6 A 4-bit, 1 's-complement number representation system for integers.
26 Representing Signed Numbers
The foregoing scheme properly handles any sum that equals or exceeds 2k. When the sum is 2k - ul p, however, the carry-out will be zero and modular reduction is not accomplished. As suggested earlier, such an all-l s result can be interpreted as an alternate representation of o that is either kept intact (making 0 detection more difficult) or is automatically converted by hardware to +0.
The range of representable numbers in a 1 's-complement number system with k whole bits is:
from
- (2k-1 - ulp)
to
This symmetric range is one of the advantages of 1 's-complement number representation.
Table 2.2 presents a brief comparison of radix- and digit-complement number representation systems for radix r. We might conclude from Table 2.2 that each of the two complement representation schemes has some advantages and disadvantages with respect to the other, making them equally desirable. However, since complementation is often performed for converting subtraction to addition, the addition of ulp required in the case of 2's-complement numbers can be accomplished by providing a carry-in of 1 into the least significant, or (-l)th, position of the adder. Figure 2.7 shows the required elements for a 2's-complement adder/subtracter. With the complementation disadvantage mitigated in this way, 2's-complement representation has become the favored choice in virtually all modem digital systems.
Interestingly, the arrangement shown in Fig. 2.7 also removes the disadvantage of asymmetric range. If the operand y is _2k-1, represented in 2's complement as 1 followed by aJl Os, its complementation does not lead to overflow. This is because the two's complement of y is essentially represented in two parts: ycompl, which represents 2k-1 -1, and Cin which represents 1.
Occasionally we need to extend the number of digits in an operand to make it of the same length as another operand. For example, if a 16-bit number is to be added to a 32-bit number, the former is first converted to 32-bit format, with the two 32-bit numbers then added using a 32-bit adder. Unsigned or signed-magnitude fixed-point binary numbers can be extended from the left (whole part) or the right (fractional part) by simply padding them with Os. This type of range or precision extension is only slightly more difficult for 2's- and 1 's-complement numbers.
Given a 2's-complement number Xk-JXk-2 ... X1XO.X1X2 .•. X_I> extension can be achieved from the left by replicating the sign bit (sign extension) and from the right by padding it with Os.
TABLE 2.2
Comparing radix- and digit-complement number representation systems
Feature/Property Radix complement Digit complement
Symmetry (P = N?) Possible for odd r Possible for even r
(radices of practical
interest are even)
Uniqne zero? Yes No
Complementation Complement all digits Complement all digits
and add ulp
Mod-M addition Drop the carry-out End-around carry 2.5 DIRECT AND INDIRECT SIGNED ARITHMETIC 27
x
y
Fig.2.7 Adder/subtractor architecture for 2's-complement numbers.
Sub/Add
o for addition
1 for subtraction
To justify the foregoing rule, note that when the number of whole (fractional) digits is increased from k (I) to k' (I'), the complementation constant increases from M = 2k to M' = 2k'. Hence, the difference of the two complementation constants
must be added to the representation of any negative number. This difference is a binary integer consisting of k' - k Is followed by k Os; hence the need for sign extension.
A I 's-cornplement number must be sign-extended from both ends:
Justifying the rule above for I 's-complernent numbers is left as an exercise.
2.5 DIRECT AND INDIRECT SIGNED ARITHMETIC
In the preceding pages, we dealt with the addition and subtraction of signed numbers for a variety of number representation schemes (signed-magnitude, biased, complement). In all these cases, signed numbers were handled directly by the addition/subtraction hardware (direct signed arithmetic), consistent with our desire to avoid using separate addition and subtraction units.
For some arithmetic operations, it may be desirable to restrict the hardware to unsigned operands, thus necessitating indirect signed arithmetic. Basically, the operands are converted to unsigned values, a tentative result is obtained based on these unsigned values, and finally the necessary adjustments are made to find the result corresponding to the original signed operands. Figure 2.8 depicts the direct and indirect approaches to signed arithmetic.
Indirect signed arithmetic can be performed, for example, for multiplication or division of signed numbers, although we will see in Parts III and IV that direct algorithms are also available for this purpose. The process is trivial for signed-magnitude numbers. If x and yare biased
28 Representing Signed Numbers
x
y
f(x, y)
x
y
Fig. 2.8 Direct versus indirect operation on signed numbers.
rex, y)
numbers, then both the sign removal and adjustment steps involve addition/subtraction. If x and yare complement numbers, these steps involve selective complementation.
This type of preprocessing for operands, and postprocessing for computation results, is useful not only for dealing with signed values but also in the case of unacceptable or inconvenient operand values. For example, in computing sin x, the operand can be brought to within [0, x /2] by taking advantage of identities such as sine -x) = - sin x and sin(2n +x) = sin(n - x) = sin x. Chapter 22 contains examples of such transformations. As a second example, some division algorithms become more efficient when the divisor is in a certain range (e.g .• close to 1). In this case, the dividend and divisor can be scaled by the same factor in a preprocessing step to bring the divisor within the desired range (see Section 15.3).
2.6 USING SIGNED POSITIONS OR SIGNED DIGITS
The value of a 2's-complement number can be found by using the standard binary-to-decimal conversion process, except that the weight of the most significant bit (sign position) is taken to be negative. Figure 2.9 shows an example 8-bit, 2's-complement number converted to decimal by considering its sign bit to have the negative weight _27.
x ( 1 0 0 0 o ) two's-compl
_27 26 25 24 23 22 21 2°
-128 + 32 + 4 + 2 -90
Check:
x = 1 0 0 0 1 o ) two's-cornpl
-X= 0 0 0 o )twa
27 26 25 24 23 22 21 2°
64 + 16 + 8 + 2 90
Fig. 2.9 Interpreting a 2's-complement number as having a negatively weighted most
significant digit. 2.6 USING SIGNED POSITIONS OR SIGNED DIGITS 29
This very important property of 2's-complement systems is used to advantage in many algorithms that deal directly with signed numbers. The property is formally expressed as follows:
x = (Xk-JXk-2" ·X]Xo.X-JX_2·· 'X-/)two's-compJ k-2
= -xk_J2k-J + LXi i i=-I
The proof is quite simple if we consider the two cases of Xk-J = 0 and Xk-J = I separately.
Developing the corresponding interpretation for 1 's-complement numbers is left as an exercise.
A simple generalization of the notion above immediately suggests itself [Koreb l ]. Let us assign negative weights to an arbitrary subset of the k + I positions in a radix-r number and positive weights to the rest of the positions. A vector
with elements Ai in { -1, I}, can be used to specify the signs associated with the various positions. With these conventions, the value represented by the digit vector X of length k + I is:
k (Xk-1Xk-2'" X]XO·X-1X_2·'· X-I)r.1c = L AiXjri i=-I
Note that the scheme above covers unsigned radix-r, 2's-complement, and negative-radix number systems as special cases:
A= 1 Positive radix
A =-1 1 Two's complement
A= -1 -1 Negative radix We can take one more step in the direction of generality and postulate that instead of a single sign vector A being associated with the digit positions in the number system (i.e., with all numbers represented), a separate sign vector is defined for each number. Thus, the digits are viewed as having signed values:
30 Representing Signed Numbers
Xi = )'i lxii, with Ai E {-I, I}
Here, Ai is the sign and IXi I is the magnitude of the ith digit. In fact once we begin to view the digits as signed values, there is no reason to limit ourselves to signed-magnitude representation of the digit values. Any type of coding, including biased or complement representation, can be used for the digits. Furthermore, the range of digit values need not be symmetric. We have already covered some examples of such signed-digit number systems in Section 1.4 (see Examples 1.1, 1.3, and 1.4).
Basically, any set [-ex, I'll of r or more consecutive integers that includes ° can be used as the digit set for radix r. If exactly r digit values are used, then the number system is irredundant and offers a unique representation for each value within its range. On the other hand, if more than r digit values are used, p = ex + tl + 1 - r represents the redundancy index of the number system and some values will have multiple representations. In Chapter 3, we will see that such redundant representations can eliminate the propagation of carries in addition and thus allow us to implement truly parallel fast adders.
As an example of nonredundant signed-digit representations, consider a radix-4 number system with the digit set [-1,2]. A k-digit number of this type can represent any integer from -( 4k - 1)/3 to 2( 4k - 1)/3. Given a standard radix-4 integer using the digit set [0, 3 J, it can be converted to the preceding representation by simply rewriting each digit of 3 as -1 + 4, where the second term becomes a carry of 1 that propagates leftward. Figure 2.10 shows a numerical example. Note that the result may require k + I digits.
The conversion process of Fig. 2.10 stops when there remains no digit with value 3 that needs to be rewritten. The reverse conversion is similarly done by rewriting any digit of -1 as 3 with a borrow of 1 (carry of -1).
More generally, to convert between digit sets, each old digit value is rewritten as a valid new digit value and an appropriate transfer (carry or borrow) into the next higher digit position. Because these transfers can propagate, the conversion process is essentially a digit-serial one, beginning with the least significant digit.
As an example of redundant signed-digit representations, consider a radix-4 number system with the digit set [-2,2]. A k-digit number of this type can represent any integer from
3 2 ° 2 3 Original digits in [0, 3]
I I I I I
-1 1 2 ° 2 -1 Rewritten digits in [-1, 2]
/ / / / / /
0 0 0 0 Transfer digits in [0, 1J
-1 2 0 3 -1 Sum digits in [-1,3]
I I I I I
1 -1 1 2 0 -1 -1 Rewritten digits in [-1, 2]
/ / / / / /
0 0 0 0 0 Transfer digits in [0, 1]
-1 2 -1 -1 Sum digits in [-1, 3] Fig.2.10 Converting a standard radix-4 integer to a radix-4 integer with the nonstandard digit set [-1,2].
PROBLEMS 31
3 2 0 2 3 Original digits in [0, 3]
I I I I I
-1 1 -2 0 -2 -1 Interim digits in [-2, 1]
/ / / / / /
0 0 Transfer digits in [0, 1]
-1 2 -2 -1 -1 Sum digits in [-2, 2] Fig. 2.11 Converting a standard radix-4 integer to a radix-4 integer with the nonstandard digit set [-2,2],
-2(4k - 1)/3 to 2(4k - I )/3. Given a standard radix-4 number using the digit set[O, 3], it can be converted to the preceding representation by simply rewriting each digit of 3 as -I + 4 and each digit of 2 as -2 + 4, where the second term in each case becomes a carry of 1 that propagates leftward. Figure 2.11 shows a numerical example.
Tn this case, the transfers do not propagate, since each transfer of I can be absorbed by the next higher position which has a digit value in [-2, 1], forming a final result digit in [-2, 2J. The conversion process from conventional radix-4 to the preceding redundant representation is thus carry-free. The reverse process, however, remains digit-serial.
2.1 Signed-magnitude representation Design the control circuit of Fig. 2.2 so that signedmagnitude inputs are added correctly regardless of their signs. Include in your design a provision for overflow detection in the form of a fifth control circuit output.
2.2 Arithmetic on biased numbers Multiplication of biased numbers can be done in a direct or an indirect way.
a. Develop a direct multiplication algorithm for biased numbers. Hint: Use the identity xy + bias = (x + bias)(y + bias) - bias[(x + bias) + (y + bias) - bias] +bias,
b. Present an indirect multiplication algorithm for biased numbers.
c. Compare the algorithms of parts a and b with respect to delay and hardware implementation cost.
d. Repeat the comparison for part c in the special case of squaring a biased number.
2.3 Representation formats and conversions Consider the following five ways for representing integers in the range [-127, 1271 within an 8-bitformat: (a) signed-magnitude, (b) 2's complement, (c) 1 's complement, (d) excess-I27 code (where an integer x is encoded using the binary representation of x + 127), (e) excess-I28 code. Pick one ofthree more conventional and one of the two "excess" representations and describe conversion of numbers between the two formats in both directions.
2.4 Representation formats and conversions
a. Show conversion procedures from k-bit 2's-complement format to k-bit biased representation, with bias = 2k-1, and vice versa. Pay attention to possible exceptions.
b. Repeat part a for bias = 2k-1 - 1.
32 Representing Signed Numbers
c. Repeat part a for 1 's-complement format.
d. Repeat part b for 1 's-complernent format.
2.5 Complement representation of negative numbers Consider a k-bit integer radix-2 complement number representation system with the complementation constant M = ». The range of integers represented is taken to be from - N to + P, with N + P + I = M. Determine all possible pairs of values for Nand P (in terms ofM) if the sign of the number is to be determined by:
a. Looking at the most significant bit only.
b. Inspecting the three most significant bits.
c. A single 4-input OR or AND gate.
d. A single 4-input NOR or NAND gate.
2.6 Complement representation of negative numbers Diminished radix complement was defined as being based on the complementation constant rk -ul p. Study the implications of using an "augmented radix complement" system based on the complementation constant rk + ulp.
2.7 One's- and 2's-complement number systems We discussed the procedures for extending the number of whole or fractional digits in a 1 's- or 2's-complcment number in Section 2.4. Discuss procedures for the reverse process of shrinking the number of digits (e.g., converting 32-bit numbers to 16 bits).
2.8 Interpreting 1 's-complement numbers Prove that the value of the number (Xk-I Xk-2 ...
XIXU.X-IX_2 •.. X-I)I's-compl can be calculated from the formula -Xk-I (2k-1 - ulp) +
L~;::l Xi 2i .
2.9 One's- and 2's-complement number systems
a. Prove that X - Y = (XC + y Y, where the superscript "c" denotes any complementation scheme.
b. Find the difference between the two binary numbers 0010 and 0101 in two ways:
First by adding the 2's complement of 0101 to 0010, and then by using the equality of part a, where "c" denotes bitwise complementation. Compare the two methods with regard to their possible advantages and drawbacks.
2.10 Shifting of 1 's- or 2's-complement numbers Left/right shifting is used to doublelhalve the magnitude of unsigned binary integers. How can we use shifting to accomplish the same for 1 's- or 2's-complement numbers?
2.11 Arithmetic on l's-complement numbers Discuss the effect of the end-around carry needed for 1 's-complement addition on the worst-case carry propagation delay and the total addition time.
2.12 Range extension for complement numbers Prove that increasing the number of integer and fractional digits in one's-complement representation requires sign extension from both ends (i.e., positive numbers are extended with Os and negative numbers with Is at both ends).
REFERENCES
REFERENCES 33
2.13 Signed digits or digit positions
a. Present an algorithm for determining the sign of a number represented in a positional system with signed digit positions.
b. Repeat part a for signed-digit representations.
2.14 Signed digit positions Consider a positional radix-r integer number system with the associated position sign vector A = (J--k-IAk-2'" AIAO), Ai E {-I, I}. The additive inverse of a number x is the number -x.
a. Find the additive inverse of the k-digit integer Q all of whose digits are r -I.
b. Derive a procedure for finding the additive inverse of an arbitrary number x.
c. Specialize the algorithm of part b to the case of 2's-complement numbers.
2.15 Generalizing 2's complement: 2-adic numbers Around the turn of the twentieth century, K. Hensel defined the class of p-adic numbers for a given prime p. Consider the class of 2-adic numbers with infinitely many digits to the left and a finite number of digits to the right of the binary point. An infinitely repeated pattern of digits is represented hy writing down a single pattern (the period) within parentheses. Here are some example 2-adic representations using this notation:
We see that 7 and - 7 have their standard 2's-eomplement forms, with infinitely many digits. The representations of 117 and -1/7, when multiplied by 7 and -7, respectively, using standard rules for multiplication, yield the representation of 1. Prove the following for 2-adic numbers:
a. Sign change of a 2-adic number is similar to 2's complementation.
b. The representation of a 2-adic number x is ultimately periodic if and only if x is rational.
c. The 2-adic representation of - 1/ (2n + 1) for n 2: 0 is (0"), for some bit string 0", where the standard binary representation of 1/(2n + 1) is (0.0"0"0"" ·)two.
[Aviz61] Avizienis, A., "Signed-Digit Number Representation for Fast Parallel Arithmetic," IRE Trans. Electronic Computers, Vol. 10, pp. 389-400,1961.
[Gos180] Gosling, J. B., Design of Arithmetic Units for Digital Computers, Macmillan, 1980. [Knut97] Knuth, D. E., The Art of Computer Programming, 3rd ed., Vol. 2: Seminumerical Algorithms, Addison-Wesley, 1997.
[Koreji l] Koren, 1., and Y. Maliniak, "On Classes of Positive, Negative, and Imaginary Radix Number Systems," IEEE Trans. Computers, No.5, Vol. 30, pp. 312-317, 1981.
[Kom94J Kornerup, P., "Digit-Set Conversions: Generalizations and Applications," IEEE Trans.
Computers, Vol. 43, No.8, pp. 622-629, 1994.
34 Representing Signed Numbers
[Parh90] Parharni, B., "Generalized Signed-Digit Number Systems: A Unifying Framework for Redundant Number Representations," IEEE Trans. Computers, Vol. 39, No.1, pp. 89-98, 1990.
[Parh98] Parhami, B., and S. Johansson, "A Number Representation Scheme with Carry-Free Rounding for Floating-Point Signal Processing Applications," Proc. Int'l. Conf. Signal and Image Processing, Las Vegas, Nevada, October 1998, pp. 90-92.
[Scot85] Scott, N. R., Computer Number Systems and Arithmetic, Prentice-Hall, 1985.
Chapter
3
REDUNDANT NUMBER SYSTEMS
This chapter deals with the representation of signed fixed-point numbers using a positive integer radix r and a redundant digit set composed of more than r digit values. After showing that such representations eliminate carry propagation, we cover variations in digit sets, addition algorithms, input/output conversions, and arithmetic support functions. Chapter topics include:
3.1 Coping with the Carry Problem
3.2 Redundancy in Computer Arithmetic 3.3 Digit Sets and Digit-Set Conversions 3.4 Generalized Signed-Digit Numbers 3.5 Carry-Free Addition Algorithms
3.6 Conversions and Support Functions
3.1 COPING WITH THE CARRY PROBLEM
Addition is a primary building block in implementing arithmetic operations. If addition is slow or expensive, all other operations suffer in speed or cost. Addition can be slow and/or expensive because:
a. With k-digit operands, one has to allow for O(k) worst-case carry-propagation stages in simple ripple-carry adder design.
h. The carry computation network is a major source of complexity and cost in the design of carry-Iookahead and other fast adders.
The carry problem can be dealt with in several ways:
1. Limit carry propagation to within a small number of bits.
2. Detect the end of propagation rather than wait for worst -case time.
3. Speed up propagation via lookahead and other methods.
4. Ideal: Eliminate carry propagation altogether!
35
36 Redundant Number Systems
As examples of option 1, hybrid redundant and residue number system representations are covered in Section 3.4 and Chapter 4, respectively. Asynchronous adder design (option 2) is considered in Section 5.4. Speedup methods for carry propagation are covered in Chapters 6 and 7.
In the remainder of this chapter, we deal with option 4, focusing first on the question:
Can numbers be represented in such a way that addition does not involve carry propagation? We will see shortly that this is indeed possible. The resulting number representations can be used as the primary encoding scheme in the design of high-performance systems and are also useful in representing intermediate results in machines that use conventional number representation.
We begin with a decimal example (r = 10), assuming the standard digit set [0, 9]. Consider the addition of the following two decimal numbers without carry propagation. For this, we simply compute "position sums" and write them down in the corresponding columns. We can use the symbols A = 10, B = 11, C = 12, etc. for the extended digit values or simply represent them with two standard digits.
578 + 629
2 4
3 8
9
9 Operand digits in [0, 9]
11 9 17 5 12 18 Position sums in [0, 18]
So, if we allow the digit set [0, 18], the scheme works, but only for the first addition! Subsequent additions will cause problems.
Consider now adding two numbers in the radix-lO number system using the digit set [0, 18]. The sum of digits for each position is in [0,36], which can be decomposed into an interim sum in [0, 16] and a transfer digit in [0, 2]. In other words:
[0,36] = lO x [0, 2J + to, 16]
Adding the interim sum and the incoming transfer digit yields a digit in [0, 18] and creates no new transfer. In interval notation, we have:
lO, 16J + la, 2J = [a, 18]
Figure 3.1 shows an example addition.
So, even though we cannot do true carry-free addition (Fig. 3.2a), the next best thing, where carry propagates by only one position (Fig. 3.2b), is possible if we use the digit set [0, 18] in radix 10. We refer to this best possible scheme as "carry-free" addition. The key to the ability to do carry-free addition is the representational redundancy that provides multiple encodings for some numbers. Figure 3.2c shows that the single-stage propagation of transfers can be eliminated by a simple lookahead scheme; that is, instead of first computing the transfer into position i based on the digits Xi-J and Yi-l and then combining it with the interim sum, we can determine s, directly from Xi, Yi, Xi-I, and Yi-). This may make the adder logic somewhat more complex, but in general the result is higher speed.
In the decimal example of Fig. 3.1, the digit set [0, 18] was used to effect carry-free addition.
The 9 "digit" values 10 through 18 are redundant. However, we really do not need this much redundancy in a decimal number system for carry-free addition; the digit set [0, 11] will do. Our example addition (after converting the numbers to the new digit set) is shown in Fig. 3.3.
3.2 REDUNDANCY IN COMPUTER ARITHMETIC 37
11 9 17 10 12 18
+ 6 12 9 10 8 18 Operand digits in [0, 18)
17 21 26 20 20 36 Position sums in [0, 36)
I I I I I I
7 11 16 ° 10 16 Interim sums in [0, 16)
/ / / / / /
2 2 Transfer digits in [0, 2]
B 12 18 12 16 Sum digits in [0,18)
Fig. 3.1 Adding radix-lO numbers with the digit set [0, 18]. A natural question at this point is: How much redundancy in the digit set is needed to enable carry-free addition? For example, will the example addition of Fig. 3.3 work with the digit set [0, lOJ? (Try it and see.) We will answer this question in Section 3.5.
3.2 REDUNDANCY IN COMPUTER ARITHMETIC
Redundancy is used extensively for speeding up arithmetic operations. The oldest example, first suggested in 1959 [Metz59J, pertains to carry-save or stored-carry numbers using the radix-2 digit set [O, 2J for fast addition of a sequence of binary operands. Figure 3.4 provides an example, showing how the intermediate sum is kept in stored-carry format, allowing each subsequent addition to be performed in a carry-free manner.
Why is this scheme called carry-save or stored-carry? Figure 3.5 provides an explanation.
Let us use the 2-bit encoding
0: (0,0),
I : (0, I) or (1,0),
2:(1,1)
to represent the digit set [0, 2J. With this encoding, each stored-carry number is really composed of two binary numbers, one for each bit of the encoding. These two binary numbers can be added
(Impossible for positional system with fixed digit set)
Fig.3.2 Ideal and practical carry-free addition schemes.
38 Redundant Number Systems
11 10 7 11 3 8
+ 7 2 9 10 9 8 Operand digits in [0, 11]
18 12 16 21 12 16 Position sums in [0, 22]
I I I I I I
8 2 6 1 2 6 Interim sums in [0, 9]
/ / / / / /
2 Transfer digits in [0, 2]
9 3 8 2 3 6 Sum digits in [0, 11]
Fig. 3.3 Adding radix-lO numbers with the digit set [0, 11]. to an incoming binary number, producing two binary numbers composed of the sum bits kept in place and the carry bits shifted one position to the left. These sum and carry bits form the partial sum and can be stored in two registers for the next addition. Thus, the carries are "saved" or "stored" instead of being allowed to propagate.
Figure 3.5 shows that one stored-carry number and one standard binary number can be added to form a stored-carry sum in a single full-adder delay (2-4 gate levels, depending on the full adder's logic implementation of the outputs s = x EEl Y EElcin and Cout = xy + XCjn + YCin). This
° a ° a 1 First binary number
+ a a Add second binary number
a 2 Position sums in [0. 2)
+ ° 1 ° Add third binary number
° 2 3 2 2 Position sums in [0, 3)
I I I I I
a ° 1 0 1 0 Interim sums in [0, 1]
/ / / / / /
° ° Transfer digits in [0, 1]
2 0 2 0 Position sums in [0, 2)
+ 0 0 0 Add fourth binary number
3 ° 3 Position sums in [0, 3]
I I I
1 1 1 ° 1 Interim sums in [0, 1]
/ / / / / /
a a ° a Transfer digits in [0, 1]
2 Sum digits in [0, 2]
Fig. 3.4 Addition of four binary numbers, with the sum obtained in stored-carry form. 3.3 DIGIT SETS AND DIGIT-SET CONVERSIONS 39
Digit in [0, 2]
Binary digit
Fig. 3.5 U sing an array of independent binary full adders to perform carry-save addition.
Digit in [0, 2]
is significantly faster than standard carry-propagate addition to accumulate the sum of several binary numbers, even if a fast carry-Iookahead adder is used for the latter. Of course once the final sum has been obtained in stored-carry form, it may have to be converted to standard binary by using a carry-propagate adder to add the two components of the stored-carry number. The key point is that the carry-propagation delay occurs only once, at the very end, rather than in each addition step.
Since the carry-save addition scheme of Fig. 3.5 converts three binary numbers to two binary numbers with the same sum, it is sometimes referred to as a 3/2 reduction circuit or (3; 2) counter. The latter name reflects the essential function of a full adder: it counts the number of Is among its three input bits and outputs the result as a 2-bit binary number. More on this in Chapter 8.
Other examples of the use of redundant representations in computer arithmetic are found in fast multiplication and division schemes, where the multiplier or quotient is represented or produced in redundant form. More on these in Parts III and IV.
3.3 DIGIT SETS AND DIGIT-SET CONVERSIONS
Conventional radix-r numbers use the standard digit set [0, r - 1]. However, many other redundant and nonredundant digit sets are possible. A necessary condition is that the digit set contain at least r different digit values. If it contains more than r values, the number system is redundant.
Conversion of numbers between standard and other digit sets is quite simple and essentially entails a digit-serial process in which, beginning at the right end of the given number, each digit is rewritten as a valid digit in the new digit set and a transfer (carry or borrow) into the next higher digit position. This conversion process is essentially like carry propagation in that it must be done from right to left and, in the worst case, the most significant digit is affected by a "carry" coming from the least significant position. The following examples illustrate the process (see also the examples at the end of Section 2.6).
40 Redundant Number Systems
• Example 3.1 Convert the following radix-lO number with the digit set [0, 18] to
one using the conventional digit set [0, 9].
11 9 17 10 12 18 Rewrite 18 as 10 (carry 1) + 8
11 9 17 10 13 8 13 = 10 (carry 1) + 3
11 9 17 11 3 8 11 = 10 (carry 1) + 1
11 9 18 1 3 8 18 = 10 (carry 1) + 8
11 10 8 1 3 8 10 = 10 (carry 1) + 0
12 0 8 3 8 12 = 10 (carry I) + 2
2 0 8 3 8 Answer: all digits in [0, 9] • Example 3.2 Convert the following radix-2 carry-save number to binary; that is, from digit set [0, 2] to digit set [0, 1].
1 2 0 2 0 Rewrite 2 as 2 (carry 1) + 0
1 2 1 0 0 2 = 2 (carry 1) + 0
1 2 0 1 0 0 2 = 2 (carry 1) + 0
2 0 0 0 0 2 = 2 (carry 1) + 0
0 0 0 0 0 Answer: all digits in [0, 1] Another way to accomplish the preceding conversion is to decompose the carry-save number into two numbers, both of which have Is where the original number has a digit of 2. The sum of these two numbers is then the desired binary number.
1 1
+ 0 0
o o
o First number: "sum" bits
o Second number: "carry" bits
o 0 0 1 0 0 Sum of the two numbers
• Example 3.3 Digit values do not have to be positive. We reconsider Example 3.1 using the asymmetric target digit set [-6,5].
11 9 17 10 12 18 Rewrite 18 as 20 (carry 2) - 2
11 9 17 10 14 -2 14 = 10 (carry 1) + 4
11 9 17 11 4 -2 11 = 10 (carry 1) + 1
11 9 18 1 4 -2 18 = 20 (carry 1) - 2
11 11 -2 4 -2 11 = 10 (carry 1) + 1
12 1 -2 4 -2 12 = 10 (carry 1) + 2
2 -2 4 -2 Answer: all digits in [ -6, 5] On line 2 of this conversion, we could have rewritten 14 as 20 (carry 2) - 6, which would have led to a different, but equivalent, representation. In general, several representations may be possible with a redundant digit set.
3.4 GENERALIZED SIGNED-DIGIT NUMBERS 41
2 0 2 0 Given carry-save number
-1 -1 0 0 0 0 Interim digits in [-1, 0]
1 1 0 1 0 Transfer digits in [0, 1]
0 0 0 0 0 Answer: all digits in [-1, 1] Example 3.4 If we change the target digit set of Example 3.2 from [O, 1] to [-1, 1], we can do the conversion digit-serially as before. However, carry-free conversion is possible for this example if we rewrite each 2 as 2 (carry I) + ° and each 1 as 2 (carry 1) -1. The resulting interim digits in [-1, 0] can absorb an incoming carry of 1 with no further propagation.
3.4 GENERALIZED SIGNED-DIGIT NUMBERS
We have seen thus far that digit set ofaradix-r positional number system need not be the standard set [0, r - I]. Using the digit set L -1, 1] for radix-2 numbers was proposed by E. Collignon as early as 1897 [Glasx l]. Whether this was just a mathematical curiosity, or motivated by an application or advantage, is not known. In the early 1960s, Avizienis LAviz61] defined the class of signed-digit number systems with symmetric digit sets [-a, a] and radix r > 2, where a is any integer in the range Lr /2 J + 1 ::s a ::s r - 1. These number systems allow at least 2 Lr /2 J + 3 digit values, instead of the minimum required r values, and are thus redundant.
More recently, redundant number systems with general, possibly asymmetric, digit sets of the form [-a, 11] have been studied as tools for unifying all redundant number representations used in practice. This class is called "generalized signed-digit CGSO) representation" and differs from the ordinary signed-digit (050) representation of Avizienis in its more general digit set as well as the possibility of higher or lower redundancy.
Binary stored-carry numbers, with r = 2 and digit set [0, 2], offer a good example for the usefulness of asymmetric digit sets. Higher redundancy is exemplified by the digit set [-7, 7] in radix 4 or [0, 31 in radix 2. An example for lower redundancy is the binary signed-digit representation with r = 2 and digit set [-I, 1]. None of these is covered by OSO.
An important parameter of a GSD number system is its redundancy index, defined as p = a + f3 + I - r (i.e., the amount by which the size of its digit set exceeds the size r of a nonredundant digit set for radix r). Figure 3.6 presents a taxonomy of redundant and nonredundant positional number systems showing the names of some useful subclasses and their various relationships.
Any hardware implementation of GSO arithmetic requires the choice of a binary encoding scheme for the a + f3 + 1 digit values in the digit set L -a, f3]. Multivalued logic realizations have been considered, but we limit our discussion here to binary logic and proceed to show the importance and implications of the encoding scheme chosen through some examples.
Consider, for example, the binary signed-digit (BSD) number system with r = 2 and the digit set [-1, 1]. One needs at least 2 bits to encode these three digit values. Figure 3.7 shows four of the many possible encodings that can be used.
Radix-r positional
.>: ~
Nonredundant ~enerali~~d
signed-digit (GSD)
a=1 \~1 P7 ~
Conventional Nonredundant Minimal Nonminimal
signed-digit GSD GSD
"~~
Symmetric Asym!'Tletric
nonminimal GSD nonrninlrnal GSD
a=/ \~=1 \= r
(eveoy/ \"*~
Symmetric Asy_mmetric
minimal GSD minimal GSD
"-oj ~;"2)
Stored- Nonbinary Ordinary Unsigned-digit SCB
carry (SC) SB signed-digit redundant (UDR)
a=Ld2J+ I "'z = r-l r-2
Minimally Maximally
redundant redundant
OSD OSD
42 Redundant Number Systems
r= 2
BSD or BSB
r= 2
BSC
a<r
BSCB
Fig. 3.6 A taxonomy of redundant and nonredundant positional number systems.
With the (n, p) encoding, the code (1, I) may be considered an alternate representation of o or else viewed as an invalid combination. Many implementations have shown that the (n, p) encoding tends to simplify the hardware and also increases the speed by reducing the number of gate levels [Parh88]. The l-out-of-3 encoding requires more bits per number but allows the detection of some storage and processing errors.
Hybrid signed-digit representations [Phat94] came about from an attempt to strike a balance between algorithmic speed and implementation cost by introducing redundancy in selected positions only. For example, standard binary representation may be used with BSD digits allowed in every third position, as shown in the addition example of Fig. 3.8.
Xi 1 -1 0 -1 0 Representation of +6
(s, v) 01 11 00 11 00 Sign and value encoding
2's-compl 01 11 00 11 00 2-bit 2's-complement
(n,p) 01 10 00 10 00 Negative and positive flags
(n, z,p) 001 100 010 100 010 t-out-or-a encoding
Fig. 3.7 Four encodings for the BSD digit set [-1, 1]. 3.5 CARRY-FREE ADDITION ALGORITHMS 43
BSD B B BSD B B BSD B B Type
1 0 -1 0 -1 0 xi
+ 0 -1 0 0 0 Yi
2 -2 -1 Pi
I I
-1 0 -1 wi
/ / /
1 -1 0 0 ti+1
-1 0 -1 Sj
Fig. 3.8 Example of addition for hybrid signed-digit numbers. The addition algorithm depicted in Fig. 3.S proceeds as follows. First one completes the position sums Pi that are in [0, 2] for standard binary and [-2,2] in BSD positions. The BSD position sums are then broken into an interim sum Wi and transfer ti+l, both in [-1, I]. For the interim sum digit, the value 1 (-I) is chosen only if it is certain that the incoming transfer cannot be 1 (-I); that is, when the two binary operand digits in position j - 1 are (not) both Os. The worst-case carry propagation spans a single group, beginning with a BSD digit that produces a transfer digit in [-1, 1] and ending with the next higher BSD position.
More generally, the group size can be g rather than 3. A larger group size reduces the hardware complexity (since the adder block in a BSD position is more complex than that in other positions) but adds to the carry-propagation delay in the worst case; hence, the hybrid scheme offers a trade-off between speed and cost.
Hybrid signed-digit representation with uniform spacing of BSD positions can be viewed as a special case of GSD systems. For the example of Fig. 3.S, arranging the numbers in 3-digit groups starting from the right end leads to a radix-8 GSD system with digit set [-4,7]: that is, digit values from (-l 0 O)two to (1 I l)two. So the hybrid scheme of Fig. 3.S can be viewed as an implementation of (digit encoding for) this particular radix-S GSD representation.
3.5 CARRY-FREE ADDITION ALGORITHMS
The GSD carry-free addition algorithm, corresponding to the scheme of Fig. 3.2b, is as follows:
Carryjree addition algorithm for GSD numbers Compute the position sums Pi = Xi + Yi.
Divide each Pi into a transfer ti+l and an interim sum Wi = Pi - rti+l· Add the incoming transfers to obtain the sum digits s, = ui, + ti.
Let us assume that the transfer digits t, are from the digit set [ - A, It]. To ensure that the last step leads to no new transfer, the following condition must be satisfied:
44 Redundant Number Systems
- ex + A < Pi - rti+1 < f3 - tL interim sum
Smallest interim sum if a transfer of -A
is to be absorbable
Largest interim sum if a transfer of fL
is to be absorbable
From the preceding inequalities, we can easily derive the conditions A ::: ex/(r - I) and fL ::: f3 / (r - I), Once A and fL are known, we choose the transfer digit value by comparing the position sum Pi against A + IJ. + 2 constants Cj, -A :s j :s u. + 1, with the transfer digit taken to bej if and only if C, :s Pi < CJ+I. Figure 3.9 represents the decision process graphically. Formulas giving possible values for these constants can be found in [Parh90]. Here, we describe a simple intuitive method for deriving these constants .
• Example 3.5 For r = 10 and digit set [-5,9], we need A ::: 5/9 and fL ::: I. Given minimal values for A and fL that minimize the hardware complexity, we find by choosing the minimal values for A and u, we find:
Amin = tLmin = I
(i.e., transfer digits are in [-1, 1])
- 4 ::s Co ::s -1
We next show how the allowable values for the comparison constant C I, shown above, are derived. The position sum Pi is in [-10,18]. We can set ti+' to 1 for Pi values as low as 6; for Pi = 6, the resulting interim sum of -4 can absorb any incoming transfer in [-1, 1] without falling outside [-5,9]. On the other hand, we must transfer 1 for Pi values of 9 or more. Thus, for Pi :::. C1, where 6 ::s C, S 9, we choose an outgoing transfer of 1. Similarly, for Pi < Co, we choose an outgoing transfer of -1, where -4 S Co ::s -1. In all other cases, the outgoing transfer is O.
Assuming that the position sum Pi is represented as a 6-bit, 2's-complement number abcdef, good choices for the comparison constants in the above ranges are Co = -4 and C, = 8. The logic expressions for the signals g, and g_, then become:
g-l = a(e + d) gl = a(h + c)
Generate a transfer of -1
Generate a transfer of 1
An example addition is shown in Fig. 3.10.
It is proven in [Parh90] that the preceding carry-free addition algorithm is applicable to a redundant representation if and only if one of the following sets of conditions is satisfied:
a. r > 2, p ::: 3
b. r > 2, p = 2, ex =1= 1, f3 =1= 1
3.5 CARRY-FREE ADDITION ALGORITHMS 45
Constants C-A C-A+l C-A.+2 ... Co C, Cfl-l Gil Gu+'
I I I I I I -too
Pi range [---) [--) [---) , .. [---) [---) [--) [---)
ti+1 chosen -;~ -;~ +1 -?c+2 0 1 .u -1 u Fig.3.9 Choosing the transfer digit t,+1 based on comparing the interim sum Pi to the comparison constants c..
In other words, the carry-free algorithm is not applicable for r = 2, p = I, or p = 2 witha = I or fJ = I. In such cases, a limited-carry addition algorithm is available:
Limited-carry addition algorithm for GSD numbers Compute the position sums Pi = Xi + vt.
Compare each Pi to a constant to determine whether ei+1 = "low" or "high" (ei+l is a binary range estimate for ti+I),
Given ei, divide each Pi into a transfer ti+1 and an interim sum w, = Pi - rt. ,», Add the incoming transfers to obtain the sum digits s, = Wi + ti.
This "limited-carry" GSD addition algorithm is depicted in Fig. 3.11 a; in an alternative implementation (Fig. 3.11 b), the "transfer estimate" stage is replaced by another transfer generation/addition phase.
Even though Figs, 3.11 a and 3, II b appear similar, they are quite different in terms of the internal designs of the square boxes in the top and middle rows. In both cases, however, the sum digit Si depends on Xi, Yi, Xi-I, Yi-I, Xi-2. and Yi-2' Rather than wait for the limited transfer propagation from stage i - 2 to i, one can try to provide the necessary information directly from stage i - 2 to stage i. This leads to an implementation with parallel carries ti~)1 and ti~2 from stage i, which is sometimes applicable (Fig, 3.11 c) .
• Example 3.6 Figure 3.12 depicts the use of carry estimates in limited-carry addition of radix-2 numbers with the digit set [ -1, 1]. Here we have p = 1, Amin = 1, and /-Lmin = 1. The "low" and "high" subranges for transfer digits are [-1, 0] and [0, 1], respectively, with a transfer (i+1 in "high" indicated if p, ::: o.
3 -4 9 -2 8 Xi in [-5,9]
+ 8 -4 9 8 1 Yi in [-5,9]
11 -8 18 6 9 Pi in [-10, 18]
1 2 8 6 -1 Wi in [-4,8]
/ / / / /
1 -1 1 0 ti+1 in (-1,1]
0 3 8 7 -1 Sj in [-5,9]
Fig. 3.10 Adding radix-lO numbers with the digit set [-5,9], 46 Redundant Number Systems
Fig.3.11 Some implementations for limited-carry addition .
• Example 3.7 Figure 3.l3 shows another example of limited-carry addition with r = 2, digit set [0, 3], p = 2, Amin = 0, and /Lmin = 3, using carry estimates. The "low" and "high" subranges for transfer digits are [0, 2] and [1, 3], respectively, with a transfer tHl in "high" indicated if Pi ::: 4 .
• Example 3.8 Figure 3.14 shows the same addition as in Example 3.7 (r = 2, digit set [0, 3], p = 2, Amin = 0, /Lmin = 3) using the repeated-carry scheme of Fig. 3.llb.
1 -1 0 -1 0 Xi in [-1.1]
+ 0 -1 -1 0 1 Yi in [-1,1]
1 -2 -1 -1 Pi in [-2,2]
/ / / / /
high low low low high high eiin (low:[-1, 0], high:[O, 1])
I I I I I
1 0 1 -1 -1 Wi in [-1,1)
/ / / / /
0 -1 -1 0 t;+1 in [-1,1]
0 0 -1 0 -1 Si in [-1,1] Fig.3.12 Limited-carry addition of radix-2 numbers with the digit set [-1, 11 by means of carry estimates. A position sum of -1 is kept intact when the incoming transfer is in [0, 1], whereas it is rewritten as 1 with a carry of -1 if the incoming transfer is in [-1,0]. This scheme guarantees that t: f. ui, and thus -1.:::: s, .:::: l.
3.5 CARRY-FREE ADDITION ALGORITHMS 47
1 1 3 2 xi in [0,3)
+ ° ° 2 2 1 Yi in [0,3)
1 1 5 3 3 Pi in [0, 6)
/ / / / /
low low high low low low ei in {low:[O, 2), high:[1, 3]}
I I I I
1 -1 1 1 Wi in [-1,1)
/ / / / /
° 2 tif. 1 in [0, 3]
° 2 2 2 Si in [0,3] Fig.3.13 Limited-carry addition ofradix-2 numbers with the digit set [Il, 31 by means of carry estimates. A position sum of 1 is kept intact when the incoming transfer is in [0, 2], whereas it is rewritten as -1 with a carry of 1 if the incoming transfer is in [1, 3].
• Example 3.9 Figure 3.15 shows the same addition as in Example 3.7 (r = 2, digit set [0, 3], p = 2, Amin = 0, Mmin = 3) using the parallel-carries scheme of Fig. 3.llc.
Subtraction of GSD numbers is very similar to addition. With a symmetric digit set, one can simply invert the signs of all digits in the subtractor y to obtain a representation of -y and then perform the addition x + ( - y) using a carry- free or limited-carry algorithm as already discussed. Negation of a GSD number with an asymmetric digit set is somewhat more complicated, but can still be performed by means of a carry-free algorithm [Parh93]. This algorithm basically
1 1 3 1 2 Xi in [0,3)
+ ° ° 2 2 Yi in [0,3)
5 3 3 Pi in [0, 6)
I I I
1 1 1 1 Wi in [0,1)
/ / / / /
° ° 2 ti; 1 in [0, 3)
° 3 2 2 Si in [0,4)
1 1 ° ° Wi in [0,1)
/ / / / /
° ° tif. 1 in [0, 2)
° 2 2 ° Si in [0,3)
Fig. 3.14 Limited-carry addition of radix-2 numbers with the digit set [0, 3] by means of the
repeated-carry scheme. 48 Redundant Number Systems
1 1 3 1 2
+ ° ° 2 2
5 3 3
I I I
/1/1/1/1/1
° /0/0/1/1
0/ °
1 ° ° Xi in [0,3] Yi in [0,3]
Pi in [0, 6]
Wi in [0,1] tDi in [0,1] ti~J in [0, 1]
° °
2
2
2
Si in [0,3]
Fig.3.15 Limited-carry addition of radix-2 numbers with the digit set [0, 3] by means of the parallel-carries scheme.
converts a radix-r number from the digit set [-{3, a], which results from changing the signs of the individual digits of y, to the original digit set [-a, tll. Alternatively, a direct subtraction algorithm can be applied by first computing position differences in [ -a - f3, a + f3], then forming interim differences and transfer digits. Details are omitted here.
3.6 CONVERSIONS AND SUPPORT FUNCTIONS
Since input numbers provided from the outside (machine or human interface) are in standard binary or decimal and outputs must be presented in the same way, conversions between binary or decimal and GSD representations are required.
• Example 3.10 Consider number conversions from or to standard binary to or from binary signed-digit representation. To convert from signed binary to BSD, we simply attach the common number sign to each digit, if the (s, v) code of Fig. 3.7 is to be used for the BSD digits. Otherwise, we need a simple digitwise converter from the (s, v) code to the desired code. To convert from BSD to signed binary, we separate the positive and negative digits into a positive and a negative binary number, respectively. A subtraction then yields the desired result. Here is an example:
-1 0 -1 0 BSD representation of +6
0 0 0 0 Positive part (l digits)
0 0 0 Negative part (-1 digits)
0 0 0 Difference = conversion result The positive and negative parts required above are particularly easy to obtain if the BSD number is represented using the (n, p) code of Fig. 3.7. The reader should be able to modify the process above for dealing with numbers, or deriving results, in Z's-complement format.
3.6 CONVERSIONS AND SUPPORT FUNCTIONS 49
The conversion from redundant to nonredundant representation essentially involves carry propagation and is thus rather slow. Hopefully, however, we will not need conversions very often. Conversion is done at the input and output. Thus, if long sequences of computation are performed between input and output, the conversion overhead can become negligible.
Storage overhead (the larger number of bits that may be needed to represent a GSD digit compared to a standard digit in the same radix) used to be a major disadvantage of redundant representations. However, with advances in VLSI technology, this is no longer a major issue; though the increase in the number of pins for input and output may still be a factor.
In the rest of this section, we review some properties of GSD representations that are important for the implementation of arithmetic support functions: zero detection, sign test, and overflow handling [Parh93].
In a GSD number system, the integer 0 may have multiple representations. For example, the three-digit numbers 0 0 0 and -1 40 both represent 0 in radix 4. However, in the special case of ex < rand f3 < r , zero is uniquely represented by the all-Os vector. So despite redundancy and multiple representations, comparison of numbers for equality can be simple in this common special case, since it involves subtraction and detecting the all-Os pattern.
Sign test, and thus any relational comparison («, .:s, etc.), is more difficult. The sign of a GSD number in general depends on all its digits. Thus sign test is slow if done through signal propagation (ripple design) or expensive if done by a fast lookahead circuit (contrast this with the trivial sign test for signed-magnitude and 2's-complement representations). In the special case of ex < rand f3 < r , the sign of number is identical to the sign of its most significant nonzero digit. Even in this special case, determination of sign requires scanning of all digits in the worst case, a process that can be as slow as full carry propagation.
Overflow handling is also more difficult in GSD arithmetic. Consider the addition of two k-digit numbers, as shown in Fig. 3.16. Such an addition produces a transfer-out digit u: Since tk is produced using the worst-case assumption about the as yet unknown tk -1, we can get an overflow indication (tk f=. 0) even when the result can be represented with k digits. It is possible to perform a test to see whether the overflow is real and, if it is not, to obtain a k-digit representation for the true result. However, this test and conversion are fairly slow.
The difficulties with sign test and overflow detection can nullify some or all of the speed advantages of GSD number representations. This is why applications of GSD are presently limited to special-purpose systems or to internal number representations, which are subsequently converted to standard representation.
Xk-1 Xk-2 X1 Xo GSD operands
+ Yk-1 Yk-2 Y1 Yo
Pk-1 Pk-2 P1 Po Position sums
I I I I
Wk-1Wk-2 W1 Wo Interim sum digits
/ / / /
tk tk-1 tz t1 Transfer digits
5k-1 5k-2 51 So "Apparenf' sum
Fig. 3.16 Overflow and its detection in GSD arithmetic. 50 Redundant Number Systems
3.1 Stored-carry and stored-borrow representations The radix-2 number systems using the digit sets [0, 2] and [-1, 1] are known as binary stored-carry and stored-borrow representations, respectively. The general radix-r stored-carry and stored-borrow representations are based on the digit sets [0, r] and [-1, r - 1], respectively.
a. Show that carry-free addition is impossible for stored-carry/borrow numbers.
b. Supply the details of limited-carry addition for radix-r stored-carry numbers.
c. Supply the details of limited-carry addition for radix-r stored-borrow numbers.
d. Compare the algorithms of parts b and c and discuss.
3.2 Stored-double-carry and stored-triple-carry representations The radix-4 number system using the digit set [0,4] is a stored-carry representation. Use the digit sets [0, 5] and [0, 6] to form the radix-4 stored-double-carry and stored-triple-carry number systems, respectively.
a. Find the relevant parameters for carry-free addition in the two systems (i.e., the range of transfer digits and the comparison constants). Where there is a choice, select the best value and justify your choice.
b. State the advantages (if any) of one system over the other.
3.3 Stored-carry-or-borrow representations The general radix-r stored-carry-or-borrow representations use the digit set [ -1, r].
a. Show that carry-free addition is impossible for stored-carry-or-borrow numbers.
b. Develop a limited-carry addition algorithm for such radix-r numbers.
c. Compare the stored-carry-or-borrow representation to the stored-double-carry representation based on the digit set [0, r + 1] and discuss.
3.4 Addition with parallel carries
a. The redundant radix - 2 representation with the digit set [0, 3], used in several examples in Section 3.5, is known as the binary stored-double-carry number system [Parh96]. Design a digit slice of a binary stored-double-carry adder based on the addition scheme of Fig. 3.15.
b. Repeat part a with the addition scheme of Fig. 3.13.
c. Repeat part a with the addition scheme of Fig. 3.14.
d. Compare the implementations of parts a-c with respect to speed and cost.
3.5 Addition with parallel or repeated carries
a. Develop addition algorithms similar to those discussed in Section 3.5 for binary stored-triple-carry number system using the digit set [0, 4].
b. Repeat part a for the binary stored-carry-or-borrow number system based on the digit set[ -1, 2].
c. Develop a sign detection scheme for binary stored-carry-or-borrow numbers.
d. Can one use digit sets other than [0, 3], [0,4], and [-1, 2] in radix-Z addition with parallel carries?
e. Repeat parts a-d for addition with repeated carries.
PROBLEMS 51
3.6 Nonredundant and redundant digit sets Consider a fixed-point, symmetric radix-3 number system, with k whole and I fractional digits, using the digit set [-1, 1].
a. Determine the range of numbers represented as a function of k and I.
h. What is the representation efficiency relative to binary representation, given that each radix-3 digit needs a 2-bit code?
c. Devise a carry-free procedure for converting a symmetric radix-3 positive number to an unsigned radix-3 number with the redundant digit set [0, 3].
d. What is the representation efficiency of the redundant number system of part c?
3.7 Digit-set and radix conversions Consider a fixed-point, radix-4 number system, with k whole and I fractional digits, using the digit set [-3, 3].
a. Determine the range of numbers represented as a function of k and l.
b. Devise a procedure for converting such a radix-4 number to a radix-8 number that uses the digit set [-7,7].
c. Specify the numbers K and L of integer and fractional digits in the new radix of part b as functions of k and l.
d. Devise a procedure for converting such a radix-4 number to a radix-4 number that uses the digit set [ - 2, 2].
3.8 Hybrid signed-digit representation Consider a hybrid radix-2 number representation system with the repeating pattern of two standard binary positions followed by one BSD position. The addition algorithm for this system is similar to that in Fig. 3.8. Show that this algorithm can be formulated as carry-free radix-8 GSD addition and derive its relevant parameters (range of transfer digits and comparison constants for transfer digit selection).
3.9 GSD representation of zero
a. Obtain necessary and sufficient conditions for zero to have a unique representation in a GSD number system.
h. Devise a 0 detection algorithm for cases in which 0 has multiple representations.
c. Design a hardware circuit for detecting 0 in an 8-digit radix-4 GSD representation using the digit set [-2, 4].
3.10 Imaginary-radix GSD representation Show that the imaginary-radix number system with r = 2j, where j = ..;=T, and digit set [- 2, 2] lends itself to a limited-carry addition process. Define the process and derive its relevant parameters.
3.11 Negative-radix GSD representation Do you see any advantage to extending the definition of GSD representations to include the possibility of a negative radix r? Explain.
3.12 Mixed redundant-conventional arithmetic We have seen that BSD numbers cannot be added in a carry-free manner but that a limited-carry process can be applied to them.
a. Show that one can add a conventional binary number to a BSD number to obtain their BSD sum in a carry-free manner.
b. Supply the complete logic design for the carry-free adder of part a.
c. Compare your design to a carry-save adder and discuss.
52 Redundant Number Systems
3.13 Negation of GSD numbers One disadvantage of GSD representations with asymmetric digit sets is that negation (change of sign) becomes nontrivial. Show that negation of GSD numbers is always a carry-free process and derive a suitable algorithm for this purpose.
3.14 Digit-serial GSD arithmetic GSD representations allow fast carry-free or limited-carry parallel addition. GSD representations may seem less desirable for digit-serial addition because the simpler hinary representation already allows very efficient bit-serial addition. Consider a radix-4 GSD representation using the digit set l-3, 3].
a. Show that two such GSD numbers can be added digit-serially heginning at the most significant end (MSD-first arithmetic).
b. Present a complete logic design for your digit-serial adder and determine its latency.
c. Do you sec any advantage for MSD-first, as opposed to LSD-first, arithmetic?
3.15 BSD arithmetic Consider binary signed-digit numbers with digit set [-I, I] and the 2-bit (n, p) encoding of the digits (see Fig. 3.7). The code (I, I) never appears and can be used as don't-care.
a. Design a fast sign detector for a 4-digit BSD input operand using fulliookahcad.
b. How can the design of part a be used for 16-digit inputs?
c. Design a single-digit BSD full adder producing the sum digit s, and transfer tj+].
3.16 Unsigned-digit redundant representations Consider the hex-digit decimal (HDD) number system with r = 10 and digit set [0, 15] for representing unsigned integers.
a. Find the relevant parameters for carry-free addition in this system.
b. Design an HDD adder using 4-bit binary adders and a simple postcorrection circuit.
3.17 Double-LSB 2's-complement numbers Consider k-bit 2's-complement numbers with an extra least significant bit attached to them [Parh98]. Show that such redundant numbers have symmetric range, allow for bitwise 2's-complementation, and can be added using a standard k-bit adder.
REFERENCES
[Avizo l ] Avizienis, A., "Signed-Digit Number Representation for Fast Parallel Arithmetic," IRE Trans. Electronic Computers, Vol. 10, pp. 389--400,1961.
[Glas81] Glaser, A., History of Binary and Other Nondecimal Numeration, rev. ed., Tomash Publishers. 1981.
[Kom94] Komerup, P., "Digit-Set Conversions: Generalizations and Applications," IEEE Trans.
Computers. Vol. 43, No.8, pp. 622-629, 1994.
[Mctz59] Metze, G., and J.E. Robertson, "Elimination of Carry Propagation in Digital Computers," Information Processing' 59 (Proceedings of a UNESCO Conference), 1960, pp. 389-396. [Parh88] Parhami, B., "Carry-Free Addition of Recoded Binary Signed-Digit Numbers," IEEE Trans. Computers, Vol. 37. No. 11, pp. 1470-1476, 1988.
[Parh90] Parhami, B., "Generalized Signed-Digit Number Systems: A Unifying Framework for Redundant Number Representations," IEEE Trans. Computers, Vol. 39, No. I, pp. 89-98, 1990.
REFERENCES 53
[Parh93] Parhami, B., "On the Implementation of Arithmetic Support Functions for Generalized Signed-Digit Number Systems," IEEE Trans. Computers, Vol. 42, No.3, pp. 379-384, 1993.
[Parh96] Parhami, B., "Comments on 'High-Speed Area-Efficient Multiplier Design Using Multiple-Valued Current Mode Circuits,''' IEEE Trans. Computers, Vol. 45, No.5, pp. 637-638, 1996.
[Parh98] Parhami, B., and S. Johansson, "A Number Representation Scheme with Carry-Free Rounding for Floating-Point Signal Processing Applications," Proc. Ini 'l. Conf. Signal and Image Processing, Las Vegas, Nevada, October 1998, pp. 90-92.
[Phat941 Phatak, D. S., and l. Koren, "Hybrid Signed-Digit Number Systems: A Unified Framework for Redundant Number Representations with Bounded Carry Propagation Chains," IEEE Trans. Computers, Vol. 43, No.8, pp. 880-891,1994.
Chapter
4
RESIDUE NUMBER SYSTEMS
By converting arithmetic on large numbers to arithmetic on a collection of smaller numbers, residue number system (RNS) representations produce significant speedup for some classes of arithmetic-intensive algorithms in signal processing applications. Additionally, RNS arithmetic is a valuable tool for theoretical studies of the limits of fast arithmetic. In this chapter, we study RNS representations and arithmetic, along with their advantages and drawbacks. Chapter topics include:
4.1 RNS Representation and Arithmetic 4.2 Choosing the RNS Moduli
4.3 Encoding and Decoding of Numbers 4.4 Difficult RNS Arithmetic Operations 4.5 Redundant RNS Representations
4.6 Limits of Fast Arithmetic in RNS
4.1 RNS REPRESENTATION AND ARrTHMETIC
54
What number has the remainders of 2, 3, and 2 when divided by the numbers 7, 5, and 3, respectively? This puzzle, written in the form of a verse by the Chinese scholar Sun Tsu more than 1500 years ago [Jenk93], is perhaps the first documented use of number representation using multiple residues. The puzzle essentially asks us to convert the coded representation (2 I 3 I 2) of a residue number system, based on the moduli (7 I 5 I 3), into standard decimal format.
In a residue number system (RNS), a number x is represented by the list of its residues with respect to k pairwise relatively prime moduli mk-l > ... > m 1 > mo. The residue Xi of x with respect to the ith modulus m, is akin to a digit and the entire k-residue representation of X can be viewed as a k-digit number, where the digit set for the ith position is [0, m, - 1]. Notationally, we write
and specify the RNS representation of x by enclosing the list of residues, or digits, in parentheses. For example,
4.1 RNS REPRESENTATION AND ARITHMETIC 55
x = (21312)RNS(71513)
represents the puzzle given at the beginning of this section. The list of moduli can be deleted from the subscript when we have agreed on a default set. In many of the examples of this chapter, the following RNS is assumed:
RNS(8171513) Default RNS for Chapter 4
The product M of the k pairwise relatively prime moduli is the number of different representable values in the RNS and is known as its dynamic range.
M = mk-l x ... x ml x mo
For example, M = 8 x7 x5 x 3 = 840 is the total number of distinct values that are representable in our chosen 4-modulus RNS. Because of the equality
the 840 available values can be used to represent numbers 0 through 839, -420 through +419, or any other interval of 840 consecutive integers. In effect, negative numbers are represented using a complement system with the complementation constant M.
Here are some example numbers in RNS(8171513) :
(0 I 0 I 0 I O)RNS Represents 0 or 840 or ... (l I 1 I 1 I l)RNS Represents 1 or 841 or· ..
(2 I 2 I 2 I 2)RNS Represents 2 or 842 or .
(0 I I I 3 I 2)RNS Represents 8 or 848 or .
(5 101 1 10)RNS Represents 21 or 861 or· ..
(0 I 1 I 4 I l)RNS Represents 64 or 904 or .
(2 I 0 I 0 I 2)RNS Represents -70 or 770 or .
(7 I 6 I 4 I 2)RNS Represents -lor 839 or .
Given the RNS representation of x, the representation of -x can be found by complementing each of the digits Xi with respect to its modules m, (0 digits are left unchanged). Thus, given that 21 = (5 I 0 I 1 I O)RNS, we find:
- 21 = (8 - 5 I 0 I 5 - 1 I O)RNS = (3 I 0 I 4 I O)RNS
Any RNS can be viewed as a weighted representation. We will present a general method for determining the position weights (the Chinese remainder theorem) in Section 4.3. For RNS(8171513), the weights associated with the four positions are:
105
120
336
280
As an example, (l I 2 I 4 I O)RNS represents the number:
(105 x 1) + (120 x 2) + (336 x 4) + (280 x 0»)840 = (1689)840 = 9
56 Residue Number Systems
In practice, each residue must be represented or encoded in binary. For our example RNS, such a representation would require 11 bits (Fig. 4.1). To determine the number representation efficiency of our 4-modulus RNS, we note that 840 different values are being represented using 11 bits, compared to 2048 values possible with binary representation. Thus, the representational efficiency is
840/2048 = 41 %
SinceIogj Sat) = 9.714, another way to quantify the representational efficiency is to note that in our example RNS, about 1.3 bits of the 11 bits goes to waste.
As noted earlier, the sign of an RNS number can be changed by independently complementing each of its digits with respect to its modulus. Similarly, addition, subtraction, and multiplication can be performed by independently operating on each digit. The following examples for RNS(8171513) illustrate the process:
x + y: (5 + 7)8 = 4, (5 + 6)? = 4, etc. x - y: (5 - 7)8 = 6, (5 - 6)] = 6, etc.
(alternatively, find -y and add to x)
x x y: (5 x Tt« = 3, (5 x 6)? = 2, etc.
Figure 4.2 depicts the structure of an adder, subtractor, or multiplier for RNS arithmetic.
Since each digit is a relatively small number, these operations can be quite fast and simple in RNS. This speed and simplicity are the primary advantages of RNS arithmetic. In the case of addition, for example, carry propagation is limited to within a single residue (a few bits). Thus, RNS representation pretty much solves the carry propagation problem. As for multiplication, a 4 x 4 multiplier (e.g.), is considerably more than four times simpler than a 16 x 16 multiplier, besides being much faster. In fact, since the residues are small (say, 6 bits wide), it is quite feasible to implement addition, subtraction, and multiplication by direct table lookup. With 6-bit residues, say, each operation requires a 4K x 6 table. Thus, excluding division, a complete arithmetic unit module for one 6-bit residue can be implemented with 9 KB of memory.
Unfortunately, however, what we gain in terms of the speed and simplicity of addition, subtraction, and multiplication can be more than nullified by the complexity of division and the difficulty of certain auxiliary operations such as sign test, magnitude comparison, and overflow detection. Given the numbers
(7 1 2 I 2 1 I)RNS
and
(2 1 5 1 0 1 I)RNS
we cannot easily tell their signs, determine which of the two is larger, or find out whether (1 1 0 1 2 1 2kNS represents their true sum as opposed to the residue of their sum modulo 840.
I ! I I I I I I I I I I
l<'ig.4.1 Binary-coded number format for RNS(8171513).
mod 8 mod 7 mod 5 mod 3
4.2 CHOOSING THE RNS MODULI 57
Operand 1
Operand 2
mod 8 mod 7 mod 5 mod 3
Fig.4.2 The structure of an adder, subtractor, or multiplier for RNS(8171513).
These difficulties have thus far limited the application of RNS representations to certain signal processing problems in which additions and multiplications are used either exclusively or predominantly and the results are within known ranges (e.g., digital filters, Fourier transforms). Developments in recent years rHung94] have greatly lessened the penalty for division and sign detection and may lead to more widespread applications for RNS in future. We discuss division and other "difficult" RNS operations in Section 4.4.
4.2 CHOOSING THE RNS MODULI
The set of the moduli chosen for RNS affects both the representational efficiency and the complexity of arithmetic algorithms. Tn general, we try to make the moduli as small as possible, since it is the magnitude of the largest modulus »Ik-1 that dictates the speed of arithmetic operations. We also often try to make all the moduli comparable in magnitude to the largest one, since with the computation speed already dictated by »Ik-1, there is usually no advantage in fragmenting the design of Fig. 4.2 through the use of very small moduli at the right end.
We illustrate the process of selecting the RNS moduli through an example. Let us assume that we want to represent unsigned integers in the range 0 to (100 OOO)ten, requiring 17 bits with standard binary representation.
A simple strategy is to pick prime numbers in sequence until the dynamic range M becomes adequate. Thus, we pick »10 = 2, »11 = 3, »12 = 5, etc. After we add ms = 13 to our list, the dynamic range becomes:
RNS(l3 I 11 I 7 I 5 I 3 I 2)
M = 30030
This range is not yet adequate, so we add m6 = 17 to the list:
RNS(17 I 13 I 11 I 7 I 5 I 3 I 2)
M=510510
58 Residue Number Systems
The dynamic range is now 5.1 times larger than needed, so we can remove the modulus 5 and still have adequate range:
RNS(17 1 13 1 11 1 7 1 3 1 2)
M = 102102
With binary encoding of the six residues, the number of bits needed for encoding each number is:
5 + 4 + 4 + 3 + 2 + 1 = 19 bits
Now, since the speed of arithmetic operations is dictated by the 5-bit residues modulo ms, we can combine the pairs of moduli 2 and 13, and 3 and 7, with no speed penalty. This leads to:
RNS(26 1 21 1 17 1 11)
M = 102102
This alternative RNS still needs 5 + 5 + 5 + 4 = 19 bits per operand, but has two fewer modules in the arithmetic unit.
Better results can be obtained if we proceed as above, but include powers of smaller primes before moving to larger primes. The chosen moduli will still be pairwise relatively prime, since powers of any two prime numbers are relatively prime. For example, after including mo = 2 and m I = 3 in our list of moduli, we note that 22 is smaller than the next prime 5. So we modify mo and m 1 to get:
RNS(22 13)
M= 12
This strategy is consistent with our desire to minimize the magnitude of the largest modulus.
Similarly, after we have included m z = 5 and m3 = 7, we note that both 23 and 32 are smaller than the next prime 11. So the next three steps lead to:
The dynamic range is now 3.6 times larger than needed, so we can replace the modulus 9 with 3 and then combine the pair 5 and 3 to obtain:
RNS(l5 1 13 1 11 1 23 1 7)
M = 120 120
The number of bits needed by this last RNS is
4 + 4 + 4 + 3 + 3 = 18 bits
which is better than our earlier result of 19 bits. The speed has also improved because the largest residue is now 4 bits wide instead of 5.
Other variations are possible. For example, given the simplicity of operations with powerof-2 moduli, we might want to backtrack and maximize the size of our even modulus within the 4-bit residue limit:
RNS(24 1 13 1 11 1 32 1 7 1 5)
M = 720720
4.2 CHOOSING THE RNS MODULI 59
We can now remove 5 or 7 from the list of moduli, but the resulting RNS is in fact inferior to RNS(l511311112317). This might not be the case with other examples; thus, once we have converged on a feasible set of moduli, we should experiment with other sets that can be derived from it by increasing the power of the even modulus at hand.
The preceding strategy for selecting the RNS moduli is guaranteed to lead to the smallest possible number of bits for the largest modulus, thus maximizing the speed of RNS arithmetic. However, speed and cost do not just depend on the widths of the residues but also on the moduli chosen. For example, we have already noted that power-of-2 moduli simplify the required arithmetic operations, so that the modulus 16 might be better than the smaller modulus 13 (except, perhaps, with table-lookup implementation). Moduli of the form 2° - 1 are also desirable and are referred to as low-cost moduli [Merr64], [Parh76]. From our discussion of addition of 1 'scomplement numbers in Section 2.4, we know that addition modulo 2(J - I can be performed using a standard a-bit binary adder with end-around carry.
Hence, we an; motivated to restrict the moduli to a power of 2 and odd numbers of the form 2Q - I. One can prove (left as exercise) that the numbers 2" - 1 and 2b - I are relatively prime if and only if a and b are relatively prime. Thus, any list of relatively prime numbers ak-2> ... > al > ao can be the basis of the following k-modulus RNS
for which the widest residues are ak-2-bit numbers. Note that to maximize the dynamic range with a given residue width, the even modulus is chosen to be as large as possible.
Applying this strategy to our desired RNS with the target range [0, 100000], leads to the following steps:
RNS(23 I 23 - 1 I 22 - I) RNS(24 1 24 - 1 123 - I)
Basis: 5, 3, 2 M = 20 832 Basis: 5, 4, 3 M = 104 160
This last system, RNS(32 1 31 I 15 1 7), possesses adequate range. Note that once the number 4 is included in the base list, 2 must be excluded because 4 and 2, and thus 24 - I and 22 - I, are not relatively prime.
The derived RNS requires 5 + 5 + 4 + 3 = 17 bits for representing each number, with the largest residues being 5 bits wide. In this case, the representational efficiency is close to 100% and no bit is wasted. In general, the representational efficiency of low-cost RNS is provably better than 50% (yet another exercisel), leading to the waste of no more than I bit in number representation.
To compare the RNS above to our best result with unrestricted moduli, we list the parameters of the two systems together:
RNS(15 1 13 1 11 1 23 1 7)
RNS(25 1 25 - 1 1 24 - I 1 23 - 1)
18 bits M = 120 120 17 bits M= 104 160
Both systems provide the desired range. The latter has wider, but fewer, residues. However, the simplicity of arithmetic with low-cost moduli makes the latter a more attractive choice. In general, restricting the moduli tends to increase the width of the largest residues and the optimal choice is dependent on both the application and the target implementation technology.
60 Residue Number Systems
4.3 ENCODING AND DECODING OF NUMBERS
Since input numbers provided from the outside (machine or human interface) are in standard binary or decimal and outputs must be presented in the same way, conversions between binary/decimal and RNS representations are required.
Conversion from binary/decimal to RNS
The binary-to-RNS conversion problem is stated as follows: Given a number y, find its residues with respect to the moduli mi, 0 ::::: i ::::: k - 1. Let us assume that Y is an unsigned binary number. Conversion of signed-magnitude or 2's-eomplement numbers can be accomplished by converting the magnitude and then complementing the RNS representation if needed.
To avoid time-consuming divisions, we take advantage of the following equality:
If we precompute and store (2j) m, for each i and j, then the residue Xi of Y (mod m;) can be computed by modulo-sa, addition of some of these constants.
Table 4.1 shows the required lookup table for converting lO-bit binary numbers in the range [0, 839J to RNS(8 I 7 I 5 I 3). Only residues mod 7, mod 5, and mod 3 are given in the table, since the residue mod 8 is directly available as the 3 least significant bits of the binary number y.
Example 4.1 Represent Y = (1010 OlOO)two = (l64)ten in RNS(8 17 15 I 3).
The residue ofy mod 8isx3 = (Y2YIYO)two = (lOO)twu = 4. Since Y = 27 +25+22, the required residues mod 7, mod 5, and mod 3 are obtained by simply adding th e values stored in the three rows corresponding to j = 7, 5, 2 in Table 4.1:
Therefore, the RNS(8 17 I 5 I 3) representation of (I 64)ten is (4 I 3 I 4 I 2)RNS.
In the worst case, k modular additions are required for computing each residue of a k-bit number. To reduce the number of operations, one can view the given input number as a number in a higher radix. For example, if we use radix 4, then storing the residues of 4i, 2 X 4i and 3 x 4' in a table would allow us to compute each of the required residues using only k/2 modular additions.
The conversion for eaeh modulus can be done by repeatedly using a single lookup table and modular adder or by several copies of each arranged into a pipeline. For a low-cost modulus m = 2a - 1, the residue can be determined by dividing up Y into a-bit segments and adding them modulo 2a - 1.
4.3 ENCODING AND DECODING OF NUMBERS 61
TABLE 4.1
Precomputed residues of the first 10 powers of 2
j 2j (2j), (2j}s (2jh
0 1
2 2 2 2
2 4 4 4 1
3 8 1 3 2
4 16 2 I 1
5 32 4 2 2
6 64 4
7 128 2 3 2
8 256 4 I 1
9 512 2 2 Conversion from RNS to mixed-radix form
Associated with any residue number system RNS(rnk_1 1 ... 1 rn2 1 rnl 1 rno) is a mixed-radix numher system MRS(rnk_1 1 ... 1 ma 1 rnl 1 rno). which is essentially ak-digit positional number system with position weights
and digit sets rO, rnk-I - I], ... , [0, rn2 - I], [0, rn 1 - I], and [0, rno - 1] in its k digit positions. Hence, the MRS digits are in the same ranges as the RNS digits (residues). For example. the mixed-radix system MRS(8 1 7 15 1 3) has position weights 7 x 5 x 3 = 105,5 x 3 = 15,3, and I, leading to:
(01311 I 0)MRS(8171513) = (0 x 105) + (3 xiS) + (1 x 3) + (0 x 1) = 48
The RNS-to-MRS conversion problem is that of determining the z, digits of MRS, given the Xi digits of RNS, so that:
Y = (xk-ll··· I X2 I Xl I XO)RNS = (Zk-l 1 '" 1 Z2 1 Zl I ZO)MRS
From the definition of MRS, we have:
It is thus immediately obvious that Zo = X(). Subtracting 70 = Xo fr0111 both the RNS and MRS representations, we get
Y - Xo = (X~_l I ... I x~ I x; I O)RNS = (Zk-I 1 ... 1 Z2 1 ZI IO)MRS
where Xl = {Xj - XO)mj' If we now divide both representations by rna, we get the following in the reduced RNS and MRS fr0111 which rna has been removed:
Thus, if we demonstrate how to divide the number y' = (x~_l I ... 1 x~ 1 x; 1 O)RNS by rna to obtain (X~_l 1 ... 1 x~ 1 XnRNS, we have converted the original problem to a similar problem with one fewer modulus. Repeating the same process then leads to the determination of all the Zi digits in turn.
Dividing y', which is a multiple of mo, by ma is known as scaling and is much simpler than general division in RNS. Division by nu, can be accomplished by multiplying each residue by the multiplicative inverse of ma with respect to the associated modulus. For example, the multiplicative inverses of 3 relative to 8, 7, and 5 are 3, 5, and 2, respectively, because:
(3 X 3)8 = (3 x 5h = (3 x 2)5 = 1
Thus, the number y' = (0 1 6 1 3 1 O)RNS can be divided by 3 through multiplication by (315121-)RNS:
(0 I 6 1 3 I OhNS
3 = (0 1 6 I 3 1 O)RNS x (3 I 5 1 2 1 - )RNS = (0 1 2 1 1 1 - )RNS
Multiplicative inverses of the moduli can be precomputed and stored in tables to facilitate RNSto-MRS conversion .
• Example 4.2 Convert y = (0 1 6 1 3 1 O)RNS to mixed-radix representation.
We have Za = Xo = O. Based on the preceding discussion, dividing y by 3 yields:
We conclude by observing that Z3 = O. The conversion is now complete:
y = (0 1 6 1 3 1 O)RNS = (0 1 3 1 1 1 O)MRS = 48
Mixed-radix representation allows us to compare the magnitudes of two RNS numbers or to detect the sign of a number. For example, the RNS representations (0 1 6 I 3 I O)RNS and (5 1 3 10 I O)RNS of 48 and 45 provide no clue to their relative magnitudes, whereas the equivalent mixed-radix representations (0 I 3 11 I O)MRS and (0 13 1 0 1 O)MRS, or (000 I 011 1001 100)MRS and (000 1 011 10001 OO)MRS, when coded in binary, can be compared as ordinary numbers.
4.3 ENCODING AND DECODING OF NUMBERS 63
Conversion from RNS to binary/decimal
One method for RNS-to-binary conversion is to first derive the mixed-radix representation of the RN S number and then use the weights of the mixed-radix positions to complete the conversion. We can also derive position weights for the RNS directly based on the Chinese remainder theorem (CRT), as discussed below.
Consider the conversion of y = (3 I 2 I 4 I 2)RNS from RNS(8 I 7 I 5 I 3) to decimal. Based on RNS properties, we can write:
(3 I 2 I 4 I 2kNS = (3 I 0 I 0 I O)RNS + (0 I 2 I 0 I O)RNS + (0 I 0 I 4 I O)RNS + (0 I 0 I 0 I 2)RNS
= 3 x (l I 0 I 0 I O)RNS + 2 x (0 I 1 I 0 I O)RNS + 4 x (0 101 1 I O)RNS + 2 x (0 I 0 I 0 I I)RNs
Thus, knowing the values of the following four constants (the RNS position weights) would allow us to convert any number from RNS(8171513) to decimal using four multiplications and three additions.
(1101010)RNs=105 (0 I I I 0 I O)RNS = 120 (0 I 0 I I I O)RNS = 336 (0 I 0 I 0 I I)RNs = 280
Thus, we find:
(3 I 2 I 41 2)RNS = (3 x 105) + (2 x 120) + (4 x 336) + (2 x 280»)840 = 779
It only remains to show how the preceding weights were derived. How, for example, did we determine that W3 = (1 101 0 I O)RNS = lOS?
To determine the value of W3, we note that it is divisible by 3, 5, and 7, since its last three residues are Os. Hence, W3 must be a multiple of 105. We must then pick the right multiple of 105 such that its residue with respect to 8 is 1. This is done by multiplying 105 by its multiplicative inverse with respect to 8. Based on the preceding discussion, the conversion process can be formalized in the form of the Chinese remainder theorem.
THEOREM 4.1 (The Chinese remainder theorem) The magnitude of an RNS
number can be obtained from the CRT formula:
1'-1
X = (Xk-l I ... I X21 Xl I XO)RNS = (2: Mi (aiXi)mi}M
i=O
where, by definition, M, = M I mi, and a, = (Mi-I )mi is the multiplicative inverse of M, with respect to m i-
64 Residue Number Systems
TABLE 4.2
Values needed in applying the Chinese remainder theorem to RNS(8171513)
To avoid multiplications in the conversion process, we can store the values of (Mi {(Xi Xi )m,) M for all possible i and Xi in tables of total size L~~ci m, words. Table 4.2 shows the required values for RNS(8171513). Conversion is then performed exclusively by table lookups and modulo-M additions.
4.4 DIFFICULT RNS ARITHMETIC OPERATIONS
In this section, we discuss algorithms and hardware designs for sign test, magnitude comparison, overflow detection, and general division in RNS. The first three of these operations are essentially equivalent in that if an RNS with dynamic range M is used for representing signed numbers in the range [-N, P], with M = N + P + I, then sign test is the same as comparison with P and overflow detection can be performed based on the signs of the operands and that of the result. Thus, it suffices to discuss magnitude comparison and general division.
To compare the magnitudes of two RNS numbers, we can convert both to binary or mixedradix form. However, this would involve a great deal of overhead. A more efficient approach is through approximate CRT decoding. Dividing the equality in the statement of Theorem 4.1 by M, we obtain the following expression for the scaled value of x in [0, 1):
4.4 DIFFICULT RNS ARITHMETIC OPERATIONS 65
x
M
Here, the addition of terms is performed modulo 1, meaning that in adding the terms mi I (aix; )In;' each of which is in [0, I), the whole part of the result is discarded and only the fractional part is kept; this is much simpler than the modulo-M addition needed in standard CRT decoding,
Again, the terms mil (a;Xi)mj can be precomputed for all possible i and Xi and stored in tables of total size L~~(~ m, words. Table 4.3 shows the required lookup table for approximate CRT decoding in RNS(8171513). Conversion is then performed exclusively by table lookups and modulo-I additions (i.e., fractional addition, with the carry-out simply ignored) .
x
M R> (.0000 + .8571 + .2000 + .0000) 1 R> .0571
; R> (.6250 + .4286 + .0000 + .0000) I R> .0536
• Example 4.3 Use approximate CRT decoding to determine the larger of the two numbers x = (0161310)RNS and y = (5131010)RNS.
Reading values from Table 4.3, we get:
Thus, we can conclude that x > y, subject to approximation errors to be discussed next.
If the maximum error in each table entry is e, then approximate CRT decoding yields the scaled value of an RNS number with an error of no more than ke . In the preceding example, assuming that the table entries have been rounded to four decimal digits, the maximum error in each entry is e = 0.00005 and the maximum error in the scaled value is 48 = 0.0002. The conclusion X > y is, therefore, safe.
Of course we can use highly precise table entries to avoid the possibility of erroneous conclusions altogether. But this would defeat the advantage of approximate CRT decoding in simplicity and speed. Thus, in practice, a two-stage process might be envisaged: a quick approximate decoding process is performed first, with the resulting scaled value(s) and error bound(s) used to decide whether a more precise or exact decoding is needed for arriving at a conclusion.
In many practical situations, an exact comparison of x and y might not be required and a ternary decision result x < y, X R> Y (i.e., too close to call), or x > y might do. In such cases, approximate CRT decoding is just the right tool. For example, in certain division algorithms (to be discussed in Chapter 14), the sign and the magnitude of the partial remainder s are used to choose the next quotient digit qj from the redundant digit set [-1, 1] according to the following:
s < 0 quotient digit = -1 s R> a quotient digit = 0 s > 0 quotient digit =
In this case, the algorithm's built-in tolerance to imprecision allows us to use it for RNS division. Once the quotient digit in [ -1, 1] has been chosen, the value qj d, where d is the divisor, is subtracted from the partial remainder to obtain the new partial remainder for the next iteration.
66 Residue Number Systems
TABLE 4.3
Values needed in applying approximate Chinese remainder theorem decoding to RNS(8171513)
Also, the quotient, derived in positional radix-2 format using the digit set [-1, 1], is converted to RNS on the fly.
In other division algorithms, to be discussed in Chapters 14 and 15, approximate comparison of the partial remainder s and divisor d is used to choose a radix-r quotient digit in [-ct, In An example includes radix-4 division with the quotient digit set [-2, 2J. In these cases, too, approximate CRT decoding can be used to facilitate RNS division [Hung94].
4.5 REDUNDANT RNS REPRESENTATIONS
Just as the digits in a positional radix-r number system do not have to be restricted to the set [0, r - 1J, we are not obliged to limit the residue digits for the modulus m, to the set [0, m, - 1]. Instead, we can agree to use the digit set [0, .Bi] for the mod-zn, residue, provided d, ::: m, - 1. If .Bi ::: m., then the resulting RNS is redundant.
One reason to use redundant residues is to simplify the modular reduction step needed after each arithmetic operation. Consider, for example, the representation of mod-13 residues using 4-bit binary numbers. Instead of using residues in [0, 12], we can use pseudoresidues in [0, 15].
x
z
4.6 LIMITS OF FAST ARITHMETIC IN RNS 67
Figure 4.3 Adder design for 4-bit mod-13 pseudoresidues.
Residues 0, 1, and 2 will then have two representations, since 13 = 0 mod 13, 14 = 1 mod l3, and 15 = 2 mod 13. Addition of such pseudoresidues can be performed by a 4-bit binary adder. If the carry-out is 0, the addition result is kept intact; otherwise, the carry-out, which is worth 16 units, is dropped and 3 is added to the result. Thus, the required mod-l3 addition unit is as shown in Fig. 4.3.
One can go even further and make the pseudoresidues 2h bits wide, where normal mod-m residues would be only h bits wide. This simplifies a multiply-accumulate operation, which is done by adding the 2h-bit product of two normal residues to a 2h-bit running total, reducing the (2h + l j-bit result to a 2h-bit pseudoresidue for the next step by subtracting 2hm from it if needed (Fig. 4.4). Reduction to a standard h-bit residue is then done only once at the end of accumulation.
4.6 LIMITS OF FAST ARITHMETIC IN RNS
How much faster is RNS arithmetic than conventional (say, binary) arithmetic? We will see later in Chapters 6 and 7 that addition of binary numbers in the range [0, M - 1] can be done in
Sum in
2h
2h
1-++-- Sum out
Fig.4.4 A modulo-m multiply-add cell that accumulates the sum into a double-length redundant pseudoresidue.
68 Residue Number Systems
O(log log M) time and with O(log M) cost using a variety of methods such as carry-lookahead, conditional-sum, or multilevel carry-select. Both these are optimal to within constant factors, given the fixed-radix positional representation. For example, one can use the constant fan-in argument to establish that the circuit depth of an adder must be at least logarithmic in the number k = log, M of digits. Redundant representations allow 0(1 )-time, O(log M)-cost addition. What is the best one can do with RNS arithmetic?
Consider the residue number system RNS(mk-ll· .. Imllmo). Assume that the moduli are chosen as the smallest possible prime numbers to minimize the size of the moduli, and thus maximize computation speed. The following theorems from number theory help us in figuring out the complexity.
THEOREM 4.2 The ith prime Pi is asymptotically equal to i In i.
THEOREM 4.3 The number of primes in [I, nJ is asymptotically equal to n/On n).
THEOREM 4.4 The product of all primes in [1, nl is asymptotically equal to en.
Table 4.4 lists some numerical values that can help us understand the asymptotic approximations given in Theorems 4.2 and 4.3.
Armed with these results from number theory, we can derive an interesting limit on the speed of RNS arithmetic.
TABLE 4.4
The ith-prime Pi and the number of primes in [1, n] versus their asymptotic approximations
Error Number of primes Error
Pi ; In ; (%) n in [1, nJ n/(In n) (%)
1 2 0.000 100 5 2 3.107 55
2 3 1.386 54 10 4 4.343 9
3 5 3.296 34 15 6 5.539 8
4 7 5.545 21 20 8 6.676 17
5 11 8.047 27 25 9 7.767 14
10 29 23.03 21 30 10 8.820 12
15 47 40.62 14 40 12 10.84 10
20 71 59.91 16 50 15 12.78 15
30 113 102.0 10 100 25 21.71 13
40 173 147.6 15 200 46 37.75 18
50 229 195.6 15 500 95 80.46 15
100 521 460.5 12 1000 170 144.8 15 4.6 LIMITS OF FAST ARITHMETIC IN RNS 69
THEOREM 4.5 It is possible to represent all k-bit binary numbers in RNS with O(kllog k) moduli such that the largest modulus has O(1og k) bits.
Proof: If the largest needed prime is n, by Theorem 4.4 we must have en :=::; 2k. This equality implies n < k. The number of moduli required is the number of primes less than n which by Theorem 4.3 is O(nllog n) = O(kllog k).
As a result, addition of such residue numbers can be performed in O(1og log log M) time and with O(1og M) cost. So, the cost of addition is comparable to that of binary representation whereas the delay is much smaller, though not constant.
If for implementation ease, we limit ourselves to moduli of the form 2(J or 2(J ~ I, the folluwing results from number theory are applicable.
THEOREM 4.6 The numbers 2" ~ I and 2b ~ I are relatively prime if and only if a
and b are relatively prime.
THEOREM 4.7 The sum of the first j primes is asymptotically 0(j21n j).
These theorems allow us to prove the following asymptotic result for low-cost residue number systems.
THEOREM 4.8 It is possible to represent all k-bit binary numbers in RNS with
O«k/log k)1/2) low-cost moduli of the form 2a ~ 1 such that the largest modulus has O«k lug k)I/2) bits.
Proof: If the largest modulus that we need is i ~ 1, by Theorem 4.7 we must have [2 In I:=::; k. This implies that 1= O«k/log k)1/2). By Theorem 4.2, the lth prime is approximately PI :=::; I In l :=::; O«k log k)li2). The proof is complete upon noting that to minimize the size of the moduli, we pick the ith modulus to be 2Pi ~ 1.
As a result, addition of low-cost residue numbers can be performed in O(log log M) time with O(1og M) cost and thus, asymptotically, offers little advantage over standard binary.
70 Residue Number Systems
4.1 RNS representation and arithmetic Consider the RNS system RNS(l5 113 111 I 8 17) derived in Section 4.2.
a. Represent the numbers x = 168 and y = 23 in this RNS.
b. Compute x + y, x - y, .i x y, checking the results via decimal arithmetic.
c. Knowing that x is a multiple of 56, divide it by 56 in the RNS. Hint: 56 = 7 x 8.
d. Compare the numbers (5141312 II)RNs and (1 1213 1415)RNS using mixed-radix conversion.
e. Convert the numbers (5 I 4 I 3 I 2 I l)RNS and (1 I 2 I 3 I 4 I 5)RNS to decimal.
f. What is the representational efficiency of this RNS compared to standard binary?
4.2 RNS representation and arithmetic Consider the low-cost RNS system RNS(32I 31 I 15 I 7) derived in Section 4.2.
a. Represent the numbers x = 168 and y = -23 in this RNS.
b. Compute x + y, x - y, x x y, checking the results via decimal arithmetic.
c. Knowing that x is a multiple of 7, divide it by 7 in the RNS.
d. Compare the numbers (4 I 3 I 2 I l)RNS and (l I 2 I 3 I 4)RNS using mixed-radix conversion.
e. Convert the numbers (4 I 3 I 2 I I)RNs and (11 2 I 3 14)RNS to decimal.
f. What is the representational efficiency of this RNS compared to standard binary?
4.3 RNS representation Find all numbers for which the RNS(8 17 I 5 I 3) representation is palindromic (i.e., the string of four "digits" reads the same forward and backward).
4.4 RNS versus GSD representation We are contemplating the use of I6-bit representations for fast integer arithmetic. One option, radix-8 GSD representation with the digit set [-5,4], can accommodate four-digit numbers. Another is RNS(16 I 15 I 13 I 11) with complement representation of negative values.
a. Compute and compare the range of representable integers in the two systems.
b. Represent the integers +441 and -228 and add them in the two systems.
c. Briefly discuss and compare the complexity of multiplication in the two systems.
4.5 RNS representation and arithmetic Consider a residue number system that can be used to represent the equivalent of 24-bit, 2's-complement numbers.
a. Select the set of moduli to maximize the speed of arithmetic operations.
b. Determine the representational efficiency of the resulting RNS.
c. Represent the numbers x = +295 and y = -322 in this number system.
d. Compute the representations of x + y, x - y, and x x y; check the results.
4.6 Binary-to-RNS conversion In a residue number system, 11 is used as one of the moduli.
a. Design a mod-Ll adder using two standard 4-bit binary adders and a few logic gates.
b. Using the adder of part a and a 10-word lookup table, show how the mod-Ll residue of an arbitrarily long binary number can be computed by a serial-in, parallel-out circuit.
PROBLEMS 71
c. Repeat part a, assuming the use of rnod-Ll pseudoresidues in [0, 15].
d. Outline the changes needed in the design of part b if the adder of part c is used.
4.7 Low-cost RNS Consider residue number systems with moduli of the form 2a, or 2a, - 1.
a. Prove that m, = 2ai - I and mj = 2aj - 1 are relatively prime if and only if ai and aj are relatively prime.
b. Show that such a system wastes at most one bit relative to binary representation.
c. Determine an efficient set of moduli to represent the equivalent of 32-bit unsigned integers. Discuss your efficiency criteria.
4.8 Special RNS representations It has been suggested that moduli of the form 2a, + I also offer speed advantages. Evaluate this claim by devising a suitable representation for the (aj + 1)-bit residues and dealing with arithmetic operations on such residues. Then, determine an efficient set of moduli of the form 2ai and 2a, ± 1 to represent the equivalent of 32-bit integers.
4.9 Overflow in RNS arithmetic Show that if 0 :s: x, y < m, then (x + y) mod m causes overflow if and only if the result is less than x (thus the problem of overflow detection in RNS arithmetic is equivalent to the magnitude comparison problem).
4.10 Discrete logarithm Consider a prime modulus p. From number theory, we know that there always exists an integer generator g such that the powers gO, g I, g2, s', ... (mod p) produce all the integers in [1, p - 1]. If gi = x modp, then i is called the mod-p, base-g discrete logarithm of x. Outline a modular multiplication scheme using discrete log and log " tables and an adder.
4.11 Halving even numbers in RNS Given the representation of an even number in an RNS with only odd moduli, find an efficient algorithm for halving the given number.
4.12 Symmetric RNS In a symmetric RNS, the residues are signed integers, possessing the smallest possible absolute values, rather than unsigned integers. Thus, for an odd modulus m, symmetric residues range from - (m - 1) /2 to (m - I) /2 instead of from 0 to m - 1. Discuss the possible advantages of a symmetric RNS over ordinary RNS.
4.13 Approximate Chinese remainder theorem decoding Consider the numbers x (0161310)RNS and y = (5131010)RNs of Example 4.3 in Section 4.4.
a. Redo the example and its associated error analysis with table entries rounded to two decimal digits. How does the conclusion change?
b. Redo the example with table entries rounded to three decimal digits and discuss.
4.14 Division of RNS numbers by the moduli
a. Show how an RNS number can be divided by one of the moduli to find the quotient and the remainder, both in RNS form.
b. Repeat part a for division by the product of two or more moduli.
4.15 RNS base extension Consider a k-modulus RNS and the representation of a number x in that RNS. Develop an efficient algorithm for deriving the representation of x in a
72 Residue Number Systems
(k + l j-modulus RNS that includes all the moduli of the original RNS plus one more modulus that is relatively prime with respect to the preceding k. This process of finding a new residue given k existing residues is known as base extension.
4.16 Automorphic numbers An n-place automorph is an n-digit decimal number whose square ends in the same n digits. For example, 625 is a 3-place automorph, since 6252 = 390625.
a. Prove that x > 1 is an n-place automorph if and only if x mod 5" = 0 or 1 and x mod 2" = 1 or 0, respectively.
b. Relate n-place automorphs to a 2-residue RNS with m 1 = 5" and mo = 2" .
c. Prove that if x is an n-place automorph, then (3x2 - 2x3) mod 102" is a 2n-place automorph.
REFERENCES
[Garn59] Garner, H. L., "The Residue Number System," IRE Trans. Electronic Computers, Vol. 8, pp. 140-147, June 1959.
lJenk931 Jenkins, W. K., "Finite Arithmetic Concepts," in Handbookfor Digital Signal Processing, S. K. Mitra and 1. F. Kaiser, (eds.), Wiley, 1993, pp. 611-675.
[Hung94] Hung, C. Y., and B. Parhami, "An Approximate Sign Detection Method for Residue Numbers and Its Application to RNS Division," Computers & Mathematics with Applications, Vol. 27, No.4, pp. 23-35,1994.
[Hung95] Hung, C. Y., and B. Parhami, "Error Analysis of Approximate Chinese-RemainderTheorem Decoding," IEEE Trans. Computers, Vol. 44, No. 11, pp. 1344-1348, 1995. [Merr64] Merrill, R.D., "Improving Digital Computer Performance Using Residue Number Theory," IEEE Trans. Electronic Computers, Vol. 13, No.2, pp. 93-1 OJ , April 1964.
[Parh76] Parhami, B., "Low-Cost Residue Number Systems for Computer Arithmetic," AflPS Con! Proc., Vol. 45 (1976 National Computer Conference), AFIPS Press, 1976, pp. 951- 956.
[Parh93] Parhami, B., and H.-F. Lai, "Alternate Memory Compression Schemes fur Modular Multiplication," IEEE Trans. Signal Processing, Vol. 41, pp. 1378-1385, March 1993. [Sode86] Soderstrand, M. A., W. K. Jenkins, G. A. Jullien, and F. 1. Taylor (eds.), Residue Number System Arithmetic, IEEE Press, 1986.
lSzab671 Szabo, N. S., and R. I. Tanaka, Residue Arithmetic and Its Applications to Computer Technology, McGraw-Hill, 1967.
PART II
ADDITION/ SUBTRACTION
Addition is the most common arithmetic operation and also serves as a building block for synthesizing all other operations. Within digital computers, addition is performed extensively both in explicitly specified computation steps and as a part of implicit ones dictated by indexing and other forms of address arithmetic. In simple ALUs that lack dedicated hardware for fast multiplication and division, these latter operations are performed as sequences of additions. A review of fast addition schemes is thus an apt starting point in investigating arithmetic algorithms. Subtraction is normally performed by negating the subtrahend and adding the result to the minuend. This is quite natural, given that an adder must handle signed numbers anyway. Even when implemented directly, a subtractor is quite similar to an adder. Thus, in the following four chapters that constitute this part, we focus almost exclusively on addition:
Chapter 5 Basic Addition and Counting Chapter 6 Carry-Lookahead Adders
Chapter 7 Chapter 8
Variations in Fast Adders Multioperand Addition
Chapter
5
BASIC ADDITION AND COUNTING
As stated in Section 3.1, propagation of carries is a major impediment to high-speed addition with fixed-radix positional number representations. Before exploring various ways of speeding up the carry propagation process, however, we need to examine simple ripple-carry adders, the building blocks used in their construction, the nature of the carry propagation process, and the special case of counting. Chapter topics include:
5.1 Bit-Serial and Ripple-Carry Adders 5.2 Conditions and Exceptions
5.3 Analysis of Carry Propagation
5.4 Carry Completion Detection
5.5 Addition of a Constant: Counters
5.6 Manchester Carry Chains and Adders
5.1 BIT-SERIAL AND RIPPLE-CARRY ADDERS
Single-bit half-adders and full adders are versatile building blocks that are used in synthesizing adders and many other types of arithmetic circuit. A half-adder (HA) receives two input bits x and y, producing a sum bit s = x EEl y = x Y + x y and a carry bit C = x y. Figure 5.1 depicts three of the many possible logic realizations of a half-adder. A half-adder can be viewed as a single-bit binary adder that produces the 2-bit sum of its single-bit inputs, namely, x + y = (cout s)two, where the plus sign in this expression stands for arithmetic sum rather than logical OR.
A single-bit full adder (FA) is defined as follows:
Inputs: Operand bits x, y and carry-in Gin
Outputs: Sum bit s and carry-out Cout
s = x EEl y EEl Cin
= XyCin + XYGin + XYG'in + XYCin
Cout = xy + XCin + yCin
(or Xi, Yi, c, for stage i) (or s; and Ci+l for stage i) (odd parity function)
(majority function)
75
76 Basic Addition and Counting
(a) AND/XOR half-adder.
(b) NOR-gate half-adder.
(c) NAND-gate half-adder with complemented carry.
Fig. 5.1 Three implementations of a half-adder.
A full adder can be implemented by using two half-adders and an OR gate as shown in Fig. 5.2a. The OR gate in Fig. 5.2a can be replaced with a NAND gate if the two HAs are NAND-gate half-adders with complemented carry outputs. Alternatively, one can implement a full adder as two-level AND-ORINAND-NAND circuits according to the preceding logic equations for sand Cout (Fig. 5.2b). Because of the importance ofthe full adder as an arithmetic building block, many optimized FA designs exist for a variety of implementation technologies. Figure S.2c shows a full adder, built of seven inverters and two 4-to-l multiplexers (Mux), that is suitable for CMOS transmission-gate logic implementation.
Full and half-adders can be used for realizing a variety of arithmetic functions. We will see many examples in this and the following chapters. For instance, a bit-serial adder can be built from a full adder and a carry flip-flop, as shown in Fig. 5.3a. The operands are supplied to the FA one bit per clock cycle, beginning with the least significant bit, from a pair of shift registers, and the sum is shifted into a result register. Addition of k-bit numbers can thus be completed in k clock cycles. A k-bit ripple-carry binary adder requires k full adders, with the carry-out of the ith FA connected to the carry-in input of the (i + l)th FA. The resulting k-bit adder produces a k-bit sum output and a carry-out; alternatively, Cout can be viewed as the most significant bit of a (k + I)-bit sum. Figure 5.3b shows a ripple-carry adder for 4-bit operands, producing a 4-bit or 5-bit sum.
The ripple-carry adder shown in Fig. 5.3b leads directly to a CMOS implementation with transmission gate logic using the full adder design of Fig. 5.2c. A possible layout is depicted in Fig. 5.4, which also shows the approximate area requirements for the 4-bit ripple-carry adder in units of A (half the minimum feature size). For details of this particular design, refer to [Puck94, pp. 213-223].
The latency of a k-bit ripple-carry adder can be derived by considering the worst-case signal propagation path. As shown in Fig. 5.5, the critical path usually begins at the Xo or Yo input, proceeds through the carry-propagation chain to the leftmost FA, and terminates at the Sk-J output. Of course, it is possible that for some FA implementations, the critical path
5.1 BIT-SERIAL AND RIPPLE-CARRY ADDERS 77
y x
(a) Built of half-adders.
y x
(c) Suitable for CMOS realization.
y x
(b) Built as an AND-OR circuit.
Fig. 5.2 Possible designs for a full adder in terms of half-adders, logic gates. and CMOS transmission gates.
might begin at Co andlor terminate at ci: However, given that the delay from carry-in to carryout is more important than from x to carry-out or from carry-in to s, full-adder designs often minimize the delay from carry-in to carry-out, making the path shown in Fig. 5.5 the one with the largest delay. We can thus write the following expression for the latency of a k-bit ripplecarry adder:
where TrA (input --+ output) represents the latency of a full adder on the path between its specified input and output. As an approximation to the foregoing, we can say that the latency of a ripplecarry adder is kTFA.
We see that the latency grows linearly with k, making the ripple-carry design undesirable for large k or for high-performance arithmetic units. Note that the latency of a bit-serial adder is also O(k), although the constant of proportionality is larger here because of the latching and clocking overheads.
Full and half-adders, as well as multihit binary adders, are powerful building blocks that can also be used in realizing nonarithmctic functions if the need arises. For example, a 4-bit binary adder with Cin, two 4-bit operand inputs, Couto and a 4-bit sum output can be used to synthesize the four-variable logic function w + xyz and its complement, as depicted and justified in Fig. 5.6. The logic expressions written next to the arrows in Fig. 5.6 represent the carries between various stages. Note, however, that the 4-bit adder need not be implemented as a ripple-carry adder for the results at the outputs to be valid.
78 Basic Addition and Counting
(a) Bit-serial adder.
~ ~ ~ ~ ~ (b) Four-bit ripple-carry adder.
Fig.5.3 Using full adders in building bit-serial and ripple-carry adders.
5.2 CONDITIONS AND EXCEPTIONS
When a k-bit adder is used in an ALU, it is customary to provide the k-bit sum along with information about the following outcomes, which are associated with flag bits within a condition/exception register:
Fig.5.4 Layout of a 4-bit ripple-carry adder in CMOS implementation [Puck94].
5.2 CONDITIONS AND EXCEPTIONS 79
Fig.5.5 Critical path in a k-bit ripple-carry adder.
Cout Indicating that a carry-out of 1 is produced
Overflow Indicating that the output is not the correct sum Negative Indicating that the addition result is negative
Zero Indicating that the addition result is zero
When we are adding unsigned numbers, Cout and "overflow" are one and the same, and the "sign" condition is obviously irrelevant. For 2's-complement addition, overflow occurs when two numbers of like sign are added and a result of the opposite sign is produced. Thus:
It is fairly easy to show that overflow in 2's-complement addition can be detected from the leftmost two carries as follows:
In 2's-complement addition, Cout has no significance. However, since a single adder is frequently used to add both unsigned and 2's-complement numbers, Cout is useful as well. Figure 5.7 shows a ripple-carry implementation of an unsigned or 2's-complement adder with auxiliary outputs for conditions and exceptions. Because of the large number of inputs into the NOR gate that tests for zero, it must be implemented as an OR tree followed by an inverter.
Bit 3 o 1
Bit 2 w 1
Bit 1 z 0
Fig. 5.6 Four-bit binary adder used to realize the logic function f = w + xyz and its complement.
Bit 0 y x
w
Cout w+xyz L-'-~~~~~r--L~r-~
o
80 Basic Addition and Counting
Fig.5.7 Two's-complement adder with provisions for detecting conditions and exceptions.
5.3 ANALYSIS OF CARRY PROPAGATION
Various ways of dealing with the carry problem were enumerated in Section 3.1. Some of the methods already discussed include limiting the propagation of carries (hybrid signed-digit, RNS) or eliminating carry propagation altogether (GSD). The latter approach, when used for adding a set of numbers in carry-save form, can be viewed as a way of amortizing the propagation delay of the final conversion step over many additions, thus making the per-add contribution of the carry propagation delay quite small. What remains to be discussed, in this and the following chapter, is how one can speed up a single addition operation involving conventional (binary) operands.
We begin by analyzing how and to what extent carries propagate in adding two binary numbers. Consider the example addition of 16-bit binary numbers depicted in Fig. 5.8, where the carry chains of length 2, 3, 6, and 4 are shown. The length of a carry chain is the number of digit positions from where the carry is generated up to and including where it is finally absorbed or annihilated. A carry chain oflength 0 thus means "no carry production," and a chain of length I means that the carry is absorbed in the next position. We are interested in the length of the longest propagation chain (6 in Fig. 5.8), which dictates the adder's latency.
Given binary numbers with random bit values, for each position i we have:
Bit no. 1 5 14 13 1 2 11 10 9 8 7 6 5 4 3 2 1 0
o
Gout 0 0 1
o 00 01110
1 0 0 1 1 0 0 0 0 1 1 Gin
\___/I____j
4
6
Carry chains and their lengths
3
2
Fig.S.8 Example addition and its carry-propagation chains.
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB