Skip to content
frankpfenning edited this page Aug 11, 2011 · 29 revisions

The type int is the only numeric type in C0. We would like to think of it as the type of integers ..., -3, -2, -1, 0, 1, 2, 3, ... but this is not quite correct because its values only cover a fixed range, namely from -231 to 231-1, inclusively. To understand the reasons requires a brief detour to talk about the underlying machine model; to understand what this means for the usual arithmetic operations requires a brief detour to talk about modular arithmetic and two's complement representation. After this we discuss the arithmetic operators and bit-level operators.

## Machine Model

C0 is a relatively low-level, imperative language that was designed to have a close relationship to the code executed by the processor of your computer. This processor's instructions directly operate on registers that have a fixed size. This native word size is the number of bits (0 or 1) that can be stored and manipulated by single instructions. Throughout history the word size of processors has increased, as design and manufacturing techniques have improved. Modern processors have a word size of 64 bits, although they are often used as if the word size was just 32 bits. In order for the basic operators to map directly to machine instructions, the size of the basic representation of integers in C0 was chosen to be 32 bits.

How many different values of type int are there? One bit can have 2 values (0 or 1), two bits have 2 * 2 = 4 values, three bits have 2 * 2 * 2 = 8 values, etc., since we can chose the bits independently. In general, a word of k bits can have 2k different values. So the type int has 232 = 4,294,967,296 (about 4 billion) different values. If you know something about the binary number system, you might expect these values to be 0, 1, 2, 3, ..., 4,294,967,295, where 0 is represented as 00000000000000000000000000000000 and 4,294,967,295 as 11111111111111111111111111111111. However, this would mean we would have no negative numbers at all (and it would seem pretty convenient to have, for example, -1 as a value). Furthermore, we have to consider what happens if operations such as addition overflow. To understand these two issues, we need a basic understanding of modular arithmetic and two's complement representation.

## Modular Arithmetic

The basic idea behind modular arithmetic is to fix a modulus m > 1 and then restrict numbers to the range 0, 1, 2, ..., m-1. We count as follows: 0, 1, 2, ..., m-1, 0, 1, 2, ..., m-1, 0, ..., so the the successor to m-1 is 0. Another way to think about it is that every arithmetic operation is performed modulo m, that is, we always take the remainder after dividing by m.

For example, if we perform arithmetic modulo 16, then 15+3 = 2 (mod 16) because 15+3 = 18, and 18 divided by 16 leaves a remainder of 2. It is remarkable that almost all the usual arithmetic laws you have learned in high school still hold when we do modular arithmetic. For example, multiplication distributed over addition (a + b) * c = a * c + b * c (mod m) for any modulus m. But we have to be careful with comparisons. For example, it is not necessarily the case that a+1 > a because we wrap around back to 0 (for example, 15+1 < 15 (mod 16)).

The elegant answer to the question how the processor performs arithmetic given a fixed word size w is arithmetic modulo 2w. We just need to figure out how to get negative numbers into the picture...

## Two's Complement Representation

Assume we have the natural numbers 0, 1, 2, .... Then -a is the number b such that a + b = 0. What happens when we apply this idea to modular arithmetic? Let's consider the case of m = 16. What would be -1? It would be the number b such that 1 + b = 0 (mod 16). Perhaps surprisingly such a number exists, namely 15. So -1 = 15 (mod 16). That make sense, since we can always add or subtract 16 from any number and get the same number modulo 16. What's -2? Following the same reasoning, we have -2 = 14 (mod 16) since 2 + 14 = 0 (mod 16). So we don't have to invent the negative numbers; they already exist in modular arithmetic!

Taking things a little further, what is -15? Since 15 + 1 = 0 (mod 16) we see that -15 = 1 (mod 16). We are looking at a table like this, where the number is each row are equal modulo 16.

... -16 0 16 ...
... -15 1 17 ...
... -14 2 18 ...
... -13 3 19 ...
... -12 4 20 ...
... -11 5 21 ...
... -10 6 22 ...
... -9 7 23 ...
... -8 8 24 ...
... -7 9 25 ...
... -6 10 26 ...
... -5 11 27 ...
... -4 12 28 ...
... -3 13 29 ...
... -2 14 30 ...
... -1 15 31 ...
Arithmetic modulo 16 corresponds to a fictional machine with word size 4, because 24 = 16. On such a machine there are 16 different bit patterns, from `0000` to `1111`. For each row in the table we have to decide which number we want it to represent. The obvious choice would be to take the middle column, which gives us the integers 0, 1, ..., 15. This is the so-called *unsigned* representation which is available in C but not in C0. Another choice is to have an equal number of positive (including 0) and negative integers. In that case we would have -8, -7, ..., -1, 0, 1, ..., 7. This is the choice highlighted in bold in the table. This is called the *two's complement representation*.

Two's complement representation has many remarkable properties. Several important ones to remember:

  • The underlying arithmetic operations, such as addition and multiplication, are simply performed in modular arithmetic. No sign bit or other information needs to be taken into account.
  • The range of numbers is from -2k-1 to 2k-1-1 for a word size of k (and therefore a modulus of 2k).
  • The most significant bit of the underlying binary representation is 1 for negative numbers, 0 for positive numbers (including 0).
  • We can compute -a by complementing each bit in the representation of a and then adding 1.
## Arithmetic Operators

The arithmetic operators of C0 are

`-e` unary minus
`e1 * e2` multiplication
`e1 / e2` integer division
`e1 % e2` integer modulus
`e1 + e2` addition
`e1 - e2` subtraction

Clone this wiki locally