"); p.document.close(); } function p2(){ p.window.close(); } function tip(){ bn=window.open('','','toolbar=no,width=600,height=450,scrollbars=yes,top=0,left=0'); bn.document.open(); bn.document.writeln("binary numbers"); bn.document.writeln("Decimal to Binary                                                      Binary to Decimal
Consider the Decimal Number 324                         Consider the Binary Number 101000100"); bn.document.writeln("
      2 | 324 | 0                                                                Decimal Equivalent = "); bn.document.writeln("
      2 | 162 | 0                                                                    1 * 2 8 + 0 * 2 7 + 1 * 2 6 + "); bn.document.writeln("
      2 |   81 | 1                                                                    0 * 2 5 + 0 * 2 4 + 0 * 2 3 + "); bn.document.writeln("
      2 |   40 | 0                                                                    1 * 2 2 + 0 * 2 1 + 0 * 2 0"); bn.document.writeln("
      2 |   20 | 0                                                                  =1 * 2 8 + 1 * 2 6 + 1 * 2 2"); bn.document.writeln("
      2 |   10 | 0                                                                  =256 + 64 + 4"); bn.document.writeln("
      2 |     5 | 1                                                                  =324"); bn.document.writeln("
      2 |     2 | 0"); bn.document.writeln("
             1"); bn.document.writeln("
   Binary Equivalent = 101000100"); bn.document.writeln("
"); bn.document.writeln("Binary conversion for Decimal fraction                 Decimal conversion for Binary fraction
Consider the Decimal Number 0. 359375                Consider the Binary Number . 010111"); bn.document.writeln("
         | . 359375 × 2 |                                                      Decimal Equivalent = "); bn.document.writeln("
      0 | . 718750 × 2 |                                                          0 * 2 -1 + 1 * 2 -2 + 0 * 2 -3 + "); bn.document.writeln("
      1 | . 437500 × 2 |                                                          1 * 2 -4 + 1 * 2 -5 + 1 * 2 -6"); bn.document.writeln("
      0 | . 875000 × 2 |                                                        =1 * 2 -2 + 1 * 2 -4 + 1 * 2 -5 + 1 * 2 -6"); bn.document.writeln("
      1 | . 750000 × 2 |                                                        =0. 25 + 0. 0625 + 0. 03125 + 0. 015625"); bn.document.writeln("
      1 | . 500000 × 2 |                                                        =. 359375"); bn.document.writeln("
      1 | . 000000 × 2 |                                                                   "); bn.document.writeln("
   Binary Equivalent = . 010111"); bn.document.writeln("

Similarly for Octal, Decimal & Hexadecimal substitute 8, 10 & 16 respectively instead of 2
"); bn.document.writeln("
"); bn.document.writeln("
"); bn.document.close(); }
Truncation Errors

These are the errors due to approximate formulae used in the computations.

Example :

Assume that a function ' f ' and all its higher order derivatives with respect to the independent variable 'x ' at the point, say  x = x0 are known. Now  to  find  the function value at a neighbouring point, say x = x 0 + dx ,  one can use the Taylor series expansion for the function f at x0 + dx as
f (x0 + dx) = f (x 0) + dx * f ' (x0) + dx 2 / 2! * f '' (x 0) + . . .

the right hand side of the above equation is  an  infinite  series  and  one  has  to truncate it after some finite number of terms to calculate  f (x0 + dx)  either  with computer or by hand calculations. Hence the value obtained is only an approximation to f (x 0 + dx).

Numerical Example :

  • Let f (x) = x + e x ,    At x = 0.5  f (x) = 2.14872127070013
  •     The Taylor series expansion for f (0.6) = f (0.5 + 0.1) is given by

        f (0.5 + 0.1) = 

        f (0.5) + (0.1) * f ' (0.5) = 2.41359339777014
        f (0.5) + (0.1) * f ' (0.5) + (0.1) 2 * f '' (x) / 2! = 2.42183700412364
        f (0.5) + (0.1) * f ' (0.5) + (0.1) 2 * f '' (x) / 2! + (0.1) 3 * f ''' (x) / 3! = 2.42211179100209
        f (0.5) + (0.1) * f ' (0.5) + (0.1) 2 * f '' (x) / 2! + (0.1) 3 * f ''' (x) / 3! + (0.1) 4 * f '''' (x) / 4! = 2.42211866067405

    The exact value correct upto 14 decimal places is 2.42211880039051
    Round off Errors

    These are the  errors  which arise as a result of  rounding  or chopping the numbers by the computer. To understand this rounding or chopping of numbers by the computer one needs to have a knowledge of represtation of real numbers in computers. In general, the real numbers are stored as floating point quantities in computers; i.e., the fixed number  39.428  is stored as 0.39428*102 (normalized form) or 0.00248 is stored as 0.248*10-2

    Though Different computers use slightly different techniques, the general procedure is similar. The floating point numbers have four things to understand,

    1. base
    2. sign (requires one  bit)
    3. fractional part or mantissa ( consumes large number of bits of the available 32, 64 or 80 bits)
    4. Exponent part or characteristic.

    The first one represents the number  system ( binary,  octal,  decimal  or  hexadecimal) which does not need any space to store. The rest of the three parts have  a  fixed  length that is often 32, 64 or 80(or more) depending on the precision of the representation. Symbolically the same   is represented as

    ± .d1d 2...dp * Be

    where di's are digits or bits with values from zero to B-1 and
    B = Base that is used 
    P = the number of significant bits
    e = an integer exponent

    The significant bits constitute the fractional part of the numbers. In the normalized  floating points which uses a binary system, the first digit of the fractional part is always 1.  Some  systems take advantage of this fact and do not store the  first  bit, that  is  gaining  one  bit  of precision.  This first suppressed bit  is called the hidden bit.

    IEEE, VAX, IBM standards are some of the methods followed by the  computers  to store these floating point numbers.  IEEE and VAX uses the binary number system where as IBM uses the hexadecimal system.  Let us now see the  IEEE  system  in detail.

    IEEE Method
    Total 
    # bits
    # bits 
    for p
    # bits 
    for e
    Bias** 
    value
    Max. 
    Exponent
    Min.
    Exponent
    Largest 
    Dec. No.
    Smallest 
    Dec. No.
    App. Prec.
    of Digits
    Single
    32
    23*
    8
    127
    127
    -126
    1.701E38
    1.755E-38
    7
    Double
    64
    52*
    11
    1023
    1023
    -1022
    8.988E307
    2.225E-308
    16
    Extended
    80
    64
    15
    16383
    16383
    -16382
    6E4931
    3E-4931
    19
    *plus one hidden bit
    **the bias value is added to the exponent for unsigned numbers

    (Diagram is not to the scale)
    Its clear from the above table that any computer which adopts  the  IEEE  notation   to store  the floating point numbers has its own bounds. These bounds for a single precision variable  are -(1-2 -24) * 2127 to -2-126 for negative numbers and 2-126 to (1-2-24) * 2127 on  the positive side.  Also,  there  exist  gaps  between  numbers in these two intervals. To understand   this let us consider a small number system in binary i.e., B = 2  with p =2, -1 < e < 2. For   this system the possible normalized numbers are either  ± .102 * 2e or ± .11 2 * 2e,    -1 < e < 2. So the smallest  and largest numbers  with this number system are respectively

              -.11 * 22 = - 1 * 2-1+1 * 2-2 * 22 =  - (1/2 + 1/4) * 4 =  -3
               .11 * 22 = 3.

    The list of all the +ve numbers are

    .102 * 2-1 = 1/4                                  .112 * 2-1 = 3/8
    .102 * 2-1 = 1/2                                  .112 * 2-1 = 3/4

              .102 * 2-1 = 1                                     .11 2 * 2-1 = 3/2
              .10
    2 * 2-1 = 2                                     .112 * 2-1 = 3

    Similarly  the -ve numbers are also distributed. The complete list of possible numbers are now


    (Diagram is not to the scale)
    So for the considered number system, all the numbers lie in the intervals [-3, -1/4] and [1/4, 3] and the above are the only possible numbers. That is any number between 2 and 3 will be chopped off or rounder to either to 2 or to 3. Similarly for the IEEE method also their are gaps in the two intervals though these gaps are very small compared to the one presented above.

    It is also not possible to convert some numbers say 0.6, into  binary form  without some round off errors.

    Because of the above mentioned gaps in representation and rounding or chopping in conversions, round off errors are inevitable in computer arithmetic.