Floating point issue & how to escape it.

Computers are better than humans in calculations. But here comes the floating point maths issue.

As we all know when we do the following math  1.2 - 1.0 = 0.2

0.2 is the desiered result. but when we try to do it with programming languages using native types. we get the result as 0.199999999999999996 . It’s a problem caused when the internal representation of floating-point numbers, which uses a fixed number of binary digits to represent a decimal number. It is difficult to represent some decimal number in binary, so in many cases, it leads to small roundoff errors.We know similar cases in decimal math, there are many results that can’t be represented with a fixed number of decimal digits.

For example : In a case where you have to add 5 + 10 = 15

But in case of decimal number things are different

For example if we try to calculate 1.2 - 1.0 you know we will get the result as 0.2 but if we give the same maths to a computer the result would be different. For example when a computer calculates the same maths we get the result as 0.199999999999999996. Which is bulllshit according to humans. This happens with numbers when a computer does decimal calculations.

This is because humans mostly calculate maths based on the base of 10. 0 to 9 can fit in any decimal place and each digit that you increment to the left represents a multiple of 10 and digits to the right represents the division of 10. If the system is base 10 we can use the fraction of the base factor of base. In the base 10 system we can express fractions that are prime factors of base. Base factors of 10 are 5 and 2. So ½ , ¼, ⅕ , ⅛, 1/10 these can be expressed properly. But when we try to express more precise calculations. So if you take a fraction that cant be expressed in base 10 like ⅓ that is 0.3 so in this case 0.3 times 0.3 becomes 0.9. Or 0.33 three times3  is 0.99. One third is a number that can get closer to the correct value but never there.  

In the case of computers they use binary. Eight represented in binary is 1000 so you need four decimal places in base 2 to represent it. The same way 255 in binary is 11111111  so you need 8 decimal places to represent 255. It is the highest number that can be fit inside 8 decimal places. Each time that you add another decimal place to the lest in base  2 the maximum value doubles and each time to the right a value halves. In binary we can represent ½ , ¼, ⅕ , ⅛  since it is half. But in case of ⅕ and 1/10 the problem occurs.

For example if we use 0.1 + 0.3 = 0.4 but if we add 0.1+0.2 we get 0.3000000000000004. This problem may sound minute but here is where the problem occurs (0.3*3)==0.9 the response would be false. This is just a case. This is caused due to a rounding off issue.

Solutions:

  1. Use doubt precision floats
  2. Rounding off ( but this will change from platform to platform )
  3. If precision if very important you should store the value as integer & use arthemitic to adjust based on your needs.