
浮点数的存储格式可参考:Single-precision floating-point format 32 bitsDouble-precision floating-point format 64 bits

  • 单精度浮点型float,通常32位,至少有6位有效数字,取值范围10^-38 - 10^38
  • 双精度浮点型double,通常64位,15-17位有效数字,取值范围10^-308 - 10^308
  • 多精度浮点型long double,精度更高
  • A signed 32-bit integer variable has a maximum value of 2^31 − 1 = 2,147,483,647; An IEEE 754 32-bit base-2 floating-point variable has a maximum value of (2 − 2^−23) × 2^127 ≈ 3.4028235 × 10^38



The portable way to get epsilon in C++ is:

#include <limits>

Then the comparison function becomes:

#include <cmath>
#include <limits>

bool AreSame(double a, double b) {
    return std::fabs(a - b) < std::numeric_limits<double>::epsilon();


double a = 12.03;
double b = 22; 
long long c = a * b * 100000000L;
printf("c[%lld]\n", c);              // 26465999999
c = a * 100000000L * b;
printf("c[%lld]\n", c);              // 26466000000


Python 2.7.5 (default, Jun 17 2014, 18:11:42) 
[GCC 4.8.2 20140120 (Red Hat 4.8.2-16)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 1.1 + 0.1
  • Actually, the error is because there is no way to map 0.1 to a finite binary floating point number.
  • Most fractions can’t be converted to a decimal with exact precision. A good explanation is here: Floating Point Arithmetic: Issues and Limitations

What can I do to avoid this problem? That depends on what kind of calculations you’re doing.

  • If you really need your results to add up exactly, especially when you work with money: use a special decimal datatype.
  • If you just don’t want to see all those extra decimal places: simply format your result rounded to a fixed number of decimal places when displaying it.
  • If you have no decimal datatype available, an alternative is to work with integers, e.g. do money calculations entirely in cents. But this is more work and has some drawbacks.



If you are looking for data type supporting money / currency then try this: decimal_for_cpp

The cpp_dec_float back-end is used in conjunction with number: It acts as an entirely C++ (header only and dependency free) floating-point number type that is a drop-in replacement for the native C++ floating-point types, but with much greater precision.

#include <iostream>
#include <iomanip>
#include <boost/multiprecision/cpp_dec_float.hpp>

int main()
    namespace mp = boost::multiprecision;
    // here I'm using a predefined type that stores 100 digits,
    // but you can create custom types very easily with any level
    // of precision you want.
    typedef mp::cpp_dec_float_100 decimal;

    decimal tiny("0.0000000000000000000000000000000000000000000001");
    decimal huge("100000000000000000000000000000000000000000000000");
    decimal a = tiny;         

    while (a != huge)
        std::cout << std::fixed << a << '\n';
        a *= 10;

    // (10000000000 - 5000000000) * 2.01 = 10049999999.999998
    cpp_dec_float_50 a(std::to_string(2.01));
    cpp_dec_float_50 b = ((10000000000 - 5000000000)) * a;
    long long c = b.convert_to<long long>();



浮点数的精度有限。尽管取决于系统,PHP 通常使用 IEEE 754 双精度格式,则由于取整而导致的最大相对误差为 1.11e-16。非基本数学运算可能会给出更大误差,并且要考虑到进行复合运算时的误差传递。此外,以十进制能够精确表示的有理数如 0.10.7,无论有多少尾数都不能被内部所使用的二进制精确表示,因此不能在不丢失一点点精度的情况下转换为二进制的格式。这就会造成混乱的结果:例如,floor((0.1+0.7)*10) 通常会返回7而不是预期中的8,因为该结果内部的表示其实是类似7.9999999999999991118...


浮点数的字长和平台相关,尽管通常最大值是 1.8e308 并具有 14 位十进制数字的精度(64 位 IEEE 格式)。


LNUM          [0-9]+
DNUM          ([0-9]*[\.]{LNUM}) | ({LNUM}[\.][0-9]*)
EXPONENT_DNUM [+-]?(({LNUM} | {DNUM}) [eE][+-]? {LNUM})





System.out.println(0.05 + 0.01);  // 0.060000000000000005
System.out.println(1.0 - 0.42);   // 0.5800000000000001
System.out.println(4.015 * 100);  // 401.49999999999994
System.out.println(123.3 / 100);  // 1.2329999999999999


// 构造函数
BigDecimal(int);       // 创建一个具有参数,所指定整数值的对象
BigDecimal(double);    // 创建一个具有参数,所指定双精度值的对象
BigDecimal(long);      // 创建一个具有参数,所指定长整数值的对象
BigDecimal(String);    // 创建一个具有参数,所指定以字符串表示的数值的对象

// 方法                    
add(BigDecimal);       // BigDecimal对象中的值相加,然后返回这个对象
subtract(BigDecimal);  // BigDecimal对象中的值相减,然后返回这个对象
multiply(BigDecimal);  // BigDecimal对象中的值相乘,然后返回这个对象
divide(BigDecimal);    // BigDecimal对象中的值相除,然后返回这个对象
toString();            // 将BigDecimal对象的数值转换成字符串
doubleValue();         // 将BigDecimal对象中的值以双精度数返回
floatValue();          // 将BigDecimal对象中的值以单精度数返回
longValue();           // 将BigDecimal对象中的值以长整数返回
intValue();            // 将BigDecimal对象中的值以整数返回

注意,在使用BigDecimal时,使用它的BigDecimal(String)构造器创建对象才有意义。其他的如BigDecimal b = new BigDecimal(1)这种,还是会发生精度丢失的问题。


    /* The results of this constructor can be somewhat unpredictable.  
     * One might assume that writing {@codenew BigDecimal(0.1)} in  
     * Java creates a {@code BigDecimal} which is exactly equal to  
     * 0.1 (an unscaled value of 1, with a scale of 1), but it is  
     * actually equal to  
     * 0.1000000000000000055511151231257827021181583404541015625.  
     * This is because 0.1 cannot be represented exactly as a  
     * {@codedouble} (or, for that matter, as a binary fraction of  
     * any finite length).  Thus, the value that is being passed  
     * <i>in</i> to the constructor is not exactly equal to 0.1,  
     * appearances notwithstanding.  
        * When a {@codedouble} must be used as a source for a  
     * {@code BigDecimal}, note that this constructor provides an  
     * exact conversion; it does not give the same result as  
     * converting the {@codedouble} to a {@code String} using the  
     * {@link Double#toString(double)} method and then using the  
     * {@link #BigDecimal(String)} constructor.  To get that result,  
     * use the {@codestatic} {@link #valueOf(double)} method.  
     * </ol>  
public BigDecimal(double val) {  


import java.math.BigDecimal;

public class Main {
        public static void main(String[] args) {
                System.out.println("Hello, World!");

                BigDecimal a = new BigDecimal(1.01);
                BigDecimal b = new BigDecimal(1.02);
                BigDecimal c = new BigDecimal("1.01");
                BigDecimal d = new BigDecimal("1.02");
                System.out.println(a.add(b)); // 2.0300000000000000266453525910037569701671600341796875
                System.out.println(c.add(d)); // 2.03