English 中文(简体)
原标题:Usefulness of signaling NaN?

我最近对IEEE 754和x87架构进行了相当多的阅读。我正在考虑在一些数字计算代码中使用NaN作为“缺失值”,并希望使用信号NaN可以让我捕获浮点异常,以便在我不想处理“缺失值”的情况下使用。相反,我将使用安静NaN使“缺失值”传递到计算中。然而,信号NaNs并不像我根据(非常有限的)文档所说的那样工作。

这是我所了解的概括(全部使用x87和VC ++):

  • _EM_INVALID (the IEEE "invalid" exception) controls the behavior of the x87 when encountering NaNs
  • If _EM_INVALID is masked (the exception is disabled), no exception is generated and: 和 operations can return quiet NaN. An operation involving signaling NaN will not cause an exception to be thrown, but will be converted to quiet NaN.
  • If _EM_INVALID is unmasked (exception enabled), an invalid operation (e.g., sqrt(-1)) causes an invalid exception to be thrown.
  • The x87 never generates signaling NaN.
  • If _EM_INVALID is unmasked, any use of a signaling NaN (even initializing a variable with it) causes an invalid exception to be thrown.



and: 和



If _EM_INVALID is not masked (exception is enabled), then one cannot even initialize a variable with a signaling NaN: double dVal = std::numeric_limits<double>::signaling_NaN(); because this throws an exception (the signaling NaN value is loaded into an x87 register to store it to the memory address).


  1. Mask _EM_INVALID.
  2. Initialize the variable with signaling NaN.
  3. Unmask_EM_INVALID.


Is there any utility or purpose whatsoever to a signaling NaN? I understand: 和 one of the original intents was to initialize memory with it so that use of an unitialized floating point value could be caught.






const double MISSING_VALUE = 1.3579246e123;
using std::vector;

vector<double> missingAllowed(1000000, MISSING_VALUE);
vector<double> missingNotAllowed(1000000, MISSING_VALUE);

// ... populate missingAllowed and: 和 missingNotAllowed with (user) data...

for (vector<double>::iterator it = missingAllowed.begin(); it != missingAllowed.end(); ++it) {
    if (*it != MISSING_VALUE) *it = sqrt(*it); // sqrt() could be any operation

for (vector<double>::iterator it = missingNotAllowed.begin(); it != missingNotAllowed.end(); ++it) {
    if (*it != MISSING_VALUE) *it = sqrt(*it);
    else *it = 0;

Note that the check for the "missing value" must be performed every loop iteration. While I understand: 和 in most cases, the sqrt function (or any other mathematical operation) will likely overshadow this check, there are cases where the operation is minimal (perhaps just an addition) and: 和 the check is costly. Not to mention the fact that the "missing value" takes a legal input value out of play and: 和 could cause bugs if a calculation legitimately arrives at that value (unlikely though it may be). Also to be technically correct, the user input data should be checked against that value and: 和 an appropriate course of action should be taken. I find this solution inelegant and: 和 less-than-optimal performance-wise. This is performance-critical code, and: 和 we definitely do not have the luxury of parallel data structures or data element objects of some sort.


using std::vector;

vector<double> missingAllowed(1000000, std::numeric_limits<double>::quiet_NaN());
vector<double> missingNotAllowed(1000000, std::numeric_limits<double>::signaling_NaN());

// ... populate missingAllowed and: 和 missingNotAllowed with (user) data...

for (vector<double>::iterator it = missingAllowed.begin(); it != missingAllowed.end(); ++it) {
    *it = sqrt(*it); // if *it == QNaN then sqrt(*it) == QNaN

for (vector<double>::iterator it = missingNotAllowed.begin(); it != missingNotAllowed.end(); ++it) {
    try {
        *it = sqrt(*it);
    } catch (FPInvalidException&) { // assuming _seh_translator set up
        *it = 0;

Now the explicit check is eliminated and: 和 performance should be improved. I think this would all work if I could initialize the vector without touching the FPU registers...

Furthermore, I would imagine any self-respecting sqrt implementation checks for NaN and: 和 returns NaN immediately.










A signalling NaN is represented by any bit pattern between 7FF0000000000001 and 7FF7FFFFFFFFFFFF or between FFF0000000000001 and FFF7FFFFFFFFFFFF

A quiet NaN is represented by any bit pattern between 7FF8000000000000 and 7FFFFFFFFFFFFFFF or between FFF8000000000000 and FFFFFFFFFFFFFFFF






const uint64_t sNan = 0xFFF7FFFFFFFFFFFF;
double[] myData;
uint64_t* copier = (uint64_t*) &myData[index];
*copier = sNan & ~myErrorFlags;

