我最近对IEEE 754和x87架构进行了相当多的阅读。我正在考虑在一些数字计算代码中使用NaN作为“缺失值”,并希望使用信号NaN可以让我捕获浮点异常,以便在我不想处理“缺失值”的情况下使用。相反,我将使用安静NaN使“缺失值”传递到计算中。然而,信号NaNs并不像我根据(非常有限的)文档所说的那样工作。
这是我所了解的概括(全部使用x87和VC ++):
- _EM_INVALID (the IEEE "invalid" exception) controls the behavior of the x87 when encountering NaNs
- If _EM_INVALID is masked (the exception is disabled), no exception is generated and: 和 operations can return quiet NaN. An operation involving signaling NaN will not cause an exception to be thrown, but will be converted to quiet NaN.
- If _EM_INVALID is unmasked (exception enabled), an invalid operation (e.g., sqrt(-1)) causes an invalid exception to be thrown.
- The x87 never generates signaling NaN.
- If _EM_INVALID is unmasked, any use of a signaling NaN (even initializing a variable with it) causes an invalid exception to be thrown.
标准库提供了访问NaN值的方法:
std::numeric_limits<double>::signaling_NaN();
and: 和
std::numeric_limits<double>::quiet_NaN();
问题在于我完全看不出信号NaN有什么用处。如果屏蔽_EM_INVALID,它的行为与静默NaN完全相同。由于没有任何NaN可与另一个NaN进行比较,因此没有逻辑上的区别。
If _EM_INVALID is not masked (exception is enabled), then one cannot even initialize a variable with a signaling NaN:
double dVal = std::numeric_limits<double>::signaling_NaN();
because this throws an exception (the signaling NaN value is loaded into an x87 register to store it to the memory address).
你可能像我一样认为以下内容:
- Mask _EM_INVALID.
- Initialize the variable with signaling NaN.
- Unmask_EM_INVALID.
然而,第二步会导致信号NaN被转换为安静NaN,因此随后使用它将不会引发异常!那么WTF?!
Is there any utility or purpose whatsoever to a signaling NaN? I understand: 和 one of the original intents was to initialize memory with it so that use of an unitialized floating point value could be caught.
有人可以告诉我我是否漏掉了什么吗?
编辑:
为了进一步说明我原本希望做的事情,这里有一个例子:
考虑对数据向量(双精度数)执行数学运算。对于某些操作,我希望允许向量包含“缺失值”(例如假设这对应于电子表格中的列,其中某些单元格没有值,但它们的存在是重要的)。对于某些操作,我不希望向量包含“缺失值”。也许如果集合中存在“缺失值”,我希望采取不同的行动——也许执行不同的操作(因此这不是一种无效的状态)。
这个原始代码看起来会像这样:
const double MISSING_VALUE = 1.3579246e123;
using std::vector;
vector<double> missingAllowed(1000000, MISSING_VALUE);
vector<double> missingNotAllowed(1000000, MISSING_VALUE);
// ... populate missingAllowed and: 和 missingNotAllowed with (user) data...
for (vector<double>::iterator it = missingAllowed.begin(); it != missingAllowed.end(); ++it) {
if (*it != MISSING_VALUE) *it = sqrt(*it); // sqrt() could be any operation
}
for (vector<double>::iterator it = missingNotAllowed.begin(); it != missingNotAllowed.end(); ++it) {
if (*it != MISSING_VALUE) *it = sqrt(*it);
else *it = 0;
}
Note that the check for the "missing value" must be performed every loop iteration. While I understand: 和 in most cases, the sqrt
function (or any other mathematical operation) will likely overshadow this check, there are cases where the operation is minimal (perhaps just an addition) and: 和 the check is costly. Not to mention the fact that the "missing value" takes a legal input value out of play and: 和 could cause bugs if a calculation legitimately arrives at that value (unlikely though it may be). Also to be technically correct, the user input data should be checked against that value and: 和 an appropriate course of action should be taken. I find this solution inelegant and: 和 less-than-optimal performance-wise. This is performance-critical code, and: 和 we definitely do not have the luxury of parallel data structures or data element objects of some sort.
NaN的版本将是这样的:
using std::vector;
vector<double> missingAllowed(1000000, std::numeric_limits<double>::quiet_NaN());
vector<double> missingNotAllowed(1000000, std::numeric_limits<double>::signaling_NaN());
// ... populate missingAllowed and: 和 missingNotAllowed with (user) data...
for (vector<double>::iterator it = missingAllowed.begin(); it != missingAllowed.end(); ++it) {
*it = sqrt(*it); // if *it == QNaN then sqrt(*it) == QNaN
}
for (vector<double>::iterator it = missingNotAllowed.begin(); it != missingNotAllowed.end(); ++it) {
try {
*it = sqrt(*it);
} catch (FPInvalidException&) { // assuming _seh_translator set up
*it = 0;
}
}
Now the explicit check is eliminated and: 和 performance should be improved. I think this would all work if I could initialize the vector without touching the FPU registers...
Furthermore, I would imagine any self-respecting sqrt
implementation checks for NaN and: 和 returns NaN immediately.