The code given below shows different results, depending on -O or -fno-inline flags. Same (strange) results for g++ 10.1 and 10.2 and clang++ 10 on x86. Is this because the code is ill-formed or is this a genuine bug?
The "invalid" flag in Nakshatra constructor should be set whenever it's nakshatra (double) field is >= 27.0. But, when initialized via Nakshatra(Nirayana_Longitude{360.0}), the flag is not set, even though the the value after scaling becomes exactly 27.0. I assume that the reason is that the argument of 360.0 after scaling becomes 26.9999999999999990008 (raw 0x4003d7fffffffffffdc0) in 80-bit internal register, which is < 27.0, but, being stored as 64-bit double, becomes 27.0. Still, this behavior looks weird: the same nakshatra seems to be simultaneously <27.0 and >= 27.0. Is it the way it's supposed to be?
Is it the expected behaviour because my code contains UB or otherwise ill-formed? Or is it a compiler bug?
Minimal code to reproduce (two .cpp files + one header, could not reproduce with less):
main.cpp:
#include "nakshatra.h"
#include <iostream>
#include <iomanip>
int main() {
Nakshatra n{Nirayana_Longitude{360.0}};
std::cout << std::fixed << std::setprecision(40) << std::boolalpha;
std::cout << n.nakshatra << "\n";
std::cout << "invalid (should be true): " << n.invalid << "\n";
std::cout << "n.nakshatra >= 27.0: " << (n.nakshatra >= 27.0) << "\n";
}
nakshatra.h:
struct Nirayana_Longitude {
double longitude;
};
class Nakshatra
{
public:
double nakshatra;
bool invalid = false;
Nakshatra(double nakshatra_value) : nakshatra(nakshatra_value) {
if (nakshatra < 0.0 || nakshatra >= 27.0) {
invalid = true;
}
}
Nakshatra(Nirayana_Longitude longitude);
};
nakshatra.cpp:
#include "nakshatra.h"
// this constructor has to be implemented in a separate .cpp file to reproduce the bug,
// moving it to ether nakshatra.h or main.c fixes the problem (perhaps due to
// compiler removing the relevant code from runtime).
Nakshatra::Nakshatra(Nirayana_Longitude longitude) : Nakshatra(longitude.longitude * (27.0 / 360.0))
{
}
To compile and run:
$ g++ -O2 main.cpp nakshatra.cpp && ./a.out
or
$ clang++ -O2 main.cpp nakshatra.cpp && ./a.out
(or ./a.exe for Windows/msys2)
Actual output:
27.0000000000000000000000000000000000000000
invalid (should be true): false
n.nakshatra >= 27.0: true
Expected output:
27.0000000000000000000000000000000000000000
invalid (should be true): true
n.nakshatra >= 27.0: true
Compiling with -O, -Og, -O1 or -O2 manifests this strange behaviour, compiling without -O works fine, just like any -O with -fno-inline. Reproduced with g++.exe (Rev5, Built by MSYS2 project) 10.2.0 (Windows 7) and g++ 10.1 in Ubuntu 18.04.5LTS as well as with clang 10 (Linux). Clang 11 under msys2 seems to work fine (did not verify extensively).
Also, I failed to reproduce this behaviour if all code is combined into a single file. Also I have failed to reproduce this in wandbox, even using gcc 10.1, so maybe the CPU used to reproduce this behaviour is relevant: Intel Core i5-660 (3.33GHz). Unfortunately, otherwise excellent compiler exporer doesn't support multiple compilation units, so can't reproduce there.
For completeness, this is one example of the assembly code generated when the strange behavior is shown.
0x00401734 <+0>: fldl 0x404080 ; (27.0/360.0)?
0x0040173a <+6>: fmull 0x4(%esp) ; argument *= (27.0*360.0), giving 26.9999999999999990008 (raw 0x4003d7fffffffffffdc0)
0x0040173e <+10>: fstl (%ecx)
0x00401740 <+12>: movb $0x0,0x8(%ecx) ; set invalid=false
0x00401744 <+16>: fldz
0x00401746 <+18>: fcomip %st(1),%st
0x00401748 <+20>: ja 0x40175a <_ZN9NakshatraC2E18Nirayana_Longitude+38> ; jump if argument < 0.0
0x0040174a <+22>: flds 0x404088 ; 27.0?
0x00401750 <+28>: fxch %st(1)
0x00401752 <+30>: fcomip %st(1),%st
0x00401754 <+32>: fstp %st(0)
0x00401756 <+34>: jb 0x401760 <_ZN9NakshatraC2E18Nirayana_Longitude+44> ; jump if argument is >= 27.0
0x00401758 <+36>: jmp 0x40175c <_ZN9NakshatraC2E18Nirayana_Longitude+40>
0x0040175a <+38>: fstp %st(0)
0x0040175c <+40>: movb $0x1,0x8(%ecx) ; set invalid=true
0x00401760 <+44>: ret $0x8
UPDATE: after reading answers to another question, I see that compiling with either -mfpmath=sse -msse2 -ffp-contract=off or -ffloat-store fixes the bug. Still, the question remains: does g++ and clang 10 diverge from the C++ standard for optimized 32-bit x86 by default or is this behavior permitted by C++ standard? Double being <27.0 and >=27.0 at the same time looks inconsistent to me.