TITLE: An Adventure with C++ Compilers AUTHOR: Eugene Wallingford DATE: November 01, 2016 4:04 PM DESC: ----- BODY: I am a regular reader of John Regehr's blog, which provides a steady diet of cool compiler conversation. One of Regehr's frequent topics is undefined behavior in programming languages, and what that means for implementing and testing compilers. A lot of those blog entries involve C and C++, which I don't use all that often any more, so reading them is more spectator sport than contact sport. This week, I got see how capricious C++ compilers can feel up close. My students are implementing a compiler for a simple subset of a Pascal-like language. We call the simplest program in this language print-one:
    $ cat print-one.flr
    program main();
      begin
        return 1
      end.
One of the teams is writing their compiler in C++. The team completed its most recent version, a parser that validates its input or reports an error that renders its input invalid. They were excited that it finally worked:
    $ g++ -std=c++14 compiler.cpp -o compiler
    $ compiler print-one.flr 
    Valid flair program
They had tested their compiler on two platforms: I sat down at my desktop computer to exercise their compiler.
    $ g++ compiler.cpp -o compiler
    In file included from compiler.cpp:7:
    In file included from ./parser.cpp:3:
    In file included from ./ast-utilities.cpp:4:
    ./ast-utilities.hpp:7:22: warning: in-class initialization of non-static data
          member is a C++11 extension [-Wc++11-extensions]
        std::string name = "Node";
                    ^
    [...]
    24 warnings generated.
Oops, I forgot the -std=c++14 flag. Still, it compiled, and all of the warnings come from a part of the code has no effect on program validation. So I tried the executable:
    $ compiler print-one.flr 
    ERROR at line #3 -- unexpected <invalid>  1
    Invalid flair program
Hmm. The warnings are unrelated to part of the executable that I am testing, but maybe they are creating a problem. So I recompile with the flag:
    $ g++ -std=c++14 compiler.cpp -o compiler
    error: invalid value 'c++14' in '-std=c++14'
What? I check my OS and compiler specs:
    $ sw_vers -productVersion
    10.9.5
    $ g++ --version
    Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
    Apple LLVM version 6.0 (clang-600.0.57) (based on LLVM 3.5svn)
    [...]
Oh, right, Apple doesn't ship gcc any more; it ships clang and link gcc to the clang exe. I know my OS is a bit old, but it still seems odd that the -std=c++14 flag isn't supported. I google for an answer (thanks, StackOverflow!) and find that that I need to use -std=c++1y. Okay:
    $ g++ -std=c++1y compiler.cpp -o compiler
    $ compiler print-one.flr 
    ERROR at line #3 -- unexpected <invalid>  1
    Invalid flair program
Now the student compiler compiles but gives incorrect, or at least unintended, behavior. I'm surprised that both my clang and the students' gcc compile their compiler yet produce executables that give different answers. I know that gcc and clang aren't 100% compatible, but my students are using a relatively small part of C++. How can this be? Maybe it has something to do with how clang processes the c++1y standard flag. So I backed up to the previous standard:
    $ g++ -std=c++0x compiler.cpp -o compiler
    $ compiler print-one.flr 
    ERROR at line #3 -- unexpected <invalid>  1
    Invalid flair program
Yes, that's c++0x, not c++0y. The student compiler still compiles and still gives incorrect or unintended behavior. Maybe it is a clang problem? I upload their code to our student server, which runs Linux and gcc:
    $ cat /etc/debian_version 
    8.1
    $ g++ --version
    [...]
    gcc version 4.7.2 (Debian 4.7.2-5)
This version of gcc doesn't support either c++14 or c++1?, so I fell back to c++0x:
    $ g++ -std=c++0x compiler.cpp -o compiler
    $ compiler print-one.flr 
    Valid flair program
Hurray! I can test their code. I'm curious. I have a Macbook Pro running a newer version of OS X. Maybe...
    $ sw_vers -productVersion
    ProductName:Mac OS X
    ProductVersion:10.10.5
    BuildVersion:14F2009
    $ g++ --version
    Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/usr/include/c++/4.2.1
    Apple LLVM version 7.0.2 (clang-700.1.81)
    [...]

    $ g++ -std=c++14 compiler.cpp -o compiler
    $ compiler print-one.flr 
    Valid flair program
Now, the c++14 flag works, and it produces a compiler that produces the correct behavior -- or at least the intended behavior. I am curious about this anomaly, but not curious enough to research the differences between clang and gcc, the differences between the different versions of clang, or what Apple or Debian are doing. I'm also not curious enough to figure out which nook of C++ my students have stumbled into that could expose a rift in the behavior of these various C++ compilers, all of which are standard tools and pretty good. At least now I remember what it's like to program in a language with undefined behavior and can read Regehr's blog posts with a knowing nod of the head. -----