File Conventions

There are some strict conventions to obey regarding the files and their contents.

Rule: One class LikeThis per files like-this.*

Each class LikeThis is implemented in a single set of file named like-this.*. Note that the mixed case class names are mapped onto lower case words separated by dashes.

There can be exceptions, for instance auxiliary classes used in a single place do not need a dedicated set of files.

Rule: *.hh: Declarations

The *.hh should contain only declarations, i.e., prototypes, extern for variables etc. Inlined short methods are accepted when there are few of them, otherwise, create an *.hxx file. The documentation should be here in the *.hh too.

There is no good reason for huge objects to be defined here.

As much as possible, avoid including useless headers (GotW007, GotW034):

  • when detailed knowledge of a class is not needed, instead of

    #include <foo.hh>
    

    write

    // Fwd decl.
    class Foo;
    

    or better yet: use the appropriate fwd.hh file (read below).

  • if you need output streams, then include ostream, not iostream. Actually, if you merely need to declare the existence of streams, you might want to include iosfwd.

Relevant C++ Core Guidelines:
Rule: *.hxx: Inlined definitions

Some definitions should be loaded in different places: templates, inline functions etc. Declare and document them in the *.hh file, and implement them in the *.hxx file. The *.hh file last includes the *.hxx file, conversely *.hxx first includes *.hh. Read below.

Relevant C++ Core Guidelines:
Rule: *.cc: Definitions of functions and variables

Big objects should be defined in the *.cc file corresponding to the declaration/documentation file *.hh.

There are less clear cut cases between *.hxx and *.cc. For instance short but time consuming functions should stay in the *.cc files, since inlining is not expected to speed up significantly. As another example features that require massive header inclusions are better defined in the *.cc file.

As a concrete example, consider the accept methods of the AST classes. They are short enough to be eligible for an *.hxx file:

void LetExp::accept(Visitor& v)
{
  v(*this);
}

We will leave them in the *.cc file though, since this way only the *.cc file needs to load ast/visitor.hh; the *.hh is kept short, both directly (its contents) and indirectly (its includes).

Rule: Explicit template instantiation

There are several strategies to compile templates. The most common strategy consists in leaving the code in a *.hxx file, and letting every user of the class template instantiate the code. While correct, this approach has several drawbacks:

  • Because the *.hh file includes the *.hxx file, each time a simple declaration of a template is needed, the full implementation comes with it. And if the implementation requires other declarations such as std::iostream, you force all the client code to parse the iostream header!

  • The instantiation is performed several times, which is time and space consuming.

  • The dependencies are tight: the clients of the template depend upon its implementation.

To circumvent these problems, we may control template instantiations using explicit template instantiation definitions (available since C++ 1998) and declarations (introduced by C++ 2011).

This mechanism is compatible with the way templates are usually handled in the Tiger compiler, i.e., where both template declarations and definitions are accessible from the included header, though often indirectly (see above). We use the following two-fold strategy:

  • First, we add an explicit template definition in the implementation file of the template’s client (for instance temp/temp.cc) to instruct the compiler that we want to instantiate a template (e.g. misc::endo_map<T>) for a given (set of) parameter(s) (e.g. temp::Temp) in this compilation unit (temp/temp.o). This explicit template definition is performed using a template clause.

/**
 ** \file temp/temp.cc
 ** \brief temp::Temp.
 */

#include <temp/temp.hh>

// ...

namespace misc
{
  // Explicit template instantiation definition to generate the code.
  template class endo_map<temp::Temp>;
}
  • Then, we block the automatic (implicit) instantiation of the template for this (set of) parameter(s), which would otherwise be triggered by default by the compiler when the implementation of the template is made available to it—which is the case in our example, since the header of the template (misc/endomap.hh) also includes its implementation (misc/endomap.hxx). To do so, we add an explicit template instantiation declaration matching the previous explicit template definition, using an extern template clause.

/**
 ** \file temp/temp.hh
 ** \brief Fresh temps.
 */

#pragma once

#include <misc/endomap.hh>

namespace temp
{
  struct Temp { /* ... */ };
}

// ...

namespace misc
{
  // Explicit template instantiation declaration.
  extern template class endo_map<temp::Temp>;
}

Any translation unit containing this explicit declaration will not generate this very template instantiation, unless an explicit definition is seen (in our case, this will happen within temp/temp.cc only).

You will notice that both the approach and the syntax used here recall the ones used to declare and define global variables in C and C++.

We can further improve the previous design by factoring explicit instantiation code using the preprocessor.

/**
 ** \file temp/temp.hh
 ** \brief Fresh temps.
 */

#pragma once

#include <misc/endomap.hh>

#ifndef MAYBE_EXTERN
# define MAYBE_EXTERN extern
#endif

namespace temp
{
  struct Temp { /* ... */ };
}

// ...

namespace misc
{
  // Explicit template instantiation declaration.
  MAYBE_EXTERN template class endo_map<temp::Temp>;
}
/**
 ** \file temp/temp.cc
 ** \brief temp::Temp.
 */

#define MAYBE_EXTERN
#include <temp/temp.hh>
#undef MAYBE_EXTERN

// ...

Explicit template instantiation declarations (not definitions) are only available since C++ 2011. Before that, we used to introduce a fourth type of file, *.hcc: files that had to be compiled once for each concrete template parameter.

Rule: Guard included files (*.hh & *.hxx)

Use the #pragma once directive to ensure the contents of a file is read only once. This is critical for *.hh and *.hxx files that include one another.

One typically has:

/**
 ** \file sample/sample.hh
 ** \brief Declaration of sample::Sample.
 */

#pragma once

// ...

#include <sample/sample.hxx>
/**
 ** \file sample/sample.hxx
 ** \brief Inlined definition of sample::Sample.
 */

#pragma once

#include <sample/sample.hh>

// ...
Rule fwd.hh: forward declarations

Dependencies can be a major problem during big project developments. It is not acceptable to “recompile the world” when a single file changes. To fight this problem, you are encouraged to use fwd.hh files that contain simple forward declarations. Everything that defeat the interest of fwd.hh file must be avoided, e.g., including actual header files. These forward files should be included by the *.hh instead of more complete headers.

The expected benefit is manifold:

  • A forward declaration is much shorter.

  • Usually actual definitions rely on other classes, so other #include s etc. Forward declarations need nothing.

  • While it is not uncommon to change the interface of a class, changing its name is infrequent.

Consider for example ast/visitor.hh, which is included directly or indirectly by many other files. Since it needs a declaration of each AST node one could be tempted to use ast/all.hh which includes virtually all the headers of the ast module. Hence all the files including ast/visitor.hh will bring in the whole ast module, where the much shorter and much simpler ast/fwd.hh would suffice.

Of course, usually the *.cc files need actual definitions.

Rule: Module, namespace, and directory likethis

The compiler is composed of several modules that are dedicated to a set of coherent specific tasks (e.g., parsing, AST handling, register allocation etc.). A module name is composed of lower case letters exclusively, likethis, not like_this nor like-this. This module’s files are stored in the directory with the same name, which is also that of the namespace in which all the symbols are defined.

Contrary to file names, we do not use dashes to avoid clashes with Swig and namespace.

Rule: libmodule.*: Pure interface

The interface of the module module contains only pure functions: these functions should not depend upon globals, nor have side effects of global objects. Global variables are forbidden here.

Rule: tasks.*: Impure interface

Tasks are the place for side effects. That’s where globals such as the current AST, the current assembly program, etc., are defined and modified.