Skip to content

Coding Guidelines

AleksandarZeljic edited this page May 7, 2020 · 6 revisions

This Style Guide is based on CVC4 Coding Guidelines (and by extension Google C++ Style Guide).

Table of Contents

Header Files

Self-Contained Headers

All header files should be self-contained. A header should have header guards and include all other headers it needs.

In general, header files should contain declarations only. The only exception to that rule are template functions and functions that can be inlined.

A template that is explicitly instantiated for all relevant sets of template arguments, or that is a private implementation detail of a class, is allowed to be defined in the one and only .cpp file that instantiates the template.

Do not use the using namespace directive in header files.

The #define Guard

All header files should have #define guards to prevent multiple inclusion. The format of the symbol name should be +marabou<NAMESPACE>…​<NAMESPACE><FILE>_h (two underscores between namespaces and at start and end).

Notice that the <FILE> section may have single underscores in it.

Example:

#ifndef _marabou__parser__acas_h_
#define _marabou__parser__acas_h_



namespace marabou {
namespace parser{
namespace acas{
...
} // namespace acas
} // namespace parser
} // namespace marabou

#endif /*_marabou__parser__acas_h_ */

Forward Declarations

Avoid using forward declarations where possible. Justify the use of forward declarations, e.g., significantly faster compile time, cyclic dependencies.

Inline Functions

All functions defined in a header file that are not template functions should be functions that are likely to be inlined by the compiler. Note that a function that is defined entirely inside a class, struct or union definition is implicitly an inline function.

As a basic rule, do not define functions in the header if they have more than 5 lines or contain loop or switch statements. For every function that violates that rule, add a justification as a comment.

Names and Order of Includes

Within each section the includes should be ordered alphabetically. Note that older code might not conform to this rule and should be fixed when convenient.

You should include all the headers that define the symbols you rely upon, except in the unusual case of forward declaration. However, any includes present in the related header do not need to be included again in the related.cpp (i.e., foo.cpp can rely on foo.h’s includes).

Scoping

Namespaces

In the future we want to more everything in Marabou to reside in the Marabou namespace or a sub-namespace. Generally, the sub-namespace follows the module (and thus the source directory) structure. The name of a sub-namespace should contain lowercase letters only, even for acronyms (e.g. marabou::smt, marabou::main, marabou::parser) See also Naming of Namespaces.

Note that certain base types might reside in namespace Marabou directly instead of a sub-namespace despite belonging to a module. These base types are user-visible classes (such as marabou::Exception), core functionality classes, and utility classes.

Terminate namespaces with comments as shown in the header guards example.

Always enclose functions and methods in the same namespace hierarchy that they’ve been declared in. Use using directives exclusively for importing other namespaces in source files and only directly after the list of includes.

Unnamed Namespaces and Static Variables

When definitions in a .cpp file do not need to be referenced outside that file, place them in an unnamed namespace or declare them static. Do not use either of these constructs in .h files.

Nonmember, Static Member, and Global Functions

Prefer grouping functions with a namespace instead of using a class as if it were a namespace. Static methods of a class should generally be closely related to instances of the class or the class’s static data.

For a header myproject/foo_bar.h, for example, write

namespace myproject {
namespace foo_bar {
void Function1();
void Function2();
}  // namespace foo_bar
}  // namespace myproject

instead of

namespace myproject {
class FooBar
{
 public:
  static void Function1();
  static void Function2();
};
}  // namespace myproject

If you define a nonmember function and it is only needed in its .cpp file, use internal linkage to limit its scope.

Local Variables

Place a function’s variables in the narrowest scope possible, and initialize variables in the declaration. Declare variables as close to the first use as possible. Initialization should be used instead of declaration and assignment, e.g.:

int i;
i = f();      // Bad -- initialization separate from declaration.
int j = g();  // Good -- declaration has initialization.
std::vector<int> v;
v.push_back(1);  // Bad -- Prefer initializing using brace initialization.
v.push_back(2);
std::vector<int> v = {1, 2};  // Good -- v starts initialized.

There is one caveat: if the variable is an object, its constructor is invoked every time it enters scope and is created, and its destructor is invoked every time it goes out of scope.

// Inefficient implementation:
for (int i = 0; i < 1000000; ++i)
{
  Foo f;  // My ctor and dtor get called 1000000 times each.
  f.DoSomething(i);
}

It may be more efficient to declare such a variable used in a loop outside that loop:

Foo f;  // My ctor and dtor get called once each.
for (int i = 0; i < 1000000; ++i)
{
  f.DoSomething(i);
}

Static and Global Variables

Variables of class type with static storage should only be used sparely and their usage should be well justified. They cause hard-to-find bugs due to indeterminate order of construction and destruction. The order in which class constructors and initializers for static variables are called is only partially specified in C++ and can even change from build to build, which can cause bugs that are difficult to find.

Likewise, global and static variables are destroyed when the program terminates, regardless of whether the termination is by returning from main() or by calling exit(). The order in which destructors are called is defined to be the reverse of the order in which the constructors were called. Since constructor order is indeterminate, so is destructor order.

Classes

This section lists the main dos and don’ts you should follow when writing a class.

Naming

Constructors

Do not call virtual methods in constructors because these calls will not get dispatched to the subclass implementations.

Avoid initialization that can fail if you can’t signal an error.

Implicit Conversions

Implicit conversions allow an object of one type (called the source type) to be used where a different type (called the destination type) is expected, such as when passing an int argument to a function that takes a double parameter.

Use implicit conversion operators sparely and only in well justified cases. Prefer using the explicit keyword for conversion operators.

Copyable and Movable Types

Support copying and/or moving if these operations are clear and meaningful for your type. Otherwise, disable the implicitly generated special functions that perform copies and moves.

If a class requires a user-defined destructor, a user-defined copy constructor, a user-defined copy assignment operator, a user-defined move constructor, or a user-defined move assignment operator, you should implement all of them (Rule of Five).

Structs vs. Classes

Use a struct only for passive objects that carry data; everything else is a class.

Inheritance

Composition is often more appropriate than inheritance. When using inheritance, make it public. If you want to do private inheritance, you should be including an instance of the base class as a member instead.

Make your destructor virtual if necessary. If your class has virtual methods, its destructor should be virtual.

Explicitly annotate overrides of virtual functions or virtual destructors with an override or (less frequently) final specifier. Older (pre-C++11) code will use the virtual keyword as an inferior alternative annotation. For clarity, use exactly one of override, final, or virtual when declaring an override. Rationale: A function or destructor marked override or final that is not an override of a base class virtual function will not compile, and this helps catch common errors. The specifiers serve as documentation; if no specifier is present, the reader has to check all ancestors of the class in question to determine if the function or destructor is virtual or not.

Virtual functions without an implementation in the base class (a pure virtual function) must be annotated with the pure-specifier = 0.

Operator Overloading

Overload operators judiciously. Define overloaded operators only if their meaning is obvious, unsurprising, and consistent with the corresponding built-in operators. Only overload built-in operators.

Certain operator overloads are inherently hazardous. Overloading unary & can cause the same code to have different meanings depending on whether the overload declaration is visible. Overloads of &&, ||, and , (comma) cannot match the evaluation-order semantics of the built-in operators.

Declaration Order

Class definitions should be organized as follows:

A class definition should start with any friend declarations, followed by a public section, then protected, then private. Omit sections that would be empty. Do not alternate between sections.

Within a section, class members should be ordered as follows:

  1. Types (typedefs and nested structs and classes)

  2. Constants

  3. Constructors, destructors, copy and move constructors and operators

  4. Methods (static go before non-static methods)

  5. Data members (static go before non-static data members)

Functions

Parameter Ordering

When defining a function, prefer the following order: inputs, then outputs.

Write Short Functions

Prefer small and focused functions.

Input Parameters

All input parameters passed by reference or pointer must be labeled const.

Default Arguments

Default arguments are banned on virtual functions, where they don’t work properly, and in cases where the specified default might not evaluate to the same value depending on when it was evaluated. (For example, don’t write void f(int n = counter+);+.)

Ownership and Smart Pointers

Prefer to have single, fixed owners for dynamically allocated objects. Prefer to transfer ownership with smart pointers.

"Ownership": is a bookkeeping technique for managing dynamically allocated memory (and other resources). The owner of a dynamically allocated object is an object or function that is responsible for ensuring that it is deleted when no longer needed. Ownership can sometimes be shared, in which case the last owner is typically responsible for deleting it. Even when ownership is not shared, it can be transferred from one piece of code to another.

"Smart" pointers: are classes that act like pointers, e.g. by overloading the * and → operators. Some smart pointer types can be used to automate ownership bookkeeping, to ensure these responsibilities are met. std::unique_ptr is a smart pointer type introduced in C++11, which expresses exclusive ownership of a dynamically allocated object; the object is deleted when the std::unique_ptr goes out of scope. It cannot be copied, but can be moved to represent ownership transfer. std::shared_ptr is a smart pointer type that expresses shared ownership of a dynamically allocated object. std::shared_ptrs can be copied; ownership of the object is shared among all copies, and the object is deleted when the last std::shared_ptr is destroyed.

If dynamic allocation is necessary, prefer to keep ownership with the code that allocated it. If other code needs access to the object, consider passing it a copy, or passing a pointer or reference without transferring ownership. Prefer to use std::unique_ptr to make ownership transfer explicit. For example:

std::unique_ptr<Foo> FooFactory();
void FooConsumer(std::unique_ptr<Foo> ptr);

Do not design your code to use shared ownership without a very good reason. One such reason is to avoid expensive copy operations, but you should only do this if the performance benefits are significant, and the underlying object is immutable (i.e. std::shared_ptr<const Foo>). If you do use shared ownership, prefer to use std::shared_ptr.

Never use std::auto_ptr. Instead, use std::unique_ptr.

Other C++ Features

Friends

We allow use of friend classes and functions, within reason.

Friends extend, but do not break, the encapsulation boundary of a class. In some cases this is better than making a member public when you want to give only one other class access to it. However, most classes should interact with other classes solely through their public members.

Exceptions

Do not use throw specifiers since they are deprecated in C++11. Document exceptions that can be thrown.

Run-Time Type Information (RTTI)

Do not use Run Time Type Information (RTTI) such as typeid or dynamic_cast.

Decision trees based on type are a strong indication that your code is on the wrong track.

if (typeid(*data) == typeid(D1))
{
  ...
}
else if (typeid(*data) == typeid(D2))
{
  ...
}
else if (typeid(*data) == typeid(D3))
{
  ...
}

Code such as this usually breaks when additional subclasses are added to the class hierarchy. Moreover, when properties of a subclass change, it is difficult to find and modify all the affected code segments.

Casting

C++ introduced a different cast system from C that distinguishes the types of cast operations. The problem with C casts is the ambiguity of the operation; sometimes you are doing a conversion (e.g., (int)3.5) and sometimes you are doing a cast (e.g., (int)"hello"). Brace initialization and C++ casts can often help avoid this ambiguity. Additionally, C++ casts are more visible when searching for them.

Do not use C-style casts. Instead, use C++-style casts like static_cast<float>(double_value), or brace initialization for conversion of arithmetic types like int64 y = int64{1} << 42. Do not use cast formats like int y = (int)x or int y = int(x) (but the latter is okay when invoking a constructor of a class type).

  • Use brace initialization to convert arithmetic types (e.g. int64{x}). This is the safest approach because code will not compile if conversion can result in information loss. The syntax is also concise.

  • Use static_cast as the equivalent of a C-style cast that does value conversion, when you need to explicitly up-cast a pointer from a class to its superclass, or when you need to explicitly cast a pointer from a superclass to a subclass. In this last case, you must be sure your object is actually an instance of the subclass.

  • Use const_cast to remove the const qualifier (see const).

  • Use reinterpret_cast to do unsafe conversions of pointer types to and from integer and other pointer types. Use this only if you know what you are doing and you understand the aliasing issues.

See the RTTI section for guidance on the use of dynamic_cast.

Preincrement and Predecrement

Use prefix form (++i) of the increment and decrement operators with iterators and other template objects. When the return value is ignored, the "pre" form (++i) is never less efficient than the "post" form (i++), and is often more efficient.

Use of const

We strongly recommend that you use const whenever it makes sense to do so:

  • If a function guarantees that it will not modify an argument passed by reference or by pointer, the corresponding function parameter should be a reference-to-const (const T&) or pointer-to-const (const T*), respectively.

  • Declare methods to be const whenever possible. Accessors should almost always be const. Other methods should be const if they do not modify any data members, do not call any non-const methods, and do not return a non-const pointer or non-const reference to a data member.

  • Consider making data members const whenever they do not need to be modified after construction.

Integer Types

<stdint.h> defines types like int16_t, uint32_t, int64_t, etc. You should always use those in preference to short, unsigned long long and the like, when you need a guarantee on the size of an integer.

Preprocessor Macros

Avoid defining macros, especially in headers; prefer inline functions, enums, and const variables. See also Naming of Macros.

Exporting macros from headers (i.e. defining them in a header without `#undef`ing them before the end of the header) is strongly discouraged. If you do export a macro from a header, it must have a globally unique name. To achieve this, it must be named with a prefix consisting of your project’s namespace name (but upper case).

0 and nullptr/NULL

Use 0 for integers, 0.0 for reals, nullptr for pointers, and '\0' for chars.

sizeof

Prefer sizeof(varname) to sizeof(type).

auto

auto is permitted when it increases readability, particularly as described below. Never initialize an auto-typed variable with a braced initializer list.

Specific cases where auto is allowed or encouraged:

  • (Encouraged) For iterators and other long/cluttery type names, particularly when the type is clear from context (calls to find, begin, or end for instance).

  • (Allowed) When the type is clear from local context (in the same expression or within a few lines). Initialization of a pointer or smart pointer with calls to new commonly falls into this category, as does use of auto in a range-based loop over a container whose type is spelled out nearby.

  • (Allowed) When the type doesn’t matter because it isn’t being used for anything other than equality comparison.

  • (Encouraged) When iterating over a map with a range-based loop (because it is often assumed that the correct type is std::pair<KeyType, ValueType> whereas it is actually std::pair<const KeyType, ValueType>). This is particularly well paired with local key and value aliases for .first and .second (often const-ref).

      
    for (const auto& item : some_map)
    {
      const KeyType& key = item.first;
      const ValType& value = item.second;
      // The rest of the loop can now just refer to key and value,
      // a reader can see the types in question, and we've avoided
      // the too-common case of extra copies in this iteration.
    }

Lambda expressions

Lambda expressions are a concise way of creating anonymous function objects. They’re often useful when passing functions as arguments. For example:

std::sort(v.begin(), v.end(), [](int x, int y)
{
  return Weight(x) < Weight(y);
});

They further allow capturing variables from the enclosing scope either explicitly by name, or implicitly using a default capture. Explicit captures require each variable to be listed, as either a value or reference capture:

int weight = 3;
int sum = 0;
// Captures `weight` by value and `sum` by reference.
std::for_each(v.begin(), v.end(), [weight, &sum](int x)
{
  sum += weight * x;
});

Default captures implicitly capture any variable referenced in the lambda body, including this if any members are used:

const std::vector<int> lookup_table = ...;
std::vector<int> indices = ...;
// Captures `lookup_table` by reference, sorts `indices` by the value
// of the associated element in `lookup_table`.
std::sort(indices.begin(), indices.end(), [&](int a, int b)
{
  return lookup_table[a] < lookup_table[b];
});

Lambdas were introduced in C++11 along with a set of utilities for working with function objects, such as the polymorphic wrapper std::function.

Use lambda expressions where appropriate:

  • Use lambda expressions where appropriate, with formatting as described below.

  • Prefer explicit captures if the lambda may escape the current scope. For example, instead of:

    {
      Foo foo;
      ...
      executor->Schedule([&] { Frobnicate(foo); })
      ...
    }
    // BAD! The fact that the lambda makes use of a reference to 'foo' and
    // possibly 'this' (if 'Frobnicate' is a member function) may not be
    // apparent on a cursory inspection. If the lambda is invoked after
    // the function returns, that would be bad, because both 'foo'
    // and the enclosing object could have been destroyed.
    prefer to write:
    {
      Foo foo;
      ...
      executor->Schedule([&foo] { Frobnicate(foo); })
      ...
    }
    // BETTER - The compile will fail if `Frobnicate` is a member
    // function, and it's clearer that `foo` is dangerously captured by
    // reference.
  • Use default capture by reference ([&]) only when the lifetime of the lambda is obviously shorter than any potential captures.

  • Use default capture by value ([=]) only as a means of binding a few variables for a short lambda, where the set of captured variables is obvious at a glance. Prefer not to write long or complex lambdas with default capture by value.

  • Keep unnamed lambdas short. If a lambda body is more than maybe five lines long, prefer to give the lambda a name, or to use a named function instead of a lambda.

  • Specify the return type of the lambda explicitly if that will make it more obvious to readers, as with auto.

Template metaprogramming

Avoid complicated template programming.

Template metaprogramming often leads to extremely poor compiler time error messages: even if an interface is simple, the complicated implementation details become visible when the user does something wrong.

C++11

We require a C++ compiler with C++11 support for Marabou and we encourage the use of libraries and language extensions.

Nonstandard Extensions

Nonstandard extensions to C++ may not be used unless there is a good reason. Instead of using the extensions directly, create a portability wrapper macro (with a marabou_ prefix) that is appropriately guarded for compilers that do not support the extension.

Example:

#ifdef __has_builtin
#if __has_builtin(__builtin_expect)
#define marabou_PREDICT_FALSE(x) (__builtin_expect(x, false))
#else
#define marabou_PREDICT_FALSE(x) x
#endif
#else
#define marabou_PREDICT_FALSE(x) x
#endif

Aliases

Public aliases are for the benefit of an API’s user, and should be clearly documented.

There are several ways to create names that are aliases of other entities:

typedef Foo Bar;
using Bar = Foo;
using other_namespace::Foo;

We prefer using instead of typedef. Like other declarations, aliases declared in a header file are part of that header’s public API unless they’re in a function definition, in the private portion of a class, or in an explicitly-marked internal namespace. Aliases in such areas or in .cpp files are implementation details (because client code can’t refer to them), and are not restricted by this rule.

Naming

Namespaces

The name of a namespace is uncapitalized and acronyms are all in lowercase letters, e.g., marabou::smt, marabou::main, marabou::parser.

Classes

Classes, and type names in general, are in CamelCase. There are exceptions to this rule, however, where lower case names with embedded underscores are acceptable:

  • STL type names are not CamelCase, and when implementing traits and STL-like typedefs (iterators for instance), you should match the STL style as closely as possible:

    class Foo
    {
      class Item;
      typedef Item*       iterator;
      typedef const Item* const_iterator;
    };
    for low-level exception types (similar to STL’s bad_alloc). However, for derived classes of marabou::Exception, the names should be CamelCase.
  • Low-level unions and structs may be in a similar lower case form, e.g., low_level_struct.

Note that acronyms (SMT, Marabou) inside type names should generally be lowercase after the first letter, such as in marabou::SmtEngine. This is preferred to SMTEngine or SMT_Engine, but use judgment on a case-by-case basis.

Members

Members are prefixed with _, and are generally in lowerCamelCase after the prefix, e.g.,

class Foo
{
  int _thisIsAnInt;
};

Macros

Preprocessor macro names are ALWAYS_UPPERCASE, and generally prefixed with MARABOU_ to avoid name clashes.

Files and Directories

File names are all CamelCase. Class FooSolver has its definition in FooSolver.h and its implementation in FooSolver.cpp.

File extensions .cpp and .h are generally preferred.

Enums

The names of enums are in CamelCase. Enumerators in enums are of the form ALL_CAPS_WITH_UNDERSCORES.

Comments

Though a pain to write, comments are absolutely vital to keeping our code readable. When writing your comments, write for your audience: the next contributor who will need to understand your code. Be generous---the next one may be you!

TODO/FIXME Comments

Use TODO/FIXME comments for code that is temporary, a short-term solution, or good-enough but not perfect.

TODO/FIXMEs should include the string TODO/FIXME in all caps, followed by the name, e-mail address, bug ID, or other identifier of the person or issue with the best context about the problem referenced by the TODO/FIXME. The main purpose is to have a consistent TODO/FIXME that can be searched to find out how to get more details upon request. A TODO/FIXME is not a commitment that the person referenced will fix the problem. Thus when you create a TODO/FIXME with a name, it is almost always your name that is given.

// TODO(kl@gmail.com): Use a "*" here for concatenation operator.
// TODO(Zeke) change this to use relations.
// TODO(bug 12345): remove the "Last visitors" feature

Formatting

Follow existing patterns.