Skip to content

Coding Conventions

Fabian Schiebel edited this page Sep 17, 2023 · 12 revisions

Coding Guidelines

PhASAR mostly adheres to the LLVM coding guidelines that can be found here:

In addition, we have the following rules:

Pre-Commit Hook

Our pull requests are automatically checked for coding style and correctness. If any of the checks fail, merging is blocked. To enhance your code quality while development of new features and to prevent the style-checks from failing in the CI, we recommend you to make use of our pre-commit hook. The pre-commit hook runs the coding-style checks before committing and blocks the commit if any of these checks fail. In addition, it transforms your code to make these checks succeed in the next commit.

To ensure that the pre-commit hook is automatically run before a commit is created, please install the hook by running the following commands in PhASAR's root directory:

$ pip install pre-commit
$ pre-commit install

Use the following script to run some very useful clang-tidy checks (and automated fixes) on the entire code base. This script runs a lot of clang-tidy checks and may be potentially expensive. You may wish to run it only once in a while, but definitely before you check-in any code.

$ cd PHASAR_ROOT
$ utils/run-phasar-checks.sh

Branches

The PhASAR repository consists of different kinds of branches.

  • master: Here we push our releases. Code inside the master-branch is required to be complete and stable.
  • development: Tracks our latest improvements on PhASAR. This branch may not directly be pushed to. Instead we rely on peer-reviewed pull-requests. Code inside development is expected to be stable, but does not have to be complete.
  • feature branches: Our feature branches have a descriptive name in CamelCase prefixed by f-. Here, our main development happens. There are no requirements on the code-quality inside of feature branches. Feature branches are branched from development and eventually merged back to development. After merging a feature branch back to development, it gets deleted. We use squash as merge strategy.

We do not use force-push!

Source Tree

PhASAR's source tree consists of several folders, many of them containing code. When contributing to PhASAR and adding files, please make sure to put your files into the correct folders:

  • include/phasar/ -- All your header (.h) files
  • lib/ -- The source (.cpp) files corresponding to your header. The sub-folder structure matches the one in include/phasar/.
  • unittests -- The unit-tests corresponding to your feature. The sub-folder structure matches the one in include/phasar/.
  • test -- The test-resources necessary to run your unit-tests
  • tools -- executable tools that make use of PhASAR.

Within include/phasar and lib/ we have a sub-solder structure that contains a sub-folder for each individual phasar library, e.g., lib/DB/ contains the sources for the library phasar_db; the corresponding headers are in include/phasar/DB/. For each of these sub folders, include/phasar contains a header file that aggregates the headers of the sub-folder, e.g., include/phasar/DB.h aggregates all headers from include/phasar/DB/.

Hence when adding a new header, please make sure to add an entry to the corresponding aggregating header of the phasar-subfolder.

There are two sub-folders of PhASAR that require special attention: PhasarLLVM and PhasarClang. They mirror the rest of PhASAR's sub-folder structure and contain all elements that depend on LLVMCore and libclang respectively. So, for example while the DB subfolder contains the generic interface ProjectIRDBBase, PhasarLLVM/DB contains the implementation LLVMProjectIRDB that depends on LLVM's IR.

Abstractions

In general, we want PhASAR to be easy to use and high-performant at the same time. However, these requirements are oftentimes contradictory. For the APIs that external users of PhASAR see and use (including the APIs to implement custom dataflow analyses) we prefer usability and readability over performance. Inside the internals of PhASAR, we permit "more ugly" code that is high-performant as long as it is well documented, well abstracted and benchmarked.

#include Order

We group the #included headers into four groups which are placed in the following order:

  1. PhASAR's own headers
  2. LLVM's headers
  3. Other non-STL and non-LibC headers
  4. STL- and LibC headers

All except the STL- and LibC headers are included in Quoting Style; the STL and LibC headers are included in angular brackets.

#include vs Forward Declaration

We prefer #include over forward declarations for all non-PhASAR headers. For PhASAR's own headers we prefer forward declarations to improve incremental builds. However, for various templates that PhASAR consists of, forward declarations are oftentimes not usable, so in such cases we use #include and aim for using the C++17 extern template feature to lower compilation times.

Type Aliases

PhASAR follows the LLVM rules to name types and type aliases in CamelCase. However, often we have type aliases for template type parameters. Those type aliases are written in lower_case with trailing _t. So, for a template parameter D the corresponding type alias would be named d_t.

We prefer using type aliases over complex or repeated type-expressions.

Inline Functions

We prefer keeping the dependencies of our header files low to reduce the number of re-compiled compilation units if a single header changes. This also leads to functions being only forward-declared in headers and implemented in the corresponding source files. However, sometimes it might be appropriate to inline small functions into the header.

We permit this for one- or two-liners where the implementation does not require additional #includes or benchmarks show significant performance gains by inlining a function into the header. However, we refrain from inlining functions if their content is subject to frequent changes.

Global Variables

We aim to avoid global variables as they are especially bug-prone. We allow constexpr globals. If you need to use a mutable global variable, consider using a getter-function with a static local variable instead. They do not suffer from undefined initialization-order.

As an exception, we permit static globals within the source-file of command-line tools that make use of LLVM's command-line options parser.

File Header

Each code file inside PhASAR starts with a header of the form:

/******************************************************************************
 * Copyright (c) <year> <owner>.
 * All rights reserved. This program and the accompanying materials are made
 * available under the terms of LICENSE.txt.
 *
 * Contributors:
 *     <authors> and others
 *****************************************************************************/

Here, <year>, <owner> and <authors> are placeholders.

This file header might be changed in the future.

Include Guard

As the time of this writing, PhASAR prefers include guards over #pragma once as the latter is a non-standard extension. This rule may change in the future.

The include-guard macro contains the full path to the header file starting from PhASAR's include/ folder. The macro is written in UPPER_CASE with directory separators (/) replaced by underscores (_). It ends with the _H suffix.

Enums

We prefer the type-safe enum class over C-style enums. For enums with more than two variants, we prefer the macro-based generator pattern to not only generate the enum class itself, but the to_string function as well. As an example, see DataFlowAnalysisType.h.

If you want to use enums as bit-flags, use the EnumFlags header.

Library Dependencies

We avoid introducing new library dependencies as they add non-negligible overhead for maintenance. If possible, we use the utilities from LLVM and the STL. Otherwise, we prefer writing our own version of the desired utility, if the maintenance of the additional code does not exceed the maintenance for an additional library dependency.

Standard Library

Many utilities that PhASAR uses are present in both the STL and LLVM. We don't have a preference which of these to prefer. This heavily depends on the situation. We aim to use the best fitting data-structure for the respective tasks and oftentimes this leads us to prefer LLVM's data-structures over the ones from the STL. However, this stays an individual case decision.

We prefer LLVM over the STL for the following:

  • llvm::raw_ostream over std::ostream -- we also follow LLVM's rule to ban <iostream>
  • more to be added...

Exceptions and Noexcept

We avoid the use of exceptions. This is, because of our heavy use of LLVM's functionality which makes PhASAR inherently exception-unsafe. However, in the periphery of PhASAR's core, exceptions are permitted but not encouraged.

Move constructors, move assignment and destructors must be noexcept.

Trailing Return Types

With C++17 functions can be defined or declared not only with leading- but also with trailing return types. For better readability, we use trailing return types only to avoid explicitly qualifying the return type with its namespace- or parent-type name.

So, for example we write:

// No extra qualification required for typename int => Leading return type
int IDESolverTest::isZeroValue(d_t) const {...}
// Trailing return type to avoid typing IDESolverTest::EdgeFunctionPtrType 
auto IDESolverTest::getNormalEdgeFunction(...) -> EdgeFunctionPtrType {...}
Clone this wiki locally