Aller au contenu

Coding Guidelines

Conversions in C and C++

C is relatively permissive in the way values can be converted from one type to another one, using the C type cast operator (type). As demonstrated in the example program, many implausible casts are accepted in C and this may result in unexpected results. Although the example is using classes, the same problem arises with struct.

Conversions in C++

For making conversions more robust and better documented, C++ introduced several type cast operators:

static_cast
dynamic_cast
const_cast
reinterpret_cast

Each cast operator can be used for a specific purpose and using it documents the intended reason for the conversion and it gives the compiler a chance to check that the conversion is acceptable.

static_cast

The syntax for using the static_cast operator is:

static_cast<type-name>(expression)

The conversion is valid only if type-name can be converted implicitly to the same type that expression has, or vice-versa. In our example program, you can observe that static_cast prevents conversion between unrelated types, while C-type casts allow them. It does not however protect against conversion errors between related types.

More generally, static_cast should also be avoided and class hierarchy should be designed for avoiding downcasting. While upcasting is safe, downcasting may lead to unexpected errors and should be avoided by a proper class design. If one needs to access methods from a specialized class, it probably means that the class design is not appropriate.

dynamic_cast

dynamic_cast is part of the Runtime type identification (RTTI) mechanism introduced in C++. The goal of RTTI is to provide a standard way to determine the type of objects during runtime. The dynamic_cast operator generates a pointer to a derived type from a pointer to a base type, if this conversion is possible. When the conversion is not possible, the operator returns nullptr. Note that RTTI works only for classes that have virtual functions.

Conclusion

RTTI requires additional memory footprint and has an associated cost at runtime. Also, in general, compile-time errors must be preferred to run-time errors. This means that the programmer should favour robust static checks. For this reason, RTTI is not enabled in many embedded development frameworks.

As a conclusion, one should avoid upcasting and use static_cast rather than C-type casts whenever a conversion is required.

Example: Type Casting Comparison

#include <iostream>
#include <limits>

class Base {
private:
    int32_t _d1 = 0;
public:
    virtual void m1(int32_t i) { _d1 -= i; }
};

class Derived : public Base {
private:
    int32_t _d2 = 1;
public:
    void m1(int32_t i) override { _d2 += i; }
    void m2(int32_t i) { _d2 -= i; }
};

class Unrelated {
private:
    int16_t _d3 = 1;
public:
    void m() { _d3++; }
};

int main() {

    // Declare one variable of each type.
    Base b;
    Derived d;
    Unrelated u;

    //###################
    // USING c-type cast
    // Primitive types casts
    {
        uint32_t ui32 = std::numeric_limits<uint32_t>::max();
        uint16_t ui16 = std::numeric_limits<uint16_t>::max();
        float f = std::numeric_limits<float>::max();
        // Accepted implicit conversions -> those conversions should be explicit since they are narrowing.
        // Note that making them explicit does not change the result but it shows that the conversion is intended.
        // 2^32 - 1 is converted to 4.29... e+09 (precision may be lost)
        f = ui32;
        // 4.29... e+09 is converted to 0
        ui32 = f;
        ui32 = std::numeric_limits<uint32_t>::max();
        // std::numeric_limits<uint32_t>::max() is converted to std::numeric_limits<uint16_t>::max()
        ui16 = ui32;
        // seeing f as a pointer to uint32_t
        uint32_t *pui32 = (uint32_t*) &f;
        uint32_t fAsUint32 = *pui32;

        // Using cast for Base, Derived and Unrelated
        // You can cast almost anything

        // This cast is accepted and correct since Base is a base class to Derived
        // and Naming is a Derived instance
        Base *pb = (Base *) &d;
        // it calls m1 from Derived as expected.
        pb->m1(1);

        // This cast is accepted because a Base instance could be a Derived instance.
        // It is correct because pb is effectively pointing to a Derived instance.
        Derived *pd = (Derived *) pb;
        // it calls m1 from Derived as expected.
        pb->m1(1);

        // But if pb is now pointing to a Base instance it is accepted.
        // One can then call methods of Derived on a Base instance ! -> undefined behavior.
        pb = &b;
        pd = (Derived *) pb;
        // it calls m1 from Base and not Derived...
        pd->m1(1);

        // It is even possible to cast to a type which is totally unrelated to the expression !
        Unrelated *pu = (Unrelated *) &b;
        // it calls m() from Unrelated...
        pu->m();

        // The address of any object can be stored in almost any type (as long as it can store the address)
        uint64_t dummy = (uint64_t) &b;

        // In C++, some c-style casts that would be accepted in C are not accepted
        // Here char cannot store the address to b.
        // The code below does not compile.
        // char ch = (char) &b;
    }

    std::cout << "c-style cast done" << std::endl;

    //###################
    // USING static_cast
    // Primitive types
    // Conversion with static casts behaves similarly to c-type cast.
    // It does however reject casting between unrelated types !
    {
        uint32_t ui32 = std::numeric_limits<uint32_t>::max();
        uint16_t ui16 = std::numeric_limits<uint16_t>::max();
        float f = std::numeric_limits<float>::max();
        f = static_cast<float>(ui32);
        ui32 = static_cast<uint32_t>(f);
        ui32 = std::numeric_limits<uint32_t>::max();
        ui16 = static_cast<uint16_t>(ui32);

        // The code below does not compile
        // uint32_t* pui32 = static_cast<uint32_t*>(&f);


        // Using cast for Base, Derived and Unrelated
        // static_cast prevents conversions between unrelated types.
        Base *pb = static_cast<Base *>(&d);
        Derived *pd = static_cast<Derived *>(pb);
        pb = &b;
        pd = static_cast<Derived *>(pb);
        // this may crash at run time...
        // The code below does not compile
        // Unrelated* pu = static_cast<Unrelated*>(&b);

        // The code below does not compile
        // uint64_t dummy = static_cast<uint64_t>(&b);
    }

    std::cout << "static_cast done" << std::endl;

    //###################
    // USING dynamic_cast
    // Using dynamic_cast requires RTTI. In most cases, for embedded systems, RTTI is disabled.
    // For embedded systems, even more than for other systems, compile-time errors are preferred to run-time errors.
    //
    // dynamic_cast<Type *>(pt) converts the pointer pt to a pointer of type Type *
    // if the pointed-to object (*pt) is of type Type or else derived directly or indirectly from type Type.
    // Otherwise, the expression evaluates to the null pointer.

    // dynamic_cast works only for classes that have virtual functions.
    // dynamic_cast works only when the target type is a pointer or reference to a class type.
    {
        Base *pb = dynamic_cast<Base *>(&d);
        // pb is of type Base which is not derived from Derived.
        // But it will succeed in runtime since the pointed-to object is of type Derived.
        Derived *pd = dynamic_cast<Derived *>(pb);
        pd->m1(1);

        // b is of type Base which is not derived from Unrelated.
        // It cannot succeed in runtime since Base and Unrelated are not related.
        Unrelated* pu = dynamic_cast<Unrelated*>(&b);
        if (pu != nullptr) {
            pu->m();
        }
    }

    std::cout << "Main done" << std::endl;
    return 0;
}

Towards Safer Code

Rules

There are a number of rules for writing safer applications:

  • Do not discard values returned by functions. When exceptions are disabled (like in many embedded development frameworks), errors are notified to users with return values. If you discard them, it means that you ignore errors. Use the [[nodiscard]] construction for forcing users not to discard return values.
  • Use enums as error types and prefer scoped enums to unscoped enums. Scoped enums provide better type safety by not allowing implicit conversion to int.

Example: Safe Error Handling

#include <iostream>

enum UnscopedError { Ok, NotPossible,  NotValid,  DontEvenTry };
enum class ScopedError { Ok, NotPossible,  NotValid,  DontEvenTry };

class SafeClass {
public:
    SafeClass() {}
    explicit SafeClass(int size) {
    }

    [[nodiscard("Why are you discarding me?")]] int getter() {
        return 1;
    }

    UnscopedError m1() {
        return NotPossible;
    }

    [[nodiscard("Why are you discarding me?")]] ScopedError m2() {
        return ScopedError::NotPossible;
    }

private:
};


int main() {
    SafeClass s;

    // Valid use of getter.
    int i = s.getter();

    // Use of getter while discarding the return value.
    // The compiler will issue a warning (and warning may be treated as errors).
    /* if you call s.getter() and discard the return value, you get a warning */

    // m1() returns am unscoped enum that can be converted to int
    // Verifying error codes is thus is very lax.
    int ok = s.m1();

    // the code below does not compile -> cannot be implicitly converted to int
    // ok = s.m2();
    ScopedError rc = s.m2();
    // A ScopedError variable cannot be implicitly be converted to int
    if (rc != ScopedError::Ok) {
       std::cout << "Error " << (int) rc << std::endl;
    }

    return 0;
}

User Defined Conversions

User defined conversions are very useful sometimes. However, they should be used with care, because they can also lead to unpredicted conversions and errors that may be hard to locate.

Constructor

When a constructor with parameters is defined in a class, it defines conversion that can be used implicitly. If it is not intended to be able to convert from this specific type to the type defined by the class, an explicit keyword should be added in front of the constructor. This is demonstrated in the String class. Add the explicit keyword for defining constructors and observe the errors in the main() function.

Conversion operators

Defining conversion operators such as operator int() for allowing implicit and explicit conversions to the converted type (int value in this case) must be used with care. Such conversion operators make sense when an instance of the class can be represented as a value of the converted type. In embedded programming, this applies for instance for an input pin, where the operator int() conversion operator can be considered as a shortcut to a read() method that delivers the value read on the input pin. In the example program, we demonstrate an example of an operator int() conversion operator that is probably not as wise.

Example: User Defined Conversions

#include <cstring>

class String {
public:
    String() {}
    String(const char* szArray) {
        // not implemented
    }
    String(int size) {
        szArray = new char[size];
    }

    operator int() {
        return szArray != nullptr ? strlen(szArray) : 0;
    }

    int getLength() {
        return szArray != nullptr ? strlen(szArray) : 0;
    }

private:
    char* szArray = nullptr;
};

int main() {
    // create a string by calling explicitly the constructor
    String* s1 = new String("s1");
    // create a string by calling implicitly the constructor
    String s2 = "s2";
    // create a string by calling explicitly the constructor
    String* s3 = new String(10);
    // create a string by calling implicitly the constructor
    // this creates a string of size 10 -> is it was it expected ?
    String s4 = 10;
    // '123' will be converted to an int (combining the ASCII values) and this
    // creates a string by calling the constructor with a very big int value
    String s5 = '123';

    // with the int() operator, conversion from String to int become possible
    // Is it really meaningful ?
    // In that case, one should rather define a method for
    // getting the string length.
    int i2 = s2;

    // use of getLength
    i2 = s2.getLength();

    // use of getLength while discarding the return value
    s2.getLength();

    return 0;
}

Variable Scope and Initialization

Minimize global and shared data

This includes the following statements:

  • Avoid data with external linkage, because it creates coupling between different parts, potentially distant, of the programs.
  • Names of variables in the global namespace pollute this namespace.
  • Order of initialization of variables in different compilation units is difficult to understand and handle.
  • In any case, access to global shared data needs to be serialized.
  • A declaration of a name in a block can hide a declaration in an enclosing block or a global name. Hiding names are difficult to discover for programmers.

Declare variables as locally as possible

As a generic rule, variables represent a state and this state lifetime should be as short as possible. If a variable lives longer than expected, then

  • it makes the program harder to understand and maintain.
  • it pollutes its context with its name, meaning that its name exists in a context where it has no meaning.
  • it can’t always be initialized with a meaningful value. You should not declare a variable that you cannot initialize sensibly.

Remember that the requirement of older versions of C where variables must be declared only at the beginning of a scope is obsolete in C++.

Always initialize a variable

Uninitialized variables are an important source of bugs in C and C++ programs. This includes: - cleaning memory before using it. - initialize variables upon declaration.

Recall that the compiler is not required to initialize variables unless the program does it explicitly. Local variables or class attributes forgotten from the constructor initializer list may thus be not initialized before they are used.

Example: Variable Scope and Name Hiding

#include <iostream>

// This code should not be taken as a good example.
// It demonstrates the problem of variable name hiding.

int x; // global x

void hide()
{
    int x; // local x hides global x
    {
        // if you type the statement below for declaring x again
        // this demonstrates how one can hide another variable declaration
        /* int x; // hides first local x */
        x = 22; // assign to second local x

        // access to first local x is not possible here
        std::cout << x << std::endl;

        // this uses global x before it is initialized
        x = ::x;
        std::cout << x << std::endl;
    }
    // this uses first local x before it is initialized
    std::cout << x << std::endl;
    x = 33; // assign to first local x
    std::cout << x << std::endl;
    // access to global scope x is possible with ::
    ::x = 22;
    std::cout << ::x << std::endl;
}

int* px = &x; // take address of global x

int main() {
    hide();
    std::cout << *px << std::endl;
}

This example demonstrates:

  • How local variables can hide global variables with the same name
  • The problem of variable name shadowing and scope confusion
  • Using the scope resolution operator :: to access global variables
  • The risks of uninitialized global and local variables
  • Why variable scope should be kept as narrow as possible