Big Integers: API Design in C++

Preface

I was programming an arbitrary precision integer library in C++ (more on this in a later blog) when I ran into a couple of issues regarding API design. For context, the library itself is header-only since I wanted to simplify the installation as much as possible. It’s also small because I was writing it for fun to learn about arbitrary precision arithmetic in computers and library API design in general.

In the library, I exposed a big integer class (we’ll call it BigInt), which stores an std::vector of 32-bit integers resized based on the value the object represents. In addition, it also stores a bool to account for the sign of the number. The particular class internals are not important for the blog; if you’re interested in that, check out the other blog posts.

For any arbitrary precision arithmetic library to be useful, there obviously has to be algorithms to do the arithmetic. The functions that implement the algorithms are generally not class members, because that would bloat the size of the class too much. In addition, if one is overloading the operators for BigInt, they might also want to place it outside the class.

The goal was to have a set of internal functions, not members of the class, that implements the algorithms. The public API would in turn use these internal functions to perform their task. As such, I had the following code:

namespace mylib {

    class BigInt {
    public:
        ...
    private:
        ...
    };

    BigInt public_api_1(const BigInt& value) { ... }
    BigInt public_api_2(const BigInt& value) { ... }

    namespace internal {
        BigInt internal_func_1(const BigInt& value) { ... }
        BigInt internal_func_2(const BigInt& value) { ... }
        BigInt internal_func_3(const BigInt& value) { ... }
    }
}

The problem

The problem was that in order for the various algorithms to do their job, they will need access to the private member variables in BigInt.

One way to give access is by implementing getters for the members. However, since I didn’t want to also expose BigInt’s private members for reading and writing by users of the library, I did not want to define publically accessible getters. For this reason, I tried look for a different way to accomplish the same thing.

Another possibility that C++ provides to give our algorithms access to BigInt’s private members is by declaring these functions as friends of the class. This only gives internal functions access to BigInt’s privates and doesn’t give users access, which sounds amazing.

However, because the internal functions are declared under the mylib::internal namespace, one has to forward declare the functions before friending them in the class one by one. This makes the coupling between the internal algorithms and the class extremely tight, which is not ideal. After tweaking the files for a bit, I scrapped this idea.

Eventually however, with the help of AI and various internet sources, I finally figured out a solution that worked for the library.

Accessor structs

I created two structs in the internal namespace called BigIntRepr and BigIntReprAccess.

First, BigIntRepr would be the only private member of the BigInt class, and store all the necessary data. The internal algorithms would then all operate on the BigIntRepr class instead. This separates the class that users interact with and the bulk of the internal logic.

Second, BigIntReprAccess class would be responsible for getting the BigIntRepr from the public integer class for the internal algorithms to use. This struct would be the only friend of the BigInt class, and would implement static methods that returns BigIntRepr from instances of the public integer class.

When the public API needs to call the internal functions, it would first obtain the BigIntRepr using BigIntReprAccess, and call the internal functions using just the extracted instance of BigIntRepr. For example:

namespace mylib {

    namespace internal {
        struct BigIntRepr;
        struct BigIntReprAccess;
    }

    class BigInt {
    public:
        ...
    private:
        friend struct internal::BigIntReprAccess;
        internal::BigIntRepr m_repr;
    };

    struct internal::BigIntReprAccess {
        static BigIntRepr& get(BigInt& x) { return x.m_repr; }
        static const BigIntRepr& get(const BigInt& x) { return x.m_repr; }
    };
    
    // These public APIs would use internal::BigIntReprAccess::get to access
    // internal::BigIntRepr in order to call the internal functions. 
    BigInt public_api_1(const BigInt& value) { ... }
    BigInt public_api_2(const BigInt& value) { ... }

    namespace internal {
        BigInt internal_func_1(const BigIntRepr& value) { ... }
        BigInt internal_func_2(const BigIntRepr& value) { ... }
        BigInt internal_func_3(const BigIntRepr& value) { ... }
    }
}

As demonstrated, one only needs to friend BigInt once, and have only a couple of forward declares. It also protects the internal representations from being tampered by the user. Thus, this way of doing things is vastly better than implementing getters for the private members, or friending the internal functions one by one.

Conclusion

This pattern allows the representation to evolve independently of the public API. It also enforces a cleaner codebase: algorithms operate on the internal data, while the class exists purely as a stable interface to users.

Have a comment or a question about this post? Reply by email!