Heuristic42
Blog
Opengl
Meta
Rendering
1
comment
Nov 19 at 15:47
Matrices
Hello, I hope this message finds you doing well. I believe…
–
anonymous
created
Oct 20 at 20:30
Iterators: pointers vs cursors
You're already doing both of these by hand. This post emphaisze…
–
pknowles
comment
Oct 10 at 10:27
Matrices
[deleted]
–
anonymous
comment
Oct 4 at 19:12
Matrices
[deleted]
–
anonymous
comment
Sep 30 at 18:51
Matrices
[deleted]
–
anonymous
comment
Sep 23 at 16:15
Matrices
[deleted]
–
anonymous
comment
Sep 21 at 6:52
Contributing
I kind of predicted what was bound to happen when my favourite …
–
anonymous
comment
Sep 7 at 1:21
Route contention when running docker and a VPN
Thank you for this. Between this and the overwriting of iptabl…
–
anonymous
comment
Sep 6 at 17:57
Making a real EMF Reader
Sorry for the random quoted text comments. I am one of those p…
–
anonymous
comment
Sep 6 at 17:48
Making a real EMF Reader
["ove! Play a tone with a buzzer and has 5 LEDs to show the “EM…
–
anonymous
comment
Sep 6 at 17:47
Making a real EMF Reader
["easure direction Measure the magnetic fie"](#q107-644-685)
–
anonymous
comment
Aug 20 at 17:01
Matrices
[deleted]
–
anonymous
comment
Aug 11 at 22:32
Matrices
[deleted]
–
anonymous
edited
Jun 8 at 22:29
Rethinking writing files with memory mapping and C++
This post introduces the motivation behind the [decodless C++ o…
–
admin
created
Jun 8 at 22:16
Rethinking writing files with memory mapping and C++
This post introduces the motivation behind the [decodless C++ o…
–
pknowles
comment
Jun 5 at 13:36
Contributing
[deleted]
–
anonymous
comment
Apr 19 at 11:24
Matrices
[deleted]
–
anonymous
comment
Apr 13 at 0:25
Matrices
[deleted]
–
anonymous
comment
Apr 5 at 9:43
Matrices
[deleted]
–
anonymous
comment
Mar 27 at 17:19
Matrices
[deleted]
–
anonymous
comment
Mar 25 at 4:59
Matrices
[deleted]
–
anonymous
comment
Mar 5 at 15:39
Matrices
[deleted]
–
anonymous
comment
Feb 7 at 5:45
Microsoft Natural Ergonomic 4000 Replacement
Thank you so much for sharing your thoughts here, it tells me e…
–
anonymous
comment
Jan 28 at 23:31
Microsoft Natural Ergonomic 4000 Replacement
Oh man, I feel this post. Not sure if you've seen the "new" new…
–
anonymous
…
View All
Log in
Writing custom C++ containers, iterators and value references
leave this field blank to prove your humanity
Article title
*
Article revisions must have a non-empty title
Article body
*
Generic containers are awesome. 1. The effort required to write a container and access data no longer affects the design of a good system. Efficient bug free code need only be written once. A common interface allows reuse. 2. More mistakes can be caught at compile time. Type safety is amazing. Long gone should be the days of writing inline linked list access, where each time there's a chance to mess something up. No really, please stop. Good code is modular. Great code strikes a balance in the boundaries of its modules --- the internal implementation can be optimized (bigger is better) but every bit of code needs to work with every other bit of code within the interface boundary and that's hard to maintain (smaller is better). Abstractions are great, but they [cost](https://www.youtube.com/watch?v=rHIkrotSwcc) and can hurt performance. Containers are an easy example where good boundaries are fairly intuitive. See [High Cohesion](https://en.wikipedia.org/wiki/Cohesion_(computer_science)#High_cohesion), [Loose Coupling](https://en.wikipedia.org/wiki/Loose_coupling) and [Single-responsibility](https://en.wikipedia.org/wiki/Single-responsibility_principle). When discussing containers, performance and modularity I think [Structure of Arrays](https://en.wikipedia.org/wiki/AoS_and_SoA) is an important thing to bring up. One concept I've recently started to respect is iterator safety over indices. for(size_t i = 0; i < foo.size(); ++i) { doSomethingWithBar(bar[i]); } This is a simple mistake that the compiler should be able to help with. Imagine a more complex example with many arrays and nested loops. A further issue is iterating over a structure of arrays, but perhaps [zip](https://en.cppreference.com/w/cpp/ranges/zip_view) will be a solution there. When conforming to existing C++ convetions, some things just become easier. Consider the following: void convertArrayFooToBar(FooInput, BarOutput); void convertArrayBarToFoo(BarInput, FooOutput); void convertArrayFooToBaz(FooInput, BazOutput); void convertArrayBazToFoo(BazInput, FooOutput); void convertArrayBazToBar(BazInput, BarOutput); void convertArrayBarToBaz(BarInput, BazOutput); Sure, each function can internally use a common templated conversion function but that's still only written for this one specific example. If these array types were iterator compatible it would be far easier to write generic code for them; once, without the chance for mistakes in each one or forgetting a permutation. // Fine-grained object conversion functions namespace FooBarBaz { template <class Result, class From> Result convert(const From& value); template <> Bar convert(const Foo& value) { ... specialize } ... } // Single convert-anything template template <class ArrayResult, class ArrayFrom> void convertArray(const ArrayFrom& input, ArrayResult& output) { using ResultType = typename ArrayResult::value_type; std::transform(std::begin(input), std::end(input), std::begin(output), [](const auto& foo){ return FooBarBaz::convert<ResultType>(foo); }); } # Example use cases A few ideas to demonstrate how amazing custom containers can be. 1. `std::vector` uses separate heap allocations. What if you just need a small vector of items that live on the stack. This could even fall back to heap allocations, e.g. [eastl::fixed_vector](https://github.com/electronicarts/EASTL/blob/master/include/EASTL/fixed_vector.h). 2. One could make a [circular buffer](https://www.boost.org/doc/libs/1_83_0/doc/html/circular_buffer.html). 3. `std::span` assumes contiguous data. In computer graphics it's common to interleave values in memory. [Extending it with a stride](https://github.com/NVIDIAGameWorks/Displacement-MicroMap-Toolkit/blob/main/meshops_core/include/meshops/meshops_array_view.h) would allow more convenient views of data. 4. Pointers don't work between address spaces, e.g. addresses within a file on disk or within a shared memory block. Boost's [offset_ptr](https://www.boost.org/doc/libs/1_36_0/doc/html/interprocess/offset_ptr.html) instead holds a relative offset from its own memory address. Containers to traverse memory-mapped files could be written around this concept. # Container The container itself is often the simpler object to implement. It just holds the data, or references it in the case of views. The [iterator](#iterator) is where the complexity may lie. The main thing a container needs is a `begin()` and `end()` function. With those, you get a pile of things for free. In C++ 20 begin and end are grouped into [ranges](https://en.cppreference.com/w/cpp/ranges) and with range adaptors existing containers "just work"™ with other library features. class my_container { public: my_iterator begin() { ... } my_iterator end() { ... } private: ??? }; The next main thing are some common functions such as `size()` and types such as a `value_type`, a.k.a. *named requirements*. Here's a more complete boilerplate example. Full source code is available on [GitHub](https://github.com/pknowles/container_boilerplate). Fair warning, this is not well tested. template <class T, size_t N> class my_container { public: using value_type = T; using iterator = my_iterator<value_type>; using const_iterator = my_iterator<const value_type>; using reference = decltype(*std::declval<iterator>()); using const_reference = decltype(*std::declval<const_iterator>()); using size_type = size_t; my_container() : m_array{} {}; my_container(const my_container& other) noexcept = default; my_container(std::initializer_list<T> init) { std::copy(init.begin(), init.end(), begin()); } // Copy constructor that also handles const conversion template <class U, class = std::enable_if_t<std::is_same_v<T, U> || std::is_convertible_v<U, T>>> explicit my_container(my_container<U, N>& other) { std::copy(other.begin(), other.end(), begin()); } template <class Iterator> my_container(Iterator init_begin, Iterator init_end) { std::copy(init_begin, init_end, this->begin()); } #ifdef __cpp_lib_ranges template <std::ranges::input_range Range> my_container(Range&& range) { std::ranges::copy(range, this->begin()); } #endif reference operator[](size_type index) noexcept { return *(begin() + index); } const_reference operator[](size_type index) const noexcept { return *(begin() + index); } iterator begin() noexcept { return iterator(m_array); } const_iterator begin() const noexcept { return const_iterator(m_array); } iterator end() noexcept { return begin() + N; } const_iterator end() const noexcept { return begin() + N; } value_type* data() noexcept { return m_array; } const value_type* data() const noexcept { m_array; } size_type size() const noexcept { return N; } reference front() noexcept { return *begin(); } const_reference front() const noexcept { return *begin(); } reference back() noexcept { return *(end() - 1); } const_reference back() const noexcept { return *(end() - 1); } private: value_type m_array[N]; }; # Named Requirements and Concepts C++ templates are evaluated when they're used. There's no formal declaration ahead of time. For example, you can write: template<class T> Result doThing(const T& object) { return object.thing(); } Being a template, this code doesn't get compiled until you make the call `doThing(my_object);`. At that point, the function `doThing` for `my_object`'s type gets generated and compiled. If the type doesn't have a member function `thing()` then compilation will fail. [Named requirements](https://en.cppreference.com/w/cpp/named_req) are basically a convention without formal validation where you say "objects I want to call doThing() with will all have a `.thing()`. Thankfully, C++ 20 introduced [concepts](https://en.cppreference.com/w/cpp/language/constraints) to do better formal validation of templates. [C++ named requirements: Container](https://en.cppreference.com/w/cpp/named_req/Container) At the time of writing there are no official *concepts* for containers, but [ranges](https://en.cppreference.com/w/cpp/ranges) has *range concepts*. Many of these requirements can be written by reusing another single definition, e.g. `using value_type = typename iterator::value_type;` or implementing `operator[]` as `return *(begin() + index);`. While writing the container it is invaluable to already have some test cases in place. There are a lot of great [compile time tests](https://en.cppreference.com/w/cpp/meta) that the standard library already provides. // Copy construction static_assert(std::is_copy_constructible_v<my_view<int>>); static_assert(std::is_trivially_copy_constructible_v<my_view<int>>); // Const conversion static_assert(std::is_constructible_v<my_view<const int>, my_view<int>>, "can't make const view from non-const view"); static_assert(!std::is_constructible_v<my_view<int>, my_view<const int>>, "can make non-const view from const view"); # Iterator Iterators can just be raw pointers and it's way easier if they actually are. That said, `std::vector` and `std::span` probably cover most uses for pointer-only iterators so I imagine it's uncommon to have a container that doesn't also need a custom iterator class. In my experience, the iterator class is often where most of the complexity lies. Here's a boilerplate iterator example. Again, this is not well tested. While the iterator itself could be a raw pointer, i.e. `template <class T> my_iterator = T*;`, this class instead has a `value_type* m_ptr;` to demonstrate various operators, types and [`my_value_reference`](#value_reference). template <class T> class my_iterator { public: using iterator_category = std::random_access_iterator_tag; using value_type = T; using difference_type = std::ptrdiff_t; using reference = my_value_reference<T>; // could just be value_type& using const_reference = reference; // must be the same my_iterator() noexcept = default; my_iterator(const my_iterator& other) noexcept = default; my_iterator& operator=(const my_iterator& other) noexcept = default; // TODO: implement this my_iterator(value_type* ptr) noexcept : m_ptr(ptr) {} reference operator*() noexcept { // TODO: implement this return *m_ptr; } const_reference operator*() const noexcept { // TODO: implement this return *m_ptr; } reference operator[](difference_type index) noexcept { return *(*this + index); } const_reference operator[](difference_type index) const noexcept { return *(*this + index); } my_iterator& operator++() noexcept { // TODO: implement this ++m_ptr; return *this; } my_iterator operator++(int) noexcept { my_iterator temp = *this; ++(*this); return temp; } my_iterator& operator--() noexcept { // TODO: implement this --m_ptr; return *this; } my_iterator operator--(int) noexcept { my_iterator temp = *this; --(*this); return temp; } my_iterator& operator+=(difference_type n) noexcept { // TODO: implement this m_ptr += n; return *this; } my_iterator operator+(difference_type n) const noexcept { my_iterator result(*this); return result += n; } friend my_iterator operator+(difference_type n, const my_iterator& it) noexcept { return it + n; } my_iterator& operator-=(difference_type n) noexcept { return *this += (-n); } my_iterator operator-(difference_type n) const noexcept { return *this + (-n); } friend my_iterator operator-(difference_type n, const my_iterator& it) noexcept { return it - n; } difference_type operator-(const my_iterator& other) const noexcept { return std::distance(other.m_ptr, m_ptr); } bool operator==(const my_iterator& other) const noexcept { return (other - *this) == 0; } bool operator!=(const my_iterator& other) const noexcept { return !(*this == other); } bool operator<(const my_iterator& other) const noexcept { return 0 < (*this - other); } bool operator<=(const my_iterator& other) const noexcept { return 0 <= (*this - other); } bool operator>(const my_iterator& other) const noexcept { return 0 > (*this - other); } bool operator>=(const my_iterator& other) const noexcept { return 0 >= (*this - other); } private: value_type* m_ptr = nullptr; }; Again, compile-time validation with static_assert can help catch bugs way before they happen or you start trying to use the container in real code. template <class iterator> concept input_and_output_iterator = std::input_iterator<iterator> && std::output_iterator<iterator, typename std::iterator_traits<iterator>::value_type>; static_assert(std::ranges::random_access_range<my_container>); static_assert(std::ranges::sized_range<my_container>); static_assert(input_and_output_iterator<typename my_container::iterator>); # Value reference Typically iterators will dereference directly to their value type. This is generally easy for const access. However sometimes it's not possible to return a writable reference. A reference wrapper or proxy object can be used instead. For example, - The value is encoded differently in memory such as a bitfield. In this case, the reference wrapper provides `operator value_type() const` and `operator=(const value_type& value)` to read and write values respectively. - The value is cached and its reference must not go out of scope while in use. The reference wrapper could hold a reference count (TBH, this sounds rather expensive). Here's a boilerplate value reference example. Again, this is not well tested. template <class T> class my_value_reference { public: using value_type = T; my_value_reference() = delete; my_value_reference& operator=(const my_value_reference& other) { return operator=(other.operator value_type()); } my_value_reference(value_type& reference) noexcept : m_reference{reference} {} my_value_reference& operator=(const value_type& value) { m_reference = value; return *this; } const my_value_reference& operator=(const value_type& value) const { m_reference = value; return *this; } operator value_type() const { return m_reference; } private: value_type& m_reference; }; # Extras ## Rule of three, five and zero Rule of Three: If you make a custom destructor, copy constructor, or copy assignment operator, make sure to define all three. Rule of Five: Add move constructor and move assignment operator to the Rule of Three. Rule of Zero: Ideally, have none. It might be possible to move resource management to individual members so that aggregate classes don't need to do anything special. See [here](https://en.cppreference.com/w/cpp/language/rule_of_three) for reference. ## Const objects Const containers, iterators and value references can be confusing especially when combined with const values. Remember that a const pointer can point to non-const (mutable) data. An iterator should be considered like a pointer. Whether you have a const iterator or not, the iterator's value_type would be the same. In fact, iterator concepts validate that `iterator::reference` is the same as `iterator::const_reference`. However, the container's `const_iterator::reference` may be a `const value_type&` which it would be for the case of `std::vector` but not `std::span`. A container for type `T` is a different type to a container of type `const T` and there is no cast between them (well, reinterpret_cast would do it but that'd be undefined behaviour, e.g. if the `const T` type were to specialize its memory layout). One use case is when trying to make a const object immutable despite having non-const pointers, in which case there are wrappers such as [propagate_const](https://en.cppreference.com/w/cpp/experimental/propagate_const) to provide a transitive/deep const view of an object. ## Exceptions Just like `const`, adding [noexcept](https://en.cppreference.com/w/cpp/language/noexcept_spec) can allow the compiler to make some optimizations. Libraries can also check methods with the [`noexcept()` operator](https://en.cppreference.com/w/cpp/language/noexcept). > For example, containers such as std::vector will move their elements if the elements' move constructor is `noexcept`, and copy otherwise... Note that the compiler won't validate that called functions are also `noexcept` and `std::terminate()` is called if one actually happens to throw. ## Testing Sadly not everything can be compile-time validated. Unit testing is important, especially for generic containers that will see a lot of use and are expected to work in all cases. I wish there were some standard test suites people could easily throw at their own containers. Maybe there are and I just haven't found them. [gcc](https://gcc.gnu.org/onlinedocs/gcc-7.5.0/libstdc++/manual/manual/test.html) and [llvm](https://libcxx.llvm.org/TestingLibcxx.html) have standard library test suites, but they're very likely not easily pluggable with custom containers. ## Performance I'm always surprised when the compiler is able to optimize crazy amounts inline functions away completely. In many cases, custom C++ iterators are just as fast to execute as writing out C structures and code by hand. That said, all intermediate operations still need to conform to C++ spec. If a copy constructor for an iterator gets called, it gets called, and the compiler needs to optimize it away. Sometimes a little mistake can lead to something significant and as always testing is needed. **Benchmark** Profile some basic loops over items in your container and compare them to bare bones hand written loops with plain old data structs. For example, the repository for above code samples includes [test_benchmark.cpp](https://github.com/pknowles/container_boilerplate/blob/main/tests/test_benchmark.cpp). **Compiler output** If the benchmarks aren't looking good, take the loops and throw each into [godbolt](https://godbolt.org/). E.g. `void loop_with_my_container()` and `void loop_with_raw_array()`. Remember to compile with `-O2`. **Debug Performance** One major drawback to custom C++ containers is their performance in debug builds. Benchmarks show an easy 10x or more penalty. One thought is if it's possible to switch on optimizations and disable debug code generation for custom containers (after, and separately to thoroughly testing them of course). This is possible to do with compiler specific pragma directives [GCC's `#pragma GCC optimize`](https://gcc.gnu.org/onlinedocs/gcc/Function-Specific-Option-Pragmas.html) or [MSVC's `#pragma optimize()`](https://learn.microsoft.com/en-us/cpp/preprocessor/optimize), but I have not tried this yet.
Toggle Preview
Edit message
*
A description of the changes made
Discard Draft
Save Draft
leave this field blank to prove your humanity
Flag
the thing you clicked
for moderator attention.
Reason choice:
Spam, promoting, advertising without disclosure
Rude, inappropriate, generally offensive
Too arrogant or demeaning to others
Other
Reason:
The reason for raising the flag
Error