Saturday, December 26, 2009

STL, allocators, etc

Every time I find a link even remotely related to the internals of STL, I get interested in it. One of the reasons is because I've been bit very badly because of its various nitty gritties. There were some very hard to find memory "leaks" first, some weird performance issues and some interesting side effects like the one in the previous blog post.

The performance issues were encountered in a module which was using STL very heavily ( ok, STL was not the only reason for the performance hit - but when the execution time of a process comes down from 6 hours to 2.5 hours just by changing a certain configuration for STL, annoyed me a lot). The only thing i managed to write about it was this one line. Maybe I'll write more about it some other day, just so that I never forget about it.

The memory "leaks" fiasco was very interesting too. The memory usage of some process ( heap size ) was continuously increasing and no memory leak detector could find any leak. There was hardly any explicit pointer handling in the suspect code. Infact it was ridden with all value types with heavy usage of some STL containers. It's still a perfect whodunnit, as no one knows the answer for sure. I suspected memory fragmentation to start with, but then no one thought it made sense, or whether such a thing can happen ( whatever that means :) ). Finally, the case was closed prematurely by someone by citing this page from sgi.

Just recently, I came across a link, which explains in detail how EA Sports wrote their own implementation of the STL - EASTL. I didn't understand the full article as such, but there were some very interesting statements and references in it


...the std allocator design and its use by std containers make it hard to work with and leads to suboptimal performance and code bloat...


...the use of virtual functions is avoided as well, given the potential cache miss they may cause...


...The compiler optimizes for if statements to evaluate as true, so write your code to follow this when possible...


The third point is also emphasized in another article I read recently about the pitfalls of object oriented programming

In addition, it was also interesting to find an article on custom allocators by someone from outside the gaming industry. stackoverflow usually has interesting discussions on almost all topics. There were lots of discussions about custom allocators , one of which mentioning that it's a premature optimization for most things other than game development/embedded devices development. But then there were also discussions whether all the so-called-features of C++ are used in game development at all :).

After having been subjected to all those articles and having understood it only moderately, I thought I'll try my hand at writing an allocator just to check out some simple behaviour by printing some debug messages.

The following program just tracks the memory allocation/deallocation pattern on inserting elements into a vector. It's pretty interesting to see the pattern when the vector doubles its capacity to ensure all the elements are continuous.


#include <iostream>
#include <memory>
#include <cstdlib>
#include <vector>

namespace std {

template <>
class allocator<int>
{

public:
typedef size_t size_type;
typedef ptrdiff_t difference_type;
typedef int* pointer;
typedef const int* const_pointer;
typedef int& reference;
typedef const int& const_reference;
typedef int value_type;

allocator<int>() {}
allocator<int>(const allocator<int>&) {}

template <class U>
struct rebind { typedef allocator<U> other; };

pointer allocate(size_type n, const void * = 0) {
int* t = (int*) malloc(n * sizeof(int));
std::cout<< "Used allocator<int> to allocate "<<n<<" objects at address "
<< t << " (+)" << std::endl;
return t;
}

void deallocate(void* p, size_type n) {
if (p) {
free(p);
std::cout<< "Used allocator<int> to deallocate "<<n<<" objects at address "
<< p << " (-)" <<std::endl;
}
}

pointer address(reference x) const { return &x; }
const_pointer address(const_reference x) const { return &x; }

allocator<int>& operator=(const allocator<int>&) { return *this; }

void construct(pointer p, const int& val)
{
cout<<"Constructing object at "<<p<<endl;
new ((int*) p) int(val);
}

size_type max_size() const { return size_t(-1); }

};

}

int main()
{

std::vector<int> abcd;

for( int i = 1; i <= 10 ; i++ ) {
std::cout<<"Inserting '"<<i<<"' into the vector "<<std::endl;
abcd.push_back(i);
std::cout<<"-----------"<<std::endl;
}

return 0;
}


which printed the following on my 32 bit linux machine


Inserting '1' into the vector
Used allocator<int> to allocate 1 objects at address 0x804c008 (+)
Constructing object at 0x804c008
-----------
Inserting '2' into the vector
Used allocator<int> to allocate 2 objects at address 0x804c018 (+)
Constructing object at 0x804c01c
Used allocator<int> to deallocate 1 objects at address 0x804c008 (-)
-----------
Inserting '3' into the vector
Used allocator<int> to allocate 4 objects at address 0x804c028 (+)
Constructing object at 0x804c030
Used allocator<int> to deallocate 2 objects at address 0x804c018 (-)
-----------
Inserting '4' into the vector
Constructing object at 0x804c034
-----------
Inserting '5' into the vector
Used allocator<int> to allocate 8 objects at address 0x804c040 (+)
Constructing object at 0x804c050
Used allocator<int> to deallocate 4 objects at address 0x804c028 (-)
-----------
Inserting '6' into the vector
Constructing object at 0x804c054
-----------
Inserting '7' into the vector
Constructing object at 0x804c058
-----------
Inserting '8' into the vector
Constructing object at 0x804c05c
-----------
Inserting '9' into the vector
Used allocator<int> to allocate 16 objects at address 0x804c068 (+)
Constructing object at 0x804c088
Used allocator<int> to deallocate 8 objects at address 0x804c040 (-)
-----------
Inserting '10' into the vector
Constructing object at 0x804c08c
-----------
Used allocator<int> to deallocate 16 objects at address 0x804c068 (-)

0 Comments:

Post a Comment

<< Home