Pages

Sorted vectors instead of set

A set is a perfect data structure if you plan to perform on it a mix of different operations (insertions, lookups, erasures) with no control on what is going on next.

If you can see a definite pattern in its usage (a setup where data are inserted, a lookup where data are retrieved, a reorganization when the data is changed, ...) it could be worthy using a less fancy but more efficient sorted vector.

The idea is setting the data, sorting the vector, and then looking up using the standard searching functions for sorted containers.

Here is an usage example.

First thing we put some data in our vector:

std::vector<int> vi;
vi.reserve(10);
for(int i = 0; i < 10; ++i)
vi.push_back(i);

std::random_shuffle(vi.begin(), vi.end());
dump(vi);

Once we have the data ready (I shuffled the vector to emulate a random initialization), we can sort it:

std::sort(vi.begin(), vi.end());
dump(vi);

Time to perform some data search. We can use the standard alogorithm binary_search():

int target = 5;
if(std::binary_search(vi.begin(), vi.end(), target))
std::cout << "binary_search found " << target << std::endl;

Let's add some more data, duplicating the values, so to make the test a bit more interesting:

vi.reserve(20);
for(int i = 0; i < 10; ++i)
vi.push_back(i);
std::sort(vi.begin(), vi.end());
dump(vi);

We can use lower_bound() to get the first element with the given value, but we should pay attention writing the condition that ensures we have a good value:

typedef std::vector<int>::const_iterator CIT;
CIT it = std::lower_bound(vi.begin(), vi.end(), target);
if(it != vi.end() && !(*it < target))
std::cout << "lower_bound found: " << *it << std::endl;

In this simple case we could have checked just for equality but, generally speaking, the negation of less is the correct way of approaching the problem.

Otherwise we could use equal_range() that does not suffer that issue:

std::pair<CIT, CIT> range = std::equal_range(vi.begin(), vi.end(), target);
if(range.first != range.second)
std::cout << "equal_range found: " << *range.first << std::endl;

Many more details on sorted vectors in Item 23 of Effective STL, part of the Effective C++ book series by Scott Meyers.

No comments:

Post a Comment