Pages

Standard partial sorting

Sometimes using std::sort(), or its stable and hence more costly version std::stable_sort(), is unnecessary expensive. We could use instead other standard functions like partial_sort(), nth_element(), or partition().

First step, let's create a shuffled vector of ints:

std::vector<int> vi;
vi.reserve(100);
for(int i = 0; i < 100; ++i)
vi.push_back(i);
std::random_shuffle(vi.begin(), vi.end());
dump(vi);

If we need just the top 20 elements, we could do just a partial sort, that puts in the correct order only the range specified by the first two iterators:

std::partial_sort(vi.begin(), vi.begin() + 20, vi.end());
dump(vi.begin(), vi.begin() + 20);

If we are not iterested in the order, but only in the top elements, we could call instead nth_element():

std::nth_element(vi.begin(), vi.begin() + 20, vi.end());
dump(vi.begin(), vi.begin() + 20);

The nth_element() function() is written in such a way that it makes it useful for getting a specific nth element in a collection (as its name shows) limiting the ordering effort on the collection. So we could use it, for instance, to get the median element in a collection:

std::vector<int>::iterator target = vi.begin() + vi.size() / 2;
std::nth_element(vi.begin(), target, vi.end());
std::cout << "Median is: " << *target << std::endl;

A less costly function is partition(), that applies a predicate to a collection to partition it in two groups, returning the iterator to the last element of the group respecting the passed condition. Given this function:
bool even(int value) { return value%2 == 0; }
we can use it to partition our vector:

std::vector<int>::iterator lastEven = std::partition(vi.begin(), vi.end(), even);
dump(vi.begin(), lastEven);

For more details on partitioning and partial sorting, have a look at Item 31 of Effective STL, in the Effective C++ book series by Scott Meyers.

No comments:

Post a Comment