Pages

Splitting a string in plain C++

To split a string I would normally use the Boost tokenizer function. Sometimes I can't. For instance when I am having fun in solving some online programming problem, where no extra library could be used. In this case I fall back to this homemade split function.

boost::tokenizer() is smarter, however this bare version is usually enough for what I need:
std::vector<std::string> split(const std::string& input, char sep) // 1
{
    std::vector<std::string> tokens;
    std::size_t beg = 0, end = 0; // 2
    while ((end = input.find(sep, beg)) != std::string::npos) // 3
    {
        if(beg < end) // 4
            tokens.push_back(input.substr(beg, end - beg));
        beg = end + 1; // 5
    }
    if(beg < input.size()) // 6
        tokens.push_back(input.substr(beg));

    return tokens;
}
1. It accepts in input a constant reference to the string we want to split and the unique character expected as separator. As output we get the found tokens in a vector of strings.
2. I am going to loop on the input string, putting in beg and end the delimiter positions for each token.
3. Find the next position for the separator, until one of them is available at all.
4. If the token is not empty, extract it from input and push it in the tokens vector.
5. The next token would start after the current separator position.
6. Push the last token, not considering a possible empty one at the end of the input string.

As example, see how I have used this split() function to solve the Swap Numbers codeeval problem, that asks to swap the first and last character in each word in a blank separated string:
std::string solution(const std::string& input)
{
    std::vector<std::string> words = split(input, ' '); // 1
    for(std::string& word : words) // 2
        std::swap(word.front(), word.back());

// ...
1. Using the above defined split() function on the input string for the blank separator.
2. Each word in the vector has its first and last character swapped.

No comments:

Post a Comment