How do I iterate over the words of a string in C++?
I need to iterate over the words of a string composed of words separated by whitespace. I’m not interested in C string functions or character manipulation/access. I prefer elegance over efficiency.
Here’s my current solution using istringstream:
#include <iostream>
#include <sstream>
#include <string>
using namespace std;
int main() {
string s = "Somewhere down the road";
istringstream iss(s);
do {
string subs;
iss >> subs;
cout << "Substring: " << subs << endl;
} while (iss);
}
Are there more elegant or alternative approaches to iterate over words in a C++ string?
Modern C++ offers several elegant approaches to iterate over words in a string beyond the standard istringstream method. These include using the C++17 string view with algorithms, range-based for loops with string views, and the powerful C++20 ranges library for more functional-style iteration. Each approach provides different levels of elegance and expressiveness while maintaining readability and avoiding low-level character manipulation.
Contents
- String Views with Algorithms
- Range-Based For Loops
- C++20 Ranges Library
- Boost String Algorithms
- STL Algorithms with Custom Predicates
- Performance Comparison
- Recommendations
String Views with Algorithms
String views (std::string_view) introduced in C++17 provide an elegant way to iterate over words without copying data. This approach offers better performance and more expressive syntax:
#include <iostream>
#include <string>
#include <string_view>
#include <vector>
#include <algorithm>
std::vector<std::string_view> split_words(std::string_view text) {
std::vector<std::string_view> words;
auto start = text.begin();
auto end = text.begin();
while (end != text.end()) {
start = std::find_if_not(start, text.end(), [](char c) { return std::isspace(c); });
if (start == text.end()) break;
end = std::find_if(start, text.end(), [](char c) { return std::isspace(c); });
words.emplace_back(&*start, end - start);
start = end;
}
return words;
}
int main() {
std::string s = "Somewhere down the road";
auto words = split_words(s);
for (const auto& word : words) {
std::cout << "Word: " << word << std::endl;
}
}
This approach creates string views that reference the original string’s memory, avoiding unnecessary copies while maintaining clean, readable code.
Range-Based For Loops
For a more functional approach, you can create a custom iterator that works with range-based for loops:
#include <iostream>
#include <string>
#include <cctype>
class WordIterator {
const std::string& str;
size_t pos = 0;
public:
WordIterator(const std::string& s) : str(s) {}
class Word {
const std::string& str;
size_t start, end;
public:
Word(const std::string& s, size_t b, size_t e) : str(s), start(b), end(e) {}
const std::string& operator*() const {
return str.substr(start, end - start);
}
// Other iterator methods...
};
WordIterator begin() {
pos = str.find_first_not_of(" \t\n\r");
return *this;
}
WordIterator end() {
return WordIterator(str);
}
bool operator!=(const WordIterator& other) const {
return pos != other.pos;
}
WordIterator& operator++() {
pos = str.find_first_of(" \t\n\r", pos);
pos = str.find_first_not_of(" \t\n\r", pos);
return *this;
}
Word operator*() const {
auto end_pos = str.find_first_of(" \t\n\r", pos);
return Word(str, pos, end_pos);
}
};
int main() {
std::string s = "Somewhere down the road";
for (const auto& word : WordIterator(s)) {
std::cout << "Word: " << word << std::endl;
}
}
This approach provides the most natural syntax and can be reused across different strings.
C++20 Ranges Library
C++20 ranges library offers the most elegant and expressive solution:
#include <iostream>
#include <string>
#include <ranges>
#include <vector>
int main() {
std::string s = "Somewhere down the road";
auto words = s
| std::views::split(' ')
| std::views::transform([](auto&& rng) {
return std::string_view(&*rng.begin(), std::ranges::distance(rng));
});
for (const auto& word : words) {
std::cout << "Word: " << word << std::endl;
}
}
For a more robust implementation that handles multiple whitespace characters:
#include <iostream>
#include <string>
#include <ranges>
#include <vector>
#include <algorithm>
std::vector<std::string> split_words(const std::string& text) {
auto words = text
| std::views::split(' ')
| std::views::transform([](auto&& rng) {
return std::string(&*rng.begin(), std::ranges::distance(rng));
});
std::vector<std::string> result;
for (const auto& word : words) {
if (!word.empty()) {
result.push_back(word);
}
}
return result;
}
int main() {
std::string s = "Somewhere down the road";
auto words = split_words(s);
for (const auto& word : words) {
std::cout << "Word: " << word << std::endl;
}
}
Boost String Algorithms
If you can use external libraries, Boost provides elegant string algorithms:
#include <iostream>
#include <string>
#include <boost/algorithm/string.hpp>
int main() {
std::string s = "Somewhere down the road";
std::vector<std::string> words;
boost::split(words, s, boost::is_space(),
boost::token_compress_on);
for (const auto& word : words) {
std::cout << "Word: " << word << std::endl;
}
}
Boost offers additional options for handling different types of whitespace and trimming.
STL Algorithms with Custom Predicates
You can combine STL algorithms with custom predicates for elegant solutions:
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
#include <cctype>
std::vector<std::string> split_words(const std::string& text) {
std::vector<std::string> words;
auto start = text.begin();
while (true) {
start = std::find_if(start, text.end(),
[](char c) { return !std::isspace(c); });
if (start == text.end()) break;
auto end = std::find_if(start, text.end(),
[](char c) { return std::isspace(c); });
words.emplace_back(start, end);
start = end;
}
return words;
}
int main() {
std::string s = "Somewhere down the road";
auto words = split_words(s);
for (const auto& word : words) {
std::cout << "Word: " << word << std::endl;
}
}
Performance Comparison
| Approach | Performance | Memory Usage | Readability | C++ Standard |
|---|---|---|---|---|
istringstream |
Moderate | High (copies) | Good | C++98 |
| String Views | Excellent | Low (references) | Good | C++17 |
| Custom Iterator | Good | Low | Moderate | C++11 |
| C++20 Ranges | Excellent | Low | Excellent | C++20 |
| Boost | Good | Moderate | Excellent | External |
Recommendations
For modern C++ development, I recommend:
- C++17+: Use string views with algorithms for the best balance of performance and elegance
- C++20: If available, the ranges library provides the most elegant and expressive solution
- Legacy Code: The
istringstreamapproach remains perfectly fine for simple cases - Large Projects: Consider Boost string algorithms for consistent, well-tested solutions
The most elegant approach depends on your C++ standard and specific requirements. For maximum elegance and modern syntax, C++20 ranges are ideal, while string views offer excellent performance with C++17.
Sources
- C++17 std::string_view documentation
- C++20 ranges library overview
- Boost string algorithms documentation
- C++ algorithms reference
- Modern C++ features overview
Conclusion
Modern C++ offers multiple elegant approaches to iterate over words in a string, each with different advantages. The istringstream method you’re currently using is perfectly adequate, but newer approaches offer better performance and more expressive syntax. For the most elegant solution, consider using C++20 ranges or C++17 string views depending on your compiler support. The ranges library provides functional-style composition, while string views offer excellent performance with minimal memory overhead. Choose the approach that best fits your project’s C++ standard and performance requirements.