At GoingNative back in September, Andrei Alexandrescu posed an interesting question about API design and C++11 that has had me scratching my head for a month. It was about the design of std::getline
:
// Read a line from sin and fill in buf. Return sin. std::istream& getline(std::istream& sin, std::string& buf) { buf.clear(); // ... fill in buf return sin; }
Seasoned programmers recognize this pattern: The function takes the buffer by non-const reference and fills it in. They also know why the interface is designed this way: Because containers like std::string
are too expensive to copy to consider returning one by value. APIs designed like this have traditionally had the benefit of being efficient, at the expense of some awkwardness at the call site:
std::string buf; std::getline(std::cin, buf); use_line(buf);
In C++11, standard containers like std::string
are moveable, so returning one by value is darn near free. So, perhaps a better API design would look like this:
// Should getline look like this instead? std::string getline(std::istream& sin) { std::string buf; // ... fill in buf return buf; // This gets moved out efficiently }
That allows a more concise, natural usage, and doesn’t force the user to create a named variable:
use_line(getline(std::cin));
That’s nice, right? I mean, aside from the obvious shortcoming that now you can’t tell whether getline
succeeded or not. Oops. But even overlooking that, there’s an issue here.
Performance, Performance, Performance
You might think that because of move semantics, we don’t have to worry about the lousy performance of returning expensive collections by value, and you’d be right. Sort of. But consider this use of getline
:
std::string buf; while(std::getline(std::cin, buf)) use_line(buf);
Now consider what this code would be doing if, instead of taking buf
as an out parameter, getline
created a new string
each time and returned it by value. Well, it’s creating a new string
each time, duh. But the code above doesn’t do that. After a few times through the loop, buf
will probably be big enough to hold whatever lines will be read next, and that space can be reused with no further allocations. Much, much faster.
Back To The Drawing Board
During GoingNative, Andrei left getline
there. (It turns out he prefers a different design, and we’ll be arriving at a similar conclusion.) I wanted to continue the discussion. Out parameters are ugly and awkward to use, they hurt API composability, they force you to declare objects and initialize them in separate steps, they cause acne, etc. Surely something could be done!
I studied the problematic code some more:
std::string buf; while(std::getline(std::cin, buf)) use_line(buf);
What is this code doing? It’s reading a bunch of lines and processing them one at a time, right? You might even say, it’s returning a range of lines. Then it hit me: std::getline
is the wrong API! It should be called getlines
(plural), and it should return a range of strings. Take a look:
for(std::string& buf : getlines(std::cin)) use_line(buf);
This API feels right-er to me. Not only is it easier to use (look ma! one fewer line!), it doesn’t force a two-step initialization of any objects, and ranges and range operations compose. (More on that later.) It also doesn’t suffer from the performance problems of my first attempt, although it takes some work to see why.
Lazy Ranges
What does my getlines
function return? Surely it doesn’t fill in a std::vector
of string
‘s and return that. That would be (a) dumb, (b) expensive, and (c) impossible in practice since a potentially infinite number of lines could be read from an istream
. Instead, getlines
does something smarter: it returns a lazy range.
A lazy range is something that generates elements on demand. The STL already has such a thing: std::istream_iterator
. You can create a range out of istream_iterator
s that pulls characters — or ints or whatever — from an istream
on demand. We need something like that, but for lines.
Unfortunately, we can’t press istream_interator
into service for us. Instead, we need to write our own iterator type, and build a valid range out of that. This is a painful and verbose programming exercise, but Boost.Iterator can help. It has some helpers that let you build iterators from a fairly minimal interface. Without further ado, here is the lines_iterator
:
struct lines_iterator : boost::iterator_facade< lines_iterator, std::string, // value type std::input_iterator_tag // category > { lines_iterator() : psin_{}, pstr_{}, delim_{} {} lines_iterator(std::istream *psin, std::string *pstr, char delim) : psin_(psin), pstr_(pstr), delim_(delim) { increment(); } private: friend class boost::iterator_core_access; void increment() { if(!std::getline(*psin_, *pstr_, delim_)) *this = lines_iterator{}; } bool equal(lines_iterator const & that) const { return pstr_ == that.pstr_; } std::string & dereference() const { return *pstr_; } std::istream *psin_; std::string *pstr_; char delim_; };
The magic happens when you increment a lines_iterator
, which happens in lines_iterator::increment
. std::getline
is called, and it fills in a buffer referred to by pstr_
. Note that it uses the same buffer every time. And when you dereference a lines_iterator
, it returns a reference to that buffer. No copying, no unnecessary allocation.
Where does the buffer referred to by pstr_
live? In the lines_range
object, which is returned by getlines
.
using lines_range_base = boost::iterator_range<lines_iterator>; struct lines_range_data {std::string str_;}; struct lines_range : private lines_range_data, lines_range_base { explicit lines_range(std::istream & sin, char delim = 'n') : lines_range_base{ lines_iterator{&sin, &str_, delim}, lines_iterator{}} {} }; inline lines_range getlines(std::istream& sin, char delim = 'n') { return lines_range{sin, delim}; }
lines_range
is really just a boost::iterator_range
of lines_iterator
s. Some contortion was needed to initialize the str_
member before the iterator_range
constructor was called (hence the need for lines_range_data
), but that’s just an implementation artifact.
The long and short of it is this: when you call getlines
, you get back a lines_range
object, which is basically a free operation. Now you can call .begin()
and .end()
on it, or directly iterate over it using a range-based for
loop, like I showed. No more memory allocations are done using this interface than with the original std::getline
API. Nice, eh?
Composability of Ranges and Range Algorithms
There’s lots of reasons to prefer the range-based getlines
API — and range-based interfaces in general. The most immediate benefit is that people can use range-based for
loops, as I showed above. But the real power comes once you start using range algorithms and range adaptors. Both Boost and Adobe’s ASL provide powerful utilities for working with ranges, and the C++ Standardization Committee has a working group dedicated to ranges for some future version of the standard. And for good reason! Range operations compose, so for instance you could do something like this:
// Read some lines, select the ones that satisfy // some predicate, transform them in some way and // echo them back out boost::copy( getlines(std::cin) | boost::adaptors::filtered(some_pred) | boost::adaptors::transformed(some_func), std::ostream_iterator<std::string>(std::cout, "n"));
That’s strong stuff. I shudder to think what the equivalent code would look like with straight iterators and STL algorithms.
But what if you just want to read a single line? Doesn’t the new getlines
hurt you for this simple usage scenario? Nope! All we need is one perfectly general function that returns the first element of a range. Let’s call it front
:
using std::begin; // return the front of any range template<typename Range> auto front(Range && rng) -> decltype(boost::make_optional(*begin(rng))) { for(auto x : rng) return x; return boost::none; }
Since a range might be empty, we need to return an optional
. Now you can read a single line from an istream
like this:
if(auto s = front(getlines(std::cin))) use_line(*s);
Compare this to the original and I think you’ll see it’s no worse:
std::string str; if(std::getline(std::cin, str)) use_line(str);
Stateful Algorithms
So have we completely addressed all of Andrei’s concerns with getline
? Yes and no. Certainly we’ve fixed getline
, but Andrei’s point was bigger. He was showing that you can’t just blindly pass and return by value, hoping that move semantics will magically make your programs faster. And that’s a valid point. I can’t say anything that changes that fact.
I think getline
is a curious example because what looks at first blush like a pure out parameter is, in fact, an in/out parameter; on the way in, getline
uses the passed-in buffer’s capacity to make it more efficient. This puts getline
into a large class of algorithms that work better when they have a chance to cache or precompute something. And I can say something about that.
If your algorithm needs a cache or a precomputed data structure, then your algorithms are inherently stateful. One option is to pass the state in every time, as getline
does. A better option is to encapsulate the state in some object that implements the algorithm. In our case, the state was the buffer and the object was the range. To take another case, Boyer-Moore search is faster than strstr
because it precomputes stuff. In the Boost implementation, boyer_moore
is a stateful function object that keeps its precomputed part private.
Summary
Here are the key take-aways:
- If your algorithm runs faster with a cache or a precomputed data structure, encapsulate the state in an object that implements the algorithm, rather than forcing your users to pass the state in.
- API design must be guided by the expected usage scenarios of the API, and also the common idioms of modern C++11.
- Ranges are a powerful abstraction because operations on them compose.
- Boost.Iterator and Boost.Range greatly simplify the job of implementing custom ranges.
Thanks for reading!
Having toyed with ranges (due to being unsatisfied with Boost.Range), I have to say that so far I have favoured non-reference element types for those ranges that ‘produce’ values. Conversely I use ranges over references when any reference ‘element’ can really outlive the current iteration: if something grabs a hold of an element, and the range is exhausted, the expectation is that the reference is still valid and meaningful. (Although I have to say I use that expectation for the user code, and I don’t need it in the range code proper.)
But then to implement, say, filtered ranges I need to re-implement caching: between having fetched a value from the parent range, and before handing it off to the predicate to test it, I have to store it somewhere. If more composite ranges do that caching, and are used with one another, it stacks up. (Unless it doesn’t, if the compiler can see through. Oh well.)
So it’s not just a matter of a trade-off between caching for efficiency, and passing values for ease of correctness (in user code). There’s also a concern of implementation.
(And now my mind is full of ideas about a ‘peek’ operation for ranges, and a peek_range that transforms a non-peekable range into a peekable range. Thanks!)
I’m curious about the trouble you’ve had with Boost.Range. Please, please share, either here, or on the Boost developers list, or even on the mailing list for the Ranges standardization working group. Your experience is valuable.
Regarding returning references from an input iterators’s
operator*
, I don’t see the problem. Can you clarify? Thelines_iterator
should probably return astd::string const&
, but either way, the string lives in thelines_range
so lifetime isn’t an issue. Returning by-value will reintroduce all the performance problems Andrei was complaining about.The caching of filter iterators, and fat iterators in general, are a problem. Have you seen N3752? It suggests something along the lines of cursors and property maps to implement ranges. That might be a way forward.
I didn’t intend to condemn Boost.Range with my comment, my apologies. My dissatisfaction stems from one of its core design, which is to interplay nicely with iterators. In my opinion that makes it really, really hard to write new ranges (and perhaps relatedly, Boost.Range comes with few ranges of its own — even something like zipping is still sitting in trac I think?).
OTOH I understand completely the need to play nicely with all the preexisting iterator-based code (e.g. all of comes for free).
My problem with returning a reference to a cached element is that consider writing a function that returns a tuple of the three elements of a range (we’ll assume that there are at least three elements as a precondition). You could decay everything and copy it into the tuple, but that kind of defensive copying is starting to grate me. I’d rather hold whichever element the range provides — then a client can choose to transform with a decaying, copying functor if they really want to get a copy.
Or perhaps it’s a composite range that gives a sliding window of three consecutive elements, in a tuple (so you’d get an element, and both neighbours — useful for some algorithms I would say). But then if you start using that range amongst other composite ranges, it might not be obvious if or where an ‘untimely’ reference is passed.
In the end I was interested to see what Andrei-style ranges would look like in C++11, and so I’m playing with that.
Ah, but it’s very easy to tell. If the range is an input range, you have to cache the values as they are produced. The sliding window adaptor you describe would be responsible for caching the values if it is adapting an input range. For forward ranges or better, it doesn’t have to. This is one of the reasons why the STL has a distinction between input iterators and forward iterators.
Yes, but also no. Because that discounts input ranges that may very well have a reference element type, which elements are in fact timely.
In addition, I separated the traversal concepts (forward – bidirectional – random-access) from that of ‘repeatability’, whether a range is saveable (in D-speak) or not. Admittedly this is more out of curiosity than out of a genuine need for functionality, but if there’s a generic low-hanging fruit I’d like to reap it. More special cases is not what I want.
You’re not crazy for wanting to do this. The Boost.Iterators library is refined enough so that you can specify traversal and element access separately. Check out the docs for its new-style iterators. Maybe you’re looking for something like that?
“At this point, Andrei gave up. The design of std::getline is satisfactory, don’t mess with it.”
That’s quite unfair, if you read one of his comment in the reddit thread about the talk, he actually said that he prefer a range-based API and it’s what they use at facebook for getline !
“That’s a range, so you’ll have an easy time convincing me that it’s better than both alternatives :o). My example, however, was focused on ref vs. value, as opposed to larger API issues.
Incidentally we have a class at Facebook called exactly LineReader, and with some ancillary stuff it allows stuff like:
for (auto& line : byLine(stream)) { … }
where stream is an iostream, a FILE* etc. That idiom has the best efficiency*convenience measure in our codebase.”
(see http://www.reddit.com/r/programming/comments/1m1izv/goingnative_2013_writing_quick_code_in_c_quickly/cc4zrb1)
Ah! I didn’t see the reddit thread, thanks. During GoingNative, Andrei didn’t mention ranges as a possible fix, but he’s a stout range proponent so I’m not surprised. I’ll add this discussion to the blog. Thanks.
Nice article. I’ve similar API in my experimental project.
However, I’ve this doubt:
lines_iterator() = default;
Does it ensure the members are value-initialized or default-initialized? If they’re default-initialized, then the pointer members (and other built-in types) will remain uninitialized and the equal() function will invoke undefined behavior when reading their value.
This is a very good question. A quick review of the standard wasn’t enlightening. I’ll get back to you when I know the answer.
Edit
OK, you’re right. The relevant part of the latest draft (N3291) is 12.1/6:
A ctor-initializer is the “
: mem1(), mem2()
” stuff that initializes members in a constructor. That is, by declaringlines_iterator() = default
, I’m defininglines_iterator() {}
, which leaves the pointers uninitialized! This is most definitely a bug, and I’ll fix it. Thanks for reporting it.I use this technique, so I was surprised that you abandoned it in your code. Unfortunately, many things in C++ need you to reference several disjoint parts of the Standard. This is one of them. (Look at Section 8.5, Initializers.)
If a class has a default constructor that is
delete
d, otherwise canceled, ambiguous, or inaccessible; both default- and value-initialization return an error.If a class uses a code-ful default constructor you wrote, both default- and value-initialization will use it.
If a class uses a (mostly) code-less default constructor, default-initialization will use the random garbage bits already in the object’s region of storage. However, value-initialization will do zero-initialization first!
Basically the answer to “Does ‘
MyClass() = default
‘ do default- or value-initialization?” is: both! Whichever initialization you do to the top-level object is what happens to its sub-objects. I use this in my complex-number class. This is an advantage overstd::complex
since I’m not forced to run the code to zero-fy my complex numbers before running them over with an external initialization function.Look at:
MyClass a, b{}, c();
Object
a
is a default-initialized object,b
is value-initialized, andc
is a function declaration. (You can only use the()
for value-initialization when it wouldn’t trigger the “most vexing parse.”) The only way I see that you would gave up on “= default
” for the default constructor is if you created your default-value objects like “a
.” Well, don’t, you may get random garbage bits. Always use “b
” when you need a zero-fied default-value object. (And you can use “b
” in an r-value way.) This option is a big deal after C++98/03 not having it.So, go back to “
lines_iterator() = default
,” but make sure any use of that constructor is “line_iterator{}
” or “line_iterator x{}
.” Never use “line_iterator y
” wherey
will be read without any initialization in between. It may work with code-ful class types, but for any other type.I posted a little coroutine based solution here:
http://www.reddit.com/r/cpp/comments/1of2mb/containers_out_parameters_vs_move_semantics/ccricg2
I can’t seem to get more than one line of code in to a comment here 😉
Very cool. I was going to say a few words about how complicated it is to define iterators in C++, even with the help of Boost.Iterators, and how C#’s yield keyword makes it so trivial. Your solution with coroutines looks a lot like how the C# code would look. Nice.
Sorry about the code comment trouble. Lemme try:
If the preview is to be believed, that works. The trick is to indent your code 4 spaces.
Small suggestion: I would implement
front
as a range of zero or one element and call it in afor
loop rather than inside anif
.I’ve been just writing about infinite lazy lists in Haskell and comparing them to C++. As you mentioned, writing input iterators or ranges in C++ is hard. Your example was simple enough, but try to convert this Haskell program that generates Pythagorean triples to a lazy iterator.
triples =
[(x, y, z) | z <- [1..]
, x <- [1..z]
, y <- [x..z]
, x * x + y * y == z * z]
main = print $ take 4 triples
You’ll have to use a state machine with accompanying inversion of control. I hope C++ adopts
yield
and simplifies it. I suspect that the general solutions would involve coalgebras and anamorphisms 😉Re:
front
, that’s an interesting suggestion, but I’m not sure how I feel about it. It feels a little too cute to use a range for this whenoptional
expresses the intent perfectly.Pythagorean triples sounds like an interesting little programming challenge. Maybe some ambitious soul reading this will show an elegant range-based solution for C++. That said, the
yield
keyword would be a nice addition. In lieu of that, there’s a coroutines library in Boost that would greatly simplify this problem.Some languages or libraries regard optional values as a 0-to-1 range, or introduce a 0-to-1 range that’s as good as an optional value.
That Pythagorean triples are regarded as a challenge is what I find embarrassing about the state of the art regarding ranges in C++. I wish it wasn’t one.
Think of
front
astake 1
, wheretake n
returns the list of the firstn
elements. The problem with optional is that you might accidentally dereference it without checking (I presume, it throws an exception if it’s empty). Iffront
returned a range then you’d be pretty much forced to use a loop, which is safe.Maybe
is safe in Haskell because it (pretty much) forces you to pattern match — it’s harder to missNothing
.Rather than using optional on front I would rather use an empty() function. Right off I’d prefer that be part of the range concept itself, but it could be done in this same style with a generic free function:
Now we can test explicitly:
Where front() no longer returns an optional but maps to the same function as std::vector or std::list.
This doesn’t address your safety concern but I think being able to directly test for an empty set should be part of the range abstraction in general.
Your suggestion feels very much in the spirit of C++. And I have no doubt this is what the standardization committee would want. It’s really a very interesting question. I like
optional
because I don’t have to save the result ofgetlines
anywhere before I use it. But in some sense,optional
is messy, and there will certainly be some applications where you know the range is non-empty and you want to be able to callfront
without the check. Then again, Bartosz’ solution holds some appeal for me. But then again, even Haskell’shead
function has undefined behavior if called on an empty list. What to do?@Eric’s reply, regarding “what to do”:
Usually I think you would be using an algorithm that acts on a Range (for_each, map, reduce, etc) and not need to handle the empty case directly..
When you do need to, perhaps storing in local scope isn’t the worst price to pay. You could also trivially write an optional_front() helper on top of empty() and front().
Or, perhaps use a helper like “if_not_empty”:
Regarding front() returning an optional.. std::vector front() is undefined in the vector is empty. Why return an optional which you then have to check? Why not just check the original range if it is empty in the first place? So, I suggest front() be undefined in the Range is empty, which I think is also likely an important criteria if on-par performance is considered importnat.
<nod> See my response to Jay Miller above.
Pingback: Eric Niebler: out parameters, move semantics, and stateful algorithms | The other branch
I always knew® that this was the right way to do it. Even in C++98. But after implementing the “iterator_facade” part I always got stuck with the ‘.begin()/.end()’ part. This closes the circle with the
iterator_range
class. Thank you Eric. I have two small questions, 1) How is it that .begin()/.end() is automatically handled byiterator_range
? 2) What would you do for lazy sequence that even in principle has no end? or for “converging” lazy sequences whose end can be defined in practice by a convergence criterion (I know this is a mathematical matter, but how (or even if) would you do it if the convergence criterion is well defined.Just look at the docs for
iterator_range
. There you can see public.begin()
and.end()
member functions, thatlines_range
acquires through inheritance. Stylistically, it’s a bit sloppy to publicly inherit implementation like that, but it’s quick and easy.lines_range
is a lazy sequence that even in principle has no end. You can keep sending lines tolines_range
till the cows come home, and thefor
loop will keep looping blissfully forever.I think
lines_range
is also a sequence like this, the convergence criterion being: whenstd::getline
returns anistream
in a bad state. You can define an InputIterator likelines_iterator
that checks any ol’ predicate, mathematical or not, and moves itself into the end iterator’s state, signifying the end of the sequence.Hope that helps.
I was thinking about this further, and came up with this simple solution:
https://gist.github.com/mmocny/7204756
I suspect its relying on undefined behaviour, but all tested compilers are producing such odd (and correct) output, that I’m almost tempted to believe there may be a rule I’m unaware of that makes it correct.
Can anyone take a look and let me know if returning a reference to a temporary created using function default arguments yields dangling reference or if the lifetime is extended? The output suggests it is extended (take a look at when destructors are called, especially in the “piped” test case)
Interesting thought, but it’s a dangerous interface. Here’s why. Given a function like:
you are allowed to do this:
It’s exactly equivalent to:
The temporary string is constructed at the call site and has a lifetime of the full expression, so it survives long enough to be streamed out.
But here’s the problem:
The trouble is, the hidden temporary object constructed on the first line is not directly bound to the const reference, so it doesn’t have its lifetime extended. As a result, the object
s
refers to dies after the first statement, and so the second statement is undefined behavior.Sorry.
Like almost every other C++ developer I encountered the problem line mode iterator missing from language. I really like your solution, someone (Stepanov?) should have made it standard before.
There are some adjustments I consider for version I will use.
Some of your members are externally owned dumb pointers which is not something you normally want. I’d rather wrap iostream in container.
struct line_stream_container {
std::istream& stream;
std::string current;
//TODO: constructors and swap
};
And inside lines_iterator
class lines_iterator {
//Your code ommited
line_stream_container& container;
linea_iterator(line_stream_container& c) container(c) {
increment();
}
const std::string& dereference () const noexcept {
return container.current;
}
//More code ommited
}
This clarifies that iterator is valid as long as underlying container is valid, there is still a problem of two iterators using the same string, referenced string so we must somehow make it clear that this iterator is input iterator.
P.S: I don’t like the term “input iterator”. It is not a real iterator. I beleive better technical term for “input iterator” is “sequencer”.
I disagree that there is anything wrong with using a raw pointer as I did. I read
std::string *
as: this is a pointer to a string defined and owned elsewhere. Using a reference as you do changes the semantics. First of all, your iterator no longer gets a compiler-generated assignment operator, and defining it yourself becomes problematic because you can’t rebind a reference. When using a pointer like I do, the compiler-generated copy and assign operators Just Work and do the right thing.And check out my next post which shows a better way to define this range and its iterator type.
P.S: for my previous post. “Sequencer” is a bad name, “Generator” is much better. Such input iterator is semantically closer to generators in than real iterators over containers.
Another method to create this sort of functionality relies upon imbuing the associated
istream
with a modified locale (and just using the associatestd::istream_iterator<std::string>
). Although modifying the locale changes the associated behaviour of the input stream (this can be reset easily).So, the object below sets up the
std::locale::cctype
:struct line_reader: std::ctype<char> {
line_reader(): std::ctype<char>(get_table()) {}
static std::ctype_base::mask const* get_table() {
static std::vector<std::ctype_base::mask>
rc(table_size, std::ctype_base::mask());
rc['n'] = std::ctype_base::space;
return &rc[0];
}
};
If you are using
std::cin
, then to imbue it with the modified locale:std::cin.imbue(std::locale(std::locale(), new line_reader()));
Finally, the line reader iterator is:
std::istream_iterator<std::string>(std::cin);
and the terminating iterator is:
`std::istream_iterator();
Pingback: Meeting C++ 2013 « Домик Миа
Very useful information indeed! Thanks for posting the content and you can find more on default arguments here below with examples.
Default arguments in Cpp with examples