Input Iterators vs Input Ranges

Posted on November 7, 2013 by Eric Niebler — 45 Comments ↓

This post was inspired by some shortcomings of the std::getline solution I described in my previous post, which just goes to show that there is no interface so simple that it can’t be done wrong. Or at least sub-optimally.

Input Iterators and Lazy Ranges

In the previous article, I analyzed the interface of std::getline and proposed a range-based solution as a better alternative. Users of the new range-based getlines API would read lines from a stream like this:

for(std::string const & line : getlines(std::cin))
{
    use_line(line);
}

The range object returned from getlines is lazy; that is, it fetches lines on demand. It’s a good design, and I’m still happy with it. The implementation leaves much to be desired though. Both the range object itself, as well as the iterators it yields, are fatter than they need to be. That got me thinking about std::istream_iterator, and input iterators and ranges in general. My conclusion: Naked input iterators like std::istream_iterator that don’t “belong” to a range have serious problems.

Fat Input Iterators

If you’re not already familiar with std::istream_iterator, take a minute to look it up in your favorite C++ reference. It is parametrized on the type of thing you want to extract from a stream. An istream_iterator<int> reads ints, an istream_iterator<string> reads strings, etc. Although the implementation is unspecified, reading an element would typically happen first when the iterator is constructed, and then each time the iterator is incremented. The element is stored in a data member so that it can be returned when you dereference the iterator. OK so far?

The implication for istream_iterator<string> is that it is a hulking behemoth of an iterator. Not only is it fat because it holds a string, but copying one means copying a string, too. That’s potentially a dynamic allocation, just from copying an iterator! STL algorithms generally assume iterators are cheap to copy and take them by value nonchalantly. What’s more, a default-constructed istream_iterator<string> is used as a dummy end-of-sequence iterator. Naturally, it contains a string too, but it never uses it! istream_iterator definitely needs to go on a diet. We’ll fix that, but we’re not done describing the problems yet. Read on.

Surprising Side-Effects

Say we wanted to return a range of istream_iterator<string>s. We could return a std::pair of them, and that would work, sort of. Better, we could return a boost::iterator_range (which is basically a std::pair of iterators with begin and end member functions) to get something that users could iterate over with a range-based for loop:

// Return a lazy range of strings
boost::iterator_range<std::istream_iterator<std::string>>
get_strings( std::istream & sin )
{
    return boost::make_iterator_range(
        std::istream_iterator<std::string>{sin},
        std::istream_iterator<std::string>{}
    );
}

//...

for(std::string const & str : get_strings( std::cin ))
{
    use_string(str);
}

But think of the waste: the range holds two iterators, each of which holds a string and a reference to the stream. Wouldn’t it be smarter if the returned range just held a reference to the stream, and constructed the iterators on-demand in its begin and end member functions, like this:

template< class T >
class istream_range
{
    std::istream & sin_;
public:
    using iterator = std::istream_iterator<T>;
    using const_iterator = iterator;

    explicit istream_range( std::istream & sin )
      : sin_(sin)
    {}
    iterator begin() const
    {
        return std::istream_iterator<T>{sin_};
    }
    iterator end() const
    {
        return std::istream_iterator<T>{};
    }
};

OMG, isn’t this soooo clever? The range object went from about 24 bytes (with libstdc++ 4.7), to 4 bytes — the size of just one pointer! And if you play around with istream_range, it will seem to work. Check it out:

// Read a bunch of strings from a stream
std::istringstream sin{"This is his face"};

for(auto const & str : istream_range<std::string>{sin})
{
    std::cout << str << std::endl;
}

As we might expect, the above prints:

This
is
his
face

But all is not roses. Take a look at this:

std::istringstream sin{"This is his face"};
istream_range<std::string> strings{sin};

if(strings.begin() != strings.end())
    std::cout << *strings.begin() << std::endl;

This code checks to see if the range is non-empty, and if so it prints the first element of the range. What would you expect this to print? This, right? After all, that’s the first string in the stream. If you try it, this is what you’ll get:

is

Huh? That’s hardly what any reasonable person would expect. Chalk this gotcha up to a quirk of the implementation of istream_iterator. As mentioned above, when you construct one from a stream, it eagerly fetches a value out of the stream and saves it (or, most implementations do). That’s fine, unless you happen to throw that iterator away and construct a new one, which fetches a second value from the stream. That, sadly, is what the above code is doing, but it’s not obvious.

If the fatness was the first problem with std::istream_iterator, the second is that its constructor has surprising side-effects.

Lone Range-er to the Rescue!

The solution to istream_iterator‘s woes will be to replace it with istream_range. Put simply, if we’re reading strings from a stream, the string needs to live somewhere. The iterator seemed like the logical place when we were all thinking strictly in terms of iterators. But with ranges, we now have a much better place to put it: in the range object.

With the string safely tucked away in the range object, we neatly side-step the issue of fat istream iterators. The iterator only needs to hold a pointer to the range. It goes without saying that the iterator cannot outlive the range that produced it, but that’s true of all the standard containers and their iterators.

The range object also gives us a better place to put the surprising side-effect: in the range object’s constructor. By moving the side-effect out of the iterator’s constructor, it is now perfectly acceptable to construct the iterators on-demand in the begin and end member functions. We’re left with an optimally small range — it holds only a string and an istream & — and an optimally small and efficient iterator — it holds only a pointer.

Without further ado, here is the complete solution:

template< class T >
class istream_range
{
    std::istream & sin_;
    mutable T obj_;

    bool next() const
    {
        return sin_ >> obj_;
    }
public:
    // Define const_iterator and iterator together:
    using const_iterator = struct iterator
      : boost::iterator_facade<
            iterator,
            T const,
            std::input_iterator_tag
        >
    {
        iterator() : rng_{} {}
    private:
        friend class istream_range;
        friend class boost::iterator_core_access;

        explicit iterator(istream_range const & rng)
          : rng_(rng ? &rng : nullptr)
        {}

        void increment()
        {
            // Don't advance a singular iterator
            BOOST_ASSERT(rng_);
            // Fetch the next element, null out the
            // iterator if it fails
            if(!rng_->next())
                rng_ = nullptr;
        }

        bool equal(iterator that) const
        {
            return rng_ == that.rng_;
        }

        T const & dereference() const
        {
            // Don't deref a singular iterator
            BOOST_ASSERT(rng_);
            return rng_->obj_;
        }

        istream_range const *rng_;
    };

    explicit istream_range(std::istream & sin)
      : sin_(sin), obj_{}
    {
        next(); // prime the pump
    }

    iterator begin() const { return iterator{*this}; }
    iterator end() const   { return iterator{};     }

    explicit operator bool() const // any objects left?
    {
        return sin_;
    }

    bool operator!() const { return !sin_; }
};

This solution has a major advantage over std::istream_iterator even in the pre-ranges world of C++98: the iterators are as svelte and cheap to copy as a single pointer. One might go so far as to wonder how a potentially inefficient and error-prone component as istream_iterator ever made it into the standard in the first place. (But, I just mentioned “efficient” and “iostreams” in the same sentence, so how smart am I, right Andrei?)

As an added bonus, I added a cute contextual conversion to bool for testing whether the range is empty or not. That lets you write code like this:

if( auto strs = istream_range<std::string>{std::cin} )
    std::cout << *strs.begin() << std::endl;

If you don’t like the Boolean conversion trick, you can do it the old, boring way too:

istream_range<std::string> strs{std::cin};
if( strs.begin() != strs.end() )
    std::cout << *strs.begin() << std::endl;

You can call strs.begin() as many times as you like, and it has no untoward side-effects. Adapting this code to improve my getlines implementation from the previous post is a trivial exercise.

Home on the Range

In the post-ranges world, the advantages of istream_range over istream_iterator are even clearer. As I mentioned in my previous post ranges are awesome because they compose. With filters and transformers and zippers and the whole zoo of range adapters, you can do things with ranges and range algorithms that you wouldn’t dream of doing with raw iterators before.

Conclusion

So far, the ranges discussion as I’ve heard it has been framed mostly in terms of ranges’ added convenience and power. To this impressive list of advantages, we can now add efficiency. Win, win, win.

Caveat to the Boost.Range Users

Please read this if you are an avid user of Boost’s range adapters. As they are currently written, they interact poorly with the istream_range I’ve presented here. Some things will work, like this:

// read in ints, echo back the evens
auto is_even = [](int i) {return 0==i%2;};
boost::copy( istream_range<int>{std::cin}
               | boost::adaptors::filtered(is_even),
             std::ostream_iterator<int>(std::cout) );

And some things will fail, like this:

// read in ints, echo back the evens
auto is_even = [](int i) {return 0==i%2;};
auto evens = istream_range<int>{std::cin}
               | boost::adaptors::filtered(is_even);
boost::copy( evens, std::ostream_iterator<int>(std::cout) );

The problem is that the temporary istream_range<int> goes out of scope before we have a chance to iterate over it. Had we gone with an iterator_range< std::istream_iterator<int> >, it would have actually worked, but only because of a quirk of the current Boost.Range implementation. The Boost.Range adapters only work when either (A) the adapted range happens to be an lvalue, or (B) the range’s iterators can outlive their range. These less-than-ideal assumptions made sense in C++98, but not in C++11. On modern compilers, Boost.Range can and should store a copy of any adapted rvalue ranges. In my opinion, it’s time for a range library for the modern world.

45 Replies to “Input Iterators vs Input Ranges”

ld on November 7, 2013 at 5:37 am said:

Have you started studying how forward and bidirectional ranges come into life with this scheme? (If this is an upcoming post, I’ll wait patiently.) You may want to try random-access ranges first.

Reply ↓
- Eric Niebler on November 7, 2013 at 5:43 am said:
  
  It’s unlikely that there will be many iterators that benefit performance-wise from this approach besides input iterators. Forward iterators and better in all likelihood do not store any values within the iterators themselves. One counter-example I can think of would be something like boost::counting_iterator when used to count an infinite precision integer type. That would be a random-access iterator that could benefit from being implemented as a counting_range instead. But the examples are few and far between.
  
  Reply ↓
  - ld on November 7, 2013 at 6:13 am said:
    
    Consider a filtered range — if you want to make the range forward or better, you need to store a copy of the predicate in both iterators. You can take an extra step by separately putting the predicate in its own variable and use [&the_actual_predicate](E const& element) { return the_actual_predicate(element); } as the ‘shallow’, lightweight predicate. But then you can’t return the result because it cannot live separate from the actual predicate.
    
    Correct me if I’m wrong but this sounds to be the exact same situation that motivates the solution outlined in this post. (A quick glance at Boost.Iterator suggests that is how they do it.)
    
    Reply ↓
    - Eric Niebler on November 7, 2013 at 7:03 am said:
      
      By jove, you’re right! I had considered this but dismissed it because I knew that Boost.Range provided a filtered range, and I (incorrect) assumed it would do it the “right” way. instead, I see it’s implemented in terms of boost::filter_iterator, which is fat as you observe.
      
      I also noticed that Boost.Range has an istream_range, but it’s implemented in terms of std::istream_iterator, so it’s fat and inefficient, too. I’m going to try to find out why.
      
      Reply ↓
gnzlbg on November 7, 2013 at 9:07 am said:

As far as I know almost all (or all) adapted ranges in Boost.Range are implemented as a pair of Boost.Iterators (e.g. transformed produces a range that contains two boost::transform_iterators…). This is very problematic. For example, boost::for_each(zip(a,b), print) works as expected but boost::sort(zip(a,b), predicate) returns a compile-time error deep in the std library. This is a problem with boost::zip_iterator that Boost.Range just carries along (e.g. thurst::zip_iterator doesn’t have this problem). It also doesn’t play with move semantics at all.

However, even tho counting_range is also a pair of counting_iterators, both clang and gcc actually vectorize for(auto i : counting_range(0, 10)) {…} correctly. That is, they produce the same code as for (int i = 0; i < 10; ++i) {…} ! So… Benchmark! Boost.Range iterators are fat, but are they really slower than the hand-rolled code? Probably the answer is “sometimes”, e.g. one could benchmark a for loop with an if vs a filtered counting_iterator range vs a “native” filtered_range implementation.

Have you taken a look at D’s standard library? Ranges are its basic unit of abstraction (although some of their algorithms also use the concept of a position as in e.g. “the n-th range element”).

It would be great to have a similar and hopefully better library in C++1y… maybe Boost.Range 3.0?

Reply ↓
- Eric Niebler on November 7, 2013 at 9:28 am said:
  
  It’s all true what you say. Maybe you missed the “Caveat to the Boost.Range Users” addendum I added to this post within the last few hours. And regarding “Boost.Range 3.0”, I suggested just such a thing on the Boost developers list today. We’ll see where that leads. Thanks.!
  
  Reply ↓
Michael on November 8, 2013 at 11:06 am said:

This is cool. The string obj_ is owned by the range and iterator keeps reference to the range like if range was a container…

The lesson here is that iterator should not have own state, it’s just a pointer to somewhere in container.

One possible “improvement” for such range is numeric limit. If you want to read three strings at most “3” is saved as member variable inside range instead of enlarging the iterator.

Reply ↓
- Michael on November 8, 2013 at 11:35 am said:
  
  P.S: one more question: why can’t we remove “const” from next() and “mutable” from _obj?
  
  Reply ↓
  - Eric Niebler on November 8, 2013 at 2:20 pm said:
    
    Because the iterator gets a const istream_range pointer in begin, which is const.
    
    Reply ↓
Gregory on November 8, 2013 at 2:27 pm said:

I second the idea of having ranges as basic abstraction (as in std.range and std.algorithm of D). The problem with fat iterators is not that they are fat, but that there’re two of them. IMHO having fat range that algorithm operates on directly is better then having two iterators that store additional data sometimes by reference and sometimes by value.

Also, I think it’s beneficial to make a distinction between range and container. Container may be expensive to copy – range is not. It should be possible to make range out of container (in that case range would pair of iterators), but range does not nessesary store iterators at all (but it may store another range by value).

Of course going such way, would mean reimplemeting algorithms in terms of ranges, and having iterator-based algorithms to call range-based ones. But maybe such direction worth pursuing anyway…

Reply ↓
- Eric Niebler on November 9, 2013 at 1:59 am said:
  
  Personally, I’m not convinced that ranges makes a better primitive than iterators. In fact, I see problems with doing that. The concept of position is fundamental. There is a difference between a position in a sequence, and a sequence of one element. I have yet to see a C++ proposal in which ranges are the primitives which doesn’t sacrifice something from the status quo. But I agree with you that a range doesn’t need to store a pair of iterators. The range I presented here doesn’t.
  
  My gut tells me that there should be a way to define a range such that the iterators can be automatically generated. This is a direction I’d like to pursue.
  
  Reply ↓
  - Rein Halbersma on November 10, 2013 at 3:13 am said:
    
    Here’s a little challenge to see if iterators can do everything that D ranges can: define a C++ iterator-based range that iterates over all set 1-bits of a 64-bit integer, and that can be used in C++ range-based for statement, yielding identical code as the low-level C bit-twiddling, or the high-level D-style range iteration. See below for an implementation of C and D style bit iteration:
    
    http://coliru.stacked-crooked.com/a/1d85b03259293648
    
    If the Standard would not mandate the range-based for implementation to use begin() / end() iterators of the same type, but rather the D primitives empty(), pop_front() and front(), then native performance could be achieved on exotic data structures. An extra container requirement for providing these 3 functions would be all it takes (and empty() would only have to be defined for C-arrays / initializer-lists).
    
    Reply ↓
    - Casey Carter on January 27, 2014 at 11:40 pm said:
      
      class CPPBitRange { public: explicit CPPBitRange(uint64_t& b): b_{b} {} struct iterator : boost::iterator_facade<iterator, int, std::input_iterator_tag, int> { iterator(CPPBitRange* rng) : rng{rng} {} int dereference() const { return __builtin_ctzll(rng->b_); } void increment() { rng->b_ &= rng->b_ - 1; } bool equal(const iterator& other) const { return rng ? (other.rng || rng->b_ == 0) : (!other.rng || other.rng->b_ == 0); } CPPBitRange* rng; }; iterator begin() { return this; } iterator end() { return nullptr; } private: uint64_t& b_; };
      
      Live code at coliru. Note that the C-style, D-style, and CPP-style versions produce identical machine instructions. It is not lost on me that the C++-style implementation takes twice as many lines as D-style.
      
      Reply ↓
      - Casey Carter on January 27, 2014 at 11:59 pm said:
        
        Here’s a better version that uses a CRTP base class to turn the D-style range into a C++-style input range instead of writing another class.
      - Eric Niebler on January 28, 2014 at 12:04 am said:
        
        Regarding your CRTP base class that turns a D-style range into a C++-style input range, that’s exactly the point! It shouldn’t take a mountain of code to implement a custom range. We just need better supporting libraries.
      - Rein Halbersma on January 28, 2014 at 6:41 am said:
        
        This is really cool! One nitpick: your equal() inside iterator doesn’t actually compare the range pointers. This means that e.g. the begin() iterators of two different ranges compare equal. Instead, you could do
        
        return (rng == other.rng) || (!other.rng && rng->b_ == 0) || (!rng && other.rng->b_ == 0);
        
        This works because the first OR-clause takes care of two equal range pointers and also of two nullptr values. This means that it is safe to dereference the second range pointer inside the remaining two OR-clauses once the first range pointer is a nullptr. It yields the same assembly as your example.
      - Eric Niebler on January 28, 2014 at 10:43 am said:
        
        It’s a violation of preconditions to compare iterators from different ranges. It’s probably better to assert than return false and continue on as if nothing in the world is wrong.
      - Rein Halbersma on January 28, 2014 at 6:51 am said:
        
        The DInputRange adapter is even cooler! Same nitpick for the equal() inside iterator:
        
        return (rng == other.rng) || (!other.rng && rng->empty()) || (!rng && other.rng->empty());
        
        actually compares if iterators belong to the same range. Again, identical assembly. I’m really impressed that in the context of a range-for, the compiler can prove that begin() is constructed from a this-pointer that can never compare equal to the nullptr from end().
      - Rein Halbersma on January 28, 2014 at 11:04 am said:
        
        Ah, bad mistake to forget about the UB of inter-range iterator comparisons! Withdrawn 🙂
        
        Btw, for your istream_range, you could also apply Casey’s nullptr logic inside equal() instead of checking for it in the constructor and nulling out inside increment(). What would be your preferred style?
      - Eric Niebler on January 28, 2014 at 11:10 am said:
        
        Six-and-one-half, really. Either way would be fine, AFAICT.
  - Anders Sjögren on November 12, 2013 at 2:32 am said:
    
    In C#/.Net, they differentiate between IEnumerator (most importantly supporting T Current(), and bool MoveNext() ), and IEnumerable which supports IEnumerator GetEnumerator().
    
    Thus, the Enumerable is the collection and the Enumerator is the current enumeration (with position) in the Enumerable.
    
    This could then be made richer with random-access Move(int offset), etc, but this is in my experience so much nicer to work with than the raw iterators. And if with compile-time generics it will make the code faster as well when filtering data, etc, using standard methods, since you don’t actually have to place the filtered data in a new container.
    
    The current non-lazy way of the std-algorithms is just not cutting it in 2013, IMHO.
    
    Reply ↓
Gregory on November 9, 2013 at 5:03 am said:

Probably you’re right that D does not have an ideal solution. It conflates two notions: a view of container and a device that traverses container.
And probably that’s where .Net library does the right thing.
It has notions of Enumerable – a container or some view of container and Enumerator – a device for iteration over Enumerable.
Enumerable does not represent position in any way, it’s just a sequence of elements, possibly lazily generated sequence.
Enumerator is essentially a position inside that sequence, it’s very similar to input/forward-only iterator except that it knows where the sequence ends.
Note that your input range = Enumerator, and that it has some correspondence to a position inside stream.

In my opinion usage of the word ‘range’ should be very carefull, because it may mean very different things. Instead we may use ‘view’ – non-owning reference to container and ‘enumerator’ – smart iterator that knows where to stop.

Most of the time we need enumerator only when we’re inside an algorithm, but there are places when returning enumerator from an algortithm is the right thing to do, e.g. find. D version of find is strange.

So, in my ideal world algorithms should operate on views, in order to traverse views algorithms should create enumerators and return those enumerators when needed. Enumerators should store views by value. Also in my ideal world getting view of container should be the operation supported directly by container. View should not be invalidated when elements are added/removed to/from container or container is moved.

And indeed there might be a way to automatically generate begin/end iterators from enumerators, but if those iteators are slim it should be done very carefully. E.g. that version of find would not work

It find (Cont& c, T val)
{
Enum e = get_enumerator(c);
return find(begin(e), end(e), val);
}

And that variant would

Enum find (Cont& c, T val)
{
Enum e = get_enumerator(c);
find(begin(e), end(e), val);
return e;
}

Reply ↓
- Michael on November 9, 2013 at 6:02 am said:
  
  Enumerators and iterators have same syntax but different semantics, so using the word enumerator rather than iterator sounds logical. From readability point of view I’d rather have std::istream_enumerator than std::istream_iterator
  
  Reply ↓
Rein Halbersma on November 10, 2013 at 5:08 am said:

Your istream_range solution is pretty awesome. I tested the generated assembly on gcc.godbolt.org and compared it to plain old while() looping and D-style iteration http://coliru.stacked-crooked.com/a/d22192e264e59715

Incredibly, the conditional nulling-out inside your iterator increment() appears to be optimized away altogether. It’s very surprising that the compiler can prove that a nullptr value will be compared to the other nullptr value lurking inside the end() iterator. Essentially the same assembly is generated for all 3 approaches.

Nevertheless, the D input range approach that I linked to above is much simpler to implement and does not even contain any branching or nulling-out to begin with, so no heroic optimizations are required there. A change in the C++ Standard range-based for requirements would still be very nice, wouldn’t you agree?

Reply ↓
- Eric Niebler on November 11, 2013 at 8:38 am said:
  
  That’s amazing! Thanks for doing the measurement. I’m a little surprised at the result, to be honest.
  
  I have sitting on my hard-drive a range facade class that implements input iterators on top of something like a D range. That is, you only have to implement a D-style range, and you get the iterators for free. This, I think, is a promising way forward. Given the amount of code that uses the iterator interface, I don’t think changing the paradigm now is going to happen, nor am I convinced that it isnecessary.
  
  Reply ↓
  - gnzlbg on November 11, 2013 at 8:57 am said:
    
    I have sitting on my hard-drive a range facade class that implements
    input iterators on top of something like a D range.
    
    I hope we’ll see another nice blog post soon!
    
    Reply ↓
Rein Halbersma on November 11, 2013 at 9:09 am said:

Automatic iterator generation from D-style range would be cool, so I am very interested in your approach to the bit iteration challenge I posed earlier! It can be done using proxy iterators/references (along the lines of Howard Hinnant’s revamped vector<bool>). But optimizing all that cruft away is even harder than for input stream iterators.

Compare again the input stream iteration: having an end() iterator wrapping nullptr and another iterator that conditionally sets its own range pointer to nullptr upon a read failure; it all is very indirect and puts a heavy burden on the compiler.

Direct ideas should be implementable directly in C++. IMHO, the termination condition in a ranged-based for loop is over-specified to use an indirect approach. A range’s empty() is all that should be required. Similarly, the Standard Containers requirements also do not mandate that empty() is implemented as begin() == end() (even though it is required to have that semantics).

Reply ↓
- Eric Niebler on November 11, 2013 at 9:22 am said:
  
  I’m also interested in trying my hand at your challenge! I don’t know if we should infer too much from it, though. It’s seems like a niche use case (and your “optimal” code is non-portable, I’ll point out).
  
  Compare again the input stream iteration: having an end() iterator
  wrapping nullptr and another iterator that conditionally sets its own
  range pointer to nullptr upon a read failure; it all is very indirect
  and puts a heavy burden on the compiler.
  
  I agree, but there’s another design consideration to weigh: the ability to generically express an algorithm in terms of one abstraction. We can’t expect people to implement their algorithms twice, and if we decide the range abstraction isn’t powerful enough (still an open question), then we have to go with the one that works.
  
  Reply ↓
Dave Abrahams on November 12, 2013 at 12:05 am said:

This is a pretty old topic, if memory serves me right…Did you have a look through this discussion, and in particular, Giovanni’s suggestion there?

Reply ↓
- Eric Niebler on November 12, 2013 at 7:04 am said:
  
  Thanks for the link. I’m still making my way through the thread. It looks like two things are being discussed there: “fat” iterators and range lifetime. I think the lifetime discussion is out of date. We have rvalue references now. Range adapters can move rvalue ranges into themselves, eliminating the lifetime issue. Once that problem is solved, the issue of fat iterators can safely be solved by having the iterators point to the ranges and storing common data there.
  
  I don’t much care for the solution of breaking iterators into “common” and “unique” parts. It makes you do one of two things: (1) reconstitute the parts back into a “fat” iterator, which is what we’re trying to avoid, or (2) have the iterator store a pointer to the common bits stored elsewhere. “Elsewhere” is clearly in the range object, so we’re back to my solution above.
  
  I see Arno was part of that discussion. Have you read his N3752 (“Indexed-Based Ranges”)? Arno and I discussed this issue at length at Meeting C++ last week. I’m busy coding up my range design, and I’ve asked Arno to share his code since he’s clearly saving more space in his iterators than I am.
  
  Reply ↓
- gnzlbg on November 17, 2013 at 7:22 am said:
  
  I like the point about using expression templates to “merge” multiple adaptors chained together into a single one. Its a nice thought.
  
  Reply ↓
  - Eric Niebler on November 17, 2013 at 12:31 pm said:
    
    I don’t think expression templates are necessary. And with auto, expression templates are actually pretty dangerous if they hold their child nodes by references. To make them safe, you have to move rvalues into the nodes, and if you’re doing that, you may as well move them directly into range adaptors and forget the expression templates completely.
    
    Reply ↓
    - gnzlbg on November 18, 2013 at 4:12 am said:
      
      You mean that a range adaptor constructed from an rvalue range (e.g. another adaptor) should store a copy of the range in itself (using move)?
      
      Reply ↓
      - Eric Niebler on November 18, 2013 at 8:29 am said:
        
        Yes.
Matthias Vallentin on November 16, 2013 at 11:53 am said:

I like this style of factoring state in ranges and iterators. The design you propose requires that users implement their own ranges. Have you thought about how to facilitate the implementation process for users? For example, do you picture a range_facade similar to iterator_facade to coerce users correctly providing begin, end, operator bool (or whatever a full range interface requires). It would be nice if users could just focus on implementing range traversal without needing to think about both iterators and ranges. Perhaps a range_facade obviates the need to implement the used iterators entirely.

Reply ↓
- Eric Niebler on November 17, 2013 at 12:25 pm said:
  
  Oh, absolutely! I have a range facade class here, but I’m not happy with it yet. I’ll post it (and blog about it?) when it’s ready.
  
  Reply ↓
Trass3r on February 2, 2014 at 1:49 pm said:

Interestingly enough I came up with something similar to the previously posted CRTP approach just yesterday. I didn’t see any need for that extra layer of boost::iterator_facade but of course there may be some cases I didn’t consider.
Here’s some crude code that sketches the idea:
https://gist.github.com/Trass3r/8774345

Reply ↓
Ahmed Charles on February 5, 2014 at 11:13 am said:

One interesting thing about this code that someone hasn’t mentioned before is that the mutable inside the range class isn’t thread safe, like C++11 implies it should be. Granted, this is actually fine, since the next function is the only one that modifies it and while next is const, it’s only called by non-const functions (the constructor and the increment of the iterator). So, it allows you to have a const range while iterators can iterate over it.

However, what does it actually mean to have a const input range? If you call begin() and increment that iterator, calling begin() again results in something different, so I’d probably suggest that begin and end shouldn’t be const and input ranges shouldn’t have const_iterator defined, since they can’t actually provide that guarantee.

Reply ↓
- Eric Niebler on February 5, 2014 at 8:31 pm said:
  
  Hi Ahmet! That’s an interesting point. In a sense, the problem already exists today. Consider what happens if you create two std::istream_iterators from the same std::istream (std::cin, say), and then iterate in parallel from two different threads. Boom. It’s a race condition on the stream. My range code above introduces a new race condition on a data member of the range, but I don’t think it’s any worse than the race condition that already exists. The fact that it exists at all is kind of annoying, though. If you’re hard-core paranoid, you can protect all reading and writing of the internal variable with a mutex. I think a logical extension would be that either (a) iterator dereference has to return by value (yuk!), or (b) the state would have to move back into the iterator (double yuk!).
  
  The problem only happens when you iterate over the same istream_range from different threads. I think the real answer is simply: don’t do that. 🙂
  
  what does it actually mean to have a const input range?
  
  This is a provocative question. Let me chew on that a bit, ok?
  
  Reply ↓
  - Ahmed Charles on February 6, 2014 at 8:25 pm said:
    
    I agree that there is a possible race condition with istream_iterator, but it isn’t const and it can’t be created from a const stream, which means that there isn’t an implied guarantee that there isn’t a race condition.
    
    Having a const range (even if created from a mutable stream) and const_iterator (even if it’s a typedef) on the range, implies thread safety, which is new with this design.
    
    As for the provocative question, take as much time as you’d like.
    
    Reply ↓
    - Ahmed Charles on April 29, 2014 at 10:02 pm said:
      
      I’m finally reading your group of four articles on ranges and was wondering if you ever came to some understanding of what it means to have a const input range? Or well, more generally, how const impacts ranges and their iterators?
      
      Reply ↓
      - Eric Niebler on April 29, 2014 at 10:19 pm said:
        
        Funny, I was just talking it over with Andrew Sutton this morning. I think I just need to make istream_range‘s begin and end member functions non-const, and that’s the end of it. I need to do likewise for the flatten_view, used to implement view::for_each, so I think there is just some class of ranges which maintain iteration state in the range object itself, and those ranges can’t have const begin/end members.
      - Ahmed Charles on April 30, 2014 at 2:09 am said:
        
        I agree with that solution.
Yongwei Wu on October 20, 2016 at 6:05 am said:

Hi Eric, I think there may be a simple reason for the traditional style of iterators—they are safe to be used like *it++, though it is not normally recommended.

Reply ↓
- Eric Niebler on October 20, 2016 at 10:18 am said:
  
  I don’t disagree. Iterators are essential, and I’m not trying to get rid of them. And the reason for the iterator syntax is so that raw pointers can model the Iterator concept.
  
  Reply ↓