Iterators++, Part 3

This is the fourth and final post in a series about proxy iterators, the limitations of the existing STL iterator concept hierarchy, and what could be done about it. The first three posts describe the problems of proxy iterators, the way to swap and move their elements, and how to rigorously define what an Iterator is.

This time around I’ll be focusing on the final problem: how to properly constrain the higher-order algorithms so that they work with proxy iterators.

A Unique Algorithm

In this post, I’ll be looking at one algorithm in particular and how it interacts with proxy iterators: unique_copy. Here is its prototype:

template <class InIter, class OutIter, class Fn>
OutIter unique_copy(InIter first, InIter last,
                    OutIter result, Fn binary_pred);

This algorithm copies elements from one range to another, skipping adjacent elements that are equal, using a predicate for the comparison.

Consider the following invocation:

std::stringstream sin{"1 1 2 3 3 3 4 5"};
unique_copy(
  std::istream_iterator<int>{sin},
  std::istream_iterator<int>{},
  std::ostream_iterator<int>{std::cout, " "},
  std::equal_to<int>{} );

This reads a bunch of ints from sin and writes the unique ones to cout. Simple, right? This code prints:

1 2 3 4 5

Think for a minute how you would implement unique_copy. First you read an int from the stream. Then you write it out to the other stream. Then you read another int. You want to compare it to the last one. Ah! You need to save the last element locally so that you can do the comparisons. Interesting.

When I really want to understand how some part of the STL works, I check out how the feature is implemented in ye olde SGI STL. This codebase is so old, it may have first been written on parchment and compiled by monks. But it’s the cleanest and most straightforward STL implementation I know, and I recommend reading it through. Here, modulo some edits for readability, is the relevant part of unique_copy:

// Copyright (c) 1994
// Hewlett-Packard Company
// Copyright (c) 1996
// Silicon Graphics Computer Systems, Inc.
template <class InIter, class OutIter, class Fn,
          class _Tp>
OutIter
__unique_copy(InIter first, InIter last,
              OutIter result,
              Fn binary_pred, _Tp*) {
  _Tp value = *first;
  *result = value;
  while (++first != last)
    if (!binary_pred(value, *first)) {
      value = *first;
      *++result = value;
    }
  return ++result;
}

(The calling code ensures that first != last, which explains why this code skips that check. And the strange _Tp* argument is so that the iterator’s value type can be deduced; the monks couldn’t compile traits classes.) Note the value local variable on line 11, and especially note line 14, where it passes a value and a reference to binary_pred. Keep that in mind because it’s important!

The Plot Thickens

You probably know more about unique_copy now than you ever cared to. Why do I bring it up? Because it’s super problematic when used with proxy iterators. Think about what happens when you try to pass vector<bool>::iterator to the above __unique_copy function:

std::vector<bool> vb{true, true, false, false};
using R = std::vector<bool>::reference;
__unique_copy(
  vb.begin(), vb.end(),
  std::ostream_iterator<bool>{std::cout, " "},
  [](R b1, R b2) { return b1 == b2; }, (bool*)0 );

This should write a “true” and a “false” to cout, but it doesn’t compile. Why? The lambda is expecting to be passed two objects of vector<bool>‘s proxy reference type, but remember how __unique_copy calls the predicate:

if (!binary_pred(value, *first)) { /*...*/

That’s a bool& and a vector<bool>::reference. Ouch!

They’re just bools, and bools are cheap to copy, so take them by value. Problem solved. Well, sure, but what if they weren’t bools? What if we proxied a sequence of things that are expensive to copy? Now the problem is harder.

So for lack of anything better (and pretending that bools are expensive to copy, bear with me), you write the lambda like this:

[](bool& b1, R b2) { return b1 == b2; }

Yuk. Now you port this code to another STL that happens to call the predicate with reversed arguments and the code breaks again. 🙁

My point is this: once we introduce proxy iterators into the mix, it becomes non-obvious how to define predicates for use with the algorithms. Sometimes the algorithms call the predicates with references, sometimes with values, and sometimes — like unique_copy — with a mix of both. Algorithms like sort first call the predicate one way, and then later call it another way. Vive la différence!

A Common Fix

This problem has a very simple solution in C++14: a generic lambda. We can write the above code simply, portably, and optimally as follows:

std::vector<bool> vb{true, true, false, false};
std::unique_copy(
  vb.begin(), vb.end(),
  std::ostream_iterator<bool>{std::cout, " "},
  [](auto&& b1, auto&& b2) { return b1 == b2; } );

No matter what unique_copy throws at this predicate, it will accommodate it with grace and style.

But still. Polymorphic function objects feel like a big hammer. Some designs require monomorphic functions, like std::function or virtuals, or maybe even a function pointer if you have to interface with C. My point is, it feels wrong for the STL to require the use of a polymorphic function for correctness.

To restate the problem, we don’t know how to write a monomorphic predicate for unique_copy when our sequence is proxied because value_type& may not convert to reference, and reference may not convert to value_type&. If only there were some other type, some other reference-like type, they could both convert to…

But there is! If you read my last post, you know about common_reference, a trait that computes a reference-like type (possibly a proxy) to which two other references can bind (or convert). In order for a proxy iterator to model the Iterator concept, I required that an iterator’s reference type and its value_type& must share a common reference. At the time, I insinuated that the only use for such a type is to satisfy the concept-checking machinery. But there’s another use for it: the common reference is the type we could use to define our monomorphic predicate.

I can imagine a future STL providing the following trait:

// An iterator's common reference type:
template <InputIterator I>
using iterator_common_reference_t =
  common_reference_t<
    typename iterator_traits<I>::value_type &
    typename iterator_traits<I>::reference>;

We could use that trait to write the predicate as follows:

using I = vector<bool>::iterator;
using C = iterator_common_reference_t<I>;
auto binary_pred = [](C r1, C r2) {
  return r1 == r2;
};

That’s certainly a fair bit of hoop-jumping just to define a predicate. But it’s not some new complexity that I’m introducing. unique_copy and vector<bool> have been there since 1998. I’m just trying to make them play nice.

And these hoops almost never need to be jumped. You’ll only need to use the common reference type when all of the following are true: (a) you are dealing with a proxied sequence (or are writing generic code that could deal with proxied sequences), (b) taking the arguments by value is undesirable, and (c) using a polymorphic function is impossible or impractical for some reason. I wouldn’t think that’s very often.

Algorithm Constraints

So that’s how things look from the perspective of the end user. How do they look from the other side, from the perspective of the algorithm author? In particular, how should unique_copy look once we use Concepts Lite to constrain the algorithm?

The Palo Alto TR takes a stab at it. Here is how it constrains unique_copy:

template <InputIterator I, WeaklyIncrementable Out,
          Semiregular R>
requires Relation<R, ValueType<I>, ValueType<I>> &&
         IndirectlyCopyable<I, Out>
Out unique_copy(I first, I last, Out result, R comp);

There’s a lot going on there, but the relevant part is Relation<R, ValueType<I>, ValueType<I>>. In other words, the type R must be an equivalence relation that accepts arguments of the range’s value type. For all the reasons we’ve discussed, that doesn’t work when dealing with a proxied range like vector<bool>.

So what should the constraint be? Maybe it should be Relation<R, ValueType<I>, Reference<I>>? But no, unique_copy doesn’t always need to copy a value into a local. Only when neither the input nor the output iterators model ForwardIterator. So sometimes the unique_copy calls the predicate like pred(*i,*j) and sometimes like pred(value, *i). The constraint has to be general enough to accommodate that.

Maybe it could also use the iterator’s common reference type? What if we constrained unique_copy like this:

template <InputIterator I, WeaklyIncrementable Out,
          Semiregular R>
requires Relation<R, CommonReferenceType<I>,
                     CommonReferenceType<I>> &&
         IndirectlyCopyable<I, Out>
Out unique_copy(I first, I last, Out result, R comp);

This constraint make a promise to callers: “I will only pass objects of type CommonReferenceType<I> to the predicate.” But that’s a lie. It’s not how unique_copy is actually implemented. We could change the implementation to fulfill this promise by casting the arguments before passing them to the predicate, but that’s ugly and potentially inefficient.

Really, I think we have to check that the predicate is callable with all the possible combinations of values and references. That sucks, but I don’t see a better option. With some pruning, these are the checks that I think matter enough to be required:

Relation<R, ValueType<I>, ValueType<I>> &&
Relation<R, ValueType<I>, ReferenceType<I>> &&
Relation<R, ReferenceType<I>, ValueType<I>> &&
Relation<R, ReferenceType<I>, ReferenceType<I>> &&
Relation<R, CommonReferenceType<I>, CommonReferenceType<I>>

As an implementer, I don’t want to write all that, and our users don’t want to read it, so we can bundle it up nice and neat:

IndirectRelation<R, I, I>

That’s easier on the eyes and on the brain.

Interesting Indirect Invokable Implications

In short, I think that everywhere the algorithms take a function, predicate, or relation, we should add a constraint like IndirectFunction, IndirectPredicate, or IndirectRelation. These concepts will require that the function is callable with a cross-product of values and references, with an extra requirement that the function is also callable with arguments of the common reference type.

This might seem very strict, but for non-proxy iterators, it adds exactly zero new requirements. And even for proxy iterators, it’s only saying in code the things that necessarily had to be true anyway. Rather than making things harder, the common reference type makes them easier: if your predicate takes arguments by the common reference type, all the checks succeed, guaranteed.

It’s possible that the common reference type is inefficient to use. For instance, the common reference type between bool& and vector<bool>::reference is likely to be a variant type. In that case, you might not want your predicate to take arguments by the common reference. Instead, you’d want to use a generic lambda, or define a function object with the necessary overloads. The concept checking will tell you if you forgot any overloads, ensuring that your code is correct and portable.

Summary

That’s the theory. I implemented all this in my Range-v3 library. Now I can sort a zip range of unique_ptrs. So cool.

Here, in short, are the changes we would need to make the STL fully support proxy iterators:

  1. The algorithms need to use iter_swap consistently whenever elements need to be swapped. iter_swap should be a documented customization point.
  2. We need an iter_move customization point so that elements can be moved out of and back into sequence. This gives iterators a new rvalue_reference associated type.
  3. We need a new common_reference trait that, like common_type, can be specialized on user-defined types.
  4. All iterators need to guarantee that their value_type and reference associated types share a common reference. Likewise for value_type/rvalue_reference, and for reference/rvalue_reference.
  5. We need IndirectFunction, IndirectPredicate, and IndirectRelation concepts as described above. The higher-order algorithms should be constrained with them.

From the end users’ perspective, not a lot changes. All existing code works as it did before, and all iterators that are valid today continue being valid in the future. Some proxy iterators, like vector<bool>‘s, would need some small changes to model the Iterator concept, but afterward those iterators are on equal footing with all the other iterators for the first time ever. Code that deals with proxy sequences might need to use common_reference when defining predicates, or they might need to use a generic lambda instead.

So that’s it. To the best of my knowledge, this is the first comprehensive solution to the proxy iterator problem, a problem we’ve lived with from day one, and which only promises to get worse with the introduction of range views. There’s some complexity for sure, but the complexity seems to be necessary and inherent. And honestly I don’t think it’s all that bad.

Future Directions

I’m unsure where this goes from here. I plan to sit on it for a bit to see if any better solutions come along. There’s been some murmuring about a possible language solution for proxy references, but there is inherent complexity to proxy iterators, and it’s not clear to me at this point how a language solution would help.

I’m currently working on what I believe will be the first draft of a Ranges TS. That paper will not address the proxy iterator problem. I could imagine writing a future paper that proposes the changes I suggest above. Before I do that, I would probably try to start a discussion on the committee mailing lists to feel people out. If any committee members are reading this, feel free to comment below.

Thanks for following along, and thanks for all your encouraging and thought-provoking comments. Things in the C++ world are moving fast these days. It’s tough to keep up with it all. I feel blessed that you all have invested so much time exploring these issues with me. <3

As always, you can find all code described here in my range-v3 repo on github.

"\e"

26 thoughts on “Iterators++, Part 3

  1. So the alternatives to common_reference would be
    – require predicates et al to supply all the other overloads (ie all combinations of value X reference)
    or
    – polymorphic functions/lambdas
    or
    – algorithms always make temporaries/casts (ie functions always take values)
    or
    – algorithms check which overloads are available and make temporaries if necessary (ie so I can choose to only provide value_type predicates if I want, and take the possible performance hit)

    Did I miss anything?

    P.S. We (obviously) already have these syntactic constraints. cppreference.com, for example, describes the predicate for std::sort basically as “it must compile”:

    comparison function object (i.e. an object that satisfies the requirements of Compare) which returns ​true if the first argument is less than (i.e. is ordered before) the second.

    The signature of the comparison function should be equivalent to the following:

    bool cmp(const Type1 &a, const Type2 &b);

    The signature does not need to have const &, but the function object must not modify the objects passed to it.
    The types Type1 and Type2 must be such that an object of type RandomIt can be dereferenced and then implicitly converted to both of them. ​

    So it is nice that you have built it into a concept – and more importantly – delved into the concept to reach some understanding. I’m still not sure whether common_reference is worthwhile if everyone just avoids it by implementing the overloads or picking something that happens to compile.

    • So the alternatives to common_reference would be
      – require predicates et al to supply all the other overloads (ie all combinations of value X reference)

      Right. I think we also want to test callability with all arguments of the common reference type, but that’s worth of a discussion.

      or
      – polymorphic functions/lambdas

      Right.

      or
      – algorithms always make temporaries/casts (ie functions always take values)

      They can’t always take values. for_each is allowed to mutate elements through the function, so it has to be allowed to take args by non-const ref. Also, the algorithms must work with move-only types.

      or
      – algorithms check which overloads are available and make temporaries if necessary (ie so I can choose to only provide value_type predicates if I want, and take the possible performance hit)

      This seems like a bad idea to me. Sometimes functions appear to exist if you test for them with decltype(), but which fail to compile when you actually try to call them. See std::equal_to::operator(), for instance.

      P.S. We (obviously) already have these syntactic constraints. cppreference.com, for example, describes the predicate for std::sort basically as “it must compile” […]

      Ha! I think we can give things a bit more semantic weight than “it must compile”. 😉

      So it is nice that you have built it into a concept – and more importantly – delved into the concept to reach some understanding. I’m still not sure whether common_reference is worthwhile if everyone just avoids it by implementing the overloads or picking something that happens to compile.

      Again, I think “it must compile” is aiming too low. I believe common_reference is essential for giving the Iterator concept some mathematical legitimacy. If there is no substitutability between an iterator’s value and reference types, then what is an iterator, anyway? I would be over the moon if common_type worked the way I need it to, but it doesn’t; hence, common_reference.

  2. The example of using a common_reference_t on vector of bool does wrong thing:
    auto binary_pred = [](C r1, C r2) {
    return r1 == r2;
    };
    If the C is of type of variant<bool, bit_reference> this would check if both r1 and r2 are intialized with the same type (bool& or bit_reference) and then compare it values. The thing that we want to check is if they point to same value:
    auto binary_pred = [](C r1, C r2) {
    struct Visitor : boost::static_visitor {
    bool operator()(bool& r1, bool&r2) const { return r1 == r2; }
    bool operator()(bool& r1, bit_reference r2) const { return r1 == r2; }
    bool operator()(bit_reference& r1, bit_reference r2) const { return r1 == r2; }
    bool operator()(bit_reference& r1, bool& r2) const { return r1 == r2; }
    };
    return boost::apply_visitor(Visitor(), r1, r2);
    };

    Also I think that he requirement for Relation are too strick:
    Relation<R, ValueType, ValueType> &&
    Relation<R, ValueType
    , ReferenceType> &&
    Relation<R, ReferenceType
    , ValueType> &&
    Relation<R, ReferenceType
    , ReferenceType> &&
    Relation<R, CommonReferenceType
    , CommonReferenceType>
    The invocation of CommonReferenceType
    impose requirements on providing variant overload for vector, that is not really necessary.

    • The common reference between two types must behave like a reference. When I say that it must be a variant, you are taking me too literally. I mean it must be a reference-like proxy that is probably implemented in terms of a variant. It must still behave like a reference, though.

  3. First, nice post! Second: I was expecting the IndirectRelation to be implemented using the cartesian product of ValueType, ReferenceType, and CommonReferenceType. What about the following cases:

    Relation<R, CommonReferenceType, ValueType> &&
    Relation<R, ValueType, CommonReferenceType> &&
    Relation<R, ReferenceType, CommonReferenceType> &&
    Relation<R, CommonReferenceType, ReferenceType>

    Why aren’t they needed?

    Or are they already covered by implicit conversions?

    • I think that the algorithms will never pass CommonReferenceType, as the explicit cast would be needed, to create that. Only combination of ValueType& (bool&) and ReferenceType (bitt_reference) will be passed from algorithms.

      And adding a CommonReferenceType invoaction will force the implementer to implement overload with variant<bool&, bit_reference> (which is tricky, as my original post showed up) even if he provide all fours combinations of bool&, bit_vector. CommonReferenceType may be used to implement one overload that will fullfill all requirement (because of impliciit converesion).

    • I can’t imagine an algorithm needing those overloads. If one argument is converted to the common reference type, the other argument could also be cast before the call. I left those out to keep the number of overloads people need to implement reasonable.

  4. Introducing requirements based on this concept of common_reference, which have nothing to do with actual program execution, seems very much outside the spirit of C++, or any programming language I’m familiar with for that matter.

    Quite simply: if the efficient implementation of something requires a common_reference type for some reason, then naturally the user will have to implement it. But it is totally unreasonable to require the user implement something that doesn’t get executed. C++ is a programming language, not a theorem language.

    As a side note, if the predicate is polymorphic, then having the IndirectRelation requirement check all 4 overloads (assuming we leave out the common_reference ones) means that the compiler must instantiate 4 overloads even though in many cases the algorithm only requires 1. While forcing a bit of extra work on the compiler is much more palatable than forcing extra work on the user, Concepts really shouldn’t be slowing things down. C++ is slow enough to compile as is. Potentially this could be worked around by having a syntax for specifying “diagnostic only” requirements that aren’t used for overload resolution and are only checked if instantiating the function leads to an error: if the compiler then determines that one of the “diagnostic only” requirements aren’t met, that failure will be shown in place of the actual template instantiation error that occurred.

  5. In our code, we have the proxy problem outside of iterators and containers, so I think dealing with it in the context of iterators is too restrictive:

    Say you have a

    rectangle type
    with members
    T left, top, right, bottom

    And you have a point type
    with members
    T x, y

    Isn’t it natural to be able to access any corner of the rectangle as a point<T&>, a proxy for the respective coordinates of the rectangle?

    The same problems you address with iter_swap/_move now appear in a non-iterator context.

    We approached the problem by having a custom decay, basically calculating the value_type of point<T&> as point (implemented by applying decay recursively, you get it).

    A rvalue_reference could be determined similarly.

    An iterator has a reference type, decltype(*it). But instead of making the iterator also have an rvalue_reference and a value_type, I want the reference type to be an intermediary. Regardless what the iterator is, if the reference type is the same, the rvalue_reference and value_type will be, too, no?

    • I think you have a valid point. There are definitely uses of proxies in generic code that don’t involve iterators.

      Arno writes:

      An iterator has a reference type, decltype(*it). But instead of making the iterator also have an rvalue_reference and a value_type, I want the reference type to be an intermediary. Regardless what the iterator is, if the reference type is the same, the rvalue_reference and value_type will be, too, no?

      I’m not sure about this. How would you handle a type that is an lvalue reference to a proxy? If a proxy is a logical reference type, then do you do “reference collapsing,” such that a reference_wrapper<int> & is treated as an int &? That works in some cases and hurts in others. For instance, what is the value type of a vector<reference_wrapper<int>>? It should be a reference_wrapper<int> IMO, not an int. Now, what happens if you use view::transform on it with an identity function like [](auto t){return t;}. The value type should still be reference_wrapper<int>, IMO.

      Replace “reference_wrapper” above with any arbitrary proxy-like type, and you see the general difficulty.

      But there is something to what you say. I really want the ability to generically manipulate a reference-like type; or rather, to manipulate the referred to type without caring whether it’s a true reference or a proxy reference. Think of something like Haskell’s fmap that lets me reach inside a reference type, do some type manipulation, and compute a new reference-like type, or strip the reference entirely. Consider that in my discussion above, there is no general way that common_type can be implemented in terms of common_reference. To me, that clearly indicates that I haven’t solved the root problem quite yet.

      I think one or two carefully chosen meta-abstractions would solve the problem, but I haven’t given it thought yet.

      • As you say, the same type sometimes serves as a proxy and sometimes as a value. But certain functions (such as iterator::operator* or my rect::topright() ) can be said to return a reference-like thing, which, if you want to store it by-value, should be “decayed”. Clearly, then, decaying is non-recursive, it only peels away one layer. If the type is already a reference, it is implemented as std::decay. If it is a (proxy) value, it must be implemented custom for that proxy type.

        Then we still have the problem of ranges that return by value. Clearly, since the same type can be a proxy and a value, as already stated above, there is no hope of automatically detecting this value-ness from the type of the expression *it.

        The same problem occurs again internally to a transform_adaptor. Is the transformation function really returning a value, or is it returning a proxy?

        In each case, only a function-dependent trait in the vain of std::result_of (iterator::operator* being a function, too) can tell. It should problably default to treating all values as values, but can be customized if a proxy is intended. References are always references.

        This is all uncomfortably complicated, but I am afraid this complexity is inherent to the proxy problem.

        Arno

        To make the simple case simple, unless the implementor of the function says otherwise, non-references should probably be treated as values.

        , only a trait can tell.

        Luckily, such a trait could be defaulted for all

        • Implementing this in our code, I just learned that decaying an rvalue reference is just like decaying a value (so apply custom decay) rather than like decaying an lvalue reference (implemented by std::decay, ignoring proxy resolution).

          Arno

        • The same problem occurs again internally to a transform_adaptor. Is the transformation function really returning a value, or is it returning a proxy?

          In each case, only a function-dependent trait in the vain of std::result_of (iterator::operator* being a function, too) can tell.

          I solved this a different way in range-v3. In addition to view::transform — to which you give a single function that takes values of the underlying sequence — I have view::iter_transform, to which you give three functions that each take iterators into the underlying sequence. The three functions are used to determine: value_type, reference, and rvalue_reference.

          From the range-v3 tests, I have:

          using ValueType =
            std::tuple<std::string, std::string>;
          using RvalueRefernece =
            std::tuple<std::string&&, std::string&&>;
          
          auto fun = overload(
            [](copy_tag, I i, I j) {
              return ValueType{};},
            [](I i, I j) {
              return std::tie(*i, *j);},
            [](move_tag, I i, I j) {
              return RvalueReference{
                std::move(*i), std::move(*j)};}
          );
          
          auto rng = view::iter_transform(v0, v1, fun); 
          

          The above code basically implements view::zip with all its weird associated types on top of a general view::iter_transform adaptor.

          This is all uncomfortably complicated, but I am afraid this complexity is inherent to the proxy problem.

          Indeed.

  6. Hi There,

    really love your work on range v3. But one thing really puzzles me. I may just have overlooked it, but where is the enumerate()? I mean the thing that gives you a {index, element} iterating through a container/view?

  7. Hi, I have really enjoyed these posts on iterators, views and ranges. I am developing functions to do moving window analysis on raster files and thought it would be useful to use iterator concepts. I keep running into problems that you haven given lots of thought.
    For instance, I only read chunks at a time of my raster data so need proxy iterators, I iterate over my data in different orders (by row, or by column), so need different views. I also have special views that are quite bulky, even if they do not own the data, for instance a view that gives a histogram of a window of surrounding values as I iterate from pixel-to-pixel. I want to combine different views and ranges, so need a zip_view.

    That brings me to a couple of remarks / questions:

    In your Range v3 proposal you stipulate that Views must be lightweight and have reference semantics. Is this a logical requirement, or just one that makes it easier to develop your library? I can see it is convenient to be able to copy views around, but would think that it would be better to avoid unnecessary copies.

    When I look at the view::zip / view::zip_with implementation I get the impression that the input ranges are moved into the zipped view. Why is that appropriate? Will it not leave the input ranges in unusable shape. Or is it that the ranges are copied first and then the copy is moved? But that would also not be appropriate as Ranges unlike Views do not have to be lightweight and can own data.

    In my own implementation of zip_range I copy ranges, except those wrapped in a std::ref of which the zip_range keeps a reference. I suppose in your terminology the view::zip would contain a number of views and a number of range references. Would that make sense?

    Sorry if I misread things, I am still struggling with reading C++11 code.

    • Clarification: In the proposal you say that Ranges do not own their elements, but Iterables might. Whereas in the documentation Ranges may own their elements, but Views do not. I meant to refer to the documentation because it seems to be an improvement.

Leave a Reply

Your email address will not be published. Required fields are marked *

*