# Range Comprehensions

I’ve been busy since I last wrote about ranges. I have a lot of news to share, but in this post, I’m going to narrowly focus on a recent development that has me very excited. It’s a new feature that I’m calling range comprehensions, and they promise to greatly simplify the business of creating custom ranges.

## List Comprehensions

If you’re familiar with list comprehensions from Haskell or Python, your ears might have pricked up when I said “range comprehensions.” List comprehensions give you a pithy way to generate new lists from existing ones, either by transforming thenm, filtering them, or combining them, or whatever. Here, for instance, is a Haskell program for generating the first 10 Pythagorean triples:

```main = print (take 10 triples)

triples = [(x, y, z) | z <- [1..]
, x <- [1..z]
, y <- [x..z]
, x^2 + y^2 == z^2]
```

The way to read the `triples` line is this: generate a list of tuples `(x, y, z)` where `z` goes from 1 to infinity, `x` goes from 1 to `z` (inclusive), and `y` goes from `x` to `z`, but only yield those triples for which `x^2 + y^2 == z^2` is true. The code then generates every combination of `x`, `y`, and `z` in the specified ranges in some order and filters it, yielding a list of the Pythagorean triples. Beautiful. Of particular interest is the fact that, since Haskell is lazy, there is no problem with a comprehension that generates an infinite list.

## Back-Story

Back in October, I published a blog post about API design and `std::getline` in which I showed how a range-based interface is better than the existing one. My friend Bartosz Milewski commented that ranges are difficult to work with and challenged me to show the range-based equivalent of the above pithy Haskell program. I admit that at the time, I had no answer for Bartosz.

Recently, Bartosz published a blog post about just this problem. In his post, Bartosz describes some pretty simple results from category theory (if any category theory can be described as “simple”), and applies it to the problem of generating the Pythagorean triples lazily in C++. It’s a great post, and you should read it. Here, finally, was my answer. Although Bartosz’s code was terribly inefficient, somewhat difficult to reason about, and not formulated in terms of STL-ish concepts, I knew the direction I had to take.

## Introducing Range Comprehensions

Without further ado, here is my solution to the Pythagorean triples problem:

```using namespace ranges;

// Lazy ranges for generating integer sequences
auto const intsFrom = view::iota;
auto const ints = [=](int i, int j)
{
return view::take(intsFrom(i), j-i+1);
};

// Define an infinite range of all the Pythagorean
// triples:
auto triples =
view::for_each(intsFrom(1), [](int z)
{
return view::for_each(ints(1, z), [=](int x)
{
return view::for_each(ints(x, z), [=](int y)
{
return yield_if(x*x + y*y == z*z,
std::make_tuple(x, y, z));
});
});
});

// Display the first 10 triples
for(auto triple : triples | view::take(10))
{
std::cout << '('
<< std::get<0>(triple) << ','
<< std::get<1>(triple) << ','
<< std::get<2>(triple) << ')' << '\n';
}
```

Lines 4 and 5 define `intsFrom` and `ints`, which are lazy ranges for generating sequences of integers. Things don’t get interesting until line 12 with the definition of `triples`. That’s the range comprehension. It uses `view::for_each` and `yield_if` to define a lazy range of all the Pythagorean triples.

### view::for_each

What is `view::for_each`? Like `std::for_each`, it takes a range and function that operates on each element in that range. But `view::for_each` is lazy. It returns another range. The function that you pass to `view::for_each` must also return a range. Confused yet?

So many ranges, but what’s going on? Conceptually, it’s not that hard. Let’s say that you call `view::for_each` with the range `{1,2,3}` and a function `f(x)` that returns the range `{x,x*x}`. Then the resulting range will consist of the elements: `{1,1,2,4,3,9}`. See the pattern? The ranges returned by `f` all got flattened. Really, range flattening is all that’s going on.

Now look again at line 12 above:

```auto triples =
view::for_each(intsFrom(1), [](int z)
{
return view::for_each(ints(1, z), [=](int x)
{
return view::for_each(ints(x, z), [=](int y)
{
return yield_if(x*x + y*y == z*z,
std::make_tuple(x, y, z));
});
});
});
```

For every integer `z` in the range 1 to infinity, we call `view::for_each` (which, recall, returns a flattened range). The inner `view::for_each` operates on all the integers `x` from 1 to `z` and invokes a lambda that captures `z` by value. That function returns the result of a third invocation of `view::for_each`. That innermost lambda, which finally has `x`, `y`, `z`, makes a call to a mysterious-looking function that is provocatively named `yield_if`. What’s that?

### yield_if

The semantics of `yield_if` is to “inject” the tuple `(x,y,z)` into the resulting sequence, but only if is a Pythagorean triple. Sounds tricky, but it’s really very simple. Recall that the function passed to `view::for_each` must return a range. Therefore, `yield_if` must return a range. If the condition `x*x + y*y == z*z` is false, it returns an empty range. If it’s true, it returns a range with one element: `(x,y,z)`. Like I said, simple. There’s also a function called `yield` that unconditionally returns a single-element range.

Now that you know how it works, you can forget it. You can just use `view::for_each` and `yield_if` as if you were writing a stateful function that suspends itself when you call `yield` or `yield_if`, kind of like a coroutine. After all, I picked the name “yield” to evoke the `yield` keyword from C#. That keyword gives the function it appears in precisely those coroutine-ish semantics. What’s more, C# functions that have `yield` statements automatically implement C#’s `IEnumerable` interface. `IEnumerable` fills the same niche as the Iterable concept I’ve described in previous posts. That is, you can loop over the elements.

For instance, in C# you can do this (taken from Wikipedia):

```// Method that takes an iterable input (possibly an
//  array) and returns all even numbers.
public static IEnumerable<int>
GetEven(IEnumerable<int> numbers) {
foreach(int i in numbers) {
if((i % 2) == 0) {
yield return i;
}
}
}
```

With range comprehensions, equivalent code looks like this:

```auto GetEvens =
view::for_each(numbers, [](int i)
{
return yield_if((i % 2) == 0, i);
});
```

That’s darn near the same thing, and we don’t need any special keyword or compiler magic.

## Performance

Ranges that return ranges that return ranges, oy vey. How horribly does it perform at runtime? As it turns out, not horribly at all, but a lot depends on how good your optimizer is.

I wrote a simple benchmark program that iterates over the first 3000 triples and does some trivial computation with them. I do this in two ways. One is with the range comprehension above, and the other is with this triply-nested `for` loop:

```for(int z = 1;; ++z)
{
for(int x = 1; x <= z; ++x)
{
for(int y = x; y <= z; ++y)
{
if(x*x + y*y == z*z)
{
result += (x + y + z);
++found;
if(found == 3000)
goto done;
}
}
}
}
done:
```

You would expect this solution to fly and the range-based one to crawl. But here are the numbers using a hot-off-the-presses gcc-4.9 with `-O3`:

``````Raw loop:             2.2s
Range comprehension:  2.3s
``````

That’s it?! Yes, all that extra work being done by the range comprehension is totally transparent to the optimizer, which generates near-optimal code. Kind of amazing, isn’t it?

If, however, your compiler of choice is clang, I have some bad news for you. The range comprehension is (wait for it) 15 times slower. Dear god, that’s awful. I guess that goes to show that despite the astounding awesome-ness of clang in most respects, its optimizer still has some ways to go.

## Summary

Haskell and Python have list comprehensions. C# has LINQ and `yield`. And now C++ has range comprehensions. It’s now pretty trivial to generate non-trivial sequences on the fly, lazily and efficiently, in a way that plays well with all the STL algorithms. Like I said, I’m pretty excited.

#### Acknowledgements

My deep thanks to Bartosz Milewski for getting me 90% of the way there. I couldn’t have done this without his insights, and the insights of all the functional programmers and category theoreticians that came before. Math FTW!

## 44 Replies to “Range Comprehensions”

1. “Bartosz Milewski commented that ranges are difficult to work worth”, I assume worth is supposed to be with?

Great post by the way 🙂

• Thanks! I’ll fix that as soon as I get a chance.

2. Nice post!

yield_if is so simple and so nice, good work!

Any idea about what clang is doing wrong there?
Could you post the compiler flags and clang-version you used?

If it is an inlining problem one can try to help the compiler with the attributes [[gcc::flatten / gcc::always_inline]] and [[gcc::const / gcc::pure]] and see if it is still failing(note that const is stronger than pure).

• I use clang built from svn trunk (updated within the past week). The compiler options are unsurprising: “`-std=c++11 -O3 -DNDEBUG`“. That’s it.

I’m disinclined to hack the source to accommodate clang’s lousy optimizer. It’s a bug, and the LLVM guys should fix it. Filing the bug is on my to-do list.

• Forget it. What I meant is that testing if the performance returns to normal when inlining is forced via an attribute would indicate that the problem is related to clang’s inliner, and that this information could enrich the bug report.

But I just tried those, and there wasn’t any difference since it seems that clang doesn’t support flatten, and although it supports always_inline/hot they seem to have no effect.

• But you were right in a sense. Check out this comment on Reddit to see a partial work-around for the clang performance problem. You just have to bump up the inlining threshold (using an unsupported switch, boo!)

• I just tried it and it works! Now I only get a 2x slowdown!

After checking out your code I’m pretty convinced that you haven’t built enough layers of abstraction for the inliner to perform this badly by default. You should definitely submit a bug report.

Now I just wonder where the remaining 2x slowdown comes from…

• Can you implement the Pythagorean triples example using them?

• Right. And for the uninitiated, by `bind` Paul means the function in Haskell that makes lists a Monad. In Boost.Range, the equivalent would be an adaptor that flattens a range of ranges into a range.

3. This was a great post. I’m a big range fan and a D enthusiast, and it is always welcoming to seeing ranges getting expressive implementations in C++ also.

• Thanks. And to put this work in its proper historical perspective, ranges in one form or another have been in Boost since roughly forever, so C++ ranges aren’t some Jonny-come-lately, me-too technology.

4. does anybody knows the complexity of that haskell code for triplets,
i assume it is terrible, unlike imperative solution that would do sqrt on y.

5. that would do sqrt on z^2-y^2 and then check if the diff is a square of an int by testing floor^2 and ceil^2 of the root

6. How do the performance of your C++ example compate to the performance of the Haskell and C++++ versions?

• Good question. Maybe some Haskell apologist can tell me what I’m doing wrong, but I couldn’t get any Haskell program to do the equivalent to my benchmark and complete in any reasonable amount of time. I’m using this program (compiled with `ghc -O2`):

``````import Data.List
main = print (foldl' (+)
0
(map (\(x,y,z)->x+y+z)
(take 1000 triples)))
triples = [(x, y, z) | z <- [1..]
, x <- [1..z]
, y <- [x..z]
, x^2 + y^2 == z^2]
``````

At 1000 triples, it already takes 7.7s. At 3000, it didn’t come back and I killed it.

In comparison, the raw nested for loops above finished in 2.2s and the range comprehension in 2.3s.

• I got an answer on Reddit. The following Haskell finishes in 8.5s:

``````import Data.List
default(Int, Double)

main = print (foldl' (+)
0
(map (\(x,y,z)->x+y+z)
(take 3000 triples)))

triples :: [(Int, Int, Int)]
triples = [(x, y, z) | z <- [1..]
, x <- [1..z]
, y <- [x..z]
, x^(2 :: Int) + y^(2 :: Int) == z^(2 :: Int)]
``````

That’s still quite a bit slower than the C++, both the raw nested loops and the fancy range comprehension. Go figure.

• Nothing to do with the range being infinite. But yeah, `Integer` is the default in Haskell, and it’s unlimited precision. The operations on it are apparently very slow.

• bad wording, i mean that Integer can represent -inf to +inf (with inf RAM) aka infinite range, not in a sense [1..]

• I’m using GHC 7.8.2 with LLVM 3.4svn. I’m also using GCC 4.8.2. Here are two toy programs that compute the same thing in almost the same way:

``` -- HUSKELL.hs import Data.List (foldl')```

``` ```

```main = print vals where vals = foldl' (+) 0 \$ map add \$ take 3000 triples add (x, y, z) = x + y + z triples = [ (x, y, z) :: (Int, Int, Int) | z <- [1..], x <- [1..z], y <- [x..z], sq x + sq y == sq z ] sq = (^ (2 :: Int)) ```

``` // SEPPLES.cpp```

``` include constexpr int max_triples = 3000; ```

```int main(int argc, char **argv) { int result = 0; int found = 0; for (int z = 1;; ++z) { for (int x = 1; x <= z; ++x) { for (int y = x; y <= z; ++y) { if (xx + yy == z*z) { result += (x + y + z); ++found; if (found == max_triples) { goto done; } } } } } done: printf("%d\n", result); } ```

I built and executed these programs as follows:

``` \$ ghc -O2 -fllvm HUSKELL.hs [1 of 1] Compiling Main ( HUSKELL.hs, HUSKELL.o ) Linking HUSKELL ... \$ g++ -O3 -o SEPPLES -std=c++11 SEPPLES.cpp \$ time ./HUSKELL 10650478 ./HUSKELL 3.47s user 0.00s system 99% cpu 3.484 total \$ time ./SEPPLES 10650478 ./SEPPLES 3.42s user 0.00s system 99% cpu 3.429 total ```

Toy benchmarks! \o/

7. I see people are ignoring me, so Ill just be a di*k and post performance measurements:
http://ideone.com/ikmxwI
2390ms
82ms
( n had to be reduced from 3000 cuz of timeouts 🙂 )

So basically you can freely delete performance chapter of a post because it is like a bad joke… it is like writing your sort is within 1% of the bubble sort, aka nobody who cares about perf is not gonna use bubble sort so your measurements are useless.

So you should probably fix your yield_if code to calculate it properly but then honestly imperative solution looks 10x more readable imho(lambda stuck inside 2 levels of for == :/ ).

• Um, the post is about range comprehensions, not about the optimal way to find Pythagorean triples. It’s an example. Jeez.

• Um,
making a elegant solution to a problem ignoring its shi**y alg complexity is a Haskell way not a cpp one. 😛
aka dont drink the Haskell kool-aid 😛
Yes Haskell way can be made efficient, but then you loose(in practice mostly useless anyway) nice concise way of declaring pt range.
My point was not that u are ignorant of crappy alg complexity of alg but
a) nobody discussed this
b) it is kind of irritating for me personally to have those kind of problems- where you need to worry about complexity of your declarative syntax sugar cuz it is not as obvious as imperative, but then again it should not happen ofter IRL

also though your solution is nice proof of concept syntax is horrible, for eg if it was “languaged in”
c++ we could:
-get rid of ugly input params to inner functions([=](int y))
-get rid of ugly ints(1,n) and have [1..n]

though again in most cases your ranges will be simple and you wont suffer as much in pt.

but tbh i have real problems with cpp, for example if serialization of “nonC” structures could be made in ISO portable way that would benefit me IRL, while in most cases I dont really care about comprehensions…
also often I wish map had a .keys.begin()/end() and .values.begin()/end() than that I can define list comprehensions

If somebody wants to enlighten me when range comprehensions is so important please do, all I can think of is some nice interview questions for ggl, fb… 🙂
where it reduces line count from 10 to 5 🙂

8. Great post and I hope to see some more discussion in the ranges SG about how to handle lazy ranges. Personally, I think that core language support like yield return would be preferable, but this library-base solution is a great start (as you undoubtedly remember range-based for was a reaction to your FOREACH macro becoming popular). So, ideally I would like to be able to write the above triples as a function something similar to the following:
``` auto triples() { for (auto z : intsFrom(1)) for (auto x : ints(1, z)) for (auto y : ints(x, z)) if (xx + yy == z*z) yield return std::make_tuple(x, y, z); } ```

• actually i prefer
for (auto z : intsFrom(1), auto x : ints(1, z),auto y : ints(x, z))

• That’s not a bad suggestion for a language extension. Although I think a similar syntax has been suggested to mean `zip`.

• maybe zip could be

for (auto z , auto x , auto y: intsFrom(1), ints(1, z), ints(x, z))

or maybe brackets could mean pair/tuple (ofc only in range based for):

for ( (z,&x,y) : intsFrom(1), ints(1, z), ints(x, z)) // x is auto&

but for some reason I feel ppl from ISO love their auto so that wont fly…

and ofc I must rage a bit now:
what is more often used:
nested loops or zip?
but ofc zip is so functional, functional is so cool => lets add zip to core lang, who cares about nested loops 😛

9. This is one of the things that I love about python. Lets me generate ranges on the fly. I think this would be one of the killer features of ranges: being able to generate them on the fly. Very interesting post, indeed. Can’t wait for a finished proposal.

• Sigh. Thanks.

• Yeah, I was at C++Now with Chandler, poking him in the back while he worked on this. I’m not sure what the current state is, but last I heard from him, the difficulty clang has optimizing this code is a somewhat intractable problem, and he had given up and turned his attention elsewhere. Which makes me sad.

10. Eric this maybe a bit offtopic, but will your range library support what Sean Parrent calls projections or something like that…

I mean if I have persons vector
can i do:
persons | filter(Person::get_name())

or
sort(persons, Person::get_name()) // uses < on results of get_name

P.S. Please please put some paypal or btc way to donate…. you may not need the money, but I think that it is polite and moral to give back to people like you that put a lot of hard work to improve the humanity by giving millions of C++ developers a better tools 😀

• I mean if I have persons vector can i do:
persons | filter(Person::get_name())

You probably mean “transform” instead of “filter”, but yes, you can do this:

``````persons | transform(&Person::get_name)
``````

This isn’t a projection though. It’s what I call an “invokable” — something that can be passed to `std::bind`. And yes, I got the idea from Sean.

or
sort(persons, Person::get_name())

Yes, I support projections, except that to specify a projection to `sort`, you have to specify the comparison function, so it would look like:

``````sort(persons, std::less<>{}, &Person::get_name);
``````

This idea also comes from Sean.

When I presented my work to the committee in Urbana-Champaign, we took straw polls regarding whether or not to support invokables and projections in the algorithms. Support was unanimously in favor of invokables, and strongly in favor of projections. I think we’re likely to see them in a future version of the standard library.

P.S. Please please put some paypal or btc way to donate…. you may not need the money, but I think that it is polite and moral to give back to people like you that put a lot of hard work to improve the humanity by giving millions of C++ developers a better tools 😀

Wow. You know, this comment is worth far more to me than the \$XY dollars you might donate. THANK YOU. I’ll think of some way to give people a way to express thanks. I’m not sure whether it’ll be money or kudos or what.

• Hi Eric, I gave a bit of \$ to Boost since they(including you) also do amazing work. If you ever setup a donation account please make a blog post about it so I notice.
I know it is unlikely that you will earn more than 5%-25% of what you could as staff eng at ggl or msft, but I think principle is important. People that do amazing work should be appreciated in more than a “good job Jimmy” way 😀

• For personal reasons, I decided against giving users a way to donate money. Money is not how I measure my success. Besides, on the off-chance that it brought in real money (unlikely) I would have to declare it on my taxes, which is more work than it’s worth. 😉

Believe me, I consider myself amply compensated for my work. And I really appreciate the sentiment.

11. I’m not sure the compiler remark was a fair comment against C#, and as you pointed out, C# has Linq:

public static IEnumerable GetEvens(IEnumerable numbers) => numbers.Where(i => (i % 2) == 0);

or my preference:
public static IEnumerable Evens(this IEnumerable numbers) => numbers.Where(i => (i % 2) == 0);

• Sorry, what example is missing a capture?

• The topmost one. It will not compile with `ints` not captured. [I suppose I never got notifications working here]

This site uses Akismet to reduce spam. Learn how your comment data is processed.