Opened on 03/01/2018 at 10:59:01 AM

Closed on 08/29/2019 at 05:43:52 PM

#6434 closed change (rejected)

Define helpers for non-array iterables

Reported by: mjethani Assignee:
Priority: Unknown Milestone:
Module: Core Keywords: closed-in-favor-of-gitlab
Cc: agiammarchi, sergz, sebastian, hfiguiere, fhd, greiner, kzar, erikvold Blocked By:
Blocking: #6425 Platform: Unknown / Cross platform
Ready: no Confidential: no
Tester: Unknown Verified working: no
Review URL(s):

Description (last modified by mjethani)

Background

ES6 has support for non-array iterables (e.g. generators), which can help with performance because the values may be generated "lazily" only as required.

For example, here's a function that generates a Fibonacci series:

function* fibonacci()
{
  yield* function* _(a = 0, b = 1)
  {
    yield a;
    yield* _(b, a + b);
  }();
}

The following loop will find the first number in the series that is greater than or equal to 1,000 and print its value:

for (let i of fibonacci())
{
  if (i >= 1000)
  {
    console.log(i);
    break;
  }
}

If the series were an array, it would be possible to write code like the following:

console.log(fibonacci().find(i => i >= 1000));

Unfortunately Array.prototype.find.call does not work here either:

console.log([].find.call(fibonacci(), i => i >= 1000)); // prints undefined

Since this is an endless series, we cannot simply convert it into an array. We could modify the generator to add a limit parameter:

function* fibonacci(n = Infinity)
{
  yield* function* _(a = 0, b = 1)
  {
    if (n-- > 0)
    {
      yield a;
      yield* _(b, a + b);
    }
  }();
}

And then pass in a rough estimate:

console.log([...fibonacci(2000)].find(i => i >= 1000)); // prints 1597

But this is less than ideal and not practical in many situations.

What we really need here is a generic equivalent of Array.prototype.find, something like this:

function find(iterable, callback, thisArg)
{
  for (let item of iterable)
    if (callback.call(thisArg, item))
      return item;
}

Which could be used like this:

console.log(find(fibonacci(), i => i >= 1000)); // prints 1597

What to change

Write generic implementations of the every, filter, find, flatten, includes, indexOf, map, reduce, reverse, slice, and possibly other members of Array.prototype.

filter, flatten, map, reverse, and slice should be lazy wrappers.

For example:

function* map(iterable, callback, thisArg)
{
  for (let item of iterable)
    yield callback.call(thisArg, item);
}

i.e. they should return iterators.

Attachments (0)

Change History (16)

comment:1 Changed on 03/01/2018 at 10:59:19 AM by mjethani

  • Summary changed from Define helpers functions for non-array iterables to Define helpers for non-array iterables

comment:2 Changed on 03/01/2018 at 11:01:14 AM by mjethani

  • Description modified (diff)

comment:3 Changed on 03/01/2018 at 11:12:00 AM by mjethani

  • Description modified (diff)

comment:4 Changed on 03/01/2018 at 11:13:29 AM by mjethani

  • Blocking 6433 removed

comment:5 Changed on 03/01/2018 at 04:21:00 PM by sergz

What about using some library like https://github.com/cognitect-labs/transducers-js?

comment:6 follow-up: Changed on 03/02/2018 at 11:25:29 AM by mjethani

The implementation is fairly straightforward and I'm not sure it would be worth including an external dependency just for this, but I may be wrong.

comment:7 in reply to: ↑ 6 Changed on 03/02/2018 at 12:10:40 PM by sergz

Replying to mjethani:

The implementation is fairly straightforward and I'm not sure it would be worth including an external dependency just for this, but I may be wrong.

I agree that the implementation is likely straightforward but I find it so common and old known concept that it should be possible to find a decent light library instead of doing it again, at least we should try.

In addition, for that functionality I would like to recommend to actually use the approach which is called transducers because it will definitely improve robustness (by saving us from some mistakes) and improve code readability.

comment:8 Changed on 03/05/2018 at 01:36:12 PM by greiner

What's the real-world impact of this? Is this something that currently noticeably impacts the performance of any critical components?

It sounds more like it'd help us write code that's easier to maintain which I think would be great.

Finally, are there any upcoming standards that may tackle this problem? Because in that case we may want to stay close to those with our solution.

Last edited on 03/05/2018 at 01:41:32 PM by greiner

comment:9 Changed on 03/05/2018 at 01:48:20 PM by mjethani

We would have to investigate that, and I can take that up, but also I think that just having the tools to write efficient code around makes it more likely that efficient code gets written.

For example, this:

yield* getURLsFromElement(element).map(url => url.trim());

It could be written as:

yield* map(getURLsFromElement(element), url => url.trim());

It would be a no-brainer to use this generic map function here if it existed, but of course since it doesn't exist, the choice is between yield* and Array.prototype.map (better readability) on the one hand and for...of and yield (better performance) on the other.

comment:10 Changed on 03/05/2018 at 02:40:10 PM by agiammarchi

I've followed this ticket with a lot of interest and to start with, I'd like to underline that:

  • I am pro cleaner/easier-to-write code.
  • I am pro efficiency, when needed.

What I am a bit skeptical about, are approaches like "this way to rule them all", specially when it doesn't necessarily result in better performance.

There are developers that started writing plib(1).sum(2).then(console.log) when Promises went out, thinking that even linear operations should've been asynchronous ... and I've seen big projects failing because of the negative performance impact the "everything as a promise" approach caused.

Same goes when you have everything as a generator: it could be just overhead and pressure for the garbage collector without any real world benefit.

If your code already has the a whole list of numbers, as example, and all you need is some map/reduce that does not involve any asynchronous opearation, you're better off with just map and you won't have any advantage or performance boost making that map a generator based one.

We all know a generator creates an object holding a value with a next() method and a done property, it doesn't come for free.

We can also partially visualize the pseudo-internal overhead of generators through babel [1] which is nothing compared to a simple map [2]

If you are wondering if performance are really that bad with generators, compared with linear operations, the answer is 99% of the time: yes, that bad'' [3]

As summary, I think we should clean up where possible but don't try to sell ourselves generators as a better/general purpose approach, because specially when performance matters, and asynchronous code is not involved, these are usually much slower than normal code.

[1] http://bit.ly/2I74hwP
[2] http://bit.ly/2H4rYo9
[3] https://jsperf.com/gen-map

comment:11 Changed on 03/05/2018 at 07:14:27 PM by mjethani

This ticket is really not about whether we should use generators but rather about whether we should write generic equivalents of map, reduce, and so on for the cases where we have already decided to use generators.

You can see an example of our use of generators in lib/content/elemHideEmulation.js

Here's a snippet from that file:

*getSelectors(prefix, subtree, styles)
{   
  for (let element of this.getElements(prefix, subtree, styles))
    yield [makeSelector(element, ""), element];
}

If we had a map function, we could write the same code like this:

*getSelectors(prefix, subtree, styles)
{   
  yield* map(this.getElements(prefix, subtree, styles), 
             element => [makeSelector(element, ""), element]);
}

I don't think anybody is arguing that we should use generators for everything.

The name "generator" suggests that the values are generated so I don't think it would be a good idea to use a generator for static values.

comment:12 Changed on 03/05/2018 at 07:41:26 PM by kzar

  • Component changed from Platform to Core

Having played with Clojure in which nearly all sequences are lazy by default and are easily mapped etc I agree with Manish's thinking behind this issue. However I think these things would belong in adblockpluscore though so I've updated the module here.

I also agree with Sergz, if we can reuse something like the transducers library we really should. In fact why not the transducers library, it's written by Cognitect who are the guys behind Clojure and are pretty much thought leaders with some of this stuff. We should at least evaluate it first before deciding to roll our own.

Finally I agree with Andrea and Manish in that we shouldn't force any of this stuff on people, so I guess no linting or coding style guideline changes are required.

comment:13 Changed on 03/20/2018 at 01:46:18 PM by abpbot

A commit referencing this issue has landed:
Issue 6434 - Reimplement positionInParent with indexOf

comment:14 Changed on 03/20/2018 at 07:08:19 PM by sebastian

I don't think we should blindly start adding dead code, that might (or might not) be useful at some point in the future. If we have a use case in our existing code base (or in a future change) where dealing with generators in a more functional programming style could improve the code, we can discuss it then, and if appropriate add the respective helper functions.

However, the examples from comment 9 and 11 don't strike me as a large improvement. Using a for..of loop is the standard way in modern JavaScript to deal with those situations, and the boilerplate saved by helper functions like map() is minimal (and IMO just makes the code more obscure).

As for performance, it's not an improvement. If at all, generators mitigate peak memory usage, which is only relevant when processing large amount of data. On the other hand, generator functions still prevent some JIT optimizations (as we have seen here).

I'm not saying, we should avoid generator functions at all cost. But in the few situations where they help code legibility and/or peak memory usage, I feel sufficiently equipped with for..of loops. You might feel differently if you are coming from functional programming languages. But enforcing foreign programming paradigms, on a language that wasn't designed for them, through helper functions, just adds unnecessary abstraction that makes the code obscure, more difficult to debug and less efficient.

Last edited on 03/20/2018 at 07:32:53 PM by sebastian

comment:15 Changed on 11/15/2018 at 05:27:00 AM by erikvold

  • Cc erikvold added

comment:16 Changed on 08/29/2019 at 05:43:52 PM by sebastian

  • Keywords closed-in-favor-of-gitlab added
  • Resolution set to rejected
  • Status changed from new to closed

Sorry, but we switched to GitLab. If this issue is still relevant, please file it again in the new issue tracker.

Add Comment

Modify Ticket

Change Properties
Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from (none).
 
Note: See TracTickets for help on using tickets.