Opened on 03/12/2018 at 01:35:50 PM

Closed on 03/13/2018 at 02:41:51 PM

#6467 closed change (fixed)

Use Map object for caching filter matches

Reported by: mjethani Assignee: mjethani
Priority: Unknown Milestone:
Module: Core Keywords:
Cc: sergz, kzar, hfiguiere Blocked By:
Blocking: Platform: Unknown / Cross platform
Ready: no Confidential: no
Tester: Unknown Verified working: no
Review URL(s):

https://codereview.adblockplus.org/29719569/

Description (last modified by mjethani)

Background

Using a Map object for maintaining a URL-to-filter cache in CombinedMatcher would be significantly better in terms of performance. It also makes the code easier to read.

Here's a simple test:

function foo()
{
  const getRandom = n => Math.floor(Math.random() * n).toString(2);

  let cache = Object.create(null);

  for (let i = 0; i < 1000000; i++)
    cache[getRandom(1000)] = Math.random();

  console.log(Object.keys(cache).length + " entries");

  let lookupKeys = new Array(1000000);
  for (let i = 0; i < lookupKeys.length; i++)
    lookupKeys[i] = getRandom(1000000);

  let s = +new Date();
  let n = 0;
  for (let i = 0; i < lookupKeys.length; i++)
  {
    let key = lookupKeys[i];
    if (key in cache)
      n++;
  }

  console.log(n + " hits");

  return +new Date() - s;
}

This creates a cache of ~600-650 entries and then tries a million random matches. It gets about the same number of hits out of a million. We are mainly testing the speed of key in cache here.

let n = 0;
for (let i = 0; i < 10; i++)
  n += foo();
console.log(n + " milliseconds");

On my machine this takes ~4 seconds on Node.js (V8).

Now if we change it to the following:

function foo()
{
  const getRandom = n => Math.floor(Math.random() * n).toString(2);

  let cache = new Map();

  for (let i = 0; i < 1000000; i++)
    cache.set(getRandom(1000), Math.random());

  console.log(cache.size + " entries");

  let lookupKeys = new Array(1000000);
  for (let i = 0; i < lookupKeys.length; i++)
    lookupKeys[i] = getRandom(1000000);

  let s = +new Date();
  let n = 0;
  for (let i = 0; i < lookupKeys.length; i++)
  {
    let key = lookupKeys[i];
    if (cache.has(key))
      n++;
  }

  console.log(n + " hits");

  return +new Date() - s;
}

On my machine this takes ~900 milliseconds.

We can also test the speed of setting the value, which I find to also be faster for a Map object.

Lookups are significantly faster so that the following code:

if (key in cache)
  return cache[key];

Can be changed to:

let value = cache.get(key);
if (value !== undefined)
  return value;

This saves us an extra lookup if we only assume that there will never be an entry with a value of undefined.

lib/matcher.js already uses Map objects for other things.

What to change

Change CombinedMatcher.resultCache to be of type Map.<string,Filter> and use the aforementioned single lookup optimization.

Attachments (0)

Change History (5)

comment:1 Changed on 03/12/2018 at 01:36:29 PM by mjethani

  • Cc sergz kzar hfiguiere added
  • Review URL(s) modified (diff)

comment:2 Changed on 03/12/2018 at 01:50:44 PM by mjethani

  • Description modified (diff)

comment:3 Changed on 03/13/2018 at 02:39:50 PM by abpbot

A commit referencing this issue has landed:
Issue 6467 - Use Map object for caching filter matches

comment:4 Changed on 03/13/2018 at 02:41:40 PM by mjethani

  • Owner set to mjethani

comment:5 Changed on 03/13/2018 at 02:41:51 PM by mjethani

  • Resolution set to fixed
  • Status changed from new to closed

Add Comment

Modify Ticket

Change Properties
Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from mjethani.
 
Note: See TracTickets for help on using tickets.