Opened 7 months ago

Last modified 7 months ago

#6834 new change

Index filter objects by hash

Reported by: mjethani Assignee:
Priority: Unknown Milestone:
Module: Core Keywords:
Cc: Blocked By:
Blocking: #6729, #6831 Platform: Unknown / Cross platform
Ready: no Confidential: no
Tester: Unknown Verified working: no
Review URL(s):

Description (last modified by mjethani)


When Adblock Plus is loaded with the default list of subscriptions, there are initially ~18.5 MB of internal string objects in the V8 heap. The first step towards reducing the memory consumption related to these string objects is to stop holding on to the filter text. All strings derived from the filter text are still "sliced strings" in V8, which means anything we do to "optimize" them right now (see #6710 and #6831) is only going to make it worse. Once we've lost the references to the filter text itself, then we can start optimizing the strings by "interning" them and reusing the interned strings (see #6729).

For example, take the following two filters:,,

This creates two strings and four "sliced strings" in V8 (two each for the domain and selector parts).

Our goal is to reuse "," so that we end up with three strings (",", "foo", and "bar") and two "concatenated strings" ("," and ","). In order for this to work, we need to lose any references to the original filter text. We should then generate the filter text on demand by concatenating the substrings whenever required (e.g. during serialization).

But first we should stop using the filter text as the key for the filter object cache.

One way to do this is to index the entries in the Filter.knownFilters cache of Filter instances by a numeric hash of the filter text (preferably a V8 "Smi") rather than the filter text itself.

SuperFastHash by Paul Hsieh, which is used by Chromium for various non-cryptographic purposes, appears to be a good candidate hash function for this purpose. In my test, out of ~82,000 filters in the default list of subscriptions, I got only one collision.

What to change


Change History (3)

comment:1 Changed 7 months ago by mjethani

  • Description modified (diff)

comment:2 Changed 7 months ago by mjethani

  • Summary changed from Index filter objets by hash to Index filter objects by hash

comment:3 Changed 7 months ago by mjethani

  • Blocking 6831 added
  • Component changed from Unknown to Core
  • Review URL(s) modified (diff)
Note: See TracTickets for help on using tickets.