Opened 7 months ago

Last modified 4 months ago

#7435 closed change

Use request object for filter matching — at Version 6

Reported by: mjethani Assignee: mjethani
Priority: P2 Milestone:
Module: Core Keywords:
Cc: sebastian, greiner, kzar, sergz, jsonesen Blocked By:
Blocking: #7000, #7355 Platform: Unknown / Cross platform
Ready: yes Confidential: no
Tester: Ross Verified working: yes
Review URL(s):

https://gitlab.com/eyeo/adblockplus/adblockpluscore/merge_requests/45

Description (last modified by mjethani)

Background

Currently the matcher does the following:

  1. Lower-cases the request URL for each keyword
  2. Checks if a request is third-party for each request

There's no need to lower-case the URL for each keyword, it can be lower-cased once and the lower case version cached.

For ~15% of the requests on the Alexa Top 50 (home pages only), I found that there was no need to check if the request was third-party.

All of this can be solved by using a single request object in the matcher and simply passing it around.

I find that this approach, if done correctly, can further speed up filter matching for request blocking filters by ~10% on popular sites.

What to change

See patch.

Integration notes

The signature of the matches() public method of the RegExpFilter class has now changed:

-  matches(location, typeMask, docDomain, thirdParty, sitekey)
+  matches(request, typeMask, sitekey)

The first argument request should of type URLRequest, which is exported from lib/url.js.

For example, the following call:

filter.matches("https://example.com/image.png", RegExpFilter.typeMap.IMAGE,
               "example.com", false, null)

It should now look like this:

filter.matches(URLRequest.from("https://example.com/image.png", "example.com"),
               RegExpFilter.typeMap.IMAGE, null)

The JSDoc for the URLRequest class incorrectly marks it as @package. Please ignore this, it is public, but there may be further changes to it.

After #7260 adblockpluschrome should no longer be calling the isThirdParty() function. For any other clients calling this function, the interface has changed: now the first argument should be the hostname of the request (type string) rather than the request URL (type URL|URLInfo).

Hints for testers

[TBD]

Change History (6)

comment:1 Changed 7 months ago by mjethani

  • Description modified (diff)
  • Ready set
  • Review URL(s) modified (diff)
  • Status changed from new to reviewing

comment:2 Changed 7 months ago by abpbot

A commit referencing this issue has landed:
Issue 7435 - Use request object for filter matching

comment:3 Changed 7 months ago by mjethani

  • Cc sebastian greiner added
  • Description modified (diff)

comment:4 Changed 7 months ago by mjethani

  • Cc kzar added

comment:5 Changed 7 months ago by mjethani

  • Cc sergz added

comment:6 Changed 7 months ago by mjethani

  • Cc jsonesen added
  • Description modified (diff)
Note: See TracTickets for help on using tickets.