Opened on 04/03/2019 at 10:06:51 AM

Closed on 04/13/2019 at 01:13:25 PM

Last modified on 07/25/2019 at 10:34:20 PM

#7435 closed change (fixed)

Use request object for filter matching

Reported by: mjethani Assignee: mjethani
Priority: P2 Milestone:
Module: Core Keywords:
Cc: sebastian, greiner, kzar, sergz, jsonesen Blocked By:
Blocking: #7000, #7355 Platform: Unknown / Cross platform
Ready: yes Confidential: no
Tester: Ross Verified working: yes
Review URL(s):

https://gitlab.com/eyeo/adblockplus/adblockpluscore/merge_requests/45

Description (last modified by mjethani)

Background

Currently the matcher does the following:

  1. Lower-cases the request URL for each keyword
  2. Checks if a request is third-party for each request

There's no need to lower-case the URL for each keyword, it can be lower-cased once and the lower case version cached.

For ~15% of the requests on the Alexa Top 50 (home pages only), I found that there was no need to check if the request was third-party.

All of this can be solved by using a single request object in the matcher and simply passing it around.

I find that this approach, if done correctly, can further speed up filter matching for request blocking filters by ~10% on popular sites.

What to change

See patch.

Integration notes

The signature of the matches() public method of the RegExpFilter class has now changed:

-  matches(location, typeMask, docDomain, thirdParty, sitekey)
+  matches(request, typeMask, sitekey)

The first argument request should be of type URLRequest, which is exported from lib/url.js.

For example, the following call:

filter.matches("https://example.com/image.png", RegExpFilter.typeMap.IMAGE,
               "example.com", false, null)

It should now look like this:

filter.matches(URLRequest.from("https://example.com/image.png", "example.com"),
               RegExpFilter.typeMap.IMAGE, null)

The JSDoc for the URLRequest class incorrectly marks it as @package. Please ignore this, it is public, but there may be further changes to it.

After #7260 adblockpluschrome should no longer be calling the isThirdParty() function. For any other clients calling this function, the interface has changed: now the first argument should be the hostname of the request (type string) rather than the request URL (type URL|URLInfo).

Hints for testers

This basically covers all kinds of requests, but in particular where a filter is case-insensitive and the URL's case does not match, and where a filter targets third-party requests and the request is not third-party (or vice versa), those are good areas to focus on.

For example, add the filters -foo- and -bar-; then the URLs https://example.com/example-bar-ad and https://example.com/Example-Bar-Ad should be blocked by the second filter. If the second filter is changed to -bar-$match-case, then only the first URL should be blocked.

Similarly, if the second filter is changed to -bar-$third-party, then the URL https://example.com/Example-Bar-Ad should be blocked if the request is made from a document loaded from http://localhost:8080/test.html but not if the document is loaded from https://example.com/test.html. It should be the opposite if the filter is changed to -bar-$~third-party.

Attachments (0)

Change History (8)

comment:1 Changed on 04/03/2019 at 10:29:58 AM by mjethani

  • Description modified (diff)
  • Ready set
  • Review URL(s) modified (diff)
  • Status changed from new to reviewing

comment:2 Changed on 04/05/2019 at 07:57:06 AM by abpbot

A commit referencing this issue has landed:
Issue 7435 - Use request object for filter matching

comment:3 Changed on 04/05/2019 at 08:14:17 AM by mjethani

  • Cc sebastian greiner added
  • Description modified (diff)

comment:4 Changed on 04/05/2019 at 08:14:50 AM by mjethani

  • Cc kzar added

comment:5 Changed on 04/05/2019 at 08:15:21 AM by mjethani

  • Cc sergz added

comment:6 Changed on 04/12/2019 at 09:07:39 AM by mjethani

  • Cc jsonesen added
  • Description modified (diff)

comment:7 Changed on 04/13/2019 at 01:13:25 PM by mjethani

  • Description modified (diff)
  • Resolution set to fixed
  • Status changed from reviewing to closed

comment:8 Changed on 07/25/2019 at 10:34:20 PM by Ross

  • Tester changed from Unknown to Ross
  • Verified working set

Done. Working as described.

ABP 0.9.15.2339
Microsoft Edge 44.17763.1.0 / Windows 10 1809

ABP 3.5.2.2340
Chrome 49.0.2623.75 / Windows 10 1809
Chrome 75.0.3770.142 / Windows 10 1809
Opera 36.0.2130.65 / Windows 10 1809
Opera 62.0.3331.72 / Windows 10 1809
Firefox 51.0 / Windows 10 1809
Firefox 68.0 / Windows 10 1809
Firefox Mobile 68.0 / Android 7.2.2

Add Comment

Modify Ticket

Change Properties
Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from mjethani.
 
Note: See TracTickets for help on using tickets.