Opened on 04/03/2019 at 10:06:51 AM
Closed on 04/13/2019 at 01:13:25 PM
Last modified on 07/25/2019 at 10:34:20 PM
#7435 closed change (fixed)
Use request object for filter matching
Reported by: | mjethani | Assignee: | mjethani |
---|---|---|---|
Priority: | P2 | Milestone: | |
Module: | Core | Keywords: | |
Cc: | sebastian, greiner, kzar, sergz, jsonesen | Blocked By: | |
Blocking: | #7000, #7355 | Platform: | Unknown / Cross platform |
Ready: | yes | Confidential: | no |
Tester: | Ross | Verified working: | yes |
Review URL(s): |
https://gitlab.com/eyeo/adblockplus/adblockpluscore/merge_requests/45 |
Description (last modified by mjethani)
Background
Currently the matcher does the following:
- Lower-cases the request URL for each keyword
- Checks if a request is third-party for each request
There's no need to lower-case the URL for each keyword, it can be lower-cased once and the lower case version cached.
For ~15% of the requests on the Alexa Top 50 (home pages only), I found that there was no need to check if the request was third-party.
All of this can be solved by using a single request object in the matcher and simply passing it around.
I find that this approach, if done correctly, can further speed up filter matching for request blocking filters by ~10% on popular sites.
What to change
See patch.
Integration notes
The signature of the matches() public method of the RegExpFilter class has now changed:
- matches(location, typeMask, docDomain, thirdParty, sitekey) + matches(request, typeMask, sitekey)
The first argument request should be of type URLRequest, which is exported from lib/url.js.
For example, the following call:
filter.matches("https://example.com/image.png", RegExpFilter.typeMap.IMAGE, "example.com", false, null)
It should now look like this:
filter.matches(URLRequest.from("https://example.com/image.png", "example.com"), RegExpFilter.typeMap.IMAGE, null)
The JSDoc for the URLRequest class incorrectly marks it as @package. Please ignore this, it is public, but there may be further changes to it.
After #7260 adblockpluschrome should no longer be calling the isThirdParty() function. For any other clients calling this function, the interface has changed: now the first argument should be the hostname of the request (type string) rather than the request URL (type URL|URLInfo).
Hints for testers
This basically covers all kinds of requests, but in particular where a filter is case-insensitive and the URL's case does not match, and where a filter targets third-party requests and the request is not third-party (or vice versa), those are good areas to focus on.
For example, add the filters -foo- and -bar-; then the URLs https://example.com/example-bar-ad and https://example.com/Example-Bar-Ad should be blocked by the second filter. If the second filter is changed to -bar-$match-case, then only the first URL should be blocked.
Similarly, if the second filter is changed to -bar-$third-party, then the URL https://example.com/Example-Bar-Ad should be blocked if the request is made from a document loaded from http://localhost:8080/test.html but not if the document is loaded from https://example.com/test.html. It should be the opposite if the filter is changed to -bar-$~third-party.
Attachments (0)
Change History (8)
comment:1 Changed on 04/03/2019 at 10:29:58 AM by mjethani
comment:2 Changed on 04/05/2019 at 07:57:06 AM by abpbot
comment:3 Changed on 04/05/2019 at 08:14:17 AM by mjethani
- Cc sebastian greiner added
- Description modified (diff)
comment:4 Changed on 04/05/2019 at 08:14:50 AM by mjethani
- Cc kzar added
comment:5 Changed on 04/05/2019 at 08:15:21 AM by mjethani
- Cc sergz added
comment:6 Changed on 04/12/2019 at 09:07:39 AM by mjethani
- Cc jsonesen added
- Description modified (diff)
comment:7 Changed on 04/13/2019 at 01:13:25 PM by mjethani
- Description modified (diff)
- Resolution set to fixed
- Status changed from reviewing to closed
comment:8 Changed on 07/25/2019 at 10:34:20 PM by Ross
- Tester changed from Unknown to Ross
- Verified working set
Done. Working as described.
ABP 0.9.15.2339
Microsoft Edge 44.17763.1.0 / Windows 10 1809
ABP 3.5.2.2340
Chrome 49.0.2623.75 / Windows 10 1809
Chrome 75.0.3770.142 / Windows 10 1809
Opera 36.0.2130.65 / Windows 10 1809
Opera 62.0.3331.72 / Windows 10 1809
Firefox 51.0 / Windows 10 1809
Firefox 68.0 / Windows 10 1809
Firefox Mobile 68.0 / Android 7.2.2
A commit referencing this issue has landed:
Issue 7435 - Use request object for filter matching