Opened on 01/10/2019 at 10:49:44 PM
Closed on 01/16/2019 at 06:29:48 PM
Last modified on 03/12/2019 at 05:58:49 AM
#7208 closed change (fixed)
Drop superfluous wildcards before processing pattern
Reported by: | mjethani | Assignee: | mjethani |
---|---|---|---|
Priority: | P2 | Milestone: | |
Module: | Core | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | #7000 | Platform: | Unknown / Cross platform |
Ready: | yes | Confidential: | no |
Tester: | Unknown | Verified working: | yes |
Review URL(s): |
Description (last modified by mjethani)
Background
In #7052 we started doing string-based matching for literal patterns. About half of all blocking/whitelist filters were found to match the criteria. It turns out there are still a number of filters (~1,500 in EasyList) that would match the criteria were it not for a superfluous trailing wildcard at the end of the pattern. e.g. /_/ads/*. The reason the wildcard exists is so that the pattern is not interpreted as a regular expression.
We can drop the superfluous wildcards before considering whether a pattern is literal or not.
What to change
In the RegExpFilter constructor strip out any leading and trailing wildcards before deciding how to treat the pattern.
Hints for testers
Make sure filters with a trailing wildcard work correctly. For example, the filter /foo/bar/* should block the URLs https://example.com/foo/bar/index.html and https://example.com/foo/bar/, but not the URL https://example.com/foo/bar (no slash at the end).
The same applies to filters with leading wildcards. The filter */foo/bar/ should also block the first two URLs but not the third one.
If multiple wildcards are present (e.g. ***/foo/bar/**), it should still work as if there is only one wildcard.
If there is no wildcard at the beginning nor at the end of the pattern, it should be treated as a regular expression correctly (provided it's surrounded by slashes). For example, the pattern /\bfoo\/ba[rt]\b/ should correctly block https://example.com/foo/bar/ and https://example.com/foo/bat?id=1 but not https://example.com/foo/ba[rt]/index.html.
This should also work for filters containing options (e.g. /foo/bar/*$third-party) and whitelist filters (e.g. @@*/foo/bar/).
Attachments (0)
Change History (8)
comment:1 Changed on 01/10/2019 at 10:50:06 PM by mjethani
- Status changed from new to reviewing
comment:2 Changed on 01/10/2019 at 11:05:51 PM by mjethani
- Blocking 7000 added
comment:3 Changed on 01/15/2019 at 02:19:37 PM by abpbot
comment:4 Changed on 01/16/2019 at 06:29:48 PM by mjethani
- Description modified (diff)
- Resolution set to fixed
- Status changed from reviewing to closed
comment:7 Changed on 02/07/2019 at 03:24:02 AM by abpbot
A commit referencing this issue has landed:
Issue 7208 - Drop superfluous wildcards before processing pattern
comment:8 Changed on 03/12/2019 at 05:58:49 AM by rscott
- Verified working set
A commit referencing this issue has landed:
Issue 7208 - Drop superfluous wildcards before processing pattern