Opened 4 months ago

Closed 4 months ago

Last modified 2 months ago

#7208 closed change (fixed)

Drop superfluous wildcards before processing pattern

Reported by: mjethani Assignee: mjethani
Priority: P2 Milestone:
Module: Core Keywords:
Cc: Blocked By:
Blocking: #7000 Platform: Unknown / Cross platform
Ready: yes Confidential: no
Tester: Unknown Verified working: yes
Review URL(s):

https://codereview.adblockplus.org/29978576/

Description (last modified by mjethani)

Background

In #7052 we started doing string-based matching for literal patterns. About half of all blocking/whitelist filters were found to match the criteria. It turns out there are still a number of filters (~1,500 in EasyList) that would match the criteria were it not for a superfluous trailing wildcard at the end of the pattern. e.g. /_/ads/*. The reason the wildcard exists is so that the pattern is not interpreted as a regular expression.

We can drop the superfluous wildcards before considering whether a pattern is literal or not.

What to change

In the RegExpFilter constructor strip out any leading and trailing wildcards before deciding how to treat the pattern.

Hints for testers

Make sure filters with a trailing wildcard work correctly. For example, the filter /foo/bar/* should block the URLs https://example.com/foo/bar/index.html and https://example.com/foo/bar/, but not the URL https://example.com/foo/bar (no slash at the end).

The same applies to filters with leading wildcards. The filter */foo/bar/ should also block the first two URLs but not the third one.

If multiple wildcards are present (e.g. ***/foo/bar/**), it should still work as if there is only one wildcard.

If there is no wildcard at the beginning nor at the end of the pattern, it should be treated as a regular expression correctly (provided it's surrounded by slashes). For example, the pattern /\bfoo\/ba[rt]\b/ should correctly block https://example.com/foo/bar/ and https://example.com/foo/bat?id=1 but not https://example.com/foo/ba[rt]/index.html.

This should also work for filters containing options (e.g. /foo/bar/*$third-party) and whitelist filters (e.g. @@*/foo/bar/).

Change History (8)

comment:1 Changed 4 months ago by mjethani

  • Status changed from new to reviewing

comment:2 Changed 4 months ago by mjethani

  • Blocking 7000 added

comment:3 Changed 4 months ago by abpbot

A commit referencing this issue has landed:
Issue 7208 - Drop superfluous wildcards before processing pattern

comment:4 Changed 4 months ago by mjethani

  • Description modified (diff)
  • Resolution set to fixed
  • Status changed from reviewing to closed

comment:5 Changed 4 months ago by mjethani

  • Description modified (diff)

comment:6 Changed 4 months ago by mjethani

  • Description modified (diff)

comment:7 Changed 3 months ago by abpbot

A commit referencing this issue has landed:
Issue 7208 - Drop superfluous wildcards before processing pattern

comment:8 Changed 2 months ago by rscott

  • Verified working set
Note: See TracTickets for help on using tickets.