Opened on 01/10/2019 at 10:49:44 PM

Closed on 01/16/2019 at 06:29:48 PM

Last modified on 03/12/2019 at 05:58:49 AM

#7208 closed change (fixed)

Drop superfluous wildcards before processing pattern

Reported by: mjethani Assignee: mjethani
Priority: P2 Milestone:
Module: Core Keywords:
Cc: Blocked By:
Blocking: #7000 Platform: Unknown / Cross platform
Ready: yes Confidential: no
Tester: Unknown Verified working: yes
Review URL(s):

https://codereview.adblockplus.org/29978576/

Description (last modified by mjethani)

Background

In #7052 we started doing string-based matching for literal patterns. About half of all blocking/whitelist filters were found to match the criteria. It turns out there are still a number of filters (~1,500 in EasyList) that would match the criteria were it not for a superfluous trailing wildcard at the end of the pattern. e.g. /_/ads/*. The reason the wildcard exists is so that the pattern is not interpreted as a regular expression.

We can drop the superfluous wildcards before considering whether a pattern is literal or not.

What to change

In the RegExpFilter constructor strip out any leading and trailing wildcards before deciding how to treat the pattern.

Hints for testers

Make sure filters with a trailing wildcard work correctly. For example, the filter /foo/bar/* should block the URLs https://example.com/foo/bar/index.html and https://example.com/foo/bar/, but not the URL https://example.com/foo/bar (no slash at the end).

The same applies to filters with leading wildcards. The filter */foo/bar/ should also block the first two URLs but not the third one.

If multiple wildcards are present (e.g. ***/foo/bar/**), it should still work as if there is only one wildcard.

If there is no wildcard at the beginning nor at the end of the pattern, it should be treated as a regular expression correctly (provided it's surrounded by slashes). For example, the pattern /\bfoo\/ba[rt]\b/ should correctly block https://example.com/foo/bar/ and https://example.com/foo/bat?id=1 but not https://example.com/foo/ba[rt]/index.html.

This should also work for filters containing options (e.g. /foo/bar/*$third-party) and whitelist filters (e.g. @@*/foo/bar/).

Attachments (0)

Change History (8)

comment:1 Changed on 01/10/2019 at 10:50:06 PM by mjethani

  • Status changed from new to reviewing

comment:2 Changed on 01/10/2019 at 11:05:51 PM by mjethani

  • Blocking 7000 added

comment:3 Changed on 01/15/2019 at 02:19:37 PM by abpbot

A commit referencing this issue has landed:
Issue 7208 - Drop superfluous wildcards before processing pattern

comment:4 Changed on 01/16/2019 at 06:29:48 PM by mjethani

  • Description modified (diff)
  • Resolution set to fixed
  • Status changed from reviewing to closed

comment:5 Changed on 01/16/2019 at 06:32:56 PM by mjethani

  • Description modified (diff)

comment:6 Changed on 01/16/2019 at 06:35:15 PM by mjethani

  • Description modified (diff)

comment:7 Changed on 02/07/2019 at 03:24:02 AM by abpbot

A commit referencing this issue has landed:
Issue 7208 - Drop superfluous wildcards before processing pattern

comment:8 Changed on 03/12/2019 at 05:58:49 AM by rscott

  • Verified working set

Add Comment

Modify Ticket

Change Properties
Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from mjethani.
 
Note: See TracTickets for help on using tickets.