Opened 9 months ago

Closed 8 months ago

Last modified 6 months ago

#7179 closed change (fixed)

Add function to return all matching filters

Reported by: mjethani Assignee:
Priority: P2 Milestone:
Module: Core Keywords:
Cc: greiner, kzar, arthur, sebastian Blocked By:
Blocking: Platform: Unknown / Cross platform
Ready: yes Confidential: no
Tester: Unknown Verified working: yes
Review URL(s):

https://codereview.adblockplus.org/29990555/

Description (last modified by mjethani)

Background

After #7089, matching for CSP filters should be a lot faster, because they are kept in their own type-specific map. It should be more efficient now to find all matching CSP filters.

What to change

In lib/matcher.js, add a new search method to CombinedMatcher that returns two arrays of blocking and whitelist filters. The signature should be exactly same as the matchesAny method, but additionally it should take a filterType parameter, which can be "blocking" (only blocking filters), "whitelist" (only whitelist filters), or "all" (all filters, default).

The result cache could be reused by adding a * to the key.

Integration notes

The new search method can be used to find all CSP filters. The optimal way to do this would be in the following manner (pseudocode):

let match = defaultMatcher.matchesAny(...);
if (!match)
  return;

if (match instanceof WhitelistFilter)
{
  logRequest(...);
  return;
}

let {blocking} = defaultMatcher.search(..., "blocking");
for (let filter of blocking)
{
  responseHeaders.push(...);
  logRequest(...);
}

return responseHeaders;

i.e. if the first hit is a blocking filter, only then do a full search to find all matching blocking filters.

Hints for testers

This is not yet integrated into adblockpluschrome. Please see the dependency update issue for testing hints.

Change History (15)

comment:1 Changed 9 months ago by mjethani

  • Description modified (diff)

comment:2 follow-up: Changed 8 months ago by greiner

  • Cc greiner kzar arthur added

Considering the recent discussion about whether or not to show a whiltelist filter in the DevTools panel if no blocking filter applies, I was thinking that it could be quite useful to show the user all filters that match a given request rather than only the one that ended up winning over all the others.

For instance, for a request with the URL http://www.example.com/ad.gif it could show the following list of filters:

  • ^ad.gif^
  • ||example.com^$image
  • @@.gif^$image
  • @@||example.com^$document (applied)

@arthur Do you think such a feature would help filter authors when creating/debugging filters?

@mjethani Would such a feature become feasible with this change or is it only meant to work with CSP filters?
What order would those filters be in?

comment:3 Changed 8 months ago by arthur

@greiner
Sounds useful actually e. g. when you want to remove a filter and you don't know if another filter causes the same issue when it has been removed.

comment:4 follow-up: Changed 8 months ago by mjethani

@greiner in general filter matching is more efficient now, but I meant this to be used only for CSP filters, because there are very few of them, they are in their own list in memory (faster to iterate), and CSP filters are not like other blocking filters in that it makes sense to inject multiple CSP headers. We could also use this for other filters but only in the case when DevTools is open.

It also depends on how we define a filter hit and what is displayed in the DevTools panel. Should we show all matching filters or only a "hit"? If we go for the former, we should also show all whitelist filters even if there are no blocking filters. If we choose the latter, the current behavior seems right.

comment:5 Changed 8 months ago by mjethani

  • Cc sebastian added

comment:6 in reply to: ↑ 4 Changed 8 months ago by greiner

Replying to mjethani:

We could also use this for other filters but only in the case when DevTools is open.

Yep, was also thinking that. No need for that overhead when our DevTools panel is not loaded.

Should we show all matching filters or only a "hit"?

The ideal scenario would be showing an ordered list of all filters (incl. whitelist filters) where the first one is the one that we ended up picking (i.e. the filter hit). However, an unordered list with some marker to indicate which of those got picked would already be great, I think.

comment:7 in reply to: ↑ 2 ; follow-ups: Changed 8 months ago by mjethani

Replying to greiner:

What order would those filters be in?

Sorry I missed this part. The filters used to be in random order, but we have removed that randomization now (#7178). They should be in order now, as they appear in the subscriptions.

Having said this, the order in which the filters are matched is based on the order of the "keywords" in the URL. While this is deterministic, it would be hard for a filter list author to reason about it. Also I don't think Core should provide any guarantees about the order, because that would lock us in.

However, an unordered list with some marker to indicate which of those got picked would already be great, I think.

Technically there is no single filter that is picked if we're going to look up all matching filters. I don't know if the marker makes sense in this case.

comment:8 in reply to: ↑ 7 Changed 8 months ago by mjethani

Replying to mjethani:

Replying to greiner:

However, an unordered list with some marker to indicate which of those got picked would already be great, I think.

Technically there is no single filter that is picked if we're going to look up all matching filters. I don't know if the marker makes sense in this case.

Alright, we still can't count all the filters as "hits" even if we are displaying all of them. In that case, it should be the first whitelist filter or the first blocking filter in the list. I think then a marker makes sense.

comment:9 in reply to: ↑ 7 Changed 8 months ago by greiner

Replying to mjethani:

Also I don't think Core should provide any guarantees about the order, because that would lock us in.

Agreed. The only ordering we can guarantee is that whitelist filters always come before blacklist filters which is most likely sufficient to indicate to users what will happen (i.e. request will be blocked or not blocked) after the currently matching filter gets removed/disabled.

That way we may not need to indicate which filter got hit since that's an implementation detail anyway that users shouldn't care about.

comment:10 Changed 8 months ago by mjethani

  • Component changed from Unknown to Core

comment:11 Changed 8 months ago by mjethani

  • Review URL(s) modified (diff)

comment:12 Changed 8 months ago by abpbot

A commit referencing this issue has landed:
Issue 7179 - Add function to return all matching filters

comment:13 Changed 8 months ago by mjethani

  • Description modified (diff)
  • Priority changed from Unknown to P2
  • Ready set
  • Resolution set to fixed
  • Status changed from new to closed

comment:14 Changed 8 months ago by abpbot

A commit referencing this issue has landed:
Issue 7179 - Add function to return all matching filters

comment:15 Changed 6 months ago by rscott

  • Verified working set
Note: See TracTickets for help on using tickets.