Opened 14 months ago

Last modified 9 months ago

#6994 closed change

Use shortcut matching for filters with no content type, no domain/sitekey, and no third-party flag — at Version 8

Reported by: mjethani Assignee: jsonesen
Priority: P2 Milestone:
Module: Core Keywords:
Cc: jsonesen Blocked By:
Blocking: Platform: Unknown / Cross platform
Ready: no Confidential: no
Tester: Ross Verified working: yes
Review URL(s):

https://codereview.adblockplus.org/29907586/
https://codereview.adblockplus.org/29926557/

Description (last modified by jsonesen)

Background

About half of all blocking/whitelist filters have (1) no content type, (2) no domain/sitekey, and (3) no third-party flag. As long as the request's location matches the pattern in the filter, it is a match. For these filters, there should be a "shortcut" version of the RegExpFilter.prototype.matches method that checks only the location. When such filters are added to the matcher, it should keep them separately so they can be processed separately (with the shortcut version) for each call to the _checkEntryMatch method.

What to change

Add a isLocationOnly property to the RegExpFilter class and a new locationFilterByKeyword to the constructor.

Change _checkEntryMatch in lib/matcher.js to initially loop through location only filters checking for matches using location.test(filter.regexp) Additionally, in _checkEntryMatch, if typeMask & RegExpFilter.prototype.contentType == 0 is true, these filters should be skipped entirely.

Change History (8)

comment:1 Changed 14 months ago by mjethani

  • Description modified (diff)

comment:2 Changed 14 months ago by jsonesen

This sounds like a great idea and something that I would like to work on since it will give me a good chance to get into the matcher code a bit more, which, I recently switched to classes.

I know it's an optimization and quite a hot spot so what do you think @mjethani?

comment:3 Changed 14 months ago by mjethani

@jsonesen yes, sounds like a good idea to me.

comment:4 Changed 14 months ago by mjethani

@ jsonesen we should be able to make this change without breaking compatibility (i.e. all unit tests should pass as they are), which makes it difficult but also a little less worrisome than #7003, since this is really about the internal logic of the module.

So I would go ahead and tackle this.

So as I said in the description, "About half of all blocking/whitelist filters have (1) no content type, (2) no domain/sitekey, and (3) no third-party flag", and we already know this at the time when the filter is added (filters are immutable). Let's call these "super fast" filters for the purpose of this discussion. For super fast filters, we can just directly call location.test(filter.regexp) instead of filter.matches(location, ...), which would skip three of the other checks. We just need to find an efficient way to maintain these filters separately. These filters would also have to be index by keyword.

comment:5 Changed 14 months ago by mjethani

  • Cc jsonesen added

comment:6 Changed 14 months ago by jsonesen

Awesome, I will go ahead and start on this.

comment:7 Changed 13 months ago by jsonesen

  • Owner set to jsonesen

comment:8 Changed 13 months ago by jsonesen

  • Description modified (diff)
Note: See TracTickets for help on using tickets.