Opened 5 years ago

Closed 3 months ago

#1920 closed change (rejected)

Named domain arrays for filter lists

Reported by: Lain_13 Assignee:
Priority: Unknown Milestone:
Module: Core Keywords: filtersyntax, closed-in-favor-of-gitlab
Cc: greiner, mapx, sergz, hfiguiere Blocked By:
Blocking: Platform: Unknown
Ready: no Confidential: no
Tester: Unknown Verified working: no
Review URL(s):

Description (last modified by greiner)

Problem:

We periodically need to maintain multiple different filters for the same list of domains. One of the best examples is acceptable ads list since more than 90% of it is repeated rules for the same domains. It's done to avoid whitelisting by a wildcard which most certainly will be abused.

Solution:

Declare named array of domains and use it in the filters.

Example:

@@||www.google.com^$elemhide,~third-party
||www.google.com/images/phd/px.gif
@@||www.google.ag^$elemhide,~third-party
||www.google.ag/images/phd/px.gif
@@||www.google.ba^$elemhide,~third-party
||www.google.ba/images/phd/px.gif
@@||www.google.ca^$elemhide,~third-party
||www.google.ca/images/phd/px.gif
@@||www.google.co.cr^$elemhide,~third-party
||www.google.co.cr/images/phd/px.gif
@@||www.google.co.uk^$elemhide,~third-party
||www.google.co.uk/images/phd/px.gif

...and so on for all known Google domains.

Replace with:

[google-first-level]$domain=com|ag|ba|ca|co.cr|co.uk|...
@@||www.google.[google-first-level]^$elemhide,~third-party
||www.google.[google-first-level]/images/phd/px.gif

Such arrays must be available also in the $domain= lists and for hiding filters.

Example:

@@||google.com/uds/$subdocument,document,domain=opendi.at|opendi.de|opendi.ch|opendi.be|opendi.ca|opendi.cl|opendi.co|opendi.co.id|...

Replace with:

[opendi-first-level]$domain=at|de|ch|be|ca|cl|co|co.id|...
@@||google.com/uds/$subdocument,document,domain=opendi.[opendi-first-level]|stadtbranchenbuch.ch|...

Impact:

Such feature will break compatibility with old ABP versions and third-party extensions.

Solution:

Generate different versions of the filters list. Send filter list syntax version in the download URL. In case of extending syntax in the future number of lists to generate will grow linear.

Example:
https://easylist-downloads.adblockplus.org/exceptionrules.txt?v=1

If version specified - serve list with features available for that version. If not specified - serve maximally compatible version (auto-generated filters from arrays and so on).

We may want to consider maintaining a couple of globally available variables for popular domains to make them useful in all filter lists and user filters.

Change History (8)

comment:1 Changed 5 years ago by Lain_13

Additinally this will allow to maintain large lists of domains for filters with such lists without loosing statistic on them.

For exaple I have following filters:
/images/banners/*$domain=...
...##.banner

Each of them have a large list of domains in place of ... and hit statistic is lost every time when domain added or removed.

comment:2 Changed 5 years ago by greiner

  • Component changed from Unknown to Core
  • Description modified (diff)
  • Keywords filtersyntax added
  • Owner set to trev

If I understand you correctly, you want to have support for filter text variables. Is that correct?

Currently, filters are independent of each other. Whitelisting is a case in which they're not. There a filter can interfer with another but due to whitelisting filters always having a higher priority than other filters, there's no conflict when different filters target the same resource and it also makes them independent of the order they were specified in.

With variables, however, it's unclear which value it should have when it's declared multiple times (e.g. across different filter lists).

comment:3 Changed 5 years ago by greiner

  • Cc greiner added
  • Owner trev deleted

Not sure why it automatically set the owner to trev so I reverted that.

comment:4 Changed 5 years ago by Lain_13

Yes, I'd like to have variables of at least array type.

I guess variables with the same name should merge.

Example:
[var]$domain=ru|ua
[var]$domain=com|us

Should turn into:
[var]$domain=ru|ua|com|us

Obviously we shouldn't give them short names to avoid merging unless it's intentional. For example, it could be a good idea for a "##.banner" hiding rule. So, just avoid short generic names as in the example above and there will be no problems. Also, it could be a good idea to give a prefix to variables like [el_varName] for EasyList variables to make sure it won't be accidentally used even if the rest of the name is short and generic.

BTW, probably extending $domain functionality for that is a wrong idea. "[var]=list" should be enough.

Last edited 5 years ago by Lain_13 (previous) (diff)

comment:5 Changed 5 years ago by mapx

  • Cc mapx added
  • Verified working unset

comment:6 Changed 23 months ago by sergz

  • Cc sergz hfiguiere kzar added
  • Tester set to Unknown

comment:7 Changed 23 months ago by kzar

  • Cc kzar removed

comment:8 Changed 3 months ago by sebastian

  • Keywords closed-in-favor-of-gitlab added
  • Resolution set to rejected
  • Status changed from new to closed

Sorry, but we switched to GitLab. If this issue is still relevant, please file it again in the new issue tracker.

Note: See TracTickets for help on using tickets.