Opened 2 years ago

Last modified 12 months ago

#6923 closed change

Only parse metadata from special comments at the top of the file — at Version 8

Reported by: sebastian Assignee: sebastian
Priority: P2 Milestone:
Module: Core Keywords:
Cc: mjethani, kvas, Lain_13, dimisa, scuturic Blocked By:
Blocking: Platform: Unknown / Cross platform
Ready: yes Confidential: no
Tester: Ross Verified working: yes
Review URL(s):

https://codereview.adblockplus.org/29873570
https://codereview.adblockplus.org/29879940

Description (last modified by sebastian)

Background

In python-abp (the Python library used to parse and compile filter lists), we now have the problem that without hard-coding a redundant list of keys supported in special comments by Adblock Plus (which we'd like to avoid), there is no way to reliably tell if a comment is a special comment, since filter lists can contain comments like ! WARNING: or ! Note:. See #6920.

However, in practice filter lists put all special/metadata comments right after the filter list header at the top of the file. The only exception were ! Checksum: comments but those are no longer supported (see #6849).

Moreover, requiring special/metadata comments to be given at the top, not only makes it easier for tools like python-abp to reliably identify them, it also makes the parsing in Adblock Plus more efficient, since we can bail early and no longer need to check every single line in the filter list whether it might contain metadata.

What to change

Stop parsing special/metadata comments in lib/synchronizer.js as soon as we hit the first line after the header that isn't a non-empty comment.

Change History (8)

comment:1 Changed 2 years ago by sebastian

  • Review URL(s) modified (diff)
  • Status changed from new to reviewing

comment:2 Changed 2 years ago by mjethani

  • Priority changed from Unknown to P2

comment:3 Changed 2 years ago by mjethani

@sebastian do you think it makes sense to ask the filter list authors before making this change?

comment:4 Changed 2 years ago by kvas

  • Cc kvas added

comment:5 Changed 2 years ago by sebastian

  • Cc Lain_13 dimisa scuturic added

Lain, dimesa and Sasa, any objections, requiring metadata to be given at the top of the filter list (like specified in this issue)? Feel free to add other filter list authors that might concern this?

comment:6 Changed 2 years ago by kvas

I we should define more precisely in the ticket what signals the end of the metadata for us (we have de facto definition from the code, but maybe we want to be a bit more relaxed there). The obvious options I see:

  1. Anything that doesn't look like a metadata comment (see the regexp in line 165),
  2. An empty line or a filter.

The currently proposed implementation is the first option but second option would allow having some comments interspersed with the metadata lines at the top. Does this sound useful to anyone?

comment:7 Changed 2 years ago by sebastian

  • Description modified (diff)

I made it explicit in the issue description. But no, I don't think that we should make it more relaxed. Looking at existing filter lists, I don't see any example of (non-special) comments in between metadata, but having comments follow the metadata (with no empty line in between) seem quite common.

comment:8 Changed 2 years ago by sebastian

  • Description modified (diff)
Note: See TracTickets for help on using tickets.