Changes between Version 17 and Version 19 of Ticket #6877


Ignore:
Timestamp:
09/19/2018 02:10:05 PM (2 years ago)
Author:
kvas
Comment:

I have added notes about implementation details and API impact to the description of the issue.

Legend:

Unmodified
Added
Removed
Modified
  • Ticket #6877

    • Property Status changed from reviewing to closed
    • Property Resolution changed from to fixed
  • Ticket #6877 – Description

    v17 v19  
    2424=== Expected behaviour === 
    2525Only the first line should be parsed as a header, the second header-like line should be considered a filter (this is how the extension behaves). 
     26 
     27=== Implementation notes === 
     28We have further observed that ABP: 
     291. treats any first line that contains a fragment that [https://hg.adblockplus.org/adblockpluscore/file/19020ded7d88/lib/synchronizer.js#l149 resembles a valid header] as a valid header and 
     302. rejects the filter list if the first line of it is not a valid header. 
     31 
     32This means that a line like "foo[Adblock]bar" is considered a valid header by ABP, so parsing it as a filter in python-abp is wrong (and it will break metadata parsing). A more relaxed approach to header parsing is needed to make such lines parse as headers. 
     33 
     34We also can't simply adopt (2) and reject files that don't start with a header because fragments don't have headers but we would also like to be helpful and detect invalid headers that will be rejected by ABP. We have decided to consider the first line of a file an attempt at a header if it contains a substring that starts with "[" and ends with "]". If such line does not constitute a valid header, we raise and exception. 
     35 
     36=== API implications === 
     37The parsing API of python-abp consists of `parse_filterlist()` and `parse_line()`. The first one simply does the right thing, but we also tried to preserve the ability of `parse_line()` to parse all parts of the filterlist. It now has a second argument: `position`, which allows the caller to indicate where in the filter list we are so that headers and metadata are only parsed if they can be in this part.