Opened on 09/24/2018 at 11:29:11 AM
Closed on 10/04/2018 at 02:54:58 PM
Last modified on 05/29/2019 at 10:52:07 AM
#6976 closed defect (fixed)
[python-abp] Filters that contain square brackets in the first line of a filter list fragment break rendering
Reported by: | kvas | Assignee: | kvas |
---|---|---|---|
Priority: | P2 | Milestone: | |
Module: | Sitescripts | Keywords: | |
Cc: | sebastian, atudor, arthur | Blocked By: | |
Blocking: | Platform: | Unknown / Cross platform | |
Ready: | yes | Confidential: | no |
Tester: | Unknown | Verified working: | no |
Review URL(s): |
Description
Environment
Easylist before this commit.
How to reproduce
Try to render easylist using flrender from python-abp.
Observed behaviour
$ flrender -i easylist=. easylist.template output/easylist.txt Traceback (most recent call last): File "/home/travis/virtualenv/python2.7.14/bin/flrender", line 11, in <module> load_entry_point('python-abp==0.0.1', 'console_scripts', 'flrender')() File "/home/travis/virtualenv/python2.7.14/lib/python2.7/site-packages/abp/filters/render_script.py", line 67, in main for line in lines: File "/home/travis/virtualenv/python2.7.14/lib/python2.7/site-packages/abp/filters/renderer.py", line 132, in _remove_checksum for line in lines: File "/home/travis/virtualenv/python2.7.14/lib/python2.7/site-packages/abp/filters/renderer.py", line 102, in _process_timestamps for line in lines: File "/home/travis/virtualenv/python2.7.14/lib/python2.7/site-packages/abp/filters/renderer.py", line 91, in _process_includes for line in all_included: File "/home/travis/virtualenv/python2.7.14/lib/python2.7/site-packages/abp/filters/renderer.py", line 76, in _process_includes for line in lines: File "/home/travis/virtualenv/python2.7.14/lib/python2.7/site-packages/abp/filters/parser.py", line 340, in parse_filterlist parsed_line = parse_line(line, position) File "/home/travis/virtualenv/python2.7.14/lib/python2.7/site-packages/abp/filters/parser.py", line 299, in parse_line raise ParseError('Malformed header', line) abp.filters.parser.ParseError: Malformed header in "@@/cdn-cgi/pe/bag2?r[]=*ads.exoclick.com$xmlhttprequest,domain=hd-porn.me" The command "flrender -i easylist=. easylist.template output/easylist.txt" exited with 1.
Expected behaviour
There should be no exception and the filter list should be generated.
Explanation and notes on the fix
With the fix for #6877 I have tried to be extra helpful to the filter authors and detect broken headers early (when parsing the fragment that has them). This seems to have backfired: if a filter list fragment starts with a filter that has square brackets, python-abp thinks that the line tries to be a header and throws an invalid header exception.
I think we should chill out a bit about being so helpful and just use the old regex with search. This way if the first line contains a valid header, it will be parsed as such, otherwise it will be considered a filter. A slightly wrong header would be parsed as a filter (and all the metadata following it would become comments) but I would argue that this doesn't create any serious problems.
There are two possibilities in the slightly broken header case:
- Slightly broken header is at the beginning of the root fragment, so it would be the first line in the output. In this case we would really like to detect that the header is broken to avoid producing a filter list that ABP will reject. The parser won't help us after the proposed changes but the renderer will catch it and still refuse to produce a broken filter list.
- Slightly broken header is at the beginning of a non-root fragment. In this case the output will include the slightly broken header because it will be considered a filter. The output will also include any following metadata because it will be considered normal comments and not stripped. We will end up with an extra filter and some metadata comments in the middle of the filter list. The metadata comments will be ignored by the extension because they come after a filter. As for the filter, it seems that it's very unlikely that an accidental filter that was intended to be a header will match anything. We will end up with some comments and one useless filter in the list which doesn't sound too worrying.
In either case there's no immediate danger. In addition to this, in the second case, any fragment that has a header at the beginning is likely to also be used as a standalone filter list. Otherwise why would it have a header? If the standalone list is rendered via python-abp, the rendering will break and if it's not rendered, ABP will ignore the list when loading it. Either way the broken header is likely to be discovered soon enough and fixed.
Attachments (0)
Change History (6)
comment:1 Changed on 09/24/2018 at 11:52:44 AM by sebastian
comment:2 Changed on 10/04/2018 at 01:14:58 PM by kvas
- Component changed from Unknown to Sitescripts
comment:3 Changed on 10/04/2018 at 02:11:30 PM by sebastian
- Owner set to kvas
comment:4 Changed on 10/04/2018 at 02:11:41 PM by sebastian
- Review URL(s) modified (diff)
- Status changed from new to reviewing
comment:5 Changed on 10/04/2018 at 02:50:16 PM by abpbot
A commit referencing this issue has landed:
Issue 6976 - Ignore invalid headers instead of crashing
comment:6 Changed on 10/04/2018 at 02:54:58 PM by kvas
- Resolution set to fixed
- Status changed from reviewing to closed
I don't have any better idea.