#6861 closed change (rejected)

Do not escape opening brackets in middle of filter text while serializing

Reported by: mjethani Assignee:
Priority: P4 Milestone:
Module: Core Keywords:
Cc: sergz, kzar, hfiguiere, sebastian Blocked By:
Blocking: Platform: Unknown / Cross platform
Ready: no Confidential: no
Tester: Unknown Verified working: no
Review URL(s):

https://codereview.adblockplus.org/29856560/

Description (last modified by mjethani)

Background

Currently the serializeFilters function in lib/subscriptionClasses.js escapes any occurrence of an opening bracket in a filter (e.g. ##a[href^="http://ads."]) with a backslash (\). At the time of parsing patterns.ini, the INIParser class unescapes the opening bracket before parsing the filter.

Escaping of an opening bracket is relevant only when it occurs at the beginning of a line optionally preceded only by whitespace; therefore, this escaping is mostly unnecessary.

What to change

In the serializeFilters function in lib/subscriptionClasses.js, replace filter.text.replace(/\[/g, "\\[") with filter.text.replace(/^(\s*)\[/, "$1\\[") or some equivalent.

After a few releases, modify INIParser to only unescape the opening bracket when it occurs at the beginning of the line (optionally preceded by whitespace).

(Question: Should we make both changes at the same time and bump the file format version number instead?)

Change History (11)

comment:1 Changed 15 months ago by mjethani

  • Description modified (diff)
  • Status changed from new to reviewing

comment:2 Changed 15 months ago by mjethani

  • Review URL(s) modified (diff)

comment:3 Changed 15 months ago by mjethani

  • Cc sergz kzar hfiguiere sebastian added

comment:4 Changed 15 months ago by mjethani

  • Description modified (diff)

comment:5 Changed 15 months ago by mjethani

  • Cc Ross rscott added

comment:6 Changed 15 months ago by mjethani

  • Blocking 6538 removed
  • Cc Ross rscott removed
  • Description modified (diff)
  • Keywords circumvention removed
  • Priority changed from P2 to P4

My bad! It's obviously not a problem for filters because the escaping and unescaping is symmetrical (I got confused by my local changes). Nevertheless we don't have to escape every opening bracket, only the ones that occur at the beginning of a line. This would likely speed up the loading of the initial subscriptions in INIParser.

comment:7 Changed 15 months ago by mjethani

  • Description modified (diff)

comment:8 follow-up: Changed 15 months ago by sergz

After a few releases, modify INIParser to only unescape the opening bracket when it occurs at the beginning of the line (optionally preceded by whitespace).

BTW, what will happen with custom filters?

(Question: Should we make both changes at the same time and bump the file format version number instead?)

It seems that the version should be bumped when we start to save and support filters in the new format.

Additionally FYI some of our partners have a very long update delay and if it's not a huge burden it would be great to keep the backward compatibility to a relatively longer time.

comment:9 in reply to: ↑ 8 Changed 15 months ago by mjethani

Replying to sergz:

After a few releases, modify INIParser to only unescape the opening bracket when it occurs at the beginning of the line (optionally preceded by whitespace).

BTW, what will happen with custom filters?

Sorry, what do you mean by custom filters?

(Question: Should we make both changes at the same time and bump the file format version number instead?)

It seems that the version should be bumped when we start to save and support filters in the new format.

Additionally FYI some of our partners have a very long update delay and if it's not a huge burden it would be great to keep the backward compatibility to a relatively longer time.

OK, in that case we can read the version 5 INI file using the old parsing logic and save it as version 6 the next time; this way everyone gets upgraded cleanly. We can leave the version 5 parsing in for a long time.

comment:10 Changed 15 months ago by mjethani

Based on the above discussion and some of my own exploration, it looks like the best way to make any change here is to bump the file format version number and then be able to read the old format and write it to the new format. If we're going to bump the version number, it also gives us the opportunity to make more performance-related changes to the file format, the exact details of which would be a separate discussion.

I am closing this issue now.

comment:11 Changed 15 months ago by mjethani

  • Resolution set to rejected
  • Status changed from reviewing to closed
Note: See TracTickets for help on using tickets.