Opened on 06/07/2016 at 03:19:39 PM

Closed on 08/29/2019 at 05:43:52 PM

Last modified on 10/08/2019 at 05:47:44 PM

#4128 closed change (rejected)

[emscripten] Convert parsing patterns.ini to C++

Reported by: trev Assignee: sergz
Priority: P2 Milestone:
Module: Core Keywords: closed-in-favor-of-gitlab
Cc: Blocked By: #5137, #5153
Blocking: #4122 Platform: Unknown / Cross platform
Ready: yes Confidential: no
Tester: Unknown Verified working: no
Review URL(s):

https://codereview.adblockplus.org/29548581/

Description (last modified by trev)

Background

See #4122 for the rationale. We should parse patterns.ini in C++ for better performance.

What to change

Create compiled/storage/parser.cpp providing a Parser class (should be exposed to JavaScript as _FilterStorage_Parser). The parser should allocate a 64kB (?) input buffer, with the pointer to the buffer being returned by its buffer property and its size by the bufferSize property. FilterStorage.importData (JavaScript, located in lib/filterStorage.js) should copy the incoming lines to the buffer, followed by a line break. Once there is no longer enough space in the buffer for the next line it should call Parser.process(numChars). Upon receiving a null string (finalization), FilterStorage.importData should make the parser process data still in the buffer and call Parser.finalize() then - this will replace existing subscriptions.

Notes

  • FilterStorage.importData() needs to expose _FilterStorage_Parser.delete() - it should delete the parser automatically upon finalization but the parser needs to be deleted explicitly if parsing is abandoned. Maybe the callback needs an abort() method.
  • The parser needs to keep a list with references to filters it processed. With the file format currently listing filters before subscriptions, the filters might get released otherwise and the changes of filter properties will be lost.
  • It is no longer feasible to prevent changes of filter/subscription properties before finalization. This should be ok as long as the list of subscriptions stays unchanged until finalization.
  • Methods FilterStorage.loadFromDisk() and FilterStorage.restoreBackup() can stay in JavaScript, without any changes.

Attachments (0)

Change History (11)

comment:1 Changed on 04/14/2017 at 06:44:53 AM by trev

  • Blocked By 5137 added

comment:2 Changed on 04/14/2017 at 07:01:25 AM by trev

  • Description modified (diff)
  • Summary changed from [emscripten] Convert parsing/serializing patterns.ini to C++ to [emscripten] Convert parsing patterns.ini to C++

comment:3 Changed on 04/14/2017 at 07:12:04 AM by trev

  • Description modified (diff)

comment:4 Changed on 04/15/2017 at 02:33:15 PM by trev

  • Description modified (diff)

comment:5 Changed on 04/18/2017 at 01:12:54 PM by trev

  • Blocked By 5153 added

comment:6 Changed on 09/13/2017 at 02:40:29 PM by sergz

Replying to trev:

Note also that this isn't blocking #5137, it's the other way round.

It is indeed blocking #5137. FilterStorage is using the parser, so it's impossible to implement FilterStorage, thus accomplish #5137, without the parser. On the other hand, one can implement the parser without FilterStorage, use it anywhere, e.g. in tests, or not use, and close the current issue.
So, I think that what is Blocked By here should actually be Blocking.

comment:7 Changed on 09/13/2017 at 03:53:41 PM by sergz

If we still use INI-like file format and already have lines, then why should one fiddle with the input buffer of the parser? I would say that the INI-like parser should have void Process(const String& line) and void Finalize() methods.

FilterStorage.importData() needs to expose _FilterStorage_Parser.delete() - it should delete the parser automatically upon finalization but the parser needs to be deleted explicitly if parsing is abandoned. Maybe the callback needs an abort() method.

Could you please explain how the parsing can be abandoned? It's natural to expect that while an instance of FilterStorage exists it's responsible to not allow several parsing processes. If the parsing process is asynchronous then both on a new parsing request FilterStorage and on destroying of FilterStorage, the latter should stop the currently running parsing process.

The parser needs to keep a list with references to filters it processed. With the file format currently listing filters before subscriptions, the filters might get released otherwise and the changes of filter properties will be lost.
It is no longer feasible to prevent changes of filter/subscription properties before finalization. This should be ok as long as the list of subscriptions stays unchanged until finalization.

Could you please give more details for these both sections?

comment:8 Changed on 09/13/2017 at 06:54:56 PM by sergz

  • Owner set to sergz

comment:9 Changed on 09/18/2017 at 07:20:43 PM by sergz

  • Review URL(s) modified (diff)

comment:10 Changed on 03/06/2019 at 09:59:56 AM by roseepeter

spam

Last edited on 10/08/2019 at 05:47:44 PM by kzar

comment:11 Changed on 08/29/2019 at 05:43:52 PM by sebastian

  • Keywords closed-in-favor-of-gitlab added
  • Resolution set to rejected
  • Status changed from new to closed

Sorry, but we switched to GitLab. If this issue is still relevant, please file it again in the new issue tracker.

Add Comment

Modify Ticket

Change Properties
Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from sergz.
 
Note: See TracTickets for help on using tickets.