Opened 3 months ago

Last modified 6 weeks ago

#5568 reviewing change

Improve string handling by splitting read strings into only ASCII and non-ASCII strings

Reported by: sergz Assignee: sergz
Priority: P2 Milestone:
Module: Libadblockplus Keywords:
Cc: fhd, hfiguiere Blocked By:
Blocking: Platform: Unknown / Cross platform
Ready: no Confidential: no
Tester: Unknown Verified working: no
Review URL(s):

https://codereview.adblockplus.org/29531696/

Description

V8 does support internally one byte character strings and we should use this feature in order to optimize a bit memory usage.

Currently when we read a file (see readFromFile in io.js) and pass its content into JS to go through each line V8 finds there non-ASCII characters and creates a regular UCS-2 JavaScript string, namely two bytes string. In addition when we work with a string, e.g. create a new one which is a substring of the original string but copied internally by V8, the new string has the same type, namely one byte string or two bytes string. It may be also valid for some transformations like changing the case of letters.

For example, in EasyList subscription there are only few lines with non-ASCII characters. So, we could save some memory if we managed to advice V8 when the line is one byte character string. Since we are not allowed to shuffle the lines in the read file it seems we should implement readFromFile from io.js in C++. However one should pay attention to the performance, e.g. one should consider whether the calling of listener from C++ is too expensive or not (e.g., profile how it performs on android, of course compiled in release).

In addition one should consider how to use it with downloaded data.

Change History (5)

comment:1 Changed 3 months ago by sergz

  • Owner set to sergz

comment:2 Changed 3 months ago by sergz

  • Review URL(s) modified (diff)
  • Status changed from new to reviewing

comment:4 follow-up: Changed 6 weeks ago by oleksandr

Should this be closed? I understand it has landed already.

comment:5 in reply to: ↑ 4 Changed 6 weeks ago by sergz

Replying to oleksandr:

Should this be closed? I understand it has landed already.

I would consider it as not finished yet because the downloaded data are still one big 2 bytes per character string.

Note: See TracTickets for help on using tickets.