Opened on 08/24/2017 at 06:13:06 PM
Last modified on 05/23/2018 at 08:39:01 AM
#5568 reviewing change
Improve string handling by splitting read strings into only ASCII and non-ASCII strings
Reported by: | sergz | Assignee: | sergz |
---|---|---|---|
Priority: | P2 | Milestone: | |
Module: | Libadblockplus | Keywords: | |
Cc: | fhd, hfiguiere, mjethani | Blocked By: | |
Blocking: | Platform: | Unknown / Cross platform | |
Ready: | no | Confidential: | no |
Tester: | Unknown | Verified working: | no |
Review URL(s): |
Description
V8 does support internally one byte character strings and we should use this feature in order to optimize a bit memory usage.
Currently when we read a file (see readFromFile in io.js) and pass its content into JS to go through each line V8 finds there non-ASCII characters and creates a regular UCS-2 JavaScript string, namely two bytes string. In addition when we work with a string, e.g. create a new one which is a substring of the original string but copied internally by V8, the new string has the same type, namely one byte string or two bytes string. It may be also valid for some transformations like changing the case of letters.
For example, in EasyList subscription there are only few lines with non-ASCII characters. So, we could save some memory if we managed to advice V8 when the line is one byte character string. Since we are not allowed to shuffle the lines in the read file it seems we should implement readFromFile from io.js in C++. However one should pay attention to the performance, e.g. one should consider whether the calling of listener from C++ is too expensive or not (e.g., profile how it performs on android, of course compiled in release).
In addition one should consider how to use it with downloaded data.
Attachments (0)
Change History (6)
comment:1 Changed on 08/28/2017 at 01:55:54 PM by sergz
- Owner set to sergz
comment:2 Changed on 08/30/2017 at 05:14:22 PM by sergz
- Review URL(s) modified (diff)
- Status changed from new to reviewing
comment:3 Changed on 08/31/2017 at 01:33:12 PM by abpbot
comment:4 follow-up: ↓ 5 Changed on 10/14/2017 at 11:20:56 PM by oleksandr
Should this be closed? I understand it has landed already.
comment:5 in reply to: ↑ 4 Changed on 10/15/2017 at 05:06:04 PM by sergz
Replying to oleksandr:
Should this be closed? I understand it has landed already.
I would consider it as not finished yet because the downloaded data are still one big 2 bytes per character string.
comment:6 Changed on 05/23/2018 at 08:39:01 AM by mjethani
- Cc mjethani added
A commit referencing this issue has landed:
Issue 5568 - Improve string handling by splitting read strings into only ASCII and non-ASCII strings