Opened on 08/03/2018 at 02:38:05 AM

Closed on 08/29/2019 at 05:43:52 PM

Last modified on 10/08/2019 at 06:05:26 PM

#6833 closed change (rejected)

Add support for incremental filter list updates

Reported by: sebastian Assignee: erikvold
Priority: P2 Milestone:
Module: Core Keywords: closed-in-favor-of-gitlab
Cc: erikvold, kzar, rhowell, greiner, mjethani, sergz, sporz Blocked By:
Blocking: Platform: Unknown / Cross platform
Ready: no Confidential: no
Tester: Unknown Verified working: no
Review URL(s):

https://gitlab.com/eyeo/adblockplus/adblockpluscore/merge_requests/18

Description (last modified by sebastian)

Background

Currently the whole filter list is downloaded when the filter list is updated (e.g. once a day in case of EasyList). We'd like to decrease bandwidth usage while updating filter lists at a higher frequency.

What to change

Implement the client-side support for incremental filter list updates as specified in ​this specification.

Attachments (0)

Change History (28)

comment:1 Changed on 08/06/2018 at 05:28:42 PM by greiner

  • Cc greiner added

comment:2 Changed on 08/07/2018 at 01:03:33 PM by mapx

comment:3 Changed on 08/07/2018 at 06:53:47 PM by erikvold

  • Owner set to erikvold

comment:4 Changed on 08/10/2018 at 09:49:15 AM by kzar

  • Cc mjethani sergz added
  • Description modified (diff)
  • Priority changed from P3 to P2

I think it's better to put some of the details about what we're implementing for this issue in the description. I've had a go, but Sebastian/Rosie feel free to tweak the description if you think I've got something wrong or missed something.

Once the FIXME comments in the description are addressed, I consider this issue ready. So if I'm not around to triage, feel free to mark this as Ready once you've addressed those.

It might make sense to break this down into multiple issues as well, I've not looked into how this will be implemented in practice so I'm not sure. Erik if some of these changes need to happen in adblockpluschrome for example, or if the changes make sense to be split up please file further issues which are blocked by / block this one.

comment:5 Changed on 08/10/2018 at 09:52:53 AM by kzar

  • Description modified (diff)

comment:6 Changed on 08/10/2018 at 09:57:32 AM by kzar

  • Description modified (diff)

comment:7 Changed on 08/10/2018 at 07:37:59 PM by sebastian

Thanks, Dave, but what we have in the issue description now, is a widely incomplete (and partly inaccurate) description of what should be implemented. In order to fix this, I would essentially have to copy the whole protocol section from the spec in here. That's why I decided to rather just refer to the spec, avoiding duplication and potential inconsistencies.

comment:8 Changed on 08/13/2018 at 04:38:32 PM by mjethani

I think it's OK to link to the document from the issue for the purpose of starting work on the issue. Eventually we can either copy the entire final specification into the description or link to the specification hosted at a place that is openly accessible (Google Docs requires a Google account?). I agree that for now it's not worth duplicating information.

In the long run, for features that have large specification documents, we should have a standard place to host the specifications. I don't think Trac is ideal for writing a detailed specification.

I know that the UI team has been using GitLab for this.

comment:9 Changed on 08/13/2018 at 09:39:04 PM by sebastian

  • Description modified (diff)
  • Ready set

Actually this spec document is public, i.e. you don't need to sign into Google when you click the link in order to view it (but in order to comment or edit). In the long-run we might want to convert it to markdown and host in some VCS though, for better version control, like the UI team is doing.

Anyway, since Dave is on vacation, and Manish (who is the module owner for Core) seems to agree, I reverted the issue description and set the issue to ready.

Also, FWIW, as far as I'm concerned, this issue is rather a P3 than a P2, as it's something we agreed to work on, but hasn't been given priority over other development efforts.

comment:10 Changed on 08/27/2018 at 06:13:03 PM by sporz

  • Cc sporz added

comment:11 follow-ups: Changed on 09/06/2018 at 09:06:02 AM by ameshkov

Hello guys,

Incremental updates will be useful to everyone so let me please chime in :)

I've read the spec, and I don't quite understand how are the client applications protected from accidentally downloading a wrong diff. For instance, there might be two different changes during the Diff-Expires interval, and as I understand, the diff file will contain changes for the second change only.

I suggest using something more foolproof. Here is an idea on how it can be improved.

  1. Introduce one more special comment - ! Diff-Head: HERE_GOES_THE_HEAD_COMMIT_OR_VERSION
  2. Use the regular Diff file syntax and indicate the before/after commits (just like the git-diff does)
  3. Let the diff file contain more than just the last change. For instance, it may include diffs of the previous 10-20 changes (concatenated using a special separator).
  4. In this case, the client will be able to locate the exact changes that occurred since the last update check.

It will be easy to build such diff file if filter maintainers use git or svn for version control.
Alternatively, they may use the filter version for the Diff-Head value.

comment:12 in reply to: ↑ 11 ; follow-up: Changed on 09/10/2018 at 06:46:22 PM by rhowell

Replying to ameshkov:

I've read the spec, and I don't quite understand how are the client applications protected from accidentally downloading a wrong diff. For instance, there might be two different changes during the Diff-Expires interval, and as I understand, the diff file will contain changes for the second change only.

I don't believe we will run into this case. While the server-side implementation hasn't been specified yet, there have been some discussions about it (e.g. in the workshop). One option is to keep the original/base filter lists on the server, probably in the same folder as the diff. Then, each time a new filter list is generated, the diffs will be updated to show the difference between the base version and the latest version. Maybe a file structure like:

  • current_filterlist.txt
  • version1/
    • v1_filterlist.txt
    • v1_to_current.diff
  • version2/
    • v2_filterlist.txt
    • v2_to_current.diff

The previous lists will be static, but the diffs will change each time the current filter list changes. So, no matter what previous version the client currently has, the result of applying the diff will be the latest version.

comment:13 in reply to: ↑ 11 Changed on 09/10/2018 at 07:12:34 PM by sebastian

Replying to ameshkov:

  1. Use the regular Diff file syntax and indicate the before/after commits (just like the git-diff does)

That wouldn't be possible with the current implementation of Adblock Plus since we don't save the original/unchanged filter text on the client-side but parse, normalize and store the filters in a slightly different format. Moreover, (without third-party libraries) the diff format we chose is simpler to implement (in particular on the client-side). Also it is more compact.

  1. Let the diff file contain more than just the last change. For instance, it may include diffs of the previous 10-20 changes (concatenated using a special separator).

The diff will include all changes between the last update and the latest version available.

comment:14 in reply to: ↑ 12 Changed on 09/12/2018 at 04:34:36 PM by ameshkov

I guess my confusion was caused by misreading some parts of the "diff discovery" section. For some reason, I was thinking that the diff file is a single static file which location is specified by ! Diff-URL. Clearly, that's not the case, thanks for the explanation.

comment:15 Changed on 10/14/2018 at 09:55:04 PM by erikvold

  • Review URL(s) modified (diff)

comment:17 Changed on 10/31/2018 at 01:33:05 AM by erikvold

  • Blocked By 7085 added

comment:18 Changed on 10/31/2018 at 01:34:42 AM by erikvold

  • Blocked By 7084 added

comment:19 Changed on 10/31/2018 at 01:36:07 AM by erikvold

  • Blocked By 7044 added

comment:20 Changed on 10/31/2018 at 01:36:52 AM by erikvold

  • Blocked By 5449 added

comment:21 Changed on 10/31/2018 at 01:59:02 AM by erikvold

  • Blocked By 7086 added

comment:22 Changed on 10/31/2018 at 12:02:08 PM by kzar

  • Blocked By 5449 removed

comment:23 follow-up: Changed on 10/31/2018 at 10:05:13 PM by sebastian

  • Blocked By 7044, 7084, 7085, 7086 removed

Thanks for filing those issues, however, none of them are blocking this change.

#7044: I left a comment on that issue for clarification, but in case you have to override/extend eslint-config-eyeo for some parts of the code, you can do that through an .eslintrc.json file in the respective folder or through an inline comment in the respective file, along your changes that require to relax the rules. No need to handle that upfront as a separate change.

#7084, #7085: Thanks for filing those issues, but no need to hold back the implementation of incremental filter list updates for some linter improvements.

#7086: Yes, you accidentally introduced a regression in your initial patch, which more complete tests could have avoided. But that change was unrelated and therefore shouldn't have been done in the first place. Anyway, while I think adding more tests for the semantics of metadata in filter lists is a good idea, I don't think that this necessarily has to be done before we can move forward with this issue.

comment:24 in reply to: ↑ 23 Changed on 11/01/2018 at 08:23:22 AM by erikvold

Replying to sebastian:

Thanks for filing those issues, however, none of them are blocking this change.

#7044: I left a comment on that issue for clarification, but in case you have to override/extend eslint-config-eyeo for some parts of the code, you can do that through an .eslintrc.json file in the respective folder or through an inline comment in the respective file, along your changes that require to relax the rules. No need to handle that upfront as a separate change.

I understand we can alter the rules for a project, we did that for the flattr-extension project, I don't think we should handle that change in this issue though. We can figure where the change makes most sense in #7044, and then make a separate commit to resolve it.

#7084, #7085: Thanks for filing those issues, but no need to hold back the implementation of incremental filter list updates for some linter improvements.

It's just much faster in the long run to deal with these issues as they come up instead of avoiding it.

#7086: Yes, you accidentally introduced a regression in your initial patch, which more complete tests could have avoided. But that change was unrelated and therefore shouldn't have been done in the first place. Anyway, while I think adding more tests for the semantics of metadata in filter lists is a good idea, I don't think that this necessarily has to be done before we can move forward with this issue.

I just think it's wise to make sure we have these tests in place before making the changes for this issue. This regression and others could have been caused and not caught, and I don't like landing a patch on that type of situation, this is why we have tests.

Last edited on 11/01/2018 at 04:38:31 PM by erikvold

comment:25 Changed on 01/30/2019 at 06:09:21 AM by erikvold

  • Review URL(s) modified (diff)

comment:26 Changed on 05/03/2019 at 11:23:00 AM by mjethani

  • Ready unset

Sorry it took me this long to look into this.

I am looking at the specification and I think there are questions that we need to resolve before we can move to the next step. I have unset the Ready flag for now.

comment:27 Changed on 06/07/2019 at 02:45:20 AM by suneel

spam

Last edited on 10/08/2019 at 06:05:26 PM by kzar

comment:28 Changed on 08/29/2019 at 05:43:52 PM by sebastian

  • Keywords closed-in-favor-of-gitlab added
  • Resolution set to rejected
  • Status changed from new to closed

Sorry, but we switched to GitLab. If this issue is still relevant, please file it again in the new issue tracker.

Add Comment

Modify Ticket

Change Properties
Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from erikvold.
 
Note: See TracTickets for help on using tickets.