[meta] Filter Hit Statistics Tool
|Cc:||famlam, mapx, fiaefuzz||Blocked By:||#394, #395, #396, #2220|
Description (last modified by philll)
We currently have to ship a growing file with updated filter data. It contains a lot of unused or outdated filters, either because ad implementation changed or websites has been shut off. We are carrying a lot of "waste" in this file which slowly decreases performance.
What to change
We need a tool which extracts relevant data from a large enough sample of ABP users and then analyses filter hit statistics over the whole sample. In the end, the tool should show us which filters can be erased (because of near zero usage).
Users should be able to opt-in into this and send sufficiently anonymized hit statistics to us. We also need to decide whether/how we will ask users to opt in.
Things to consider:
- We are currently only saving filter hit statistics in Firefox so implementing this in Firefox first is the most logical choice.
- Private browsing mode: we aren't saving any hit statistics there, meaning that we probably don't want to show the opt-in option for users using private browsing mode permanently. Same goes for users who disabled hit statistics altogether via "Count filter hits" option in Firefox.
- Clearing browsing history also clears hit statistics, meaning that we probably don't want to show the opt-in option for users clearing history on shutdown either.
- Moving filter hit statistics out of patterns.ini and into a separate file might be a good idea, the hit count is already responsible for much of the file size there.
- Sebastian suggested using Nginx Upload Module to receive data. IMHO that's premature optimization, we will unlikely get so many submissions that a regular FCGI script cannot handle them.
|#394||reviewing||Collect filter hit statistics||Adblock-Plus-for-Firefox||saroyanm|
|#395||closed||fixed||Filter hit statistics backend||Sitescripts||kzar|
|#396||new||[abp-backend] Add a filter hit statistics backend server||Infrastructure|