Opened on 11/05/2015 at 03:54:49 PM
Closed on 11/10/2017 at 11:15:18 AM
#3273 closed change (rejected)
Extend telemetry data by additional information
Reported by: | mario | Assignee: | |
---|---|---|---|
Priority: | Unknown | Milestone: | |
Module: | Adblock-Plus-for-Firefox | Keywords: | 2016q1 |
Cc: | saroyanm, trev, matze | Blocked By: | #394 |
Blocking: | Platform: | Firefox | |
Ready: | no | Confidential: | no |
Tester: | Unknown | Verified working: | no |
Review URL(s): |
Description (last modified by saroyanm)
Background
#495 introduces "Telemetry", formally known as "Filter Hit Statistics". #394 covers the client side implementation, i.e. collecting and regularly sending telemetry data to our backend after an explicit opt-in.
As soon as #394 has landed, the following changes should be implemented on the client side.
By sending additional information (described in "want to change"), the following requirements can be met:
- Further improve anonymization of the data.
- Identify filter lists where none of the filters actually hit.
- Identify often visited domains with an unusual high amount of filter hits in order to improve filter rules.
- Identify the environment's locale.
What to change
- Add a new attribute to the JSON on root level called "filterListSubscriptions" which includes an array of all subscribed filter lists.
"filterListSubscriptions": ["https://easylist-downloads.adblockplus.org/easylist.txt", "https:// ..."]
- Add a new attribute to the JSON on root level called "domains" which includes an object of all visited domains containing the number of page impressions within this domain.
"domains": { "example.com": { "pages": 143 // Number of page impressions within this domain }, "example.org": {"pages": 12} }
- Add a new attribute to the JSON on root level called "appLocale" which describes the browser's locale.
"appLocale": "en-US",
This is en example of the full JSON format containing the changes described above.
{ "version": 1, // For the server to recognize outdated clients "timeSincePush": 12345, // UTC Time interval (seconds in 1h-steps) since previous push "addonName": "adblockplus", // see require("info") "addonVersion": "2.3.4", // see require("info") "application": "firefox", // see require("info") "applicationVersion": "31", // see require("info") "platform": "gecko", // see require("info") "platformVersion": "31", // see require("info") "appLocale": "en-US", // see Utils.appLocale (actually ABP locale) "filterListSubscriptions": ["https://easylist-downloads.adblockplus.org/easylist.txt", "https:// ..."] // All filter list subscriptions "domains": { "example.com": { "pages": 143 // Number of page impressions within this domain }, "example.org": {"pages": 12} }, "filters": { "||example.com^": { "firstParty": { "example.com": { "hits": 12, // Number of hits "latest": 123456789 // UTC Time interval of last hit (in 1h-steps) }, "example.org": {"hits": 4, "latest": 987654321} }, "thirdParty": { "example.com": {"hits": 5, "latest": 123455489} }, "subscriptions": ["https://easylist-downloads.adblockplus.org/easylist.txt", "https:// ..."] // Subscription source of filter }, "example.com##foo > bar": { ... } } }
Note: The format might change. For the original JSON format please consult #394.
Attachments (0)
Change History (12)
comment:1 in reply to: ↑ description ; follow-up: ↓ 5 Changed on 11/06/2015 at 12:22:56 PM by Kirill
comment:2 Changed on 11/06/2015 at 12:28:40 PM by mario
- Description modified (diff)
You're right. Changed "number of pages loaded" to "number of page impressions" to make this more clear.
comment:3 Changed on 11/19/2015 at 05:58:51 PM by saroyanm
- Cc saroyanm added
comment:4 in reply to: ↑ description ; follow-up: ↓ 6 Changed on 11/19/2015 at 06:12:30 PM by saroyanm
- Cc trev matze added
Replying to mario:
- Add a new attribute to the JSON on root level called "domains" which includes an object of all visited domains containing the number of page impressions within this domain.
"domains": { "example.com": { "pages": 143 // Number of page impressions within this domain }, "example.org": {"pages": 12} }
Is there a reason of storing the page views as separate object ? Also why one object has "pages" key, another not ? What about:
"domainViews": { "example.com": 143, // Number of page impressions within this domain "example.org": 12 }
comment:5 in reply to: ↑ 1 ; follow-up: ↓ 7 Changed on 11/19/2015 at 06:15:30 PM by saroyanm
Replying to Kirill:
Replying to mario:
"Number of pages" might be misleading. We don't only need to know how many pages a user opened on a domain, but how often he did so. Maybe this is what you meant and what is clear to everyone else, but I wanted to point it out to prevent misunderstandings...
Not sure if I understand what you mean and how impressions should be calculated, can you please describe a bit what exactly we need to calculate ?
comment:6 in reply to: ↑ 4 Changed on 11/20/2015 at 08:25:59 AM by Kirill
Replying to saroyanm:
Is there a reason of storing the page views as separate object ? Also why one object has "pages" key, another not ? What about:
"domainViews": { "example.com": 143, // Number of page impressions within this domain "example.org": 12 }
The reason is, that we had another paramater in there which got removed, but the structure stayed. I like your suggestion, but if we will add parameters to domains (like last visited or something different), then we would change the structure again to the original proposed one. I frankly don't know what is better here, a simple or an extensible format....
comment:7 in reply to: ↑ 5 Changed on 11/20/2015 at 08:37:38 AM by Kirill
Replying to saroyanm:
Not sure if I understand what you mean and how impressions should be calculated, can you please describe a bit what exactly we need to calculate?
What we want, is somehow a count of visits of a domain. Not really sure how to do it technically.
comment:8 follow-up: ↓ 9 Changed on 11/20/2015 at 12:07:18 PM by trev
Actually, there just shouldn't be a separate "domains" object - this is still data related to filter hits, not some general behavior tracking. In other words:
"||example.com^": { "firstParty": { "example.com": { "hits": 12, // Number of hits "latest": 123456789, // UTC Time interval of last hit (in 1h-steps) "pages": 12 // Number of page impressions }, "example.org": {"hits": 4, "latest": 987654321, "pages": 2} }, },
comment:9 in reply to: ↑ 8 Changed on 11/23/2015 at 03:23:58 PM by saroyanm
Replying to trev:
Actually, there just shouldn't be a separate "domains" object - this is still data related to filter hits, not some general behavior tracking. In other words:
"||example.com^": { "firstParty": { "example.com": { "hits": 12, // Number of hits "latest": 123456789, // UTC Time interval of last hit (in 1h-steps) "pages": 12 // Number of page impressions }, "example.org": {"hits": 4, "latest": 987654321, "pages": 2} }, },
What about page impressions on the page where we don't have hit ?
ex.: user visit example.com/no-ad page where is no add, should we update the page impression for each filter that have been hit previously on other pages ?
If so doesn't sounds efficient with current implementation. I would say we will need to change the data structure to make it efficient in that case. ex.:
"example.com": { "firstParty": { "||example.com^": { "hits": 12, // Number of hits "latest": 123456789 // UTC Time interval of last hit (in 1h-steps) } }, "thirdParty": { ... }, impression: 20 } "example.org": { "firstParty": { "||example.com^": { "hits": 6, "latest": 12345678 } }, impression: 30 }
The question is when we need to update the Impression, only in case we had a filter hit on the specific domain ?
Maybe I just don't understand your proposed structure.
comment:10 Changed on 02/15/2016 at 04:51:20 PM by mario
- Keywords 2016q1 added; 2015q4 removed
comment:11 Changed on 02/29/2016 at 03:52:27 PM by saroyanm
- Description modified (diff)
Removed the first point while we decided to implement that in the initial review.
comment:12 Changed on 11/10/2017 at 11:15:18 AM by trev
- Resolution set to rejected
- Status changed from new to closed
Mass-closing all bugs in Adblock Plus for Firefox module, the codebase of Adblock Plus 3.0 belongs into Platform and User-Interface modules. Old bugs are unlikely to still apply.
Replying to mario:
"Number of pages" might be misleading. We don't only need to know how many pages a user opened on a domain, but how often he did so. Maybe this is what you meant and what is clear to everyone else, but I wanted to point it out to prevent misunderstandings...