Opened on 11/05/2015 at 03:54:49 PM
Closed on 11/10/2017 at 11:15:18 AM
#3273 closed change (rejected)
Extend telemetry data by additional information
| Reported by: | mario | Assignee: | |
|---|---|---|---|
| Priority: | Unknown | Milestone: | |
| Module: | Adblock-Plus-for-Firefox | Keywords: | 2016q1 |
| Cc: | saroyanm, trev, matze | Blocked By: | #394 |
| Blocking: | Platform: | Firefox | |
| Ready: | no | Confidential: | no |
| Tester: | Unknown | Verified working: | no |
| Review URL(s): | |||
Description (last modified by saroyanm)
Background
#495 introduces "Telemetry", formally known as "Filter Hit Statistics". #394 covers the client side implementation, i.e. collecting and regularly sending telemetry data to our backend after an explicit opt-in.
As soon as #394 has landed, the following changes should be implemented on the client side.
By sending additional information (described in "want to change"), the following requirements can be met:
- Further improve anonymization of the data.
- Identify filter lists where none of the filters actually hit.
- Identify often visited domains with an unusual high amount of filter hits in order to improve filter rules.
- Identify the environment's locale.
What to change
- Add a new attribute to the JSON on root level called "filterListSubscriptions" which includes an array of all subscribed filter lists.
"filterListSubscriptions": ["https://easylist-downloads.adblockplus.org/easylist.txt", "https:// ..."]
- Add a new attribute to the JSON on root level called "domains" which includes an object of all visited domains containing the number of page impressions within this domain.
"domains": { "example.com": { "pages": 143 // Number of page impressions within this domain }, "example.org": {"pages": 12} }
- Add a new attribute to the JSON on root level called "appLocale" which describes the browser's locale.
"appLocale": "en-US",
This is en example of the full JSON format containing the changes described above.
{
"version": 1, // For the server to recognize outdated clients
"timeSincePush": 12345, // UTC Time interval (seconds in 1h-steps) since previous push
"addonName": "adblockplus", // see require("info")
"addonVersion": "2.3.4", // see require("info")
"application": "firefox", // see require("info")
"applicationVersion": "31", // see require("info")
"platform": "gecko", // see require("info")
"platformVersion": "31", // see require("info")
"appLocale": "en-US", // see Utils.appLocale (actually ABP locale)
"filterListSubscriptions": ["https://easylist-downloads.adblockplus.org/easylist.txt", "https:// ..."] // All filter list subscriptions
"domains": {
"example.com": {
"pages": 143 // Number of page impressions within this domain
},
"example.org": {"pages": 12}
},
"filters": {
"||example.com^": {
"firstParty": {
"example.com": {
"hits": 12, // Number of hits
"latest": 123456789 // UTC Time interval of last hit (in 1h-steps)
},
"example.org": {"hits": 4, "latest": 987654321}
},
"thirdParty": {
"example.com": {"hits": 5, "latest": 123455489}
},
"subscriptions": ["https://easylist-downloads.adblockplus.org/easylist.txt", "https:// ..."] // Subscription source of filter
},
"example.com##foo > bar": {
...
}
}
}
Note: The format might change. For the original JSON format please consult #394.
Attachments (0)
Change History (12)
comment:1 in reply to: ↑ description ; follow-up: ↓ 5 Changed on 11/06/2015 at 12:22:56 PM by Kirill
comment:2 Changed on 11/06/2015 at 12:28:40 PM by mario
- Description modified (diff)
You're right. Changed "number of pages loaded" to "number of page impressions" to make this more clear.
comment:3 Changed on 11/19/2015 at 05:58:51 PM by saroyanm
- Cc saroyanm added
comment:4 in reply to: ↑ description ; follow-up: ↓ 6 Changed on 11/19/2015 at 06:12:30 PM by saroyanm
- Cc trev matze added
Replying to mario:
- Add a new attribute to the JSON on root level called "domains" which includes an object of all visited domains containing the number of page impressions within this domain.
"domains": { "example.com": { "pages": 143 // Number of page impressions within this domain }, "example.org": {"pages": 12} }
Is there a reason of storing the page views as separate object ? Also why one object has "pages" key, another not ? What about:
"domainViews": {
"example.com": 143, // Number of page impressions within this domain
"example.org": 12
}
comment:5 in reply to: ↑ 1 ; follow-up: ↓ 7 Changed on 11/19/2015 at 06:15:30 PM by saroyanm
Replying to Kirill:
Replying to mario:
"Number of pages" might be misleading. We don't only need to know how many pages a user opened on a domain, but how often he did so. Maybe this is what you meant and what is clear to everyone else, but I wanted to point it out to prevent misunderstandings...
Not sure if I understand what you mean and how impressions should be calculated, can you please describe a bit what exactly we need to calculate ?
comment:6 in reply to: ↑ 4 Changed on 11/20/2015 at 08:25:59 AM by Kirill
Replying to saroyanm:
Is there a reason of storing the page views as separate object ? Also why one object has "pages" key, another not ? What about:
"domainViews": { "example.com": 143, // Number of page impressions within this domain "example.org": 12 }
The reason is, that we had another paramater in there which got removed, but the structure stayed. I like your suggestion, but if we will add parameters to domains (like last visited or something different), then we would change the structure again to the original proposed one. I frankly don't know what is better here, a simple or an extensible format....
comment:7 in reply to: ↑ 5 Changed on 11/20/2015 at 08:37:38 AM by Kirill
Replying to saroyanm:
Not sure if I understand what you mean and how impressions should be calculated, can you please describe a bit what exactly we need to calculate?
What we want, is somehow a count of visits of a domain. Not really sure how to do it technically.
comment:8 follow-up: ↓ 9 Changed on 11/20/2015 at 12:07:18 PM by trev
Actually, there just shouldn't be a separate "domains" object - this is still data related to filter hits, not some general behavior tracking. In other words:
"||example.com^": {
"firstParty": {
"example.com": {
"hits": 12, // Number of hits
"latest": 123456789, // UTC Time interval of last hit (in 1h-steps)
"pages": 12 // Number of page impressions
},
"example.org": {"hits": 4, "latest": 987654321, "pages": 2}
},
},
comment:9 in reply to: ↑ 8 Changed on 11/23/2015 at 03:23:58 PM by saroyanm
Replying to trev:
Actually, there just shouldn't be a separate "domains" object - this is still data related to filter hits, not some general behavior tracking. In other words:
"||example.com^": { "firstParty": { "example.com": { "hits": 12, // Number of hits "latest": 123456789, // UTC Time interval of last hit (in 1h-steps) "pages": 12 // Number of page impressions }, "example.org": {"hits": 4, "latest": 987654321, "pages": 2} }, },
What about page impressions on the page where we don't have hit ?
ex.: user visit example.com/no-ad page where is no add, should we update the page impression for each filter that have been hit previously on other pages ?
If so doesn't sounds efficient with current implementation. I would say we will need to change the data structure to make it efficient in that case. ex.:
"example.com": {
"firstParty": {
"||example.com^": {
"hits": 12, // Number of hits
"latest": 123456789 // UTC Time interval of last hit (in 1h-steps)
}
},
"thirdParty": {
...
},
impression: 20
}
"example.org": {
"firstParty": {
"||example.com^": { "hits": 6, "latest": 12345678 }
},
impression: 30
}
The question is when we need to update the Impression, only in case we had a filter hit on the specific domain ?
Maybe I just don't understand your proposed structure.
comment:10 Changed on 02/15/2016 at 04:51:20 PM by mario
- Keywords 2016q1 added; 2015q4 removed
comment:11 Changed on 02/29/2016 at 03:52:27 PM by saroyanm
- Description modified (diff)
Removed the first point while we decided to implement that in the initial review.
comment:12 Changed on 11/10/2017 at 11:15:18 AM by trev
- Resolution set to rejected
- Status changed from new to closed
Mass-closing all bugs in Adblock Plus for Firefox module, the codebase of Adblock Plus 3.0 belongs into Platform and User-Interface modules. Old bugs are unlikely to still apply.

Replying to mario:
"Number of pages" might be misleading. We don't only need to know how many pages a user opened on a domain, but how often he did so. Maybe this is what you meant and what is clear to everyone else, but I wanted to point it out to prevent misunderstandings...