Changes between Initial Version and Version 1 of Ticket #395, comment 28


Ignore:
Timestamp:
10/21/2014 02:10:59 PM (4 years ago)
Author:
kzar
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Ticket #395, comment 28

    initial v1  
    1 Going to edit this comment to avoid filter. 
     1So quick update for everyone as it's been a while: 
     2 
     3 - I've made [a simple API that logs the data](https://github.com/kzar/adblockplus-hitstats-backend) as specced in the other ticket. @manvel if you want to play with it so far to test that data is sent OK [instructions for setting up a VM are here][https://github.com/kzar/adblockplus-infrastructure/tree/495-hitstats]. 
     4 - I wasted several days naively trying to put _all_ the raw data into Elasticsearch to allow @kirill to use the data freely as requested. Unfortunately there's just too much data. I generated some dummy data, a fraction of the expected volume, and Elasticsearch died quite a death! 
     5 - In case it helps anyone I've attached a little bit of dummy data above ^^ maybe useful for messing around with. 
     6 - I'm now working on a script to aggregate the data, to start with I'll try the geometrical mean approach @trev described. @kirill if you have better ideas for how to aggregate the data in ways that will be useful for you then let me know and I can try and incorporate them into the script too. I'm hoping this could still be useful to you even though the raw data is not directly query-able as you originally wanted. 
     7 - @c.dommers I want to make this easy to use for you guys, if you have any suggestions I'd like to hear them. Unfortunately as I mentioned making all the raw data query-able is just not feasible for now, but I'm still aiming to make something that benefits biz-dev.