Changes between Version 12 and Version 13 of Ticket #395


Ignore:
Timestamp:
10/07/2014 03:05:38 PM (5 years ago)
Author:
Kirill
Comment:

I clearly recommend using MongoDB to dump the json files there. The main advantages are:

  • is is scalable
  • you can just dump json there
  • it provides query and aggregation framework
  • you can do different database stuff with your dumped json files (indexing)
  • it can easily run in a cluster of servers and is supports map-reduce
  • it's very easy to implement and open source

Legend:

Unmodified
Added
Removed
Modified
  • Ticket #395 – Description

    v12 v13  
    11=== Background === 
    2 This ticket is part of the [Filter Hit Statistics Tool #495](https://issues.adblockplus.org/ticket/495) group. We're aiming to cut down on unused filters by allowing users to opt-in and share their filter usage data with us. This ticket is related to the back-end database and related APIs. 
     2See #495. 
    33 
    44=== What to change === 
     
    1919 
    2020Server should be set up in the infrastructure repository with Puppet scripts etc. It will be a dedicated server called "hitstats". 
    21  
    22 === Questions === 
    23  - What storage should we use for the raw data? We want to balance simplicity and speed against the usefulness of the format. E.g. MongoDB might be more useful for querying but be complex to maintain or struggle under a large load. 
    24  - What kind of authentication are we going to use to prevent outside people from performing the queries? 
    25  - What exact format will the browser send the data to the server in? 
    26  - What querying of the raw data would be useful?