Changes between Version 7 and Version 10 of Ticket #6647

05/06/2018 02:41:33 PM (17 months ago)


  • Ticket #6647

    • Property Type changed from defect to change
    • Property Component changed from Core to Platform
    • Property Owner changed from mjethani to sebastian
    • Property Summary changed from Filters containing Unicode variation selectors do not match to Stop converting domains from punycode to unicode
  • Ticket #6647 – Description

    v7 v10  
    1 === Environment === 
    2 Adblock Plus 3.0.3 
     1=== Background === 
    4 === How to reproduce === 
    5  1. Add the filter `❤️` 
    6  2. Visit 
     3In Adblock Plus we are going to quite some length to convert the domains in the URLs reported by the browser from punycode (e.g. ``) to unicode (e.g. `i❤.ws`), so that filters can be written using the unicode representation (rather than punycode). This, however, comes with a performance penalty, while the benefits for filter lists authors are questionable. 
    8 === Observed behaviour === 
    9 Subresources like are not blocked 
     5The original idea was that it feels more natural to filter list authors to spell out IDN domains in their native alphabet (rather then bothering about an obscure representation like punycode). However, while in the address bar the domain may (or may not) be rendered using the native alphabet, latest when inspecting the DOM, looking at the source code or at the HTTP requests, all domains are given in punycode encoding. 
    11 === Expected behaviour === 
    12 Subresources like should get blocked 
     7Furthermore, things become particularly confusing with unicode characters that can be composed in different ways, resulting in different punycode, but looking the same when rendered as unicode (e.g. `❤️` vs `❤`). 
    14 === Notes === 
    15 Unlike the blue heart, the green heart, and others, the red heart is composed of [ two Unicode code points] for [ historical reasons]. The actual domain name is in fact i❤.ws. 
     9=== What to change === 
     11Replace `stringifyURL(url)` with `url.href`, and replace `getDecodedHostname(url)` with `url.hostname`. As a result, IDN (non-ascii) domains given in a filter's pattern or `$domain` option should be expected to be in punycode encoding.