Opened on 07/11/2017 at 12:26:52 AM

Closed on 08/15/2017 at 05:25:35 PM

#5402 closed change (fixed)

Redirect pages based on ?fb_locale

Reported by: juliandoucette Assignee: matze
Priority: P3 Milestone:
Module: Infrastructure Keywords:
Cc: kvas, jsonesen, wspee, ire, saroyanm, matze, ferris, rraceanu Blocked By:
Blocking: #5392 Platform: Unknown / Cross platform
Ready: yes Confidential: no
Tester: Unknown Verified working: no
Review URL(s):

https://codereview.adblockplus.org/29487668

Description (last modified by juliandoucette)

Background

When shared on Facebook, our Non-English web pages are displaying English meta data because our canonical URLs and og:urls are (rightfully) pointing to (language neutral paths which default to) English pages.

I don't know if this is a bug because I don't know if Facebook's scraper is correctly passing the Accept-Language header. But I can provide the following (primitive) test results:

Pages

index.html

<!DOCTYPE html>
<html>
  <head>
    <title>English</title>
    <meta property="og:url" content="http://REDACTED/">
    <meta property="og:locale" content="en_US">
    <meta property="og:locale:alternate" content="de_DE" />
  </head>
  <body>
    English
  </body>
</html>

de/index.html

<!DOCTYPE html>
<html>
  <head>
    <title>German</title>
    <meta property="og:url" content="http://REDACTED/">
    <meta property="og:locale" content="de_DE">
    <meta property="og:locale:alternate" content="en_US">
  </head>
  <body>
    German
  </body>
</html>

Server log

Served by http-server and scraped via Facebook Sharing Debugger.

REDACTED:/var/www# hs -p 80
Starting up http-server, serving ./
Available on:
  http://127.0.0.1:80
  http://REDACTED:80
  http://REDACTED:80
Hit CTRL-C to stop the server
[Tue Jul 11 2017 00:09:19 GMT+0000 (UTC)] "GET /de/" "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)"
[Tue Jul 11 2017 00:09:19 GMT+0000 (UTC)] "GET /" "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)"
[Tue Jul 11 2017 00:09:19 GMT+0000 (UTC)] "GET /?fb_locale=de_DE" "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)"

Based on this, it seems that we can at least redirect based on the ?fb_locale parameter.

What to change

Redirect web pages based on ?fb_locale parameter.

e.g.

/chrome?fb_locale=de_DE => /de/chrome

Note: The difference between de_DE and /de/ (See #5392)

Attachments (0)

Change History (16)

comment:1 Changed on 07/11/2017 at 12:29:57 AM by juliandoucette

  • Blocking 1274 added

comment:2 Changed on 07/11/2017 at 06:55:25 AM by matze

IF "our Non-English web pages are displaying English meta data", how does establishing a redirect like /en/chrome?fb_locale=de_DE => /de/chrome help?

comment:3 Changed on 07/11/2017 at 11:04:04 AM by juliandoucette

  • Description modified (diff)

IF "our Non-English web pages are displaying English meta data", how does establishing a redirect like /en/chrome?fb_locale=de_DE => /de/chrome help?

I'm sorry. I meant /chrome?fb_locale=de_DE (without en).

But a "redirect" may not be exactly what I'm after. I hope that you can tell me.

The canonical URL that we set on adblockplus.org is language natural so that search engines point all languages to the same location and let our servers handle redirection based on the Accept-Language header. og:url is supposed to work the same way in combination with og:locale and og:locale:alternate. e.g. You share adblockplus.org/de/chrome and Facebook scrapes the og:url provided on that page, passing the og:locale on that page using ?fb_locale. In this way, we can provide language specific pages to language neutral requests from Facebook using a canonical og:url.

Of course, we could just include our locale code in the og:url. But then our og:url would not be a canonical URL? And our pages likes and shares would be language specific (and harder to trend?).

comment:4 follow-up: Changed on 07/11/2017 at 02:04:00 PM by matze

Thank you for the clarification! So you would use og:url similar to a canonical URL designation? That would work quite well. Below please find the list of language and region determining factors in the order of precedence:

. URL Path
. HTTP Accept Header
. Default

When the URL path is canonical, it does not contain a language designating part. So we obviously end up with English because it's Facebook's (Accept header) or our default being used, probably both the same.

If I understand correctly you want to insert a step that examines the fb_locale query string parameter, if any, at some point. When exactly, considering the order above?

As a side-node: Doing so before the URL Path is a bit tricky, as the Nginx hacks behind would not really allow for simple removal of the language designating part. This is currently a one-way process, and just using board tools to establish the other direction would probably end up quite prone to error, i.e. false positives.
Examining the fb_locale parameter after the URL parts but before the HTTP header, on the other hand, is rather trivial and would work quite well with the corrected example. Then again, this requires Facebook and similar to visit the canonical path.

comment:5 in reply to: ↑ 4 Changed on 07/12/2017 at 01:06:25 PM by juliandoucette

Replying to matze:

Thank you for the clarification! So you would use og:url similar to a canonical URL designation?

Yes. og:url should be the same as canonical url.

If I understand correctly you want to insert a step that examines the fb_locale query string parameter, if any, at some point. When exactly, considering the order above?

Yes. Before the HTTP Accept header.


Note: It may matter how we serve/redirect. If we redirect to the same page being served then we could create an infinate loop (depending on how Facebook's scraper works).

Last edited on 07/12/2017 at 01:11:15 PM by juliandoucette

comment:6 Changed on 07/12/2017 at 03:06:28 PM by matze

  • Priority changed from Unknown to P3
  • Ready set

comment:7 Changed on 07/12/2017 at 03:34:59 PM by matze

  • Owner set to matze

comment:8 Changed on 07/12/2017 at 03:40:26 PM by matze

  • Review URL(s) modified (diff)
  • Status changed from new to reviewing

comment:9 Changed on 07/13/2017 at 07:35:12 AM by abpbot

A commit referencing this issue has landed:
Issue 5402 - Extract language and region from ?fb_locale

comment:10 Changed on 07/13/2017 at 07:53:12 AM by matze

  • Resolution set to fixed
  • Status changed from reviewing to closed

comment:11 Changed on 08/08/2017 at 10:07:58 PM by juliandoucette

Note: The way that was implemented causes ?fb_locale=de_DE to serve the de page instead of redirecting to /de/. I think that this was the correct way to implement this feature.

comment:12 Changed on 08/09/2017 at 10:51:26 PM by juliandoucette

  • Blocking 1274 removed

comment:13 Changed on 08/12/2017 at 05:33:47 PM by juliandoucette

This doesn't seem to be working for adblockbrowser.org. I'm guessing the commit above only changed adblockplus.org. Can you please apply this change to all websites?

comment:14 Changed on 08/12/2017 at 05:33:58 PM by juliandoucette

  • Resolution fixed deleted
  • Status changed from closed to reopened

comment:15 Changed on 08/14/2017 at 08:22:00 PM by paco

web2:

role: web/adblockplus
dry-run: no problems
provision: already has the change

web-abb-org-1:

role: web/adblockbrowser
dry-run: no problems
provision: gut!

web-subscribe-abp-org-1:

role: web/subscribe
dry-run: nginx and logrotate erros
provision: all gut!

eyeo-com-1:

role: web/eyeo
dry-run: no problems!
provision: all gut!

acceptableads-com-1:

role: web/acceptableads
dry-run: no change needed
provision: no provisioned

acceptableads-org-1:

role: web/acceptableads
dry-run: no change needed
provision: not provisioned

facebook-me-1:

role: web/facebook
dry-run: nginx and logrotate errors
provision: all gut!

testpages-1:

role: web/testpages
dry-run: nginx and logrotate erros
provision: all gut!

youtube-me-1:

role: web/youtube
dry-run: nginx and logrotate errors
provision: all gut!

share-1:

role: web/share
dry-run: nginx and logrotate errors
provision: all gut!

subscribe-1:

role: web/subscribe
dry-run: nginx and logrotate errors
provision: all gut!

easylist-1:

role: web/easylist
dry-run: nginx and logrotate errors
provision: all gut!

eyeo-to-1:

role: web/redirect/eyeo
dry-run: didn't change anything
provision: not provisioned

adblockbrowser-to-1:

role: web/redirect/adblockbrowser
dry-run: didn't change anything
provision: not provisioned

adblockplus-to-1:

role: web/redirect/adblockplus
dry-run: didn't change anything
provision: not provisioned

adblockplus-org-1:

role: web/adblockplus
dry-run: didn't change anything
provision: not provisioned

These are the changes after provisioning all webservers, the nginx and logrotate errors are normal in a dry-run, the dry-run was made only to know whether the provision was needed or not.

Cheers!

comment:16 Changed on 08/15/2017 at 05:25:35 PM by paco

  • Resolution set to fixed
  • Status changed from reopened to closed

Add Comment

Modify Ticket

Change Properties
Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from matze.
 
Note: See TracTickets for help on using tickets.