Opened on 03/12/2014 at 01:49:20 PM

Closed on 09/05/2019 at 02:55:30 PM

#104 closed change (duplicate)

Update locale validation

Reported by: trev Assignee:
Priority: Unknown Milestone:
Module: User-Interface Keywords:
Cc: sebastian, wspee, erick, tlucas, erikvold, greiner Blocked By: #7127
Blocking: Platform: Unknown
Ready: no Confidential: no
Tester: Unknown Verified working: no
Review URL(s):

https://codereview.adblockplus.org/29587910/
https://codereview.adblockplus.org/29690867/

Description (last modified by sebastian)

Background

Currently locale validation is being performed by test_locales.pl in the adblockplus repository in combination with LocaleTester.pm in buildtools repository. It only works for our legacy Firefox extension.

What to change

There is little point updating this script, we rather need something in Python which would be integrated with the build tools:

  • We need to expand the JSON format for translations to include optional translationOptional (boolean) and maxLength (integer) keys.
  • Our custom translation keys should be removed when building or uploading to Crowdin. The maxLength field should be converted to "Note: this cannot be longer than N characters" description when uploading to Crowdin.
  • The new build.py checktranslations command should load the locales and perform the following checks:
    • Make sure the JSON data is UTF-8-encoded and parses correctly
    • Validate the structure of the JSON data.
    • Make sure that all declared placeholders are present in all translations.
    • Warn on strings where the source and target translation are identical, except for strings marked with translationOptional key or en-* locales.
    • Warn for overlong translations if maxLength is set for a translation.

Attachments (1)

test_vars.py (730 bytes) - added by trev on 05/07/2014 at 08:19:23 AM.
Chrome placeholder checking script

Download all attachments as: .zip

Change History (22)

comment:1 Changed on 03/13/2014 at 06:24:05 AM by trev

  • Reporter changed from philll to trev

comment:2 Changed on 03/13/2014 at 07:38:26 AM by trev

  • Component changed from Unknown to Build-and-Release-Tools
  • Priority changed from Unknown to P3

Changed on 05/07/2014 at 08:19:23 AM by trev

Chrome placeholder checking script

comment:3 Changed on 05/07/2014 at 08:20:27 AM by trev

  • Ready unset

I wrote a simple script meant to be run from the root of the adblockpluschrome repository to check placeholder usage, attached it here. This is merely a stopgap solution until we have a better solution in place.

comment:4 Changed on 07/09/2014 at 12:38:11 PM by philll

  • Platform set to Firefox

comment:5 Changed on 07/09/2014 at 01:13:44 PM by philll

  • Platform changed from Firefox to Unknown

comment:6 Changed on 04/20/2015 at 10:06:48 PM by trev

  • Description modified (diff)

comment:7 Changed on 04/21/2015 at 09:34:14 AM by sebastian

  • Cc sebastian added
  • Ready set

comment:8 Changed on 08/16/2017 at 02:23:15 PM by sebastian

  • Cc wspee erick added
  • Ready unset
  • Tester set to Unknown

I'd like to revive this issue, since Erick is interested in picking it up. However, the issue description seems out-of-date. Specifically the Gecko-specific features/syntax (including access keys) aren't relevant anymore, as legacy Gecko extensions are phased out.

Also, since we only need to target the Chrome-style translation format, I'd rather parse any metadata for the validation from the description field in the source translation's locales/en_US/*.json file. That way we avoid redundancies (for example max length is usually already documented there) and have all translation data in one place.

Then again, max length is already enforced by CrowdIn, and flagging strings which may be equal to the source translation might be tricky since some (in particular short) translations might match the English string in some but not all translations. But what we'd still need at least, is validation of placeholders.

Last edited on 08/16/2017 at 03:31:50 PM by sebastian

comment:9 Changed on 08/17/2017 at 12:00:06 PM by trev

  • Description modified (diff)

comment:10 Changed on 08/17/2017 at 12:01:44 PM by trev

  • Description modified (diff)

comment:11 Changed on 08/17/2017 at 01:44:13 PM by tlucas

  • Cc tlucas added

I propose a separation of this issue, in order to fully embrace the scope of an apprentice's project (which is the reason why Erick wants to tackle this).

This issue should be stripped down to validating the current format, which could be defined as following (and be fixed as the project):

  • Implement a new "build.py checktranslations" functionality, which handles the following (as the apprentice's project):
    • Make sure the JSON data is UTF-8-encoded and parses correctly
    • Validate the structure of the JSON data.
    • Make sure that all declared placeholders are present and formatted correctly in all translations.
    • Warn on untranslated strings. While this check is prone to false positives, it is also a warning sign allowing to detect bad translations.

Additionally, a new issue could be filed, handling the rest of this issue:

  • Expand the JSON format for translations to include optional translationOptional (boolean) and maxLength (integer) keys.
  • Our custom translation keys should be removed when building or uploading to Crowdin. The maxLength field should be converted to "Note: this cannot be longer than N characters" description when uploading to Crowdin.
  • Additionally, adjust the in #104 introduced functionality to:
    • Warn for overlong translations if maxLength is set for a translation.
    • Ignore untranslated strings for strings marked with translationOptional key or en-* locales.

comment:12 Changed on 08/17/2017 at 04:06:03 PM by sebastian

I don't see why we would want to warn about untranslated strings. Incomplete translations are quite normal, in particular for less common languages and rarely seen strings, we don't care much about missing translations. Also CrowdIn already gives a good overview of the translation status, anyway. Perhaps without this feature, it would also better fit the scope of an apprentice project.

comment:13 Changed on 08/17/2017 at 08:13:28 PM by trev

By "untranslated" I mean translations that are identical to English, not missing translations.

comment:14 Changed on 08/18/2017 at 07:46:47 AM by sebastian

  • Description modified (diff)
  • Ready set

comment:15 Changed on 02/06/2018 at 02:24:47 PM by tlucas

  • Review URL(s) modified (diff)

comment:16 Changed on 02/06/2018 at 02:25:21 PM by tlucas

Half of this was done by Erick in his project. We need need to figure out, what is left to be done.

comment:17 Changed on 11/15/2018 at 03:50:58 AM by erikvold

  • Cc erikvold added

comment:18 Changed on 11/15/2018 at 03:57:19 AM by erikvold

In Issue #7127 we are trying to replace the tools provided in the buildtools repo with npm scripts, so I think this ought to be blocked by that issue.

comment:19 Changed on 11/27/2018 at 12:32:21 PM by greiner

  • Blocked By 7127 added
  • Cc greiner added
  • Component changed from Automation to User-Interface
  • Priority changed from P3 to Unknown
  • Ready unset

Agreed. Also changing its module to User-Interface as that's where the new script will be located for now.

comment:20 Changed on 09/05/2019 at 02:53:16 PM by greiner

Closing this ticket in favor of ui#569.

comment:21 Changed on 09/05/2019 at 02:55:30 PM by greiner

  • Resolution set to duplicate
  • Status changed from new to closed

Add Comment

Modify Ticket

Change Properties
Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from (none).
 
Note: See TracTickets for help on using tickets.