Opened 18 months ago

Last modified 18 months ago

#6471 new change

Allow small tags in translation strings

Reported by: juliandoucette Assignee:
Priority: Unknown Milestone:
Module: Sitescripts Keywords:
Cc: kvas, jsonesen, ire Blocked By:
Blocking: Platform: Unknown / Cross platform
Ready: no Confidential: no
Tester: Unknown Verified working: no
Review URL(s):

Description (last modified by juliandoucette)

Background

We allow anchor tags and span tags inside strings e.g. {{ string_id[string_context] This string has an <a href="example.com">anchor</a> in it. }}

What to change

Allow <small> tags inside strings too.


FWIW we could probably allow all phrasing content inside these strings.

This is relevant to the adblockplus.org homepage. I'm currently implementing <small> using the semantically inferior <span class="small"> because <small> will throw an error.

Change History (11)

comment:1 Changed 18 months ago by kvas

Sounds reasonable in general, but this is probably a bit of an overkill:

FWIW you should probably allow all ​phrasing content inside these strings.

Does anyone actually need things like <script>, <iframe>, <input>, <canvas>, etc. in the translation strings?

We can add more tags to this ticket now at pretty much no cost, but I'd rather discuss which ones we are likely to need instead of delegating to the phrasing content list (I see why it makes sense logically but the actual list seems to big for our purposes).

Or alternatively we could think about making this per-site configurable but that will probably not come at "pretty much no cost".

comment:2 Changed 18 months ago by juliandoucette

Agreed. I trust your judgement regarding which tags are/not necessary. You may proceed using your judgement or ask for a list if you prefer?

comment:3 Changed 18 months ago by juliandoucette

  • Description modified (diff)

comment:4 Changed 18 months ago by kvas

I suppose we can also allow the following tags inside translation strings: <abbr>, <b>, <bdo>, <br>, <cite>, <code>, <data>, <del>, <dfn>, <em>, <i>, <ins>, <kbd>, <mark>, <math>, <meter>, <output>, <q>, <ruby>, <samp>, <small>, <span>, <strong>, <sub>, <sup>, <svg>, <time>, <var>, <wbr>.

I excluded the following ones (but I'm open to input):

  • <audio>, <embed>, <iframe>, <img>, <object>, <video>, - Needs to reference an external resource, too multimedia-ish and complicated (although maybe <img> could work in limited cases).
  • <button>, <datalist>, <input>, <label>, <select>, <textarea> - Are for forms. I don't think we want to translate forms so that controls are inside translations things. I'm not 100% sure on this though. Opinions?
  • <canvas> - Used for JS animation and image editing. Unlikely to be in a translation string.
  • <command>, <keygen> - deprecated.
  • <noscript>, <script> - related to JavaScript.
  • <progress> - for dynamic UI features.

comment:5 follow-up: Changed 18 months ago by ire

I think a good starting point could be to look at our website-defaults reset file (specifically lines 24 to 57). That technically contains all the elements that we have basic support for.

We can further restrict that larger subset to just the typography-related elements, which will leave us with the following:

h1, h2, h3, h4, h5, h6,
a, p, span,
em, small, strong, sub, sup,
strike, s, mark, del, ins,
abbr, dfn,
blockquote, q, cite,
code, pre,
kbd, samp, var, output, ruby,

Even then, we ignore the last line.

comment:6 in reply to: ↑ 5 ; follow-up: Changed 18 months ago by juliandoucette

Replying to ire:

I like your idea about basing the list off of website-defaults. But I'm skeptical about including usually block level elements in this list.

Replying to kvas:

I would add the following:

  • img (could be an icon)
  • button (we could want to place text before or after a button in different languages)
  • input (same reason as button because input can be button)
  • select (same reason as button)

comment:7 in reply to: ↑ 6 Changed 18 months ago by ire

Replying to juliandoucette:

I like your idea about basing the list off of website-defaults. But I'm skeptical about including usually block level elements in this list.

What's your reservation?

comment:8 follow-up: Changed 18 months ago by juliandoucette

What's your reservation?

I don't expect blocks to occur in the middle of strings e.g.

{{ string_id[string context] This is some text

<h1>With a heading in the middle</h1>

Woo! }}

(I expect each block to contain one or more strings.)

Last edited 18 months ago by juliandoucette (previous) (diff)

comment:9 in reply to: ↑ 8 ; follow-up: Changed 18 months ago by ire

Replying to juliandoucette:

What's your reservation?

I don't expect blocks to occur in the middle of strings e.g.

{{ string_id[string context] This is some text

<h1>With a heading in the middle</h1>

Woo! }}

(I expect each block to contain one or more strings.)

I understand.

I would think the same logic would apply to the form elements though, e.g.:

{{ string_id[string context] This is some text

<select>
    <option>Option 1</option>
</select>

Woo! }}

I think in both examples, it would make more sense to separate the strings rather than try to join them, even though the select is an inline element.

That said, I think the cms should allow as much as possible, and allow us (frontend devs) to decide what makes sense on a case by case basis. Is there a reason that the translation strings need to be so restrictive in the first place?

comment:10 in reply to: ↑ 9 ; follow-up: Changed 18 months ago by juliandoucette

Replying to ire:

Good point. And good question.

On second thought, I don't think select should be added to this list.

I think that we want to keep strings as simple as possible so that translators do not screw them up. HTML tags are replaced with variables in crowdin and can be forgotten IIRC.

(kvas will you confirm and weigh in?)

comment:11 in reply to: ↑ 10 Changed 18 months ago by kvas

Replying to juliandoucette:

Replying to ire:

Good point. And good question.

On second thought, I don't think select should be added to this list.

I think that we want to keep strings as simple as possible so that translators do not screw them up. HTML tags are replaced with variables in crowdin and can be forgotten IIRC.

(kvas will you confirm and weigh in?)

Perhaps we could go with my list from comment 4 + all non-block tags from Ire's comment 5 + <img>, <button>, <input>. I'm also happy to exclude anything from this list if we're concerned that this could lead to issues in translation, or we can add something. In general anything that's not likely to cause translation issues or be a security vulnerability seems fine to me.

Note: See TracTickets for help on using tickets.