• No results found

Marking texts to translate

In document The Flask Mega-Tutorial (Page 102-105)

Now comes the most tedious aspect of this task. We need to review all our code and templates and mark all English texts that need translating so that Babel can find them. For example, take a look at this snippet from function after_login:

if resp.email is None or resp.email == "":

flash('Invalid login. Please try again.') redirect(url_for('login'))

Here we have a flash message that we want to translate. To expose this text to Babel we just wrap the string in Babel's gettext function:

from flask.ext.babel import gettext

# ...

if resp.email is None or resp.email == "":

flash(gettext('Invalid login. Please try again.')) redirect(url_for('login'))

In a template we have to do something similar, but we have the option to use _() as a shorter alias to gettext(). For example, the word Home in this link from our base template:

<li><a href="{{ url_for('index') }}">Home</a></li>

can be marked for translation as follows:

<li><a href="{{ url_for('index') }}">{{ _('Home') }}</a></li>

Unfortunately not all texts that we want to translate are as simple as the above. As an example of a tricky one, consider the following snippet from our post.html subtemplate:

<p><a href="{{ url_for('user',

nickname=post.author.nickname) }}">{{ post.author.nickname }}</a> said {{ momentjs(post.timestamp).fromNow() }}:</p>

Here the sentence that we want to translate has this structure: "<nickname> said <when>:". One would be tempted to just mark the word "said" for translation, but we can't really be sure that the order of the name and the time components in this sentence will be the same in all languages. The correct thing to do here is to mark the entire sentence for translation using placeholders for the name and the time, so that a translator can change the order if necessary. To complicate matters more, the name component has a hyperlink embedded in it!

There isn't really a nice and easy way to handle cases like this. The gettext function supports placeholders using the syntax %(name)s and that's the best we can do. Here is a simple example of a placeholder in a much simpler situation:

gettext('Hello, %(name)s', name=user.nickname)

The translator will need to be aware that there are placeholders and that they should not be touched.

Clearly the name of a placeholder (what's between the %( and )s) must not be translated or else the connection to the actual value would be lost.

But back to our post template example, here is how it is marked for translation:

{% autoescape false %}

<p>{{ _('%(nickname)s said %(when)s:', nickname = '<a href="%s">%s</a>' % (url_for('user', nickname=post.author.nickname), post.author.nickname), when=momentjs(post.timestamp).fromNow()) }}</p>

{% endautoescape %}

The text that the translator will see for the above example is:

%(nickname)s said %(when)s:

Which is pretty decent. The values for nickname and when are what gives this translatable sentence its complexity, but these are given as additional arguments to the _() wrapper function and are not

seen by the translator.

The nickname and when placeholders contain a lot of stuff in them. In particular, for the nickname we had to build an entire HTML link because we want this nickname to be clickable.

Because we are putting HTML in the nickname placeholder we need to turn off autoescaping to render this portion of the template, if not Jinja2 would render our HTML elements as escaped text. But requesting to render a string without escaping is considered a security risk, it is unsafe to render texts entered by users without escaping.

The text assigned to the when placeholder is safe because it is text that is entirely generated by our momentjs() wrapper function. What goes in the nickname argument, however, is coming from the nickname field of our User model, which in turn comes from our database, which can be entered by the user in a web form. If someone registers into our application with a nickname that contains

embedded HTML or Javascript and then we render that malicious nickname unescaped, then we are effectively opening the door to an attacker. We certainly do not want that, so we are going to take a quick detour and remove any risks.

The solution that makes most sense is to prevent any attacks by restricting the characters that can be used in a nickname. We'll start by creating a function that converts an invalid nickname into a valid one (file app/models.py):

import re

class User(db.Model):

#...

@staticmethod

def make_valid_nickname(nickname):

return re.sub('[^a-zA-Z0-9_\.]', '', nickname)

Here we just take the nickname and remove any characters that are not letters, numbers, the dot or the underscore.

When a user registers with the site we receive his or her nickname from the OpenID provider, so we make sure we convert this nickname to something valid (file app/views.py):

@oid.after_login

def after_login(resp):

#...

nickname = User.make_valid_nickname(nickname) nickname = User.make_unique_nickname(nickname) user = User(nickname=nickname, email=resp.email) #...

And then in the Edit Profile form, where the user can change the nickname, we can enhance our validation to not allow invalid characters (file app/forms.py):

class EditForm(Form):

#...

def validate(self):

if not Form.validate(self):

return False

if self.nickname.data == self.original_nickname:

return True

if self.nickname.data != User.make_valid_nickname(self.nickname.data):

self.nickname.errors.append(gettext('This nickname has invalid characters. Please use letters, numbers, dots and underscores only.')) return False

user = User.query.filter_by(nickname=self.nickname.data).first() if user is not None:

self.nickname.errors.append(gettext('This nickname is already in use.

Please choose another one.')) return False return True

With these simple measures we eliminate any possible attacks resulting from rendering the nickname to a page without escaping.

In document The Flask Mega-Tutorial (Page 102-105)

Related documents