Tag Archives: greek

Language codes for Twitter

Brueghel's Tower of BabelHow to find tweets in a specific language?

That’s an issue for many Twitter users, including language learners and native speakers of other languages. Because of the dominance of English on the web, it’s easy to find English tweets. But finding tweets in other languages is not so straightforward.

One solution could be to tag each tweet with a language code. Using IANA’s existing language codes seems an ideal solution for compatibility and ease of recognition. This coding system is used widely on the web and in its basic form uses a two-letter code for each language. For example, the code for English is en, the code for Maori is mi.

It would be possible to use a tag prefix symbol in front of each code so that we could search for tweets in that language. But we need to use a different tag prefix than is currently used for tweet topics. Ideally, we could use just a single non-alpha character: where # is used for topic tags, we could use something like the percent sign. So, a tweet in modern Greek might be tagged with %el.

It might also be useful to flag tweets written in a non-standard language character set. For example, because of the limitations of some Twitter clients, we might want an additional symbol to tag a tweet that it is in Greek but which is transliterated into an English character set. Eg %el!

Since there is a lack of documentation on which specific characters are distinguished by Twitter’s search function, any use of a new tag prefix to denote language will require some trial-and-error testing to ensure it works effectively. If the polyglot community of Twitter users could agree on such a coding system, it would make it much easier to find relevant posts in languages other than English.

Image: Brueghel’s Tower of Babel

Tweetdeck problem with Greek text

It seems that the latest version of Tweetdeck (0.30.3) for the Mac does not appear to work properly with Greek characters. I have a simple ‘Greek verb of the day’ service set up using the Twitter API. It automatically posts an entry each day from a database of 1400 Greek verbs, including the three main tenses and with a translation in English.

In Seesmic Desktop or a browser it appears correctly:

seesmic-in-greek

But in Tweetdeck all Greek words are replaced with a ‘pi’ symbol (∏) like this:

Tweetdeck in Greek

Trying to diagnose the problem, I found when using Tweetdeck to compose an update that Greek characters are not recognised at all. That is, you can’t type anything on the keyboard when in Greek text entry mode: this is most unusual for a Mac application. It’s not a problem with AIR as Seesmic desktop seems to handle Greek characters perfectly well.

Whether this is a problem with all non-English characters sets I don’t know: this needs some further investigation. But anyone wanting to use Twitter for language teaching and learning will need to check whether it works with Tweetdeck. If you use the Twitter API to send automated updates, check that these are readable in Tweetdeck. And check that students can use Tweetdeck to post updates. I’ll be posting a note on my rimata homepage advising users that the ‘verb of the day’ may not be accessible using Tweetdeck on the Mac.