How to find tweets in a specific language?
That’s an issue for many Twitter users, including language learners and native speakers of other languages. Because of the dominance of English on the web, it’s easy to find English tweets. But finding tweets in other languages is not so straightforward.
One solution could be to tag each tweet with a language code. Using IANA’s existing language codes seems an ideal solution for compatibility and ease of recognition. This coding system is used widely on the web and in its basic form uses a two-letter code for each language. For example, the code for English is en, the code for Maori is mi.
It would be possible to use a tag prefix symbol in front of each code so that we could search for tweets in that language. But we need to use a different tag prefix than is currently used for tweet topics. Ideally, we could use just a single non-alpha character: where # is used for topic tags, we could use something like the percent sign. So, a tweet in modern Greek might be tagged with %el.
It might also be useful to flag tweets written in a non-standard language character set. For example, because of the limitations of some Twitter clients, we might want an additional symbol to tag a tweet that it is in Greek but which is transliterated into an English character set. Eg %el!
Since there is a lack of documentation on which specific characters are distinguished by Twitter’s search function, any use of a new tag prefix to denote language will require some trial-and-error testing to ensure it works effectively. If the polyglot community of Twitter users could agree on such a coding system, it would make it much easier to find relevant posts in languages other than English.
Image: Brueghel’s Tower of Babel
It seems that the latest version of Tweetdeck (0.30.3) for the Mac does not appear to work properly with Greek characters. I have a simple ‘Greek verb of the day’ service set up using the Twitter API. It automatically posts an entry each day from a database of 1400 Greek verbs, including the three main tenses and with a translation in English.
In Seesmic Desktop or a browser it appears correctly:
But in Tweetdeck all Greek words are replaced with a ‘pi’ symbol (∏) like this:
Trying to diagnose the problem, I found when using Tweetdeck to compose an update that Greek characters are not recognised at all. That is, you can’t type anything on the keyboard when in Greek text entry mode: this is most unusual for a Mac application. It’s not a problem with AIR as Seesmic desktop seems to handle Greek characters perfectly well.
Whether this is a problem with all non-English characters sets I don’t know: this needs some further investigation. But anyone wanting to use Twitter for language teaching and learning will need to check whether it works with Tweetdeck. If you use the Twitter API to send automated updates, check that these are readable in Tweetdeck. And check that students can use Tweetdeck to post updates. I’ll be posting a note on my rimata homepage advising users that the ‘verb of the day’ may not be accessible using Tweetdeck on the Mac.
The beta version of bing. com, Microsoft’s new search engine, looks fast but it does seem to have a bug in handling language settings in the preferences.
When I first went to the bing site, it quite naturally provided radio buttons to either search the whole web or just in New Zealand. Very sensible.
However, when I went into my preferences and chose to search only in English and Greek, it seems to have decided I’m in Greece and forgotten I’m in New Zealand:
I fixed this by deleting the two bing cookies _FS and _FP in my Safari preferences. So far the faulty setting has not reappeared, so perhaps it only happens when creating a new set of preferences. I’ve since found when I change the search settings this also rewrites the cookies so it is easy to correct.
Another sign that there is some slight confusion between location and language comes when you set the search for Greece and the interface language changes to Greek. You can then change the interface language back to English but this is not properly handled – eg clicking on ‘Help’ or ‘Legal’ results in a page of Greek text and no option to view it in English.
One thing that seems to bother quite a few WordPress users who live out here on the fringes of the civilised world is the greeting in the headers of admin pages: for some reason, ‘Howdy, Paul Left’ seems out of place. Apparently, the greeting is justified because WordPress creator Matt Mullenweg is from Texas. It’s never bothered me too much, but it does seem to bother some users. Perhaps it’s tied up with the sense of annoyance that many users feel in relation to the dominance of English on the web. Certainly, educational users of WordPress (such as teachers incorporating blogging into their courses) might want to customise the software to include a greeting more culturally familiar to their students.
The trouble is, the greeting cannot be changed in the settings as it’s hard-coded into the WordPress source code. It used to be fairly easy to change this text by editing the code, but a couple of versions ago the File Editor was removed from WordPress, along with the ability to edit WordPress code from the admin page. As far as I know the only way to do this now is to edit the code using FTP. If you need to do this, you’ll need FTP access to the WordPress server. These steps work with Fetch on a Mac, but should apply pretty well to other platforms and FTP clients:
- Using the FTP client, go to the folder ‘wp-admin’ and locate the file ‘admin-header.php’
- To be safe, take a copy of this file by downloading it (‘get’) and saving to your hard disk.
- Right-click on the file name and choose ‘Edit’. This should download the file and open it in a text-editor
- Find the text ‘Howdy’ and replace it with the desired greeting. For example, here in New Zealand I changed the greeting to ‘Kia ora’. In WordPress 2.7.1 this should be in line 108.
- Do not change any other text in the file, just that one word! If you make a mistake, use undo to put it back the way it was, and do not save it back to the server unless you are absolutely sure you have made the correct change.
- Try reloading the admin page – hopefully you will see your new greeting. If not, and you’re not sure how to fix it, you will have to undo what you’ve done. This is where the backup of the file you downloaded in step 2 will be handy :-)
By the way, there’s no easy way to include non-English characters using this method: while WordPress can display text such as Καλή μέρα in a post, it can’t easily be included in the source code. There are ways to include such characters but that’s beyond the scope of this post.
Disclaimer: this worked for me, but editing source code is risky and a mistake can render your site unusable. Please be careful – I accept no responsibility for anything going wrong!
Image: Brueghel’s Tower of Babel