Saturday, 23 February 2013

Ancient Greek dictionary (LSJ)

A new implementation of the Liddell, Scott, Jones (LSJ) Ancient Greek dictionary.

A number of features have been added to enhance translatum's wiki implementation; for example diacritics-insensitive live search, random ancient Greek quotes with English translations, semantically arranged data and Indexes with different transliterations and forms to match every taste, from the scholar to the layman.

Thursday, 7 February 2013

Export mediawiki blobs into text



Mediawiki's default database format is blob, even for textual elements. This can be an issue when you want to export and process the textual data.

Luckily, there is a query that will allow you to export into text. All you need is a database management tool like PhpMyAdmin. To export mediawiki articles from blob to text run the following line:

SELECT CAST(old_text AS CHAR CHARACTER SET utf8) FROM text

Magic provided by courtesy of the MySQL command CAST().

First published in translatum

Convert blobs in an Access database into text


I had an issue with a database with text encoded as blobs (all I could see was "long binary data" in each cell) so I had to find a solution. Just a note here: if your blobs are in effect images or other type of non-textual binary data then the proposed solution will simply give you a string of meaningless characters.

One easy way is to open the .mdb database with a good text editor (like Notepad++, I personally use EmEditor but it is commercial). Make sure you guess the encoding right (UTF-8 would be a good one to try). What you might get is text separated with little square symbols. If this is good enough you can copy the text. If this is not good enough, then you need to follow a different procedure.

Step 1. Convert .mdb database to MySQL database. Use the free tool Access To MySQL
Step 2. Use a tool like PhpMyAdmin (available in most server configurations or in local server installs like Wamp).

Construct your query
SELECT CAST(column_name AS CHAR CHARACTER SET utf8) FROM table_name

Replace column_name and table_name with the respective column and table of the blobs.
Run your query. Use export as csv to export the results.

Magic provided by courtesy of the MySQL command CAST().

First published in translatum