Friday, October 10, 2008

Crawling Australian English

Evolving Identities: The English Language in Singapore and Malaysia by Vincent B. Y. OoiVincent Ooi, Associate Professor, National University of Singapore gave an entertaining and informative seminar on "The 5-Concentric Circles Model & the Australian English Dictionary", at the ANU in Canberra, 10 October 2008. I was a little intimidated when I arrived as everyone else at the seminar had a mug in front of them with "Oxford Dictionary Project" on it. As an IT person who can't write without a computer spell checker, it was daunting to be surrounded by the people who define the English language. However, Vincent kept the talk relatively jargon free and very relevant to the everyday.

As I understood it the argument was that dictionary makers tend to use pejorative terms such as "Slang" for Australian English words. In many cases words which are labelled as "colloquial" are in use by respected sources such as the editorials of broadsheet newspapers. Vincent illustrated a diglossia as a series of concentric circles with core English in the middle, and the the Australian expressions around in two circles, with the more accepted Australian English words in the middle region and the more colloquial around the outside.

However, I would prefer to see something like overlapping clouds of statistical probabilities, than concentric circles. A cloud would represent the likely hood of a particular word being used by a particular person. At the large scale the clouds would form groups, which might relate to nationality, but would also take into account other factors.

People such as David Hawking, Chief Scientist at Funnelback have spent decades on analysis large amount of online (and offline) text for use in search engines. This approach could be applied to dictionary making in a similar way to the use of automated machines for the It seems to me that large scale analysis of words, much like was done for sequencing the human genome. Rather than people spending decades carrying out manual analysis of a few thousands occurrences of words, the computers could work on billions of words in days.

Current dictionary work appears to be organised as a cottage industry, much as human genome sequencing was. However, whereas the human genome does not change quickly, the English language is probably changing faster than the dictionary makers can cope with using manual processes.

At the practical level I would like to have a dictionary which could advise me what words to use when writing to a specific group of people. As an example, if my ANU class in 30% Chinese mainland educated and 70% Australian secondary school, then what specific set of words can I use that they have in common to a specific level of probably?

At present I am designing desks for computer classrooms and I can refer to analysis to show the size the desk needs to be to fit the 99 percentile of a particular population of students. Can I do something similar for the language to suit that group of people? Can the dictionary tell me what the probability a particular population will be understand what I have written?

Some years ago I attempted to find a basic English dictionary which I could use to write courses which could be understood by English speakers internationally. What surprised me was that while there was talk of such basic English, there seemed no common agreement as to what it was nor any analytical basis to a set of criteria to decide on it, outside a few very technical specific areas, such as Simplified English for aerospace. If dictionary makers don't do this it is likely that those who create web search engines will and we will and end up speaking Google English. ;-)

Dr. Ooi mentioned the concentric circle designs of Walter Burley Griffin in his talk as an analogy for word diagrams. Griffin designed Canberra as a series of circles joined by triangles. Perhaps that would make a useful model, where the circles are words and the lines of the triangles are the relationships of the people using them. In the end it is not the words themselves which are interesting but what it says about relationships between people.

No comments: