About this site

Danny's Weblog

Asian languages -- miscellaneous

This section is for my observations about Asian languages other than those for which I have a separate individual folder, such as Cambodian and Thai, or for observations which are relevant to multiple Asian languages.

Additionally, there are some articles on the study of English as a foreign language for Asian students.


2007 Nov 11 [ Sun ]

Interesting stress issue and "Rosetta Stone" language-learning products

A couple of minutes ago I was randomly searching Slashdot and heard an ad for a language-learning product on TV. I was very interested of course, if only because I've never heard such an ad before.

What really caught my attention was that the gamma presenting the ad pronounced the name of the product – "Rosetta Stone" – with the wrong intonation; with the stress on "Stone", not on the second syllable of "Rosetta" – as if it were a name like "Carmen Sandiego".

I'm too lazy to check out the product's website when it seems to have such a braindead attitude to English. Who knows, perhaps some marketing wizard thought it would be fun to appropriate the name of one of mankind's most important archaeological finds as an animated character to personify the product, like Clippy.

[Single-story view] [/Asia/Language-misc] [permanent link]
Responses: 2
Name/Blog: Andrew
URL: http://www.PacificPrime.com
Title:
Comment/Excerpt: Hilarious...I've lived in Hong Kong, Tokyo, Taiwan, and China. In all of these places, it's very easy to see this phenomena. Advertisements for English language learning products when the advertisement itself is in broken English.
Name/Blog: The Boss
URL: http://www.panix.com/~dannyw/weblog/
Title:
Comment/Excerpt: Andrew: You may have been misled by finding the above posting under Asia/Language-misc/. I didn't say so in the posting, but I was in England at the time I saw the TV ad, so the mispronunciation was particularly egregious. It makes me wonder whether the management of the "Rosetta Stone" company are not themselves native English speakers. Your own point is certainly true of Cambodia. It was common to see display ads for language schools in the Cambodian-language newspapers that were mostly in bad English. It was often possible to see that the school had let a native speaker fix most of the text on some previous occasion, but had inserted some necessary updates which were unedited. [View/add responses]
2007 Jul 05 [ Thu ]

French gender issues

This posting is of interest only to French speakers or students of the language.

Here are a few examples of interesting gender choices that I jotted down from recent programs on TV5. I am not much of an expert on French, but I've gotten into it lately.

1. aucun doute c'est elle le chef

My impression is that French usage has been tending towards some sort of politically-correct development: female people's roles are stated using masculine nouns, eg "elle est acteur", not "elle est actrice". I can't find a feminine form of "chef", but I don't think it would have been used.

Actually this example is mostly interesting because of the convoluted word order, plus the omission of "que". Presumably the idea they wanted to stress was the "elle", and the normal way to achieve that in spoken French (since they don't use intonation for that purpose) is to form the sentence such that the stressed item is at the end, thus:

"il n'y a aucun doute que le chef, c'est elle"

I don't know why the speaker chose the word order he did. Maybe I don't understand French stress specification properly. Or maybe they feel that the "que" (even if omitted) needs to introduce a well-formed sentence, and the form "le chef, c'est elle" is poorly-formed.

2. les equipes de joueurs feminines

I'm pretty sure I've noted that down correctly, although it would still be interesting if the actual form had been "les equipes de joueurs feminins".

In other words, the speaker was trying to be PC by referring to these players as "joueurs" not "joueuses". but then got mixed up (it seems to me) in picking the gender of the adjective "feminin". After a moment's thought, it's clear to me that it's sometimes indeed required to use the form "feminin", for instance "joueuse, c'est un mot feminin".

But perhaps there's some other usage in French that I'm not aware of: if the person referred to is female, then the adjective is female, even if the word for that person is male gender.

Or possibly the speaker was trying to apply the adjective to the female-gender word "equipe".

3. quelquechose de plus serieux

"chose" of course is feminine, and the adjective "serieux" is masculine, so what's going on here? I think the "de" is seen as introducing some separate idea from the "quelquechose" and this new idea defaults to masculine gender even though no noun is actually stated.

2007 May 17 [ Thu ]

Interesting intonation issue in French

Because the French was so easy to understand, I spent an hour or so watching the inauguration of the new President of France, Nicholas Sarkozy, yesterday. (How does a man who looks like he's typecast as the Prince of Darkness ever get elected? Oh well, you can't go by appearances. George Bush looks like Alfred E. Neuman in a suit.)

I was struck by several things. One – as the commentators referred to explicitly several times – was that the security on Sarkozy seemed to be a lot less tight than on Bush. (When I lived in DC, Bush had more security – eg multiple helicopter gunships – than Lord Vader.)

Another was that in referring to the relationship between I think Chirac and Angela Merkel, they said "ils se tutoyaient en anglais". That seemed a funny usage to me, although it wasn't picked up on. That phrase means "In English, they spoke to each other using the French familiar "tu" form for "you"." Presumably, that word "tutoyer" must be intended to mean "use familiar forms" like "Jacques, you dock, how's it hengink??" "Angela, sweet sing, oo's yo daddee?"

Incidentally, in a subsequent programme I heard Merkel speak, and she seemed to speak in a rather colloquial German even in a major public speech: she pronounced "Aufgaben haben" as "Aufgam ham", which is the sort of thing one expects from the inhabitants of Entenhausen.

But the most startling thing – one of the things that remind me that after all French is a different language – was the intonation used on the following sentence (marked with underscores):

Un president _part_, un autre s'in_stalle_.

Of course, French is famous for having very limited use of intonation, but this takes the cake. In English, this phrase would be:

_One_ president _leaves_, the _oth_er moves _in_.

...Hmmm... now that I write this down the preferred intonation in English doesn't seem as clear as it was when I was watching TV. Anyway, the point is that French just puts prominence on the final syllable of various types of word group, and it would seem can therefore not naturally put prominence on the subject of either clause in the example (without adding "lui" two times which would sound pretty lame).

2007 May 01 [ Tue ]

Word-final consonant pronunciation in French, English, Cambodian...

About 50 years ago I started learning French. Despite being one of the best in the class, I never really was able to follow conversation in French. I could read it, write it and speak it OK, but my comprehension was very poor.

After a while I realized that most of the problem was probably the very frequent elisions and liasons: ie, they don't say "le homme" they say "l'homme", and they don't say "les femmes" pronouncing the "s" on the end of "les", but they do pronounce the "s" on the end of the "les" in "les hommes". en.wikipedia.org [http://en.wikipedia.org/wiki/Liaison_(linguistics)]

I eventually got much, much better at German conversation and decided I wasn't cut out for French. For years I used to say "Je regrette, monsieur, j'parle pas francais".

Since I moved to my current apartment, though, I've been watching a lot of French tv, because I'm too cheap to pay the landlord's gouging price for cable and French TV5 is the only European channel that's broadcast in PP.

Initially I didn't bother watching anything without subtitles (I actually preferred subtitles *in French* because I felt a little bit virtuous about watching tv if I was learning something). But I started trying to follow more and more programs without them. And quite recently something clicked; although my vocabulary is still rather unsatisfactory (for instance I just learned today that you can use "faire" to mean "look", eg "cette jupe fait bien") I am quite suddenly able to detect the elisions and liasons in real time and at least understand the words I already know.

I have started to realize that the issue of changing the pronunciation at the end of words is common to learning many different languages. The most obvious case is Thai, where you just have to learn that a final (written) L is pronounced N, along with many other cases more or less wacky.

Cambodian has the problem in a different way: final consonants are pronounced "unreleased" (which is why I recently added the symbol for that to my phonetic font) and in theory that should allow you to generate the correct sound for each final consonant (although I have my doubts about final affricates, which to my ear sound exactly like "k"... and when I was completely mispronouncing one word as ending in "t" when it actually ended in "k" my pronunciation was immediately accepted with no comment at all, so I don't think *their* hearing is all that magical either).

German indeed alters the sound of final consonants, but it does it by a clear phonological rule (converting voiced consonants to their unvoiced form) so it was never a problem for me.

The most complex case I know is probably English, because English has much stronger intonation effects than other languages. Leaving aside some major terminology and definition issues with the concept of "tone units", I can say that when we pronounce English words continuously – without gaps – we pronounce word-final consonants in an unreleased form, but before a pause we release the consonant. (I am using the term "release" to cover all such effects, although I think most texts use it to refer only to plosives. For instance, try saying "this man is a fool" and then just "man" by itself.)

Since the unreleased and released versions of words seem to my ear to sound so very different, it is a mystery to me how foreign learners manage to correctly analyze English words when they have no understanding of this effect much less the practical ability to generate or hear it. (It seems to me it should be at least as big a problem for them as the absence of gaps between words in written Thai and Cambodian is for me.)

Oh well. My conscious grasp of the elision/liason issue seems to have helped me not at all to grasp French. What actually helped was just a lot (hundreds of hours!) of watching TV, plus a tip I'm happy to pass on: turning up the treble to max and the bass to min improves the clarity of any speech I've tried it on. On the other hand, it may not make any difference to you if you're a few decades younger than me!

2005 Oct 14 [ Fri ]

Blackboard and Web CT merge

These are the two biggest commercial on-line-learning packages. The merger was announced on Oct 12.

I've talked about this kind of software, in the context of Asian-located schools teaching English, a couple of times already:

www.panix.com [http://www.panix.com/~dannyw/weblog/Asia/Language-misc/onlinelearn02.html]

www.panix.com [http://www.panix.com/~dannyw/weblog/Asia/Language-misc/onlinelearn01.html]

Here's the Slashdot discussion: slashdot.org [http://slashdot.org/article.pl?sid=05/10/12/2053257]

I can certainly sympathize with the posters who complained about Blackboard software. When I used it (back in 2001) it was slow, even when I was the only user on a test server. On a real server with perhaps 200 simultaneous users, even with recommended hardware like a 1-GB webserver and a 1-GB db server, it was horrendously slow. (I remember how surprised I was when I discovered that real web apps are not re-entrant, so every time a webpage is loaded 20-30 MB of RAM is gulped up, and then it sits around.) (I haven't used BB since that time but I believe these remarks still hold.)

Blackboard was trying to avoid running code on the client as much as possible (even though they didn't really support random browsers) so everything required a roundtrip to the server and back. They were always wiffling about providing a special local client, and I saw one demo'd, but as far as I know never released it.

I happened to see a new Microsoft product in a Khmer magazine, Microsoft Student: www.microsoft.com [http://www.microsoft.com/student/default.mspx]

Blackboard has had an "alliance" with Microsoft for years: slashdot.org [http://slashdot.org/comments.pl?sid=165110&cid=13782132]

Although the description of MS Student does not make it clear, it uses "templates" for homework assignments. Presumably course designers could create these templates for on-line courses, but since the templates *rely on Microsoft Office* they would lock every student into Microsoft products.

It's not clear what the online functionality of MS Student really is. But it could easily get some sort of "MSN Messenger" type of module which would handle all kinds of server interactions a lot more smoothly than the web/cgi stuff Blackboard relies on.

In other words, the usual "embrace and extend" strategy.

2005 May 28 [ Sat ]

More useful links for on-line learning/content management systems

I haven't had time to read these links yet but they all look good. In particular, I was not aware that there was some sort of standard for courseware format: SCORM. It would be great if the courseware you develop, or buy, could easily be ported to another system if you wind up wanting to migrate.

A survey of various systems and how to select one. Boringly written but worthwhile and good links. eduforge.org [http://eduforge.org/wiki/wiki/eduforge/wiki?pagename=CriteriaforSelectingLearningContentManagementSystems]

Page comparing interoperability – if you develop a SCORM (or IMS) courseware package with tool A on CMS X, will it really work on CMS Y? Or be editable in tool B? www.reload.ac.uk [http://www.reload.ac.uk/interop.html]

Apparently UNESCO has a "free software portal", and as you might imagine it lists "courseware tools" (not just authoring tools, but CMS systems as well such as moodle, as well as on-line collaboration stuff) software: www.unesco.org [http://www.unesco.org/cgi-bin/webworld/portal_freesoftware/cgi/page.cgi?g=Software/Courseware_Tools/index.shtml&d=1]

A slightly muddled (broken links, missing paras) overview of E-Learning: psychcentral.com [http://psychcentral.com/psypsych/E-Learning]

Article on changing between course mgmt systems: www.educause.edu [http://www.educause.edu/apps/eq/eqm05/eqm05210.asp?bhcp=1]

A very good roundup of articles on choosing a CMS (with eg the link above): www.ibritt.com [http://www.ibritt.com/resources/dc_management.htm]

Blog-style chronological listing of interesting articles on CMS: www.edtechpost.ca [http://www.edtechpost.ca/mt/archive/cat_course_management_systems.html]

Choose Blackboard or Moodle? (They chose Moodle): www.humboldt.edu [http://www.humboldt.edu/~jdv1/moodle/all.htm]

Play with setting up a Blackboard course, to see if you like it: www.blackboard.com [http://www.blackboard.com/courses/index.htm] (they lock it after 60 days, though). Warning: it's shown in teeny tiny type; I think some fairy designed it on a Mac.

A few general caveats:

1. I have not checked, but I imagine few of these packages would even support Thai, much less a very-recently-standardized language like Cambodian. Since I am only considering schools that teach English, I can gloss over that point.

2. Claims should be taken with a pinch of salt, like all software. I know some courseware sold for Blackboard just never worked right with any version of Blackboard that we had installed.

3. Content creators – ie teachers – may well feel that any effort they put into optimizing their on-line material is substantially eroding their job security.

2005 May 25 [ Wed ]

Open-source packages for Asian schools teaching English?

I have been vaguely thinking about installing some sort of content management system in order to play with the features and maybe move from the Blosxom software which currently runs this blog. As I was looking at some of the online learning sw packages, it occurred to me that local English schools in Cambodia, Thailand and elsewhere might use some of this software too.

Of course local schools have the same problem that Western schools have with such things: they don't have technical people to throw at the administration of such a system. So I'm currently looking at how easy they are to set up with some sort of minimal set of features that would be highly useful for such a school without needing much admin.

This Slashdot discussion is worth looking at: ask.slashdot.org [http://ask.slashdot.org/askslashdot/05/05/18/1429257.shtml] It's about "Blackboard-type" software, since Blackboard produces the best-known commercial package, which costs a university typically tens of thousands of dollars per year.

All of the following are at least currently open source.

www.dokeos.com [http://www.dokeos.com/]

Requires a webserver, PHP 4.1 and MySQL. Warning: it doesn't just require PHP 4.1 and above – it relies on "register_globals=on".

moodle.org [http://moodle.org/]

Developed on LAMP (Linux, Apache, MySQL and PHP) but has also been installed on XP, Mac and Netware.

The install info was easier to find than other products and seemed more organized.

dotlrn.org [http://dotlrn.org/]

The "dotlrn.org" package is produced by MIT, and I would be surprised if they ever changed it to a commercial product. On the other hand, MIT is the kind of place that has rocket scientists up the wazoo so the documentation probably assumes you write text documents in EMACS...

Based on AOLServer, TCL and OpenACS (the download is hosted on the ACS server). I could not find documentation on how to install dotlrn itself – only OpenACS (but maybe I looked in the wrong places). Includes a portal system.

My general comments on such systems (based on experience with Blackboard):

1. They wind up putting a very heavy load on hardware. I've seen delays of several seconds when I was the only user on a system. Also, bear in mind that the university environment implies huge spikes in usage (eg right before each test has to be completed).

2. When you develop your own course material, you get locked into that product *very heavily*. You also find that the course developers (teachers) become dependent on vendor-specific features.

3. Even universities with their own IT departments are often happy to go with a fully-hosted package, because web access implies 24x7 reliability which is alien to the more laid-back campus environment.

4. Likewise, administering a database server, which all of these systems rely on, is a whole extra job description by itself. I don't mean administering your CD database: I mean thousands of students doing millions of accesses a day, with the integrity and failsafe features that implies. (One user tried to reinstall Blackboard to fix a bug. He ignored the warning "this will destroy your existing data". Result: a major disturbance in the Force!)

5. When you encourage your staff to develop courses, you are actually getting into a system which is essentially software development. You find yourself wanting CVS-like features so that you can track changes and roll back to previous versions.

6. Once the courses are developed, who actually owns them? Typically they will have been assembled by a succession of professors and graduate students over years. Can one of the professors pull the material into a book? Or plug it right into a course at his *new* job? Suppose you try to sell the course, but one of the professors carelessly used copyrighted material?

7. Before you buy a product, ask the salesman how you back up *and* restore the courses. (Do you need to stop the server first? Hmmm. When do you want to do that? for how long?)

7. Far more than most environments, an educational institution has huge turnover in authorized users. At the very least, you need to have some policy for how to authorize and de-authorize students. Ideally, the software should make it easy to handle this reliably in a single mass update, while allowing any errors to be fixed without massive disruption and overtime.

8. If you use the product a long time you may have an idea for a nifty way to extend it. If you are a university you may have people who do that kind of thing. If you are using an open-source product you can look at the code and try it yourself. If you are using Blackboard, the licence agreement specifically forbids you to examine the database or the code.

2005 Mar 29 [ Tue ]

Countable nouns in English, and dictionaries

Using the definite and indefinite articles correctly in English is very difficult. Learners make mistakes all the time, and most native English speakers are aware that they can't articulate the rules (although presumably they can grok the correct form in individual cases).

One poster on Slashdot (in a thread about the grammar checker in Microsoft Word) stated:

The ground rule is: always put "a" or "the" in front of a singular noun.

The poster may well be aware of this, but (in addition to many other exceptions such as headlines) a major exception is "uncountable nouns", ie nouns like "information", "sugar", "tenacity", which we cannot have 0, 1 or 2... of. (Actually, it's even more complicated: we might use "two sugars" to mean "two samples of sugar from different suppliers", or more likely "two servings of sugar", but that is really forming a plural of *a different sense of the noun*.)

This is non-trivial. When I was working in a translation department in Germany, it was very hard to get the Germans to understand that "Informations" sounds silly, because in German you can easily say "Informationen". Even in English, I find it hard to get people to agree with my point that "data", even though formed from a Latin plural, should be considered a singular noun precisely because it is uncountable: we never speak of "three data". One marker that a text has been created by a non-technical person is the use of "codes" to mean "pieces of software". Technical people use "code" as a non-countable noun but marketing, government and military types have no idea. (Foreigners even say "three softwares" but I've never heard a native English speaker say that.)

Anyway, my point is just that I have *never* seen an English dictionary which specified whether a noun is countable or not: apparently everybody is just supposed to know innately. If I write a book about English for Cambodians, I hope I have the grace to remember this point.

2005 Jan 21 [ Fri ]

Article on learning French may be useful for those learning Asian languages

www.kuro5hin.org [http://www.kuro5hin.org/story/2004/12/29/15258/287]

It's at least interesting because it talks about *why* you should try to learn in a certain way. Most books just shove a certain procedure at you and give no justification at all for why it was chosen (because there isn't one).

As far as I can tell, the best is to repeat what you have learned at these intervals: after 1 day, after 3 days, after 1 week, after 2 weeks, and then again after 2 more weeks, and by this I mean wait one day, relearn, wait three days, relearn, wait a week, relearn, etc. - the intervals start from the last time you touched the cards, not from the initial memorization. This works with any sort of drilling: I used the same technique when doing written exercises with same results - I was usually able to remember whatever it was I was trying to learn after re-doing the exercise several times over, observing the timing rules I've outlined above. I understand that this is the method that Pimsleur uses, too, but I've never actually used any of their products, so I can only judge from the blurbs on the backs of their cassette-tape audiobooks I looked at while in Barnes and Noble.

It has often occurred to me *too* that the process of forgetting is indeed central to the process of forming long-term memories. I feel that the brain, for whatever reason, is mightily disinclined to form long-term memories, and only does so if it detects that you have *tried and failed* to form them on multiple previous occasions.

The above link has several interesting links, in particular to SuperMemo: a piece of software which seems to provide you with flashcards *and* monitor your success rate, so that it tunes the flashcard presentations to tthe parameters of your particular memory: www.supermemo.com [http://www.supermemo.com/english/smintro.htm]

It also reminds me a little bit of Van-Vogt-style flimflam. If I could make it work with Asian character sets I'd probably try it anyway.

2004 Dec 18 [ Sat ]

Slashdot article on uses for computers in language schools

ask.slashdot.org [http://ask.slashdot.org/askslashdot/04/12/17/1842229.shtml]

This thread was a little weak, but if you're thinking about language schools you should look at it.

1. Several posters were against trying to use computers for actual instruction.

2. Several posters suggested using IM, perhaps with audio/video, to connect to a remote language school: ie, if you're teaching Japanese in the US, connect to a Japanese school teaching English.

3. Computers can certainly be used for administration, eg for posting class times etc on the website.

My own idea, which I haven't seen on the thread yet, is that in many cases it is not trivial to set up a computer to use another language, so you should certainly have a part of the course on doing just that. (Many posters to the Thai forums, for instance, plead that although their girlfriend can of course read and write Thai neither of them can figure out how to set up their PC for Thai email.)

2004 Nov 11 [ Thu ]

Article at kuro5hin.org about learning a foreign language

This article is slightly pedestrian and insightless, but still addresses a topic near and dear to my heart quite well: www.kuro5hin.org [http://www.kuro5hin.org/story/2004/11/9/195744/646]

Living in Thailand and Cambodia I have met a lot of English teachers, of whom so far none has convinced me that there are any scientifically plausible studies to *support* any of the confident assertions one sees about how best to teach (or learn) a foreign language.

For instance, one often sees assertions that it is essential to do English instruction in English. Perhaps I'm being naive – perhaps everybody really knows the real reasons anyway, but politely casts a veil over them: the expat teachers are too lazy and dim to learn the local language sufficiently to teach in it, and the local teachers are so clueless that beyond a low level the hapless learners would be wasting their time. (If the local teachers could speak English halfway adequately – say about ten times better than I can speak Thai – they could get a much better job than teaching.)

The kuro5hin article mentions a product I had never heard of: "Pimsleur" tapes, with well-chosen repetition cycles. I tried creating such recordings and they're darned hard work to create, but they were indeed successful. Still, they remind me that one of the problems you have to deal with on minority languages like Thai and Cambodian is that the products available are so limited.

2004 Sep 20 [ Mon ]

The "British National Corpus" -- database of samples of English usage

I've posted before about the rather poor source material available for people choosing a basic set of words for English, or other, language instruction.

Here's a link to a site which is selling a large database of (British) English usage, both in text and spoken form: www.natcorp.ox.ac.uk [http://www.natcorp.ox.ac.uk/using/index.html]

It came up in a Slashdot discussion of a site which provides an interactive interface to the "Corpus". Irritatingly, it uses Flash, for (as usual) no sensible purpose: www.wordcount.org [http://www.wordcount.org/index2.html]

A better link for a facility which searches through the "corpus": pie.usna.edu [http://pie.usna.edu/]

I liked the explanatory text from the above site:

The corpus totals over 100 million words and covers a representative range of domains, genres and registers. The entire corpus has been analyzed and marked up with part of speech (POS) tags. Provenance and other attributes are carefully documented for each text.

I really liked the terms "domains, genres and registers". I should probably know exactly what they mean before I put together my Khmer/English vocab lists. For instance, it occurs to me that certain *phrases* may occur much moire often than some of the individual words *in* the phrases.

2004 Sep 18 [ Sat ]

Chinese TV and learning Chinese; social forces and vocabulary

Watching cable TV in a sober stupor, I happened on a series of programs on CCTV (that means the Chinese-language cable channel) purporting to help English-speaking foreigners learn Chinese.

I was very struck by the choice of vocabulary in the programs I saw. Here's an example:

This memorial archway reflects the virtue of chastity.

The program segment was a visit to some sort of cemetery or other historic area. "Chastity" means women doing what they're told and serving the family of their husband.

There's two issues here. One is obvious: is this really the sort of vocabulary which a beginner should be expected to memorize? I remember watching a show for English learners on Australian TV which seemed equally poorly conceived: all kinds of absurdly specialized and/or irrelevant vocabulary.

Another, much more interesting issue is: maybe this really *is* the sort of vocabulary the learner needs, because it reflects the fact that Chinese society is fundamentally different from Western society and therefore concepts which are bizarre or alien need to be communicated by even beginners. In the instance above, of course, the critical concepts are the importance of loyalty to the family and to one's ancestors. Thus, I imagine, a Chinese family, with whom the hapless student is staying, might frequently refer to various memorials to their ancestors (as a guide to action) or complain about actions which violate "chastity", much as those staying with a Russian family find that their knowledge of swearwords widens remarkably fast.

When I was a lad I needed to tell a German girl that I didn't want to see her any more. Without much style or good taste, I chose to do so while we were on a subway. I stepped out and as I turned to face her she said "Du tust mir Leid". I had a couple of seconds to reply and could have done so, but I simply wasn't sure exactly what she meant. Of course I was familiar with the phrase "Das tut mir Leid" – "that does me sorrow", meaning "I'm sorry". However, her usage gave me the impression that she was saying "You have made me feel sad".

Later I checked and as far as I can tell what she said means exactly the same as "I feel sorry for you" – ie, "I pity you". But because of a slight variation in the grammar, I still don't feel really sure. And that's for a language I know well, which is very closely related to English, with practically the same social structure. How long would it take me to grok the real significance of words like the Chinese for "chastity"?

Oh well. The Westerner in the show who was chatting away in Chinese nodded solemnly as the Chinese gal gave her utterly inadequate explanations to him. He must have understood... right?

A few other quotes from the show:

Nothing can be done properly without established standards.

learn to speak Chinese and make friends everywhere!

This issue also makes me think about Commander Data (the humanoid robot character on Star Trek: The Next Generation). Whenever the show dealt with his attempts to become more human and to fit into human society, I always thought "why isn't there a show where the humans try to figure out the good stuff in Data?". Of course, Data was (at least for most of the series) one of a kind, so he didn't have any kind of "society" of his own. But the real reason is that while the USA has accepted immigrants, and the immigrants have succeeded in establishing their own customs to some extent, the USA never had the concept that it really had anything to learn from the immigrants: it expected them to conform totally, and there has been friction with each new wave until it establishes a power base.

2004 Sep 03 [ Fri ]

Learning to read

A recent Slashdot discussion on a Microsoft document about learning to read had many postings which are relevant to my questions about how to learn a new character set like Thai or Cambodian: science.slashdot.org [http://science.slashdot.org/article.pl?sid=04/09/02/0213247]

For instance, I liked this posting: science.slashdot.org [http://science.slashdot.org/comments.pl?sid=120296&cid=10137106]

2004 Jul 26 [ Mon ]

Puzzling improvement in my Thai

While in Cambodia I have done very little to improve my Thai. Indeed, I noticed that while I was in Cambodia I could hardly summon up the Thai for the simplest of phrases, or write a single word – *any* word. I assumed that I would have a very difficult time on my current trip to Thailand.

To my great surprise I have found my spoken and reading Thai to be almost unnervingly improved. Things like my ex-girlfriend's favorite reading matter "Sell Laugh", a small-format mostly B/W cartoon book reminiscent in tone of Mad Magazine – before I had to struggle to read a few words, and it was months before I could make out the joke of a *single* text-based cartoon: now, although my vocabulary is nowhere near good enough to read most sentences without referring to the dictionary, I can basically scan the sentence into individual words in most cases. If you have not had to learn a language without spaces between words, that may seem utterly trivial to you, but believe me it was a revelation to me.

So I can really use these cartoons to build vocabulary and facility with reading Thai. It's actually fun now to just flip through the book and pick out a cartoon to figure out.

So of course the really interesting thing is *why*? Why so improved? And why *now*?

It's conceivable that there is some sort of threshold of practice and effort in Thai beneath which improvement is imperceptible. But I don't know why I would have gone through that threshold while in Cambodia.

My guess is that the critical thing was actually *reading Cambodian*. If you leave out the matter of figuring out the *tone* of the words you read, a lot of the Cambodian spelling system maps to Thai. Also, I think I may have actually trained my visual system to have a *globally* improved ability to detect differences between – *and* family resemblances among – non-Roman characters.

Another possibility is that while in Cambodia I *did* have a practice of watching Thai TV channels. Now I still don't really understand spoken Thai on TV, but I have been letting it "wash over" me a lot, which may have somehow trained my auditory system to establish a suitable set of phonemes. Additionally, I would pay attention to the Thai subtitles on English-soundtrack movies. Surprisingly often I would find myself able to pick out two or three words at the start of each line, so presumably this was adding up to useful practice.

Recently I have been wondering whether it is actually a waste of effort to try to memorize vocabulary in "one shot" – ie, read a chapter of the textbook with certain words, repeatedly attempt to memorise them until I can reproduce all of them, then move on to the next chapter. Memorization, it seems to me, requires that a datum be *partially or entirely forgotten first* before *permanent* memorization takes place. So maybe it's much more efficient to *eschew* the brutally unpleasant process of *trying* to memorize in one shot, and just do what comes (relatively) naturally: read through the words one time, *expecting* to forget them, and then repeatedly *review* the words over the next *several months*.

This does not address the issue of why certain words seem to be so *particularly* difficult to memorize. For instance, I must have tried to memorize the word "sontagia" (hotel) in Cambodian literally twenty or thirty times before it "took". (Perhaps every time I tried to memorize it, I did so under the wrong circumstances: I needed to use it right away, so I had to pay attention to the overall conversation rather than having spare effort available for memorization.)

One thing I've certainly noticed about Cambodian and Thai is that *all the words tend to sound the same* – they're all (or mostly) monosyllabic and all sound Asian, and I often find that I'm pretty sure the word I want is *one of two or three but I can't remember which*. Also, of course, I find myself using a Thai word in Cambodian and vice versa! One time a Cambodian girl smilingly spotted that I had used a Thai word and pointed it out; I said "wow, so you speak Thai?" and she said no – but apparently they watch enough Thai-language TV to pick up a lot of different usages. (The word was "people" or "person" – "phrachaachon" in Thai and (I think) "phrajeeajon" in Cambodian.) I think over time my skills in that area have improved – my brain is much better at retaining more features of a word than just "sounds Asian".

2004 Jun 19 [ Sat ]

Comparing the efficiency of one language with another

There is, of course, no standard way to *measure* the efficiency of a language. However, I think it's interesting to think about various possible comparisons.

For instance, I'm the kind of boring person who actually *reads* the safety card on an airplane. Before it was reduced to incomprehensible pictograms, it used to have little paragraphs of presumably identical text in several languages. (I say "presumably" because the Arabic version, let's say, might well have included instructions on identifying Jews and thrusting them back into the flames.)

Anyhow it was striking that while the French version was only slightly larger, the Spanish was *way* larger, as was the German.

These languages all have a larger character set than English, so you would think that they would need fewer characters to encode the same information than English, at least if the same font size is used. Of course French has the handicap that they want to use prepositions inside phrases; and German has the handicap of needing declined endings on adjectives.

Another issue is that the *source* language is very probably English. In general, one would certainly expect that to produce a translation which was both accurate and colloquial will take more text to express in the new language thanthe original one. That's why the very slight increase in the size of the French text was impressive to me. The increase in the size of the Spanish text was baffling.

I definitely wonder about the space-efficiency of Cambodian. I definitely find it very hard to positively identify the diacritics in normal text, eg inside a dictionary. A Cambodian, I imagine, must recognize the words from overall context, and very seldom need to peer closely at a single diacritic to resolve between two possible readings. Worse, lower diacritics on one line *regularly* collide with the upper diacritics on the next line. This would be almost unthinkable on a professionally-set book in the West (not of course involving Cambodian-style diacritics, but probably a lower-case "g" and a parenthesis – hmmm, it just ocurred to me why the French probably don't print the accent marks on capital letters).

When I laid out a cheat-sheet of Cambodian characters in uksor jriang and uksor kham fonts, I had to set the font size to 36 pts to be able to see the diacritics reasonably clearly. Of course, the English text was perfectly clear at 9 pts. By the way, this exercise helped me sympathize with the people who allow their page designs to have colliding lines – if you use the default line spacings which are (presumably) guaranteed to avoid collisions, the page layout is appallingly wasteful. I wound up having to use Corel Draw and manually overlay layers.

Certainly, for a learner most of the Cambodian character set seems bafflingly similar. It seems that little thought was paid to making the characters maximally different, to allow the characters to be shrunk to the very minimum for readability. This was probably not uppermost in the minds of the original designers of English letterforms, but legibility at a range of point sizes is certainly a specific aim for Western font designers today.

On the other hand, maybe this is what you get when you have a large character set. It seems like you would be able to encode more words using the same number of characters, but each character needs to be so much larger to maintain legibility that ther is no net gain.

It reminds me of a short sf story by Robert A. Heinlein, in which the hero is recruited by a secret society of geniuses, and is introduced to their invented language, which is tonal; Heinlein has one of his characters specify that the reason is that by encoding tone on top of other features the language can express more ideas in the same time.

But in my (not very expert at all) experience of Thai, it is *not* any more space or time efficient than English. On the contrary: the necessity of encoding the tone accurately – as well as maintaining the essential differentiation between short and long vowels – means that the syllable rate in Thai is rather slow. Additionally, the text representation in Thai of English text does not seem to be markedly smaller than English, despite (again) the much bigger character set. (Sometimes notices may give the impression the Thai *is* smaller, but if you examine them you will see that the Thai text is on the edge of illegibility at a distance for which the English text would be clear at 80% the m-size.)

I don't have a feel yet for the time-efficiency of spoken Cambodian. One thing I *can* say is that the presence of so many glottal stops can give a weird overall tone to spoken Cambodian. There seems to be one particular dubbing voice on Cambodian TV who reminds me strongly of the Martians in the recent movie "Mars Attacks". I get the impression his voice is badly recorded – overlimited – but it must mean something that the way his voice sounds is *accceptable*. When I listen to spoken Cambodian in a live situation, it sounds much more mellifluous, although the staccato clipping of final consonants certainly can make it hard to actually follow.

2004 May 08 [ Sat ]

Lists of basic words a beginner should know

It recently occurred to me that anyone who writes a guide for people learning a language should take some trouble to get a list of comon words together. Books written by Thais and Cambodians are notorious for including ludicrously useless vocabulary.

In other cases, a fundamental mistake is made at some level. For instance, I have bought English-to-Khmer dictionaries which were clearly based on an English-only original. Not only did they have the pronunciation only of the *English* words (useless to me of course), but also, they naturally left out words which are only common outside England – words like "stupa", "moto", "gekko" are absent.

Actually, I am not aware of *any* list, even of English words, which is based on scientifically respectable principles. "Basic English", for instance, popular in the thirties, supposes that if you teach someone "make" and "good" they will know how to express "succeed": www.encyclopedia4u.com [http://www.encyclopedia4u.com/b/basic-english.html]

I think I remember reading a book by Shaw which was rendered in Basic English, as well as using a new simplified English spelling system, and the same objection occurred to me as I was reading it (as a child).

A more recent attempt was made by the Voice of America – "Special" English: www.voanews.com [http://www.voanews.com/specialenglish/article.cfm?objectID=EABF9130-6A40-11D5-841A00508BF9712A]

Overall, it is stunningly discouraging to look through such a list and realize what a small fraction of the words I know in Cambodian, even allowing vague synonyms.

Btw, Smyth's "Colloquial Cambodian" has quite a good selection of vocabulary. He seems to have a good sense of what words to select. Unfortunately the English-Khmer, Khmer-English sections at the back of the book omit many of the words in the Khmer exercises – and are not even consistent with each other. It helps somewhat to buy his also valuable "Practical" dictionary, but since it does not have a Khmer-script section it's hard (or perhaps I should say even harder) to look up a Khmer word to English. www.amazon.com [http://www.amazon.com/exec/obidos/tg/detail/-/0415100089/103-7088114-6834232?v=glance]

The reviews in the above Amazon link dislike his romanization. Personally, I think it's not terrible. They may be confused by the fact that it's based on English, not American English.

Anyway, perhaps the Basic and Special English vocabulary lists would be a fair starting point for someone teaching English or Cambodian. I say starting point because, for example, "Basic" does not include "diarrhea". Or "toilet". Or "bathroom". Oh well, it does have "bath".

2004 Apr 09 [ Fri ]

Learning the Cambodian alphabet, and thinking about literacy

There's a cute expression "ontogeny recapitulates phylogeny". What it means is that when a fetus develops in the womb, it goes through recognisable stages in which its form reflects the progression of evolution.

It has, of course, been shown to be mistaken.

Still, I think it's an interesting analogy to learning a *new* alphabet. That is, having learned *one* alphabet, I am in a sense like an evolved genus, but when I have to learn a new alphabet I recapitulate the forms that the genotype passes through.

Hmm... that analogy is starting to jump the shark.

Anyway, learning Cambodian has certainly made me think about the general issue of literacy and how to promote it in children.

It seems to me most of the argument has been between people who believe that children should be taught individual skills in separate stages, and those (the "look and say" crowd) who believe that children should be taught to recognize complete wordforms.

From learning Cambodian, I would suggest the following:

1. It is essential to have a reasonable vocabulary *before* trying to learn to read. This is of course difficult for the foreigner, for whom a full-immersion course is usually impractical and winds up having to dedicate a lot of time to some sort of romanized version which will be useless later. But not half as difficult as trying to *simultaneously* learn new sounds, a new alphabet and new words

2. Once a *minimum* number of new words have been learned you can start teaching the alphabet. This is where I start to part company from the two major parties in the literacy debate. I think you *can* and *should* start teaching the alphabet before real spoken fluency.

3. Likewise, once the alphabet has been *introduced*, you should start teaching words, and presenting longer sections of text. The reason is that – as the "look and say" people perceived – it is not essential to have digested all the thousand-and-one rules about English spelling in order to learn to recognize words like "this". On the other hand, it is sadistic to try to make a learner learn hundreds of English words as if they were Chinese pictograms. Relatively few spelling rules need to be learned in order to make acquiring and retaining new words far easier.

4. On the other other hand, it is clearly highly inefficient to require the learner to visually puzzle out every single character one at a time in order to read a sentence. That's where I am right now: every time I successfully read a word like "la-or" my tongue protrudes from between my teeth and my brows knit. What I should be doing is developing the ability to scan a line of text and recognize each word from the Gestalt.

5. Incidentally, that perception is a large part of why I am very interested in the idea that the *lack of spaces between words* in languages like Thai and Cambodian may be a major impediment for literacy acquisition.

6. In general then I feel that while *some* capability must be acquired at each level in logical order, *each skill actually neesd the subsequent skills to come to full fruition*. In other words, until you become a *fluent* reader – one who can easily recognize entire words without effort – you can't develop the intuitive grasp of spelling rules which allow you to make a fair stab at pronouncing, or even writing, a new word correctly.

7. Another issue which I am strongly aware of from studying Cambodian is the variations between fonts. I still have almost no success reading uksor mool. The strange thing is I really can't remember ever encountering this issue in English, either from my own childhood or in discussions of promoting literacy, but looking at English fonts objectively I'd say they show at least as much variation as between uksor mool and uksor jriang. Perhaps teachers are so used to font variations that it doesn't occur to them that it could be part of the problem when a learner blanks on a word.

Hmmm... it just occurred to me that when literacy is taught in English, it's taught in stages: printed text, block handwriting and "joined-up" writing. That could be said to be similar to a font variation (but not very).

2004 Mar 06 [ Sat ]

Is "appropriacy" really appropriate?

I was chatting in a bar a few nights ago to a chap who mentioned he'd written a thesis on "appropriacy" in English usage by EFL students. With my inimitable tact, I told him it wasn't a word.

Having done a Google search I guess I have to concede that the word is in use, with "about 4,190" hits.

On the other hand, "appropriateness" has "about 1,080,000".

I did a search for +appropriateness +appropriacy, and I got the impression that "appropriacy" has slithered into English via French, where "appropriete" is standard. It may be struggling to establish a beachhead in the field of language teaching, by carving out some territory from "appropriateness" in the narrow sense of "linguistically correct under the circumstances", as opposed to "while not picking one's nose".

Well, this is about as close to an apology as he can expect.

2004 Mar 01 [ Mon ]

Data structures used for dictionaries

I have repeatedly encountered cases where words can be found in one half of a dictionary but not the other half. For instance, Tuttle's "Practical Cambodian Dictionary" (Smyth and Kien) shows "k'yol" as meaning "wind, breeze" but has no entry under "breeze".

It seems to me that I would endeavor to set up a *single* database from which the English-Cambodian and Cambodia-English sections could be set up with *identical* information.

Ie, suppose we have English words A-D and Khmer words K-P.

We might have the following "relationships":

A-101-K

A-102-L

A-103-P

B-104-K

C-105-L

C-106-P

C-107-M

D-108-N

D-109-O

and it should be possible to invert them in software to reach:

K-101-A

K-104-B

L-102-A

L-105-C

M-107-C

N-108-D

O-109-D

P-103-A

P-106-C

I've numbered each "relationship" to make it clear that some data needs to be attached to the *relationship* not just the word, eg for the English word "bridge" we would need to have different relationships for the sense "structure spanning a gap", "card game" and "dental attachment", etc.

This is not trivial – in fact as I write the above paragraph I have become aware that there is not a one-to-one relationship between a "relationship" in my above sense and "sense" in my above sense. And in writing that sentence I am reminded of a TV play by I think Tom Stoppard, in which a philosopher is droning on at an international conference, and the harrassed interpreters have to convey in French the distinction he is trying to draw between "what I mean" and "what I want to say".

Anyway, I still think it's doable.

Another interesting possibility which I have never seen in the wild is a *three-language* dictionary. What do I mean by *that*?

I have been struck by the problems of people whose mother tongue is a minority language. For instance, I recently showed a Spanish guy who was translating between Russian and Khmer my copy of Tuttle, and he was very interested, because of course nothing comparable exists in Spanish.

It seems to me that if you had dictionaries for multiple languages set up by this technique, you could automatically generate dictionaries for the other pairs – except that the *explanatory* text would have to remain in English, unless it were manually translated. (That's why I call this idea the "three-language dictionary" – eg for Spanish people to look up Khmer, but with explanations in English) Of course this would multiply the possibilities for confusion and error, but it might well be extremely better than nothing.

2004 Feb 25 [ Wed ]

Observations in an English-language class

Last night I sat in with an English-language teacher at a school which will remain nameless. (If I didn't think these notes represented most schools I wouldn't post them.)

1. The teacher wore a short-sleeve check shirt, black sneakers and no tie. (And pants – don't worry.) He was uniformly positive, enthusiastic and engaging: I'm sure the class loved him.

2. The room was small and very echoey. I found the teacher's words a little garbled myself.

3. There was a wall air conditioner (it looked like the compressor was fitted outside, Thai-style). There was also a fan, which was seemingly on full, and quite noisy.

4. The teacher gummed up some grammar terms totally. He several times said something about "he wants to get hired" being the "present continuous". No wonder Asians hate English grammar.

5. The class was small, I guess 4 girls and 3 boys. The girls were all at the front, smiling and talkative. The boys were at the back, reticent and frowning.

6. The teacher's book had the answers. Have you seen the Simpsons episode where Bart steals all the teachers' copies of the textbooks? In watching the teacher, I was very conscious that if you're actually teaching, you have to give a lot of attention to directing the class and analyzing and responding to the students. So taking time to analyze questions yourself is unfair to the class.

7. The students were uniformly attentive and polite. (With that and the small class size, I wondered if I was being shown a Potemkin village.)

8. The book used seemed really too difficult for the students. Apparently it was written for Americans, whose mother tongue is English! For instance, it used words like "cabinetmaker" when the class were scratching their head over a word like "approximately". This problem of a poorly-designed course seems to be part of the problem for English learners in Asia.

9. Another issue is the fact that the students can "vote with their feet". In principle I'm in favor of this, but the students may opt for sliding through an unchallenging routine for many years before slamming into the TOEFL. I think it would be far better if the students were *repeatedly* subjected to objective tests. Then teachers might be selected for their ability to get their students to actually learn, rather than entertainment value.

10. Of course, another problem for the system is that the English teachers are each in the system for only a short time. More than a year is unusual and a single term is common. So the individual teacher has little incentive to invest effort in generating rational lesson plans, and in fact is subject to considerable pressure to play along.

11. I've been in contact with admin people at several schools now, and their range of English competence has been from fair to appalling. On the other hand, you would think that a moto driver would know his way around, right? Today I asked a moto driver where the University of Cambodia was, and showed him a business card with the address in Khmer. After he used my phone to call the person, it transpired that it was 200 m away. But even though I gave him 500R for his trouble, he sent me off in the wrong direction. Sheesh. (And the address was on Norodom, and we were on Prince Sihanouk St...). To be fair to moto drivers, the security men on Norodom, also 200 m from the university, had no idea either.

2004 Feb 02 [ Mon ]

Tones and stress in English

"Deutsche Welle" broadcasts half their shows in English, and the quality of the English is usually almost perfect. Still I occasionally find something to grumble about (chorus of "Nooo!" from the crowd) and last night I heard the announcer mispronounce the phrase "paper manufacturer".

The mistake he made was interesting, and relevant to a discussion of tonality in Asian languages.

What he did was stress "manufacturer" more than "paper". It's probably hard for you to see the problem (it took me a few seconds having just heard him).

The problem is that the stress he used is the same as that which is correct for the phrase "paper doll". I don't know why – and I never thought about it till now – but for some reason we do not stress a noun used as an adjective signifying what something is made of, whereas we *do* stress it when it represents what the following noun is acting on (and I guess a couple of other cases).

Umm... I don't think I put that very well. So here are a bunch of examples:

1. Tin cup; tin magnate

2. Sugar candy; sugar bowl

3. Silver service; silver futures

4. Glass houses; glass breaker

Can you hear the difference?

As I've said before, stress in English is a *combination* of pitch and level. What's perhaps *worse* is that you need to apply multiple levels of stress, lexical and syntactic – and syntactic can refer to small phrases like the ones above *and* larger clauses simultaneously!

It occurs to me that this sort of "hinting" is handled in French by explicit prepositional phrases. Maybe this is part of the reason why French people sometimes seem to be speaking English in a weird monotone.

Heaven only knows if there's anything like this in Thai.

2004 Jan 04 [ Sun ]

Good page explaining toefl options

www.teflasia.com [http://www.teflasia.com/index.php?area=3]

2003 Dec 27 [ Sat ]

What is a good British accent

Having spent many years away from home I have noticed that my accent is becoming increasingly transatlantic, or even mitteleuropaeisch. The other day I was wondering if I could even tell someone what my accent *is* any more. Then I noticed this guy. He's from India, as is evident from his coloring, but he has a wonderful, perfect accent. British young people are no longer educated to iron out regional accents, so they are incapable of speaking without sounding like an oaf. India and his degree from University College, London seem to have combined to form an utterly flawless speaking voice. This guy could pick up chicks by cold-calling.

www.geocities.com [http://www.geocities.com/sjbaudrey/guys/shihab.html]

2003 Dec 22 [ Mon ]

Another example of consonant/vowel interaction in English

It occurred to me recently that the pronunciation of "soon" is an interesting example of how consonants and vowels intereact in English.

We are familiar with the idea that English vowels are not "pure", like Romance languages. In the word "soon", for example, we can wquite easily hear a shift in the value of the vowel towards the end.

I now realize *why* this happens. The issue is that the canonical tongue position for pronouncing the "oo" is well forward of the tongue position for "n". The tactic that is standard in English is then to start the vowel in its canonical position, but to continue producing *whatever vowel shows up* while shifting the tongue back towards the canonical position for the "n" (for me, roughly halfway along the roof of the mouth).

One can thus say that English is a "consonant priority" language (my own term). The consonants are maintained at the price of the vowel. (This is similar to the issue of "releasing" consonants at the end of a word, which so bedevils the poor Thais: a totally unwarranted puff of air has to be produced just to ensure a canonical pronunciation of the consonant.)

This should be contrasted with the situation in eg Italian. In the word "Ricardo", for instance, the "R" in the first position has to be pronounced far back in the throat to allow the Italian to produce his canonical "i", whereas the final "r" has to be far forward, almost at the teeth – sacrificed to defend the "a". We can thus say that Italian is a "vowel priority" language. I think my explanation is a great deal more logical than the usual "well the Italians trill their "r's" " blather. (An "r" pronounced at a non-canonical position is much more subject to the trilling effect.)

I am really not sure about German. My impression is that in standard German neither consonant nor vowel has clear priority.

I do not know enough Thai or Cambodian to define them as vowel or consonant-priority. However, I would certainly guess that tones are so important to Thai that they have first priority, and thus they severely limit what words are pronounceable at all.

2003 Dec 21 [ Sun ]

Update on irritating alphabetization

I re-read the Huffman book and it appears they chose a form of alphabetization for their romanized dictionary which corresponds to Cambodian dictionary order (somewhat – it doesn't have to take account of all the diacritics in Cambodian).

This is utterly insane. It's tough enough to handle alphabetization of all the phonetic-alphabet symbols, but Huffman should be taken out and shot for this nonsensical travesty. (Something about being in Cambodia seems to be disposing me towards draconian solutions. Anyway, he's probably dead of old age already.)

I had suspected this before but not confirmed it (as I didn't have any source for info on Cambodian dictionary order).

Incidentally, it occurs to me that dictionaries which insist on treating "dt" (etc) as a separate character which requires its own dictionary section should logically alphabetize it separately *within* the section – and *that* might actually be useful.

Another aside: I had been using "transliterate" to mean the process of converting words written in non-roman characters into a (more or less) roman-character version, but now I have been reminded of "romanize" I like it a lot better. "Transliterate" has the other meaning of a word-for-word translation (although people seem to be inconsistent on this.)

2003 Dec 19 [ Fri ]

Irritating alphabetization in romanized dictionaries

I can still barely stumble through reading a few words of Thai or Cambodian so I'm still stuck with romanized dictionaries.

Why on earth is the alphabetization so bad?

I know the romanization usually involves extra non-roman characters, but why do they add unnecessary complications?

For instance, many dictionaries have a separate section for words starting with "bp" or "dt". Exactly how does this help anybody (especially when you can't even be sure you heard which one it was). They aren't even consistent – they don't sort *internal* bp's after all b's.

The Huffman alphabetization is simply appalling. The only good thing about it is that the book includes a section explaining it, but as it was incomprehensible that didn't help much. I wind up having to read the section for a whole letter to find anything.

Another irritation is when these maroons obviously do the alphabetization by some sort of ASCII sort in their word processor. That leads to irritating things like "k-jeul" coming before "kaang" (I think – don't have it in front of me right now, but you get the point).

2003 Dec 16 [ Tue ]

Why can't we say certain combinations of consonants?

One of the most interesting things about the problems Thais have in learning English is their difficulties in uttering certain consonant combinations. That is, they can make an "s" sound perfectly well at the beginning of a word, but only with difficulty at the end, and a word like "first" is practically impossible.

I have a fuzzy idea that this is a particularly glaring example of a general issue: that any particular sound, vowel, consonant or tone has only one canonical mouth position (or gliding sequence of positions) and when a learner is initially tongue-tied he is actually going through a stage in which he has to experiment with the glides between sounds.

I can dimly see that the Thai tone system places very harsh demands on the shape of the mouth during each syllable, and it may be possible to predict from how tones are produced the map of allowed and unallowed combinations in Thai.

Another example occurred to me, this time for English. As is well known, English people have quite a lot of trouble learning to utter words starting with a "ng" sound, although most get used to it fairly quickly. It seems to be just a matter of unfamiliarity. However, I just thought of a much tougher case. Why can't we say "aitchs" (normally pronounced "aitches", of course)?

The interesting thing is that we *can* say something like "that bitch said". So this seems like a close relative of the problems the Thais have. Maybe I'll figure out some more stuff on this later.

I just picked up Huffman's "Cambodian System of Writing". It is forbiddingly dense (as well as having the usual problems of invented terminology and a slipshod grasp of English grammar of Huffman's other works), so I have just skimmed it. Maybe it has the following idea: the position of the mouth for a consonant cluster is determined by the second consonant, not the first. For me, this is most clearly exemplified by the word "tngai" (day) which to me always sounds like "kngai", because the "t" is forced so far back in the throat to accommodate the "ng".

2003 Dec 11 [ Thu ]

Good site for utilities for foreign-language keyboards

Although Thai is fully supported by Windows, Cambodian has almost no support. I am currently still trying to make W2000 work properly; although it can install the necessary fonts, it is impossibly tedious to use them without a keyboard mapper.

The following site looks a little old but has a lot of free software: user.dtcc.edu [http://user.dtcc.edu/~berlin/font/utils.htm]

I haven't checked it out yet: it looks as though these utiities may be for W98 only.

PC Magazine has "TradeKeys 2", a Windows scan code remapper. In the blurb they mention that Windows 2000 has a new feature for handling keyboard maps via the registry.

This suggests that if I wasn't so lazy I could write a utility as an .hta file that would run Javascript to edit the registry as desired. Hmmm.

2003 Dec 08 [ Mon ]

Would Thais and Cambodians read more if the words were separated?

In both languages as written, there are normally no gaps between words, though gaps may appear occasionally for various reasons.

It is very striking that Thais and Cambodians read so little. Apparently both countries have universal schooling which more or less teaches the 3 Rs, but the market for books and even magazines is minimal. By comparison Chinese people, for instance, read voraciously, and every rice joint is full of men smoking as they read the paper.

It's just a guess, but surely even for Thais and Cambodians themselves, it has to be a signficant extra chore to parse out the words from the solid lines of text that are normal in those languages. I know of no theory which would suggest that there is some magical way for them to perceive the content of the sentence without separating individual words, so to save the writer (and I suppose the newsprint buyer) a tiny amount, every reader has to do a significant amount of extra work.

I'd love to see these countries make official documents etc like street signs with gaps between words. Failing that, I'd like to see some sort of practical experiment where students have to read a t xt presented with with or without gaps.

It occurs to me that Chinese is normally written without gaps also. On the other hand, due to its use of complex single-character pictograms, each character is very often a word anyway.



I hope this information was useful. There may be a great deal more information on this site that is relevant to what you need. Take a look at the "site map" display at left; you can click on a topic to see many recent items on that topic.

Copyright © 2003-2009 Alternate Worlds Publishing, Boston MA USA

Debug: hittotal: 5 startban: 0 dancookie: endbandate: banned: 0 tempdate: tert: jse: jsno jsh: 5