Google fail

2017-May-07, Sunday 13:37
mindstalk: Tohsaka Rin (Rin)
I was trying to pick up my Japanese studies again, and turning to Google Translate as a way to get some daily phrases. "A bowl of rice" was given the plausible characters 米のボウル, but transliterated as "Amerika no booru". Cooked rice (kome) and America do share a kanji, but you wouldn't read it that way!

I shared with my friend in Japan, who laughed, then said you wouldn't read it as "Amerika" even when talking about America.

Also that "kome no booru" was like a bowl made out of rice (plant) or something, and not something you'd say; instead you'd use "ichizen", ichi-zen, zen being an oddly specific counter for bowls of rice or pairs of chopsticks.

So, multiple levels of machine translation fail!
mindstalk: (Default)
Ari told me a trick he uses as a GM for naming NPCs: pick a language to be thematic, type words into Google Translate, use. So one NPC is Hungarian or something for "Betrayer", and is intended to backstab the PCs at some point, and the players have no clue.

Neat trick, but one not relevant to most of my PCs these days, where I expy some anime heroine and don't try to hide my work. (Latest: Kyouko the Dungeon Slayer, an adaptation of Sakura Kyouko.)

But someone was recruiting for an all-evil PbP game, which isn't my usual thing, but I got tempted to try to think of something anyway, It's D&D3.5, so I figured I'd start with a druid, for maximum mechanical cheese. Evil druid? Sure, he wields the power of nature for evil. Or he wields the power of nature, and is a selfish jerk. If you want philosophy, you can talk about predator/prey, nature red in tooth and claw, social darwinism.

Race? (Meaning species). I like elves as an idea, but have been avoiding them, because a 1st level 100 year old character just hurts my head. But for the character I was forming? The guy who can feel superior because he lives 10x longer than you is a perfect fit.

And even better, there's the gray elf subrace, with +2 Int. Literally smarter than you. (D&D swaps gray and high elves from Tolkien: "high elves" are like the Sindar default, gray elves the longer-lived and smarter and more arrogant 'Noldor'.)

Great, a racist gray elf druid! What to name him? I tried thinking of 'elvish' names on my own, but wasn't getting far. (I hate coming up with names.) Time to try the language trick! What are elvish languages? Tolkien was inspired by Welsh and Finnish, Order of the Stick uses pseudo-Latinate names. I started with Welsh. What's a word? Well, 'racist'. So I type that in... and get 'hiliol'.

Hiliol the elf. I dunno about you, but I figured I was done on the first try. Feels vaguely elvish, doesn't have an obvious gender. (I was figuring I'd go for androgyny, a la Vaarsuvius in Order, though he's since become male -- trying to improve the human stock by fathering lots of half-elves. But anyway.)

That was a stroke of luck, really; I've since tried some other words, and they translate to words that are so flagrantly Welsh as to be intrusive. Like 'twyllwr' for "deceiver" or 'celwyddog' for "liar". (Hiliol's Big Crime was fraud, trying to convince foolish humans that he could give them elven lifespan.)

Finnish could have worked, its 'racist' is "rasistinen", though I just realized it sounds like an import of 'racist', so maybe not.

Pronoun compression

2016-Feb-25, Thursday 22:03
mindstalk: (Nanoha)
So, you'd think languages would tend to shorten the length of commonly used words. And in English, all pronouns are short. 'our' is arguably two syllables. 'theirs' is one syllable though a lot of phonemes. But I, you, though, we, your, ... all short.

Spanish too, mostly: yo, tu, el, ella. (But, nosotros). Even though inflections mean you often don't need them.

In Japanese, not so much. Of the truly excessive list of pronouns, I think all are 2+ syllables. Of the standard ones, watashi (I) and anata (you) are both three. Some informal ones are two (ore, boku). A standard really formal one is four (watakushi, I). And if you want to indicate possession, that needs another syllable, e.g. 'watashi no' for 'my'.

But, apparently, pronouns aren't used as much in Japanese. Raising a chicken and egg question: are they dropped because they're long, or are they long because they're easily dropped? At any rate, all I read says Japanese is good at dropping parts of its syntax in favor of context (more so than other languages?) so don't need to say 'I' if you're obviously talking about yourself.

Also, simply using 'anata' is often rude, and it's more proper to use people's names. And some girls trying to be cutesy will instead of using watashi, or the cutesy variant atashi, use their own names, like "Mariko-chan is hungry" instead of "I am hungry".

Which seems like a lot of work! Except... the pronouns *are* long, I realize, so the opportunity cost is a lot less. Personal names are typically 2-3 syllables, family names commonly 3-4 syllables, add a standard honorific and we're talking 3-5 syllables. Given that the alternatives are generally three syllables themselves, using the name might not take any more time. Or it might take 5 syllables to 3 -- Nakajima-san vs. anata -- but I don't know how our brains process that. Is it just as bad as using 3 for 1, two extra syllables, or is it "only 67% longer" vs. "three times longer"?

And for the cutesy usage, well, not only are personal names shorter than family names on average, they can be further truncated, especially if you're being cutesy. One manga Mariko I know of is Mari-chan (or -tan, or -chin) to her friends, and presumably if she were the sort of girl to refer to herself in the third person she'd use those forms too. No longer than atashi, then.
mindstalk: (Enki)
So there are various philosophies of translation, how literal or high level you should be, how much to preserve meaning vs. experience. For example, samurai daimyo and ninja could be translated as knight lord and assassin. This would make them seem more familiar to European cultures and preserving or 'translating' the experience -- after all, samurai isn't exotic to Japanese people -- at the potential cost of shades of meaning, and the exoticness that might be why someone wants to read the translation in the first place.

One specific disagreement I've seen is over Japanese honorifics: -san, -chan, -kun, etc. Pro translators seem to pride themselves on full naturalization, turning -san into Mister and relatives, -chan into endearments if anything, and such. Anime/manga fans generally prefer preserving them, and that has taken over professional manga translations, which now usually have an honorifics guide in the front. I prefer that myself, as I can easily see uses of honorifics that would be hard to translate without contortion[1], and it's not much work to have learned them.

But I realized, part of it may be due to the difference in intended audience. Pro novel translators probably assume that theirs may be the only novel from that language read by many of the readers, and aim to minimize the work expected of the readers. Anime/manga fans generally read or watch many such works, often trying to learn Japanese for real themselves, so for us, the not very large amount of work in learning is amortized among many works.

[1] One example: it seems easy to translate Gingko-san as Mister Gingko, or Hayate-san as Miss Hayate. But what if someone's gender is unknown or non-binary? You've got a choice problem in English that simply doesn't exist in Japanese, where -san can apply to anyone or anything.
mindstalk: (kirin)
So, I have completed the Duolingo Spanish tree. All of it is solid gold, even the optional extras. So, what's it mean, in the end. Have I learned Spanish? Is it a Spanish course? Is it worth it?

No, no, and maybe.

Just walking over tonight, I tried naming things I passed and realized lots of common things I hadn't seen words for (that I recall): bank, store, ice, ice cream, hamburger... store might be in there, but we seem to be shown 'businessman' and 'entrepreneur' and 'business' and 'institute' (and 'utilize') far more. Also I opened up El Pais's website, and while I could get some meaning out of the articles, it's definitely not total understanding.

And at that, DL hardly ever explains anything; I'm pretty sure I got through the verb tense exercise easily because I *had* studied Spanish grammar, with memories of the shape if not the content of Latin grammar. If doing things the pure DL way... I dunno.

I'll also note they encourage you to "immersive exercises", i.e. making money for them by translating articles for them, which I haven't done. Mostly because I didn't feel ready, and also if I'm going to do that I've got a dual language book of top Spanish stories to read instead...

Seems to me that it's far from a complete course in itself; if you rely on it solely, you will certainly learn something, but not even enough to be a good tourist. ("Where is the bathroom?" You could form the sentence, but I'm not sure we've seen 'bathroom'.) But as a gamified complement to proper study, sure. Mostly, if you'd be twiddling with computer or phone games anyway, doing DL exercises is going to be infinitely more productive than playing Angry Birds or Freeciv or Boggle or whatnot. Even low-brain mode via doing exercises you already know well is probably more productive.

So my current plans: keep my streak going and tree golden because it's at least something; maybe dip back into German to learn something there; perhaps more likely to go try to read newspapers with dictionaries open, or my story book, than their own immersion. But we'll see.

ETA: one way to get more out of DL is to read the comments, of course. I've learned some stuff there, hopefully contributed a bit as well. Sort of outsourced teaching, though you also have to be smart and mentally filter what you read -- not all the comments are sound. Still better off with a good text...

ETA2: but as I told my friends in January, a flawed system you use is better than a better one you don't, so there's that; DL kept me 'studying' Spanish when pure self-study wasn't.

Dialect heat map

2013-Dec-23, Monday 23:33
mindstalk: (Enki)

Server load or the various plugins in my browser mean I don't see a "Share" link for my maps. My first result was near Richmond VA and couple of other nearby cities; my second one, with some different questions and a few different answers to old ones, put me in upstate NY, Maine, and Wisconsin, with "nearest cities" of Rochester, Providence, and Springfield MA. My heat maps for individual questions are all over the place, especially the first time I took it (second one overlapped with Chicago more often, but still didn't end up there.)

I grew up a solitary and bookish child to parents from Boston and LA/Berkeley, in gifted/magnet schools, then lived in intellectual California for 10 years, Indian grad school for 8, and now in Cambridge. I've deliberately adopted "you all/y'all" as useful and thanks to knowing the originator of I frankly have no idea what word I grew up with, though I think I got "soft drink" from my parents. I suspect I just confuse the system.
mindstalk: (kirin)
At a party tonight, people playing a homegrown version of Pictionary, basically Difficult All Play with made up words. A neutral player picks a word and shows it to the drawer of each team, and they race ot make the guesser say the word; no limit on the abstraction of the word. We saw expertise, irrelevant, vulgar, and tact (which was going on when I left.) The winners of the earlier words used "sounds like" techniques, e.g. Vulcan + car = vulgar. This was banned for the 4th round on the grounds of being too powerful. Progress by non-sounds like teams was, uh, amusing.

It occurred to me that "sounds like" is recapitulating the evolution of writing. First, pictures of concrete objects or verbs, then ideograms for the more suitable abstract concepts like 'up'... and then instead of arbitrary graphical symbols for the hard stuff, phonemic techniques to elicit the sounds of the arbitrary spoken word people already know.

This suggests a compromise, based on the vast majority of Chinese characters: people can use a partial 'sounds like' technique, indicating part of the sound but combining it with a other symbols that suggest the meaning domain. E.g. 'vulcan' + pictures suggesting politeness or rudeness or the populace.
mindstalk: (12KMap)
Which languages to learn to maximize the number of speakers is a traditional exercise, going something like English, Mandarin, Spanish, Russian, French, Arabic, Hindi, Swahili, Bengali, Portuguese, Japanese, German... But what if you wanted to learn one language per family, while maximizing speakers?

English, Mandarin, Swahili, Arabic, Indonesian, Tamil, Japanese, maybe Turkish, Vietnamese, Thai; possibly Korean; Hungarian. This can be seen as how to learn up to 12 useful languages while minimizing the possibility of any re-use to make your life easier. :)

Of course both lists can look different if weighted by one's personal probability of running into the language or speakers thereof.
mindstalk: (Enki)
On further thought, what really strikes me about that list is how many major Asian languages are in different families, on the level not of Latin and German but of Latin and Chinese. Mongolian: probable Altaic Chinese: Sino-Tibetan Korean and Japanese: likely isolates, possibly relatives of each other, or in Altaic Ainu: isolate Indonesia, Malay, Tagalog/Filipino, and Formosan languages: Austronesian Vietnamese and Cambodian: Austro-Asiatic Thai and Lao: Tai-Kadai Burma: Sino-Tibetan Hindi and Bengali: Indo-European Tamil: Dravidian and if you can find Asian Muslims who actually know Arabic: Afro-Asiatic 8 language families, not counting Ainu and Arabic, and with a maximal Altaic group; 10 with a smaller one. And of course this isn't counting all the minor families and isolates. Even when there's an ostensible or even real genetic relationship, moving from one country to the next is likely to seem completely different; Thai and Lao are close, as are Indonesian and Malay, but those aren't close to Formosan or Tagalog; Vietnamese and Khmer aren't close; no one can agree if Korean and Japanese are related to anything. Contrast with Europe, where it's Indo-European almost everywhere you go, with older branches like Celtic seeming indigenous to later ones like Latin/Romance and Germanic, having completely overwhelmed whatever came before Celtic, with only a few survivors like Uralic (Finnish, Hungarian) and the Basque isolate. Two families, plus one isolate. Three families if you push out to Georgia and Caucasian, though at that point you might as well add Turkish:Altaic as well. Of course, once again, we're talking about a much smaller population; Europe is basically half the population of north India. Then again, population size and language diversity don't have much to do with each other. Geography's probably more relevant, but obviously hasn't done that much in Europe. For whatever reason, Indo-Europeans were really good at invading Europe, in multiple waves, even.
mindstalk: (kirin)
I knew there were many language families, and many of their names indicate where they are, but I thought it'd be useful to associate them with famous languages (and with a large number of speakers) to stand as representatives, as well as targets for learning if you wanted to go look at exotic languages. So I went to and clicked a lot.

Takeaway: you can't do this for all language families, because there are dozens of them -- heck, dozens in each of the Americas, New Guinea, and Australia alone. Not counting isolates, of which there are many. But, going by the big groups (at least 1% of world population, which is nearly 70 million people), we have, in order of native speakers:

Family (example languages, and notes) [% of world native speakers]

* Indo-European (duh) [46%]
* Sino-Tibetan (Chinese; Burmese, Tibetan) [21%]
* Niger-Congo (Yoruba, Zulu, Swahili) [6.4%]
* Afro-Asiatic/Hamito-Semitic (Arabic, Berber, Amharic, Hausa, Egyptian, Hebrew, Akkadian)
* Austronesian (Indonesian, Hawaiian; 9 of 10 branches only on homeland of Taiwan/Formosa; very diverse)
* Dravidian (Tamil; south India) [3.7%]
* Altaic (Turkic, Mongolian, maybe Korean and Japanese; disputed)
* Austro-Asiatic (Vietnamese, Khmer (Cambodia), Munda (indigenes of India))
* Tai-Kadai/Kradai (Thai, Lao; highly tonal) [1.3%]

some others of note:

Uralic (Finnish, Hungarian, Sami, Estonian)
South Caucasian/Kartvelian (Georgian)
Hmong-Mien (Hmong, which has 12 tones)
Iroquois (Cherokee)
Uto-Aztecan (Nahuatl)
Quechua (Andes, Inca)
Eskimo-Aleut (Yupik, Inuit, Aleut)
Algic (Algonquian; Blackfoot, Cree, Massachusett, Mohican; has a couple in California)
Tupian (Brazil; Tupi, Guarani)
Khoisan (click; Khoi, San; no longer accepted as a single family)
Sumerian (isolate)

But there are many many others. E.g. at least one non-Eskimo family that's in both Siberia and NW America, not to mention other Siberian and American families separately. Seven families indigenous to Mesoamerica, with Maya and Aztec representing only two of them. New Guinea's many, Australia's many...

There's also a concept of
unrelated languages in an area resembling each other through mutual exchange

blend of Romance, Slavic, etc. in Balkans (Albanian, Romanian, south Slavic, Greek, Romani)
tonal and vowel sharing in SE Asia, Sino/Thai/Khmer
possibly the whole Altaic 'family'
clicks from Khoisan into Bantu/Nguni has a map and other discussion

Language poll

2011-Feb-08, Tuesday 12:17
mindstalk: (Default)
What languages other than English do you know, or want to learn, and why? I don't expect answers as long as what I type below.

For me, I used to know enough Latin to barely pass the Catullus/Horace Advanced Placement Exam, plus a big of Greek, but have forgotten nearly all. This was due to my parents, especially my mother, trying to replicate their own classics training and interests. She never did really teach me how to read and pronounce French words, which might have been more useful.

I've been desultorily self-studying Spanish and Japanese for embarrassingly long. Spanish, starting back in California, as I figured it was an obvious second language in the US, especially with Spanish radio stations. Now, the facts that I have friends in Chile, and Latin American seems to be up-and-coming in social democracy and economic growth, help too. Japanese, because of all that anime, though less because I specifically wanted to understand the anime, and more because I thought "you have to listen to a language a lot to really learn it, and my enthusiasm for foreign (or any) movies is generally low, but I'm voluntarily watching 6 hours of subbed anime a week..." Also, trying a non-Indo-European language was appealing. F*ck grammatical gender.

If I wanted more, Chinese, French, and German would probably be the next tier, in no particular order. Population and economic strength, utility for possible Canadian or French immigration, Germany's economic strength, plus the strong cultural weights of all three. Absent any actual migratory need, Chinese and French would probably make the most sense, leveraging my kanji and Latin/Spanish knowledge.

Sign language has a bit of interest, for being weird in making use of space.

Not a language, but I've taught myself Morse code, largely so I might have a post-stroke communication channel.
mindstalk: (CrashMouse)
A couple of days ago I went to a sporting goods store, EMS, to get boots and such. Yesterday was in the 30s, so the long underwear was useless. But the boots were a timely purchase, as deep puddles of slush were everywhere, and I could more confidently go through them. They're not even ideal boots, ankle high vs. thigh high, hard to tuck into, one European size too large though that might be good for slip-on and wearing extra socks. But better than my shoes.

I also found some cheap (for EMS; $4.50) bandannas I liked the patterns of, pseudo-tie dye, and decided to branch into bandannaland, consulting the web for how to put them on. is an early attempt at a skullcap.

The boots took me to Red Bones, a BBQ place I'd been told about. Tasty. Lots of meat.

I've decided to look for a place to live for real. Craigslist is intimidating... we'll see if the rental agency can tempt me to shell out before I find a place on CL. I now recall that part of the appeal of an agency was "make this happen as quickly as possible", including being driven out to a bunch of places at once, vs. making several separate appointments.

I've also dived into Spanish study more. Following my book was making me restless, I've been doing straight vocabulary learning instead, also trying to do Japanese in parallel. Not because they're related at all, duh, but to keep my competencies in sync. Though with Japanese you really have triple the work: Japanese word, kanji, Chinese reading (onyomi) of the kanji in compounds (which my pocket dictionary doesn't even volunteer) so Spanish is pulling ahead. Anyway, noting gender patterns is interesting:

fork, knife: stabby things, masculine. spoon: curvy, feminine. Floor, ceiling: masculine. door, window: openings, feminine. Though interior walls are feminine, exterior masculine. So far most body parts are masculine, but face and leg are feminine. Foot is masculine.

I'm being motivated in part by some steampunk attempt to turn February into a language learning challenge month. I sent the url to a friend, then realized 10 second later that she's studying Japanese and is probably way past this point, as are most of my friends who are at least somewhat bilingual. It's monoglots like me who need it.

Flickr photos have been organized into more sets. Still not complete. Of course, now there's rumors Yahoo might close Flickr. Good thing I kept all the original photos.
mindstalk: (Default)
* You have to be born in America to be President -- US Senator Richard Shelby, R-Alabama. He notes the US being the world's biggest debtor nation; I'd note we became that under Ronald Reagan.
* A list of the 'progressiveness' of the US Senate. There's a big gap between the Democrats and even the most progressive Republicans... and between those 3 and the rest of the GOP.
* Blogs, newspapers, and journalism
* Sean Hannity and "socialism"
* Obama to use more honest budgetting
* 40% support legalizing pot. This is more than approve of Republicans.
* Krugman wonders if the right-wing job network is in trouble.

* Look at those mean/median spreads!

* Singular they, and in the Bible. How 'they' became 'him'.

May 2017

78 910 1112 13
1415 1617 18 1920

Expand Cut Tags

No cut tags

Style Credit