Log in

No account? Create an account
15 July 2004 @ 08:38 pm
New "International" keyboard for Windows users  
Since I wanted access to the entire Latin-1 repertoire of characters, I installed the "United States (International)" keymap. Unfortunately, it's sort of a pain to use, since some regular keys are defined as "dead keys" (you hit the key and then another to produce a character). The most irritating is probably the quote key, which is a dead key to add acute accents when unshifted and to add umlaut/diæresis when shifted—this resulted in a lot of accidental accents whenever I forgot and tried to type a quote mark normally (under US-International, you have to type the quote key and then space to get a quote). So, I created my own keymap which I believe improves on US-International. I've found it quite handy. I'm releasing it to the public in hopes that other people will find it useful too.

This assumes a Windows XP system. Steps 1-3 are probably the same one all Windows systems, but I'm not sure how the Control Panel is arranged on pre-XP versions.
  1. Download from this link
  2. Open the zip file (with WinZip, or whatever you've got)
  3. Double-click on USInt2.exe and hit "OK" (this installs the DLL so Windows can see it)
  4. Open the Control Panel, and go to Date, Time, Language, & Regional Options. There, open the Regional and Language Options control panel.
  5. Click on the Languages tab and then the Details button under "Text services and input languages"
  6. In "Installed services", under English, click on Keyboard and then click "Add"
  7. Make sure "Input language" says "English (United States)". Check the box for "Keyboard layout/IME", select "US Latin-1" from the pull-down menu, and click OK.

If you use it like the standard "US" (ASCII)  keymap, it will behave exactly how you expect. The additional characters are all accessed with AltGr (the right alt key, or control+alt).

The characters you get with AltGr:
keyboard layout with AltGr pressed

The characters you get with AltGr + Shift:
keyboard layout with AltGr and Shift pressed

Most of these are pretty easy to remember. ð and Ð are AltGr + d and D, þ and Þ are AltGr + t and T (because it means "th"), × (multiplication sign) is AltGr + * because the asterisk is frequently used to mean multiplication, µ (micro sign) is AltGr + m, etc.

Two things aren't really clear on these pictures. AltGr + hyphen produces the "soft hyphen": an invisible character that marks a point where a word may be broken and a hyphen added when laying out text. AltGr + space is the "non-breaking space", a space character where the line is not allowed to split.

The greyed-out characters are "dead keys". To get an accented character, you press one of these and then the character you want to accent. So to get Á, you press AltGr + ' (apostrophe), then A. Here's how they work:

AltGr + ` (backtick): grave accent
AltGr + ' (apostrophe): acute accent
AltGr + 6: circumflex (looks like the caret you get from shift+6)
AltGr + ; (semicolon): umlaut or diæresis (two dots, like a colon)
AltGr + n: tilde (commonly used with n as ñ, shows nasalization of vowels in Portuguese)
AltGr + , (comma): cedilla (looks like a cedilla), plus additional diacritics and fractions
AltGr + 0 (zero): miscellaneous

To get a spacing (non-diacritic) equivalent to any of them, type the dead key and then a space. For AltGr + comma, that produces a cedilla. For AltGr + 0, that produces a degree sign.

Grave, acute, circumflex, umlaut, and tilde behave exactly as you would expect (although they're limited to the characters in the Latin-1 character set: that means no capital ÿ, and tilde only applies to n, a, and o). AltGr+comma and AltGr+0 are a little funkier. AltGr + comma adds a cedilla to c and C (ç Ç), a ring above to a and A (å Å), a slash to o and O (ø Ø), and turns 2, 3, and 4 into ½, ¾, and ¼ respectively. AltGr + 0 turns o and a into the masculine and feminine ordinals (ª º), turns c and r into © and ®, and produces a degree sign if used with a space.

One problem with this keymap, which I can't figure out how to fix: it interferes with control+shift. I don't know why, because I didn't mess with control+key codes at all, but there you go. It's the only reason I haven't completely dumped the US-ASCII keymap.

EDIT: The file is now kindly hosted by John Cowan.
Current Mood: accomplishedaccomplished
Current Music: the A's game on the radio
ludwigvantx on July 16th, 2004 10:08 pm (UTC)
Cool! Of course you can also enter acute-accent vowels with AltGr + the vowel (doesn't work with y though). So AltGr + a = á, etc. I need to add more letters (like those extra Turkish letters, as well as Š/š and Ž/ž) to US-Int'l layout myself.

I also need to post the layout for my own version of Arabic-QWERTY, which also includes letters found in Farsi, and are also used by some Arabic-speakers for foreign loans containing 'p', 'g' and '(t)ch'.
gwallagwalla on July 16th, 2004 11:31 pm (UTC)
Yeah, this layout specifically only contains Latin-1 (ISO 8859-1) characters, because that's what most IRC clients accept (well, some of them can do any charset, like xchat, but mIRC can only do Latin-1 and it's in the majority). I've also been working on a "US Unicode" keymap that allows access to most of the unaccented characters in Unicode's Latin repretoire, plus combining diacritics, and also an IPA keymap. Those are in much more of a state of flux though.
ludwigvantx on July 17th, 2004 12:46 am (UTC)
I've always wanted a 'Unicode keyboard'! Of course that'll mean A LOT of deadkeys. I still have to figure out all the intricacies of Microsoft Keyboard Layout Creator.

My wishlist:
Latin-Complete (Latin-1, Latin Extended-A, some Latin Extended B, some Latin Extended Additional)
QWERTY-type phonetic layouts of:
Cyrillic (at least Russian-Ukrainian-Belarusian)
Arabic (I already have that)
Hangul (Korean, that will take FOREVER since a huge syllabric codepage is involved)

I use a JAVA-based and Unicode-able IRC client called PJIRC. I used to use mIRC, but got tired of Khaled Mardam-Bey's nagscreen. He kept making me want to slap him around a bit with a large trout.
gwallagwalla on July 17th, 2004 04:38 am (UTC)
Well, my "US Unicode" keyboard doesn't cover the entire Unicode repertoire by any means! It's more like "stuff from the Latin range, some useful diacritics, punctuation, and maybe some other stuff thrown in".

As for IRC, I actually use xchat, which is Unicode-capable. The problem is that mIRC isn't, either on the sending or the receiving end, and it's also the most common, so in most channels everybody pretty much has to use Latin-1 because most people will only see gibberish if Unicode is used. Mardam-Bey seems to be dead set against Unicode support for some reason, which sucks.
(Anonymous) on July 19th, 2004 03:46 am (UTC)
Suggested bug fix
I (John Cowan) told the Unicode list about this keyboard, pointing out that Ctrl+] doesn't work in Telnet (it's the Telnet escape character) and got the following comment from Michael Kaplan of Trigeminal Software:

Ah, that is the first time I ever saw a use for control chars in the CTRL shift state!

But..... if you load the keyboard in MSKLC, add U+001d to the VK_OEM_6 key (which is where that bracket is), rebuilt it, and install it, then the Telnet ESC squence will work properly.

Telnet seems to be depending on some of those control characters being on the keyboard, and in particular positions (which means lots of the built-in keyboards would fail since not all of them have these definitions). I'll put this in as a bug to fix in the next version of MSKLC....
gwallagwalla on July 20th, 2004 03:50 am (UTC)
Re: Suggested bug fix
Ah, thanks! I gotta try that out then.
(Anonymous) on July 20th, 2004 01:48 pm (UTC)
Re: Suggested bug fix
Yes, it appears that many console apps like Telnet (which do not have quite the same support as WM_KEYDOWN messages provide), are relying on the assignment of certain control characters in several of the CTRL+VK_OEM_* keystrokes. The only one I know of at the moment is Telnet (but with Telnet saying right when you open it that Ctrl+] is escape, I can hardly blame anyone for trusting that it would work. :-(

But just ignore the warning about control characters in the CTRL shift state in this situation -- since in this case you know better than MSKLC does....

MichKa [MS]
NLS Collation/Locale/Keyboard Development
Globalization Infrastructure and Font Technologies
Windows International Division

This posting is provided "AS IS" with
no warranties, and confers no rights.
gwalla: magmagwalla on July 20th, 2004 04:41 pm (UTC)
Re: Suggested bug fix
Well, it works for Control-] in telnet now. However, Control-Shift-A still doesn't work. The ASCII control codes all seem to correspond to control + unshifted keys (except for Ctrl-^ and Ctrl-_). Any idea what control codes the shifted keys correspond to—the additional codes in the Latin-1 block?
(Anonymous) on November 4th, 2005 11:36 pm (UTC)
Re: Suggested bug fix
As far as I know, control+shift+letter has never meant anything to anybody. Using the upper-half control characters isn't a totally absurd idea, but it's not very useful either. --John Cowan
(Anonymous) on July 30th, 2004 05:21 pm (UTC)
A response from Alain LaBonté
Reposted here from the Unicode list at his request:

My two cents:

It would have been nice if this keyboard would have been based (for its second layout) on ISO/IEC 9995-3 International Standard. The latter is based on the following philosophy:

- Group 1 is the national (or prefered layout) [in the USA that would be the standard US keyboard; in this case AltGr could be added to show exactly what qwalla documented in his first figure (it is obvioulsy what he prefers). Group 1 normally corresponds to unshifted, shifted and AltGr layouts (3 levels, called level 1, 2, and 3)

- Group 2 is a supplementary group whose purpose is to supplement national usage for the Latin script, based on the ISO/IEC 6937 repertoire (roughly 330 Latin characters), for European languages using the Latin script.

Subsets can be implemented [I would friendly recommend that qwalla slightly modify his figure 2 layout to fit with this international standard]. Group 2 needs a group select mechanism, which is so far left to implementation (it could be AltGr and AltGr+Shift to access the two levels described in this group in ISO/IEC 9995-3 -- however in this case that would not be sufficient for some keys of the Canadian Standard keyboard -- in at least one case we have 5 characters on the same key, see below how we do that).

Canada included ISO/IEC 9995-3's group 2 in its Canadian Standard CAN/CSA Z243.200 (implemented as "Canadian mutilingual keyboard" in several versions of Windows -- and Win XP fully implements all the characters of the ISO/IEC 6937 repertoire, with Unicode encoding [keyboard layout standards are based on abstract characters, not on coding] ; all Macs sold in Canada with French language support provide this layout as their standard layout). Group 1 is of course our national standard layer. Most Canadian implementations on PCs dedicated the scan code used on US keyboards for the RightCtrl rather as a Group Select key to access Group 2 (which can be shifted itself to get access to Group 2 Level 2 characters [so up to 3 levels in group 1 if you have followed and up to 2 levels in group 2).

Here is an example of commercial keyboard implementing the Canadian Standard keyboard with Group 2 limited to Latin 1 access (level conformance B -- full set is level conformance C [330 characters]): http://pages.videotron.com/alb/Z243200.jpg . See also another older commercial implementation (with blue color to distinguish group 2 [levels 1 and 2] and red to distinguish group 1 level 3): http://pages.videotron.com/alb/Z243200c.jpg

There is a joint Canada/Sweden project to present a new work item proposal at ISO to standardize (or offer guidelines) group selection mechanism (this has been tried in 1991, but that failed). With UNICODE/UCS now of age, this in our opinion would be highly desirable to go beyond international standardization of the Latin script support limited to some languages as now. If others are interested, please let me know, I convene ISO/IEC JTC1/SC35/WG1, which is responsible for keyboard international standardization. Our next meeting will be in November, most likely in Stockholm (fallback: Paris). In the meanwhile someone can also implement ISO/IEC 14755 (poor man's input method to enter UCS character with the help of any keyboard), a standard made in the mid 1990s (it is not a keyboard standard but could be useful for limited usage of "special" characters).

Alain LaBonté
(Anonymous) on July 30th, 2004 05:59 pm (UTC)
Re: A response from Alain LaBonté
Michael Everson responded by pointing to several keyboard images, of which I think the one most interesting to you is the standard Apple US Latin Unicode keyboard (http://www.evertype.com/celtscript/keyboards/us-expert-keys-x.gif).

The seven keyboards shown are the unshifted, Shift, Alt, Alt+Shift, CapsLock, Alt+CapsLock, and Alt+Shift+CapsLock varieties.
gwalla: magmagwalla on August 7th, 2004 05:02 am (UTC)
Re: A response from Alain LaBonté
Hm. Well, I wasn't aware of ISO/IEC 9995-3 when I made this keymap, and, well, I don't know what it actually says. :? I can't find a free reference anywhere.

The US Latin-1 keyboard was mainly just so I could access the Latin-1 repertoire for use on IRC without having to relearn key positions, so it's based on simple mnemonics. The Canadian International and US International keymaps put a lot of things in strange, counterintuitive positions (AltGr-z for a-diaeresis?). And, while I'm still a two-fingered hunt-and-peck typist (working on fixing that) I don't have the benefit of one of those snazzy keyboards with the special characters on the keys.

I can say right off that my "US Unicode" keyboard (should be called "US Random Junk", really) is unlikely to adhere to any sort of standard. I doubt many people really need direct access to hwair and eng (I certainly don't, I just like it).