The Houston Berean Page!! |
---|
I would like to thank the sources below for providing the valuable information on using Japanese with a PC. I would like to put in my own words information on this topic for others to read. I am not an expert in this area, and some may find mistakes with what I have written. This is not meant to be a technically correct presentation of this subject, but is the way I understand what is needed for using Japanese on a PC. I hope it proves helpful to someone else.
What does it take to display Japanese with your English PC? And what does it take to write in Japanese or send E-mail in Japanese?
* Japanese has several non-electronic character set standards. A character set provides a common set of characters, and are used to ensure that a minimum number of characters are learned in order to communicate with others in society.
- The English alphabet is an example of a character set standard
* ASCII is a western electronic character set. It is made up of 94 printable chacracters (a,A,1,2,@,#...) and 34 non-printable characters (space, tab, escape,...) for a total of 128 characters.
* JIS X 0208-1990 is a common Japanese electronic character set and includes 6,879 characters, among which are the hiragana and katakana syllabaries, 6,355 kanji, the Roman, Greek and Cyrillic alphabets, the numerals, and number of typographic symbols. As you can see Japanese has many more charaters to encode than English. This is important to remember for later.
* Since a computer only uses binary (0 or 1 state), have to encode the character set information into combinations of 0's and 1's for the computer to recognize.
Think of a bit as a light bulb (1 bit = lightbulb). The light bulb (bit) can either
be "ON" (represented by a "1") or "OFF"
(represented by a "0").
* If you have only one light bulb (bit) to use to encode information with, how many characters could you represent? To say it another way, how many combinations of "OFF" and "ON" could you have with one light bulb?
ANSWER: TWO ("ON"and "OFF"
) or "1" and "0"
So the "1" could represent the letter "A" and the "0" the letter "B" for example
* If you have only two light bulbs (bits) to use to encode information with, how many characters could you represent? To say it another way, how many combinations of "OFF" and "ON" could you have with two light bulbs?
ANSWER: FOUR
00 () = "A"
01 () = "B"
10 () = "C"
11 () = "D"
* If you have only three light bulbs (bits) to use to encode information with, how many characters could you represent? To say it another way, how many combinations of "OFF" and "ON" could you have with three light bulbs?
ANSWER: EIGHT
000 () = "A"
100 () = "B"
010 () = "C"
001 (>
) = "D"
110 () = "E"
101 () = "F"
011 () = "G"
111 () = "H"
* The historic ride of Paul Revere was based on binary!! The British would be coming by two means:
1 if by land: (or
)
2 if by sea: (The number of lit lanterns were a code used to send information. This is the same thing your computer is doing.)
* In the binary system, by using the formula 2 to the x power, you can calculate how many charaters you can represent with "x" bits.
EX. With 2 bits = 4 characters (2 to the 2nd power)
EX. With 3 bits = 8 characters (2 to the 3rd power)
EX. With 4 bits = 16 characters (2 to the 4th power)
EX. With 5 bits = 32 characters (2 to the 5th power)
* U.S. PC systems are designed to handle the ASCII character set mentioned previously, which is made up of 128 characters. This is sufficient for English language computing. To encode all of the 128 characters in binary, we need 7 bits. This allows a total of 128 characters to be encoded (2 to the 7th power equals 128).
* ASCII encoding examples: ( It's all about combinations of "1's" and "0's")
Letter "A" = 1000001 Letter "B" = 1000010 Letter "C" = 1000011
Letter "a" = 1100001 Letter "b" = 1100010 Letter "c" = 1100011
Number "1" = 0110001 Number "2" = 0110010 Number "3" = 0110011
Characters "@" = 1000000 Charac "%" = 0100101 Charac "!" = 0100001
* These 7 bits together are called a BYTE. For a simplifed explanation, when you press the letter "A" on your keyboard, the processor in the keyboard senses that the "A" key has been pressed and sends the "1000001" byte to the main processor. This is then acted upon in the PC and displayed on your screen as an "A".
* This is fine for U.S. English language PCs, but you can only represent 128 characers by using 7 bits. HOW DO WE ENCODE THE 6,879 CHARACTERS IN THE JAPANESE "JIS X 0208-1990" CHARACTER SET???? This is the problem with trying to read Japanese encoded information on a computer designed for English. You need special software to help your English PC decode the Japanese information.
* To represent the 6,879 charaters in the Japanese character set, you would need a system that uses a minimum of 13 bits (2 to the 13th power equals 8,192).
* To solve this problem with U.S. PCs, encoding systems have been developed that use two bytes (14 bits) to represent one Japanese character set encoded character. S/W is needed on a U.S. PC to recognize this encoding.
* Since a U.S. PC uses a byte (7 bits) at a time to represent information, if you can make the computer read two bytes (14 bits) of information, then you could encode up to 16,384 pieces of information (2 to the 14th power equals 16,384). This encoding would allow a U.S. PC to handle the 6,879 characters in the Japanese character set.
* Three commonly used Japanese encoding methods are JIS, SJIS and EUC. WWW pages normally encoded in SJIS, but not always. E-mail and Usenet Newsgroups normally encoded in JIS. If you have a Japanese wordprocessor for use on an English language PC, you should see these encodings as options to paste into the clipboard or for saving a file.
* Escape characters are used before and after the Japanese characters to denote two byte (14 bit) encoding in JIS. EUC and SJIS use other encoding schemes. Special S/W is needed to decode this two byte encoding of Japanese information on a PC.
EX) "KORE WA NIHONGO DESU" would be encoded as follows in JIS:
KO RE HA NI HON GO DE SU .<&ESC> $B $3 $1 $0 F| K\ 8| $G $9 !# <&ESC> (BAs you can see above, the <&ESC>$B tells the S/W that what follows is encoded as 2 bytes (14 bits). The syllabary "KO" is encoded with the ASCII characters $ & 3. The "$" in ASCII is represented by 0100100 and the "3" as 0110011, for a total of 14 bits. When put together would look like 01001000110011.
JAPANESE INPUT
* For English users, the keyboard can hold all the keys to represent a sufficient number of characters. Japanese characters number in the thousands, so how do you enter Japanese characters using a U.S. PC keyboard?
* A software solution uses a Front End Processor (FEP) and conversion dictionary. A Japanese wordprocessor that runs on your English system will normally have a FEP built in.
* The user normally enters information from the keyboard by entering Japanese as romaji or hiragana. The FEP grabs the user's keyboard input before any other software can use it. The FEP then takes the input information and parses it (breaks it up into smaller parts), and runs the information through a conversion process using the conversion dictionary.
* The FEP runs as a separate process in its own window.
* So to enter the sentence
, you would press the "k", "o", "r", "e", "w", "a", "n", "i",...etc. keys on your keyboard. The FEB grabs this information before anyone else, and breaks up the information to determine what data goes together.
SOFTWARE FOR YOUR PC
(some of the software below can be obtained from the sites listed above)
* Stand Alone Japanese Word Processors
- JWP (Japanese Word Processor)
- Best freeware Japanese word processor!
- Price $0- NJStar Japanese WP V4.0j(b5) by Hongo Ni
- Released May 1996
- Shareware version Basic $99 (Bitmap fonts), Pro &199 (2 True Type Fonts), Pro Plus $299 (4 True Type Fonts)- KanjiWord
- Japanese word processor for US Windows
- Built in front-end-processor
- $278
* Japanese Display-only Programs (E-mail, WWW, Newsgroups)
- Mview
- Shareware
- Displays Japanese, Chinese & Korean characters
- Windows95 and Windows 3.1
- Automatic detection of Shift JIS, JIS and EUC
- $5
- Simple to use!!! I personally use this S/W.- NJWIN MSS v1.0 (Multilingual Support System)
- Displays Japanese, Chinese & Korean
- Shareware: $49+S/H
- Supports Shift JIS, JIS and EUC- MICROSOFT INTERNET EXPLORER
- Need the Japanese extension to display Japanese
- Free from Microsoft
* Japanese Windows Software
- WIN 3.1-J & WIN 95-J
- Microsoft Windows 3.1-J & Windows 95 Japanese versions
- Japanese based operating systems
- Run Japanese and English applications
- Includes Japanese fonts and front-end-processor
- $300- DOS/V
- Japanese version of MS-DOS for IBM compatibles
- $200
* English Windows Add-on S/W to simulate Japanese Environment
- WIN/V
- Similar to WIN 3.1-J
- Run native Japanese applications
- $75.90 or $129- Japanese Windows Kit
- Converts regular MS Windows into a bilingual Japanese/English Windows environment
- Runs Japanese Windows applications
- $300