Roman-to-Bengali Conversion Rules

Go through the following steps to produce the correct input to the converter :

Step 1 : Transliterate into English
Bangla ekti sundar bhasa.
Write your Bengali sentences in English so that somebody who reads this text and who knows that this is Bengali, may correctly decode it. For human readers this is in most cases sufficient. Unfortunately Bengali Writer is not intelligent. More precisely, it does not "know" Bengali. It has to be input the text in the correct format. So you have to go through the remaining steps.

Step 2 : Remove unnecessary upper-case letters
bangla ekti sundar bhasa.
You may not start a sentence with an upper-case letter. Note that Bengali Writer is case sensitive.

Step 3 : Use proper symbols for swarabarna's (vowels)
baanglaa ekti sundar bhaasaa.
Bengali Writer distinguishes between "a" and "aa". You should write the proper symbol for the swarabarno that conforms to the spelling. For details, see below.

Step 4 : Use proper symbols for banjonbarno's (consonants)
baanglaa ekTi ssundar bhaasaa.
Again use the proper symbol for the desired consonant. In our example, "t" is to be replaced by "T" to distinguish between the hard t ("T") and the soft t ("t"). Similarly, the "s" in "sundar" is to be replaced by "ss" so that it outputs the correct spelling, namely "danto-s". Details follow.

Step 5 : Use separators
baang_laa ek_Ti ssundar bhaasaa.
Put an underscore ("_") to separate two successive swar's without a banjon in between as well as to separate two successive banjon's without a swar in between. Here "ekTi" should be replaced by "ek_Ti".

Step 6 : Mark the juktakkhar's (compound consonants)
baang_laa ek_Ti ssu<nd>ar bhaasaa.
Confine all the juktakkhar's to appear in your output by a pair of angular brackets. You may use the angular brackets also to delimit a swarabarno (not the symbol for it when it is compounded with a banjonbarno) placed NOT at the beginning of a word. For example, "ba_i" and "ba<i>" give the same output, namely a book.

And that's it. You are done !!! Isn't it simple ? I hope that you are now quite sure of what to do and how to do. There are, however, some topics that need further elaboration.

Swarabarno's :

Swarabarno's (vowels) and their symbols (which Bengalees call kaar's, e.g., a-kaar, aa-kaar, i-kaar etc.) are written in the same way. If a word begins with a swarabarno, it has to be the vowel itself, and not its symbol (as e in ek_Ti). However if a vowel sound comes not at the beginning of a word, it is treated as its symbol (kaar), unless you specify the otherwise as illustrated in Step 6 above. The vowels are encoded as
a aa i ee
u oo R(or,Ri) L(or,Li)
e oi o ou

Banjonbarno's :

The encodings of consonants are as follows:
k kh g gh ng
ch chh j jh NN
T Th D Dh N
t th d dh n
p ph (or f) b bh (or v) m
z r l b sh
s ss h rr rh
y TT Ng H ^
Note how the different s's, different n's and different j's are encoded. Also see the difference between the T-bargo (T, Th, D, Dh, N) and the t-bargo (t, th, d, dh, n).

Special banjonbarno's :

The following banjonbarno's are treated differently :
bisarga This is written as H. It does not take any vowel after it (as expected).
chandrabindu This is written as ^. If you want to put chandrabindu to a banjonbarno just put a ^ at the end of the symbol which stands for the banjonbarno, for example, the moon is "ch^aad". On the other hand if you would like to put chandrabindu to a juktakkhar make ^ the last character before closing the angular bracket environment (such cases do not occur very frequently in Bengali) - like <JKTR^> where JKTR is a juktakkhar.
hasanta The hasanta is written as |. Like chandrabindu, this does not take any vowels after it.

Juktakkhar's :

A juktakkhar is a sequence of consonants without vowels in between and written as a single symbol. A juktakkhar that does not include a ref or a ja-fala can be obtained by concatenating the encodings for the component consonants. (There are exceptions.) Note that the ref's and ja-fala's are written in a similar way. For example, a sentence is a "baa<ky>a", it's meaning is "a<rth>a". And a gift is an "a<rghy>a".

Numbers :

Use the English digits to denote numbers in Bengali. For example, if you like to mention the price of your new colorful T-shirt in your document, write "175 Taakaa".

Punctuation symbols and formatting :

Use the same punctuation symbols as in English ( . , " ? - ( ) ! : / ). You may not put a "|" to specify a period. Formatting is done by your web-browser. You can specify a line-break by a back-slash (\) and a space by a tilde (~).

Inserting English texts :

You can insert English texts in your document by the <ENG> command. When you like to revert to Bengali fonts, type </ENG>. Whatever you write between these two tags go directly to the output without any modification. Hence this text should be in the HTML format. All HTML tags in this text are interpreted by your web-browser in the usual way. Thus you can create links, import images and do similar things as you do when you prepare your homepage.
Note :
1. The tags should be as mentioned above. You may not change the case of any of the letters appearing between the angular brackets. Thus, for example, <Eng> is not a valid tag for starting the English mode.
2. The tags <ENG> and </ENG> have to be both preceded and followed by a white character (a carriage return or a space or a tab). I recommend you to write each tag in a line containing no other text.

Form for BWtester

The code convert.c can be used to convert roman text to bengali off-line. The CGI application is named BWtester. You should use the following form to call BWtester. <FORM METHOD="POST" ACTION="http://your.machine/cgi-bin/BWtester"> Enter Text here :<BR> <TEXTAREA NAME="text" ROWS=20 COLS=80> </TEXTAREA> <P><INPUT TYPE="RESET" VALUE="Clear"> <INPUT TYPE="SUBMIT" VALUE="When you are done, click here"> </FORM>

Download


Mail your comments and suggestions to abhij@csa.iisc.ernet.in


This page hosted by Get your own Free Home Page