Roman-to-Bengali Conversion Rules
Go through the following steps to produce the correct input to the converter :
- Step 1 : Transliterate into English
Bangla ekti sundar bhasa.
Write your Bengali sentences in English so that somebody who
reads this text and who knows that this is Bengali, may correctly
decode it. For human readers this is in most cases sufficient. Unfortunately
Bengali Writer is not intelligent. More precisely, it does
not "know" Bengali. It has to be input the text in the correct
format. So you have to go through the remaining steps.
- Step 2 : Remove unnecessary upper-case letters
bangla ekti sundar bhasa.
You may not start a sentence with an upper-case letter. Note that
Bengali Writer is case sensitive.
- Step 3 : Use proper symbols for swarabarna's (vowels)
baanglaa ekti sundar bhaasaa.
Bengali Writer distinguishes between "a" and "aa". You should
write the proper symbol for the swarabarno that conforms to the spelling.
For details, see below.
- Step 4 : Use proper symbols for banjonbarno's (consonants)
baanglaa ekTi ssundar bhaasaa.
Again use the proper symbol for the desired consonant. In our example,
"t" is to be replaced by "T" to distinguish between the hard t ("T") and
the soft t ("t"). Similarly, the "s" in "sundar" is to be replaced by "ss"
so that it outputs the correct spelling, namely "danto-s". Details follow.
- Step 5 : Use separators
baang_laa ek_Ti ssundar bhaasaa.
Put an underscore ("_") to separate two successive swar's without a
banjon in between as well as to separate two successive banjon's without
a swar in between. Here "ekTi" should be replaced by "ek_Ti".
- Step 6 : Mark the juktakkhar's (compound consonants)
baang_laa ek_Ti ssu<nd>ar bhaasaa.
Confine all the juktakkhar's to appear in your output by a pair of angular
brackets. You may use the angular brackets also to delimit a swarabarno
(not the symbol for it when it is compounded with a banjonbarno) placed
NOT at the beginning of a word. For example, "ba_i" and "ba<i>" give
the same output, namely a book.
And that's it. You are done !!! Isn't it simple ? I hope that you are now
quite sure of what to do and how to do. There are, however, some
topics that need further elaboration.
Swarabarno's :
Swarabarno's (vowels) and their symbols (which Bengalees call kaar's, e.g.,
a-kaar, aa-kaar, i-kaar etc.) are written in the same way. If a word begins
with a swarabarno, it has to be the vowel itself, and not its symbol (as e in
ek_Ti). However if a vowel sound comes not at the beginning of a word, it is
treated as its symbol (kaar), unless you specify the otherwise as illustrated
in Step 6 above. The vowels are encoded as
a aa i ee
u oo R(or,Ri) L(or,Li)
e oi o ou
Banjonbarno's :
The encodings of consonants are as follows:
k kh g gh ng
ch chh j jh NN
T Th D Dh N
t th d dh n
p ph (or f) b bh (or v) m
z r l b sh
s ss h rr rh
y TT Ng H ^
Note how the different s's, different n's and different j's are encoded.
Also see the difference between the T-bargo (T, Th, D, Dh, N) and the
t-bargo (t, th, d, dh, n).
Special banjonbarno's :
The following banjonbarno's are treated differently :
bisarga
| This is written as H. It does not take any vowel after it (as expected).
|
chandrabindu
| This is written as ^. If you want to put chandrabindu to a banjonbarno
just put a ^ at the end of the symbol which stands for the banjonbarno,
for example, the moon is "ch^aad". On the other hand if you would like
to put chandrabindu to a juktakkhar make ^ the last character before
closing the angular bracket environment (such cases do not occur very
frequently in Bengali) - like <JKTR^> where JKTR is a juktakkhar.
|
hasanta
| The hasanta is written as |. Like chandrabindu, this does not take any
vowels after it.
|
Juktakkhar's :
A juktakkhar is a sequence of consonants without vowels in between and
written as a single symbol. A juktakkhar that does not include
a ref or a ja-fala can be obtained by concatenating the encodings for
the component consonants. (There are exceptions.)
Note that the ref's and ja-fala's are written in a similar way. For example,
a sentence is a "baa<ky>a", it's meaning is "a<rth>a". And a
gift is an "a<rghy>a".
Numbers :
Use the English digits to denote numbers in Bengali. For example, if you
like to mention the price of your new colorful T-shirt in your document,
write "175 Taakaa".
Punctuation symbols and formatting :
Use the same punctuation symbols as in English ( . , " ? - ( ) ! : / ). You may not
put a "|" to specify a period. Formatting is done by your web-browser.
You can specify a line-break by a back-slash (\) and a space by a tilde
(~).
Inserting English texts :
You can insert English texts in your document by the <ENG> command. When
you like to revert to Bengali fonts, type </ENG>. Whatever you write
between these two tags go directly to the output without any modification.
Hence this text should be in the HTML format. All HTML tags in this text are
interpreted by your web-browser in the usual way. Thus you can create links,
import images and do similar things as you do when you prepare your
homepage.
Note :
1. The tags should be as mentioned above. You may not change the case of any
of the letters appearing between the angular brackets. Thus, for example,
<Eng> is not a valid tag for starting the English mode.
2. The tags <ENG> and </ENG> have to be both preceded
and followed by a white character (a carriage return or a space or a tab).
I recommend you to write each tag in a line containing no other text.
Form for BWtester
The code convert.c
can be used to convert roman text to bengali
off-line. The CGI application is named BWtester. You should use the
following form to call BWtester.
Mail your comments and suggestions to
abhij@csa.iisc.ernet.in
This page hosted by Get your own Free Home Page