Outputting Arabic text
by Richard Russell, April 2012
BBC BASIC for Windows supports outputting Unicode text to the main output window or to the printer, and it also supports right-to-left printing, so in principle it ought to be able to output Arabic text. However there is a complication: in Arabic the shapes of characters (their glyphs) can depend on their placement within a word, i.e. depending on whether they are at the start of a word, at the end of a word, in the middle of a word or on their own (isolated).
Because BBC BASIC outputs text as a VDU stream each character is treated in isolation and therefore when outputting Arabic the 'isolated' forms of the characters are used and the text is not rendered correctly. For example try running this program (copy-and-paste it into the BB4W editor):
VDU 23,22,640;512;8,16,16,128+8 : REM Select UTF-8 output *FONT Times New Roman, 28 arabic1$ = "هذا هو مثال على النص العربي" arabic2$ = "هو مكتوب من اليمين إلى اليسار" VDU 23,16,2;0;0;0;13 : REM Select right-to-left printing PRINT arabic1$ ' arabic2$ VDU 23,16,0;0;0;0;13 : REM Select left-to-right printing END
This is what is displayed:
The isolated forms of the characters have been used and the result is not correct.
Fortunately there is a solution to this problem. By pre-processing the strings before output the correct glyphs can be generated. Here is a revised version of the program (the function FNarabic is listed later):
VDU 23,22,640;512;8,16,16,128+8 : REM Select UTF-8 output *FONT Times New Roman, 28 arabic1$ = "هذا هو مثال على النص العربي" arabic2$ = "هو مكتوب من اليمين إلى اليسار" VDU 23,16,2;0;0;0;13 : REM Select right-to-left printing PRINT FNarabic(arabic1$) ' FNarabic(arabic2$) VDU 23,16,0;0;0;0;13 : REM Select left-to-right printing END
This is what is displayed:
The correct cursive script has been produced.
Here is the FNarabic function:
DEF FNarabic(A$) LOCAL A%, B%, O%, P%, U%, B$ A$ += CHR$0 FOR A% = !^A$ TO !^A$+LENA$-1 IF ?A%<&80 OR ?A%>=&C0 THEN O% = P% : P% = U% U% = ((?A% AND &3F) << 6) + (A%?1 AND &3F) IF ?A%<&80 U% = 0 CASE TRUE OF WHEN U%=&622: U% = &81 WHEN U%=&627: U% = &8D WHEN U%<&628: WHEN U%<=&629: U% = &8F+4*(U%-&628) WHEN U%<=&62E: U% = &95+4*(U%-&62A) WHEN U%<=&632: U% = &A9+2*(U%-&62F) WHEN U%<=&63A: U% = &B1+4*(U%-&633) WHEN U%<&641: WHEN U%<=&648: U% = &D1+4*(U%-&641) WHEN U%=&649: U% = &EF WHEN U%=&64A: U% = &F1 ENDCASE IF P% IF P%<&600 THEN B% = P% IF U% IF P%>&8D IF P%<>&93 IF P%<&A9 OR P%>&AF IF P%<>&ED IF P%<>&EF B% += 2 IF O% IF O%>&8D IF O%<>&93 IF O%<&A9 OR O%>&AF IF O%<>&ED IF O%<>&EF B% += 1 B$ = LEFT$(LEFT$(B$))+CHR$&EF+CHR$(&B8+(B%>>6))+CHR$(&80+(B%AND&3F)) ENDIF ENDIF B$ += CHR$?A% NEXT = LEFT$(B$)