How to properly display Hebrew in text widget?

I'm using Manjaro Linux KDE and the most recent versions of Tcl and Tk, and am attempting to display Hebrew in a text widget. In testing, the Hebrew text was pasted into the Tcl script in the Kate text editor and appears in the correct order, right to left with compound characters.

Without using a specific font in Tcl/Tk, the text prints from left to right and separates the components of compound characters, such that the vowel points and cantillation marks appear as separate characters. After using the SBL Hebrew font, the words look better but the vowel points are not located properly and they are still written from left to right. I tried using the \u200f and \u200e marks but it made no difference; but I really don't know what I'm doing there and simply tried prefixing and suffixing it to the Hebrew word. Reversing the the string helps but the vowel points are not combined with the consonants.

I'm not using Tkinter but this older SO post seems to indicate that it is a Linux issue with Tcl.

If I extract Hebrew from SQLite using Tcl and write it to the command line using puts, it displays correctly. Also, if I copy the reversed text from the Tk text widget and paste it in this SO question, it is displayed in the correct order. To clarify, by reversed here, I don't mean using string reverse but simply that it appears reversed in Tk but when pasted in this SO box, it displays correctly.

Would you please tell me what I'm doing wrong and how to get it to display properly?

I tried to follow this document on internationalization in Tcl and encoding but don't follow how this affects displaying Hebrew in a text Widget. I also came across a web site that has code for a unicode editor that displays several languages including Hebrew but I can't follow that code either. I tried running the code and, if select Hebrew language, it writes right to left but I don't see vowel points or cantillation marks; but I don't know much about typing the Hebrew language.

Thank you.

.tw tag configure heb -font {"SBL Hebrew" 18 normal}
.tw insert end "בְּרֵאשִׁ֖ית" "heb"
# Also tried "בְּרֵאשִׁ֖ית\u200f" and "\u200fבְּרֵאשִׁ֖ית".
# and "בְּרֵאשִׁ֖ית\u200e" and "\u200eבְּרֵאשִׁ֖ית".
# Tried .t insert end [string reverse $h ] "heb", which order the 
# consonants but the vowel points and cantillation marks are not correct.


This is the correct rendering.

enter image description here

This is from Tk. The first is in normal order and the second using string reverse. It can be observed that the vowel points are not "on" the consonants and the cantillation marks are not correct. I know little about Hebrew but I can tell they don't match and appear to be printed as separate characters instead of combined. I think what looks like a "t" under the Hebrew letter that looks similar to a "W" is two characters on top of each other-- a dot and the symbol sort of similar to a left parenthesis in the correct rendering.

enter image description here

I don't know why but after rebooting and installing the next batch of updates, not that they have anything to do with Tk, the rendering is different when a font is not set. However, once the SBL Hebrew font is set, then the characters are separated as displayed above.

enter image description here

Answers

I can tell you know that the text renders very close to correctly with Tk on macOS (I'm not sure how much is just font differences, and there's a bit of clipping of the descender decorations that I don't like, but I don't think that's Tk itself doing the wrong thing).

Hebrew text on macOS with Tk

That means that *it's definitely a rendering bug* that you're seeing. I suspect it might relate to the size of chunks of characters fed into the renderer; if the low levels of the renderer are only being given a character at a time, then they've got no chance to get the overall placement correct or to apply any character combining. I'm guessing that the real issue is that TkpDrawCharsInContext() just calls Tk_DrawChars(), if my reading of the comments is right. (By contrast, the macOS renderer does something different here.)

I don't have a workaround.

Posted on by Donal Fellows

Relevant tags