2018-06-24-fun-with-fonts-in-emacs.en.md (10716B)
1 +++
2 title = "Fun With Fonts in Emacs"
3 date = 2018-06-24
4 slug = "fun-with-fonts-in-emacs"
5 draft = false
6 +++
7
8 I finally took some time to look at the my font configurations in Emacs and cleaned them up as much as possible. This dive into the rabbit hole have been tiring yet fruitful, revealing the cravat of typesetting that I didn't know before, especially for CJK characters.
9
10 I primarily use Emacs by running a daemon and connecting to it via a graphical
11 `emacsclient` frame, and I am attempting to tackle three major problems: I don't
12 have granular control over font mapping, glyph widths are sometimes inconsistent
13 with character widths, and emoji show up as weird blocks. Terminal Emacs doesn't
14 suffer as much from these problems, yet I don't want to give away the nice perks
15 like system clipboard access and greater key binding options, so here goes
16 nothing.
17
18
19 ## Font Fallback Using Fontsets {#font-fallback-using-fontsets}
20
21 Ideally, I want to specify two sets of fonts, a default monospace font and a
22 CJK-specific font. Here's how I originally specified the font in Emacs:
23
24 ```emacs-lisp
25 (setq default-frame-alist '((font . "Iosevka-13")))
26 ```
27
28 The method above obviously leaves no ground for fallback fonts. However, it
29 turns out I can specify the `font` to be a fontset instead of an individual
30 font. According to [Emacs Manual](https://www.gnu.org/software/emacs/manual/html%5Fnode/emacs/Fontsets.html), a fontset is essentially a mapping from Unicode
31 range to a font or hierarchy of fonts and I can [modify](https://www.gnu.org/software/emacs/manual/html%5Fnode/emacs/Modifying-Fontsets.html) one with relative ease.
32
33 Sounds like an easy job now? Not so fast. I don't really know which fontset to
34 modify: fontset behavior is quirky in that the fontset Emacs ends up using seems
35 to differ between `emacsclient` and normal `emacs`, between terminal and
36 graphical frames, and even between different locales. While there is a way to
37 get the current active fontset (`(frame-parameter nil 'font)`), this method is
38 unreliable and may cause errors like [this one](https://lists.gnu.org/archive/html/emacs-devel/2006-12/msg00285.html).
39
40 After all kinds of attempts and DuckDuckGoing (that really rolled right off the
41 tongue, and no, I am [not the first one](https://www.reddit.com/r/duckduckgo/comments/8cm51u/what%5Fing%5Fverb%5Fdo%5Fyou%5Fuse%5Ffor%5Fduckduckgo/)), I finally found the [answer](https://stackoverflow.com/questions/17102692/using-a-list-of-fonts-with-a-daemonized-emacs): just define
42 a new fontset instead of modifying existing ones.
43
44 ```emacs-lisp
45 (defvar user/standard-fontset
46 (create-fontset-from-fontset-spec standard-fontset-spec)
47 "Standard fontset for user.")
48
49 ;; Ensure user/standard-fontset gets used for new frames.
50 (add-to-list 'default-frame-alist (cons 'font user/standard-fontset))
51 (add-to-list 'initial-frame-alist (cons 'font user/standard-fontset))
52 ```
53
54 I won't bore you with the exact logic just yet, as I also made other changes to
55 the fontset.
56
57
58 ### Displaying Emoji {#displaying-emoji}
59
60 Solution to emoji display is similar—just specify a fallback font with emoji
61 support—or so I thought. I tried to use Noto Color Emoji as my emoji font,
62 only to find Emacs does not yet support colored emoji font. Emacs used to
63 support colored emoji on macOS, but this functionality was later [removed](https://github.com/emacs-mirror/emacs/blob/emacs-25.1/etc/NEWS#L1723).
64
65 I ended up using [Symbola](http://users.teilar.gr/~g1951d/) as my emoji fallback font (actually I used it as a
66 fallback for all Unicode characters), which provided comprehensive coverage over
67 [all the emoji](https://unicode.org/Public/emoji/11.0/emoji-test.txt) and special characters. Also note that since Emacs 25,
68 customization to the `symbols` [charset](https://www.gnu.org/software/emacs/manual/html%5Fnode/emacs/Charsets.html), which contains puncation marks, emoji,
69 etc., requires [some extra work](https://github.com/emacs-mirror/emacs/blob/emacs-25/etc/NEWS#L58):
70
71 ```emacs-lisp
72 (setq use-default-font-for-symbols nil)
73 ```
74
75 There does exist a workaround for colored emoji though, not with fancy fonts,
76 but by replacing Unicode characters with images. [`emacs-emojify`](https://github.com/iqbalansari/emacs-emojify) is a package
77 that provides this functionality. I ultimately decided against it as it does
78 slow down Emacs quite noticeably and the colored emoji image library is not as
79 comprehensive.
80
81
82 ### Quotation Marks {#quotation-marks}
83
84 I've always used full-width directional curly quotation marks ("“”" and
85 "‘’") when typing in Chinese, and ASCII style ambidextrous straight quotation
86 marks (""" and "'") when typing in English. Little did I know there really is no
87 such thing as full-width curly quotation marks: there is only one set of curly
88 quotation mark codepoints in Unicode (U+2018, U+2019, U+201C, and U+201D) and
89 the difference between alleged full-width and half-width curly quotation marks
90 is caused solely by fonts. There have been [proposals](https://www.unicode.org/L2/L2014/14006-sv-western-vs-cjk.pdf) to standardize the two
91 distinct representations, but for now I'm stuck with this ambiguous mess.
92
93 It came as no surprise that these curly quotation marks are listed under
94 `symbols` charset, instead of a CJK one, thus using normal monospace font
95 despite the fact that I want them to show up as full-width characters. I don't
96 have a true solution for this—being consistent is the only thing I can do, so
97 I forced curly quotation marks to display as full width characters by overriding
98 these exact Unicode codepoints in my fontset. I'm not really sure how I feel
99 when I then realized ASCII style quotation marks also suffered from
100 [confusion](https://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html)—maybe we are just really bad at this.
101
102 My fallback font configurations can be found on both [GitHub](https://github.com/shimmy1996/.emacs.d#fontset-with-cjk-and-unicode-fallback) and [Trantor Holocron](https://git.shimmy1996.com/emacs.d/file/README.org.html#l158)
103 and I'll list them here just for sake of completeness:
104
105 ```emacs-lisp
106 (defvar user/cjk-font "Noto Sans CJK SC"
107 "Default font for CJK characters.")
108
109 (defvar user/latin-font "Iosevka Term"
110 "Default font for Latin characters.")
111
112 (defvar user/unicode-font "Symbola"
113 "Default font for Unicode characters, including emojis.")
114
115 (defvar user/font-size 17
116 "Default font size in px.")
117
118 (defun user/set-font ()
119 "Set Unicode, Latin and CJK font for user/standard-fontset."
120 ;; Unicode font.
121 (set-fontset-font user/standard-fontset 'unicode
122 (font-spec :family user/unicode-font)
123 nil 'prepend)
124 ;; Latin font.
125 ;; Only specify size here to allow text-scale-adjust work on other fonts.
126 (set-fontset-font user/standard-fontset 'latin
127 (font-spec :family user/latin-font :size user/font-size)
128 nil 'prepend)
129 ;; CJK font.
130 (dolist (charset '(kana han cjk-misc hangul kanbun bopomofo))
131 (set-fontset-font user/standard-fontset charset
132 (font-spec :family user/cjk-font)
133 nil 'prepend))
134 ;; Special settings for certain CJK puncuation marks.
135 ;; These are full-width characters but by default uses half-width glyphs.
136 (dolist (charset '((#x2018 . #x2019) ;; Curly single quotes "‘’"
137 (#x201c . #x201d))) ;; Curly double quotes "“”"
138 (set-fontset-font user/standard-fontset charset
139 (font-spec :family user/cjk-font)
140 nil 'prepend)))
141
142 ;; Apply changes.
143 (user/set-font)
144 ;; For emacsclient.
145 (add-hook 'before-make-frame-hook #'user/set-font)
146 ```
147
148
149 ## CJK Font Scaling {#cjk-font-scaling}
150
151 My other gripe is the width of CJK fonts does not always match up with that of
152 monospace font. Theoretically, full-width CJK characters should be exactly twice
153 of that half-width characters, but this is not the case, at least not in all
154 font sizes. It seems that CJK fonts provide less granularity in size, i.e. 16px
155 and 17px versions of CJK characters in Noto Sans CJK SC are exactly the same,
156 and does not increase until size is bumped up to 18px, while Latin characters
157 always display the expected size increase. This discrepancy means their size
158 would match every couple sizes, but different in between with CJK fonts being a
159 bit too small.
160
161 One solution is to specify a slightly larger default size for CJK fonts in the
162 fontset. However, this method would render `text-scale-adjust` (normally bound
163 to <kbd>C-x C-=</kbd> and <kbd>C-x C--</kbd>) ineffective against CJK fonts for some reason. A
164 better way that preserves this functionality is to scale the CJK fonts up by
165 customizing `face-font-rescale-alist`:
166
167 ```emacs-lisp
168 (defvar user/cjk-font "Noto Sans CJK SC"
169 "Default font for CJK characters.")
170
171 (defvar user/font-size 17
172 "Default font size in px.")
173
174 (defvar user/cjk-font-scale
175 '((16 . 1.0)
176 (17 . 1.1)
177 (18 . 1.0))
178 "Scaling factor to use for cjk font of given size.")
179
180 ;; Specify scaling factor for CJK font.
181 (setq face-font-rescale-alist
182 (list (cons user/cjk-font
183 (cdr (assoc user/font-size user/cjk-font-scale)))))
184 ```
185
186 bWhile the font sizes might still go out of sync after `text-scale-adjust`, I am
187 not too bothered. The exact scaling factor took me a few trial and error to find
188 out. I just kept adjusting the factor until these line up (I found [this table](https://websemantics.uk/articles/font-size-conversion/)
189 really useful):
190
191 ```nil
192 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
193 云云云云云云云云云云云云云云云云云云云云云云云云云云云云云云云云云云云云云云云云
194 雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲雲
195 ㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞㄞ
196 ああああああああああああああああああああああああああああああああああああああああ
197 가가가가가가가가가가가가가가가가가가가가가가가가가가가가가가가가가가가가가가가가
198 ```
199
200 Unfortunately, the CJK font I used has narrower Hangul than other full-width CJK
201 characters, so this is still not perfect—the solution would be to specify a
202 Hangul specific font and scaling factor—but good enough for me.
203
204 It took me quite some effort to fix what may seem like a minor annoyance, but at
205 least Emacs did offer the appropriate tools. By the way, I certainly wish I had
206 found [this article](https://www.emacswiki.org/emacs/FontSets) on Emacs Wiki sooner, as it also provides a neat write up of
207 similar workarounds.