4 GNU Unifont is an official GNU package. It is a dual-width
5 (8x16/16x16) bitmap font, designed to provide coverage for
6 all of Unicode Plane 0, the Basic Multilingual Plane (BMP).
7 This version has a glyph for each visible code point in the
8 Unicode 6.3 Basic Multilingual Plane (Plane 0).
10 Unifont only provides a single glyph for each character, making it
11 impossible to handle any language properly that needs context-dependent
12 character shaping. It is supplied in the form of a hex file, with
13 a converter to convert it to BDF. See http://czyborra.com/unifont/
14 or http://unifoundry.com/unifont.html for more information. The
15 BDF font is converted to PCF, and the hex file is converted to a
18 This is the unifoundry.com collection of utilities for GNU Unifont,
19 assembled by Paul Hardy with the encouragement of the font's creator,
20 Roman Czyborra. This archive contains the following directories
23 ChangeLog Log of changes made to each GNU release
24 COPYING Full text of GPL version 2
25 doc Documentation in Texinfo format
26 font Everything you need to build the font from scratch
27 hangul Standalone font sources to build hangul-syllables.hex
28 INSTALL Instructions for font and software installation
29 Makefile The "make" file
31 NEWS Summary of what's new with each GNU release
33 src Source programs, in Perl and C
35 The "font/precompiled" directory contains prebuilt font-related files:
37 coverage.txt Percentage coverage of each Plane 0 script
39 unifont-<version>.hex Hex string source of glyphs to build Unifont
40 unifont-<version>.bdf.gz BDF version of Unifont
41 unifont-<version>.pcf.gz PCF version of Unifont
42 unifont-<version>.ttf TrueType version of Unifont
44 unifont_sample-<version>.hex Hex string source of all Plane 0 glyphs,
45 including nonprinting and PUA glyphs, with
47 unifont_sample-<version>.bdf.gz BDF font version of the above .hex file
48 unifont_sample-<version>.ttf SBIT font version of the above .hex file
50 unifont-<version>.bmp The entire Plane 0 font with combining circles,
51 actually built from unifont_sample-*.hex to
52 show combining circles
54 This release incorporates all glyph errata issued by The Unicode Consortium
55 from Unicode 1.0 errata to the latest.
60 See the "INSTALL" file in this directory for building instructions.
65 Roman Czyborra wrote all the Perl files in the src directory except
66 "hex2sfd", "unifontchojung", "unifontksx", "unihex2png", and "unipng2hex".
67 In the case of "johab2ucs2", Jungshik Shin wrote the orignial version;
68 he then gave it to Roman. Paul Hardy made further changes to "johab2ucs2".
70 Roman originally named the "src/hexbraille" script as simply "braille".
71 Paul Hardy thought there was too great a chance of a name conflict with
72 other utilities, and so renamed it.
74 Luis Alejandro Gonzalez Miranda wrote the original "hex2sfd" Perl
75 script, as well as a "howto-build.sh" shell script that Paul Hardy
76 converted into "./font/ttfsrc/Makefile".
78 Paul Hardy wrote "unifontchojung" and "unifontksx" for extracting subsets
79 of Hangul glyphs, as an aid in creating a new Hangul Syllables block.
81 Andrew Miller wrote "unihex2png" and "unipng2hex" based upon Paul
82 Hardy's "unihex2bmp" and "unibmp2hex" programs.
84 Paul Hardy wrote all the C programs.
89 Roman Czyborra created the original GNU Unifont, including the
90 .hex format. For greater detail, see the HISTORY section below.
92 David Starner aggregated many glyphs contributed by others and
93 built these into pre-2004 Unifont releases.
95 Qianqian Fang began his Wen Quan Yi font in 2004, by which
96 time work on Unifont had stopped. Most of the almost 30,000
97 CJK ideographs in Unifont versions 5.1 and later were taken
98 from Wen Quan Yi with permission of Qianqian Fang. The glyphs
99 in "./font/hexsrc/wqy-cjk.hex" are for the most part Qianqian
100 Fang's Unibit and Wen Quan Yi glyphs.
102 Paul Hardy drew most of the newly-drawn glyphs added to the BMP
103 from the Unifont 5.1 release to the present release. This includes
104 the 11,172 glyphs in the Hangul Syllables block, plus approximately
105 10,000 additional glyphs scattered throughout the BMP.
107 Andrew Miller drew the glyphs added to Unicode 6.3.0.
112 The source code for everything except the compiled fonts in this current
113 release is licensed as follows:
115 License for this current distribution of program source
116 files (i.e., everything except the fonts) is released under
117 the terms of the GNU General Public License version 2,
118 or (at your option) a later version.
120 GPL version 2 is contained in the "COPYING" file in the main source
121 directory for this package. If your received this source without
122 a copy of GPL version 2, you can download a copy from GNU's website
123 at http://www.gnu.org/licenses/gpl-2.0.html.
125 The license for the compiled fonts is covered by the above GPL terms
126 with the GNU font embedding exception, as follows:
128 As a special exception, if you create a document which uses this font,
129 and embed this font or unaltered portions of this font into the document,
130 this font does not by itself cause the resulting document to be covered
131 by the GNU General Public License. This exception does not however
132 invalidate any other reasons why the document might be covered by the
133 GNU General Public License. If you modify this font, you may extend
134 this exception to your version of the font, but you are not obligated
135 to do so. If you do not wish to do so, delete this exception statement
138 See "http://www.gnu.org/licenses/gpl-faq.html#FontException" for more details.
141 CHANGES IN VERSION 6.3
142 ----------------------
143 Version 6.3 reflects all glyph changes and errata published in Unicode
144 6.3.0. In preparation for releasing this version, Paul Hardy obtained
145 a hard copy of the errata published in Unicode Version 1.1, not yet
146 available on Unicode's website. All previously published errata have
147 been incorporated. This is a complete replacement for all previous
150 The following code points in previously published errata were examined
151 and found to be correct:
153 Unicode 1.1: U+717F, U+773E, U+809C, U+8480, U+908E
155 Andrew Miller drew the 5 new additions to the Unicode 6.3.0 Basic
156 Multilingual Plane in the initial Unifont 6.3 release.
158 The latest Unifont 6.3 release includes these glyph changes by Paul Hardy:
160 - Armenian -- several glyphs were redrawn based upon feedback from
161 native speakers (U+0530..U+058F).
163 - CJK Radicals Supplement -- several glyphs were redrawn to better match
164 their representations in The Unicode Standard code charts:
165 U+2E9F, U+2EA9, U+2EAC, U+2EAE, U+2EC0, U+2EDE, U+2EE7, and U+2EED.
167 - Capricorn sign (U+2651) -- this was redrawn to an alternate form that
168 better fit in an 8 by 16 pixel grid.
170 - Dashes -- changed to distguish better between different dash types
171 (a two horizontal pixel difference is the minimum to easily distinguish
172 a difference between two glyphs):
173 * Hyphen (U+002D) and Soft Hyphen (U+00AD) are now 4 pixels wide
174 * En Dash (U+2012) is now 6 pixels wide
175 * Em Dash (U+2013) is now 8 pixels wide
178 * Centered text for C1 Controls U+0089 ("HTJ"), U+0095 ("MW"),
180 * Copied glyphs from U+0000..U+001F to U+2400..u+241F and erased
181 surrounding borders; earlier, some glyphs in U+0000..U+001F had
182 their text re-centered so this carries that change forward
184 - Arrows -- General Re-alignment
185 * Aligned most single vertical arrow strokes with the 5th column,
186 counting from the left, to align with the center of the "w" glyph
188 * Aligned most single horizontal arrow strokes with the 7th row,
189 counting from the bottom, to align with the horizontal stroke in
190 the "e" glyph (U+0065); this follows the convention of Donald Knuth's
191 fonts in TeX, as illustrated in The TeXbook
192 * Modified the following ranges per the above two re-alignments:
193 o U+2190..U+21FF Arrows
194 o U+27F0..U+27FF Supplemental Arrows -- A
195 o U+2900..U+297F Supplemental Arrows -- B
196 o U+2B00..U+2BFF Miscellaneous Symbols and Arrows
198 - Modified the following additional Miscellaneous Technical glyphs
199 * Scan lines for old 9-line character terminals:
200 o U+23BA Line 1, horizontal line across row 1 (counting from the top)
201 o U+23BB Line 3, horizontal line across row 5 (counting from the top)
202 o U+23BC Line 7, horizontal line across row 12 (counting from the top)
203 o U+23BD Line 9, horizontal line across row 16 (counting from the top)
204 * U+23CE Return Symbol: shortened to match Latin capital height
205 * U+23AF Horizontal Line Extension: aligned on 7th row, counting
207 * U+23D0 Vertical Line Extension: aligned on 5th column, counting
209 * U+23DA Ground Symbol: aligned with Vertical Line Extension (U+23D0)
210 * U+23DB Fuse Symbol: algined with Horizontal Line Extension (U+23AF)
211 * U+23EC Black Down-pointing Double Triangle: moved down one row to
212 match Latin capital height
214 - Swapped U+FE17 and U+FE18, which had been reversed
216 - hangul/ directory -- updated "hangul-generation.html" to match the
217 latest version at http://unifoundry.com/hangul/hangul-generation.html
219 Five new utility programs have also been added:
221 - unifontpic - creates a bitmapped graphics (.bmp) file of the entire
222 Basic Multilingual Plane (Plane 0), by default in a 256-by-256
223 glyph grid for ease of printing, and optionally in a 16-by-4096 glyph
224 grid for easier scrolling on a screen, for software that can handle
225 a .bmp file with over 64k pixel rows (not all software can). The
226 256-by-256 glyph grid can be scaled to print on a piece of paper
227 approximately 3 feet by 3 feet (or one meter by one meter). Written
230 - unigencircles - adds dashed combining circles to unifont.hex glyphs
231 for code points that are in "font/ttfsrc/combining.txt" but not in
232 "font/hexsrc/nonprinting.hex". Written by Paul Hardy.
234 - unigenwidth - creates an implementation of the POSIX functions
235 wcwidth() and wcswidth() as specified in IEEE 1003.1-2008, Vol. 2:
236 System Interfaces, Issue 7, pages 2251 and 2241, respectively.
237 Plane 0 widths are determined by reading the current Unifont glyphs.
238 All higher planes, 0x01 through 0x10, are calculated without regard
239 to Unifont glyphs. This can be modified in the future if Unifont
240 glyphs extend beyond Plane 0. Written by Paul Hardy.
242 - unihex2png - converts a unifont.hex-format file into a Portable
243 Network Graphics (PNG) file for editing with a wider rane of graphics
244 editors than the original unihex2bmp allowed. Written by Andrew
245 Miller, based upon the unihex2bmp source code. Introduced in
246 Version 6.3.20131215.
248 - unipng2hex - converts a PNG graphics file created by unihex2png
249 back into a unifont.hex-format file. Written by Andrew Miller,
250 based upon the unibmp2hex source code. Introduced in Version
253 The last two program additions, unihex2png and unipng2hex, also support
254 glyph heights of 24 and 32 pixels in addition to Unifont's original
255 height of 16 pixels. hex2bdf and hexdraw have also been modified to
256 support these alternate glyph heights. This capability has not been
257 tested extensively, and for now is considered experimental.
260 CHANGES IN VERSION 6.2
261 ----------------------
262 After release of version 5.1 of Unifont, it was learned that the
263 replacement glyphs used in Hangul Syllables, although free to use,
264 could never be licensed under any version of GPL. For that reason,
265 Paul Hardy created a set of Hangul Syllables from scratch with the
266 oversight of some native Koreans. This was done using the files that
267 appear in the "hangul/" directory. For a detailed discussion of the
270 http://unifoundry.com/hangul/hangul-generation.html
272 The new font was released as Unifont 6.2, with representation of
273 all glyphs in the Unicode 6.2 BMP. As a result of replacing the
274 Hangul Syllables block, this was the first release that provided
275 GPLv2+ coverage (with a font embedding exception) for the entire
278 The Unicode Consortium released Unicode Version 6.2.0 on 22 April 2013.
280 This version of Unifont includes all additions to the BMP since Unicode
281 Version 5.1, and adds 1,328 more glyphs to the Basic Multilingual Plane.
283 It also incorporates all errata that the Unicode Consortium published
284 that apply to the BMP from Unicode 3.0 errata through Unicode 6.1 errata
285 (listed with the Unicode 6.2.0 release). Only one erratum was left
286 unmodified: the Ogham Space glyph, U+1680, which was left as a line stroke
287 because of the rendering limitations of the bitmapped Unifont. The errata
288 for the following glyphs were examined and if necessary corrected:
290 Unicode 3.1: U+066B, U+224C, U+1780..U+17E9
292 Unicode 4.0: U+06DD, U+0B66
293 Unicode 4.1: U+01B3, U+031A
294 Unicode 5.0: U+0485, U+0486, U+06E1
295 Unicode 5.1: U+047C, U+047D, U+075E, U+075F,
296 U+1031, U+1E9A, U+1460, U+147E,
298 Unicode 5.2: U+04A8, U+04A9, U+04BE, U+04BF,
299 U+135F, U+19D1, U+19D2, U+19D4
304 Note that some glyphs were assigned in earlier versions of Unicode and
305 later withdrawn, but their glyphs still appear in the code charts.
306 Therefore, they have been left in place. The Unicode Consortium now
307 holds the position that once a glyph is assigned, it is not replaced.
309 Andrew Miller noted that one glyph (U+2047) was incorrect and the glyph
310 CYRILLIC CAPITAL LETTER A (U+0410) did not match LATIN CAPITAL LETTER A
311 (U+0041). He submitted corrections and they have been incorporated.
313 The biggest change was a totally redrawn set of Hangul Syllables,
314 U+AC00..U+D7A3, comprising 11,172 glyphs in all. This allowed the
315 entire font to be licensed under the GNU GPL.
317 Unicode 6.2 (and hence Unifont) now only has 2,330 unassigned code points
318 in the BMP for possible future assignments, and the rate at which new
319 code points are being assigned in the BMP is decreasing greatly.
321 The unihex2bmp program has reversed the meaning of its "-f" (flip,
322 or transpose) flag compared to Unifont Version 5.1 unihex2bmp.
323 Now the default behavior is to produce 16x16 glyph charts with
324 the same arrangement as The Unicode Standard.
326 The unibmp2hex program now hard-codes several scripts and code points
327 to be double-width. This was necessary after removing the combining
328 circles from many glyphs that only occupied the left-hand side of the
329 16x16 grid, but combine with double-width glyphs from the rest of a
332 The "blanks.hex" file has been renamed to "unassigned.hex" as a more
333 accurate description of its contents. The "substitutes.hex" file has
334 been renamed to "spaces.hex", as all it contained were single- and
335 double-width space glyphs (strings of 0s).
338 Roman Czyborra and Paul Hardy wanted to license this entire collection
339 under GPL to simplify its adoption by the GNU Project. In the end, there
340 was just one catch: the Hangul Syllables block that appeared in Unifont 5.1,
341 although licensed for free use, could not be licensed under the GPL.
343 There was no suitable alternative that was covered under the GPL, so Paul
344 Hardy created a new block of Hangul Syllables. This took a few years of
345 spare time to complete. Native Koreans reviewed and critiqued the glyphs.
346 If anyone who is Korean would like to improve this block (U+AC00..U+D7A3),
347 please feel free to do so and submit the changes so they can be incorporated.
349 The font has also gone through a couple of simplifications since the
350 release of version 5.1:
352 - There is only one source file for CJK ideographs now, "wqy.hex",
353 acknowledging that most of these glyphs were taken from the Wen
354 Quan Yi distribution.
356 - There are no more combining circles; these were all removed.
358 The result is now there is just one variation of output font rather than
359 four. That one is used to generate the TrueType "unifont.ttf" font.
361 The directory "font/hexsrc" contains the .hex input files for building
362 Unifont, and contains these files:
364 hangul-syllables.hex Unicode Hangul Syllables, U+AC00..U+D7A3
365 nonprinting.hex Format and other assigned but invisible glyphs
366 pua.hex Private Use Area glyphs
367 README The README file
368 spaces.hex Code points that are space glyphs
369 unassigned.hex Unassigned code points in the BMP
370 unifont-base.hex Source file with almost all BMP scripts
371 wqy.hex Source file with Wen Quan Yi CJK ideographs
373 The file previously named "blanks.hex" is now named "unassigned.hex".
374 These "blank" glyphs are no longer included in the compiled font.
375 Although the Unicode Standard specifically allows a visual rendering
376 of unassigned code points, doing so would prevent a display engine
377 finding a glyph in another font. In fact, the original "blanks.hex"
378 pattern was modeled after the proposed representation of unassigned
379 code points depicted in The Unicode Standard, Version 5.0, Section 5.3,
380 Unknown and Missing Characters (p. 155).
382 Incorporating "blanks.hex" (now "unassigned.hex") was invaluable in
383 spotting assigned code points with glyphs that had not yet been drawn.
384 However, now there is complete coverage of the entire BMP, with only
385 about 2,300 BMP code points remaining out of 65,536 that could potentially
386 be given assignments in the future, so the great bulk of work on the
389 The "pua.hex" file contains a four-digit hexadecimal representation of
390 each code point, rendered as white on black. The new program "hexgen.c"
391 generated these glyphs. A four-digit hexadecimal code point is suggested
392 as one possible rendering of PUA glyphs in The Unicode Standard, Version
393 5.0, Section 5.3. Another possible rendering suggested in that same
394 section is a pencil glyph. A pencil glyph was used originally in Unifont
397 The glyphs in "pua.hex" are not compiled into the final font. To do
398 so, modify font/Makefile by adding "pua.hex" to the list of hex source
399 files. Alternatively, someone could use their own pua.hex file for
400 various Private Use Area assignments.
405 Paul Hardy's first release of Unifont and associated graphics utilities
406 was Version 5.1. This corresponded to Unicode Version 5.1 (the current
407 version at the time), with a glyph for every visible character in the
408 Unicode 5.1 Basic Multilingual Plane.
410 For the Unifont 5.1 release, Paul Hardy replaced the 11,172
411 thick-stroke Hangul Syllables glyphs with thin-stroke glyphs
412 (a desire expressed by Roman Czyborra for years), merged Qianqian
413 Fang's unibit and Wen Quan Yi glyphs into GNU Unifont (with lots
414 of help and enthusiasm from Qianqian Fang), drew about 8,500 more
415 glyphs to provide complete coerage of the BMP, and replaced the
416 existing Tibetan glyphs with new ones contributed by Rich Felker.
418 There was a bug in the johab2ucs2 Perl Script that formed one range
419 of Hangul Syllables incorrectly in previous releases. Paul Hardy
420 noticed and fixed the bug for the Unifont 5.1 release. All previous
421 releases of Unifont have an incorrectly formed Hangul Syllables block.
423 Earlier releases also had an incorrectly formed Braille glyph block.
424 There was a bug in the Perl script that drew the Braille glyphs in
425 earlier releases. Roman Czyborra made a fix to that Perl script,
426 named "braille" at his website (http://czyborra.com). The revised
427 script ("hexbraille") was included in the Unifont 5.1 release, and
428 used to generate the Unifont 5.1 Braille glyphs.
433 Roman Czyborra <roman@czybora.com> began GNU Unifont in 1998 as a low
434 quality font to provide a glyph for every Unicode character in the
435 Basic Multilingual Plane. He realized that no one font at the time
436 had complete coverage of the Unicode BMP. http://czyborra.com still
437 has several cool tools for Unifont not included here.
439 Since Roman Czyborra was unable to maintain the Unifont for a while,
440 and many patches existed on gnu-unifont@groups.yahoo.com
441 (http://groups.yahoo.com/group/gnu-unifont), David Starner
442 <dstarner98@aasaa.ofe.org> decided to make a new release extending
443 Unifont with many characters in 1999. That was the foundation of earlier
444 GNU Unifont compilations from 1999 to 2004.
446 By 2004, work on Unifont had stopped. Qianqian Fang wanted to create
447 a high-quality Chinese Unicode font in 2004. He began by copying the
448 GNU Unifont glyphs. He replaced its Latin glyphs with those of another
449 X11 font. He replaced the existing main CJK ideographs with a higher
450 quality font that the People's Republic of China had placed in the public
451 domain. Qianqian named this new font "unibit", and released it under
452 the terms of the GNU General Public License (GPL) version 2, with the
453 exception that embedding his font in a document did not by itself bind
454 that document to the terms of the GNU GPL.
456 See http://wqy.sourceforge.net/cgi-bin/enindex.cgi (English) or
457 http://wenq.org (Chinese) for more information on Wen Quan Yi.
459 In late 2007, Paul Hardy became interested in adding to GNU Unifont.
460 He wrote a couple of programs to convert GNU Unifont .hex files to and
461 from bitmap images for easy editing with any graphics software. He began
462 by combining the latest glyphs available for GNU Unifont. This starting
463 point was posted at http://czyborra.com as the 2007-12-31 version of
464 unifont.hex. Shortly after that, Roman Czyborra's website went down.
465 Paul Hardy then started posting complete copies of GNU Unifont on his
466 website, at "http://unifoundry.com/unifont.html".
468 Roman Czyborra encouraged Paul Hardy to continue this work on GNU Unifont.
470 In early 2008, Paul Hardy learned of Qianqian Fang's work. Qianqian
471 encouraged a combining of effort, and Paul Hardy at that point created
472 two versions of GNU Unifont: one with the original Chinese ideographs
473 (which Roman Czyborra copied from a Japanese font in the public domain),
474 and one with Qianqian Fang's Wen Quan Yi (Spring of Letters) ideographs.
475 The Wen Quan Yi font provides far more coverage of CJK ideographs than
476 the original Japanese font did, and is of higher quality.
478 Paul Hardy created a version of both the font with the original CJK
479 ideographs from Japan and with CJK ideographs from Wen Quan Yi that
480 contained combining circles. He then wrote a post-processing program
481 to remove the combining circles from the final font.
483 In 2005, Luis Alejandro Gonzalez Miranda (http://www.lgm.cl) created
484 a set of Fontforge scripts and Perl programs to build a TrueType font
485 from unifont.hex. Paul Hardy modified Luis' software in 2008 to cover
486 the full Unicode 5.1 Basic Multilingual Plane range. Luis gave Paul
487 Hardy permission to release this modified version under the terms of
488 "the GNU General Public License, version 2 or (at your option) a later
491 On 4 July 2008, Paul Hardy was looking through all of Roman Czyborra's
492 Perl scripts. One of these, "braille", contained a comment from 2003
493 that the original GNU Unifont did not generate its Braille patterns
494 (U+2800..U+28FF) correctly. The modified script fixed that bug. Paul
495 Hardy incorporated the corrected Braille glyphs into the 6 July 2008
496 release of GNU Unifont.
498 All previous versions probably contain this Braille bug and should be
501 Other notable additions include:
503 - Incorporation of CJK glyphs from Qianqian Fang's fonts
505 - Incorporation of Rich Felker's Tibetan glyphs
507 - Replacement of the Hangul Syllables block with a thin stroke font
508 (Roman had mentioned wanting to do this someday on his website),
509 the current version being created from scratch by Paul Hardy
511 - Addition of circled pencil glyphs for the Private Use Area
512 (suggested as an acceptable rendering in the Unicode 5.0 Standard),
513 now replaced with optional four-digit hexadecimal code point glyphs;
514 thought not built into the final font by default, they are available
515 in "font/hexsrc/pua.hex"
517 - Replacement of the Unifont 5.1 gray box glyphs for unassigned
518 code points with four-digit hexadecimal glyphs; these are built
519 into the final font by default
521 - Proper handling of combining characters in the TrueType version
523 - Proper handling of space glyphs in the TrueType version
525 The hex2bdf script in this release is Roman's original script, not the
526 modified version that produced two BDF files (one for 8 pixel wide glyphs
527 and another for 16 pixel wide glyphs). The TrueType font should be used
528 in preference to the BDF font, so this is probably a moot point.
530 For the Unifont 6.2 release, Qianqian Fang gave Paul Hardy permission
531 to release the subset of Wen Quan Yi glyphs included in Unifont under
532 GPLv2+, with a font embedding exception. With the newly-drawn Hangul
533 Syllables block, this allowed the entire font to be released under
534 GPLv2+ with a font embedding exception.
539 * Some CJK ideographs use an entire 16x16 pixel grid. This leaves
540 insufficient space between lines. However, changing to a non-square
541 grid would distort the block drawing glyphs. The best solution is
542 probably to use GNU Unifont for mostly non-CJK glyph rendering, and
543 to use Qianqian Fang's Wen Quan Yi fonts (http://wenq.org) for
544 predominately CJK glyph rendering. The Wen Quan Yi fonts use extra
545 leading (blank space) between lines.
547 * There are still some Control and Format glyphs in "unifont-base.hex";
548 these might be more appropriate for "nonprinting.hex".