Transcript Chapter 1
Chapter 7 Characters and Fonts Multimedia Systems Key Points Character sets map abstract characters to bit-patterns. The widely used ASCII character set can only accommodate 96 printable characters. Eight bit extensions to ASCII have been developed, culminating in the ISO 8859 standards. Unicode, which is identical to the 16-bit subset of ISO 10646 can, with the use of UTF-16 encoding, represent over a million characters, which is enough for all known languages. A font is a collection of glyphs, which are small images of character shapes. Fonts may be monospaced or proportionally spaced; serifed or sans serif; upright, italic or other shapes; they may vary in width from condensed to extended and in weight from ultra-light to ultra-bold. Key Points Fonts may be classified as text or display. The choice of fonts for multimedia productions must take account of the special requirements of display on a computer monitor. Fonts may use bitmaps or outlines of glyphs. Bitmapped fonts cannot be scaled accurately; outline fonts need special software to render the glyphs on screen. Common outline font formats are Adobe Type1 (PostScript) and TrueType Multiple Master fonts can be used to generate arbitrary instances of a typeface. Character Sets Character sets map abstract characters to bit-patterns. Abstract character and its graphic representation Character repertoire Code points (code value) Each abstract character maps to a code point ASCII ASCII (American Standard Code for Information Interchange) 128 code points 0-31, 127: control characters ISO 646, 1972 8-bit extension ISO 8859-1: ISO Latin 1 ISO 8859-2: Latin 2, Eastern European ISO 8859-5 (Cyrillic), 8859-7 (modern Greek) Shortcomings Not achieve universal adoption 256 is not enough ISO 10646 Universal Multiple-Octet Coded Character Set (UCS) 32-bit character set Hypercube, 4-D cube Each character (g, p, r, c) 256 groups A group= 256 planes of 256 rows A row= 256 characters Group g, plane p, row r, and column c ISO Latin1= (0, 0, 0, *) Fig. 7.1 Unicode 16-bit character set CJK consolidation (合併) 字太多 Contemporary major language and classical forms Punctuation marks, technical and mathematical symbols, arrow, dingbats (miscellaneous symbols), … Dingbats Point hands, stars Unicode (2) 39,000 symbols Reserved for UTF-16 expansion 6400 code points for private use Not for music notation or other symbolic writing system Basic Multilingual Plane (BMP) (0, 0, *, *) Encoding Quoted-Printable (QP) 8-bit ASCII => 7-bit ASCII 128-255= three bytes First byte: ACSII code for = Remaining two: hexadecimal digitals MIME content type Text/html; charset = iso-8859-1 ISO 10646 Encoding ISO 10646 UCS-4 (4 bytes) BMP (0, 0, *, *) : top two bytes set to zero Universal Character Set UCS-2 (2 bytes) Drop top two bytes UCS-2 = Unicode Unicode Encoding Three UCS Transformation Formats (UTFs) UTF-8 (8-bit bytes) If their high-order byte is zero, low-order byte < 128=> a single byte Otherwise up to 6 bytes with highest bit = 1 UTF-7 UTF-7, UTF-8, UTF-16 ~ QP, pure ASCII text UTF-16 Transforming a subset of the UCS-4 repertoire into pairs of UCS2 values from a reserved range Access to an extra 15 planes of ISO 10646 UTF-16 UCS-4 x< 0x10000 UTF-16 x 0001 0000.. 0010 FFFF y;z; y= ((x - 0001 0000) / 400) + D800 z= ((x - 0001 0000) % 400) + DC00 unmapped x >= 0011 0000 Fonts • Glyph as a specific representation of a character A A A A A A • A font as a collection of glyphs used for the visual depiction of characters • A font is often associated with a set of parameters (size, posture, weight, …) set to certain value Classification and Choice of Fonts Monospaced & Proportional Serif & Sans serif Upright shape & Italic shape Condensed & Extended Weight Monospaced & Proportional Monospaced (or fixed-width) Lucida Console Each letter occupies the same amount of horizontal space, so that the text looks as if it was typed on a typewriter. Proportional Times New Roman Each letter occupies an amount of horizontal space proportional to the width of the glyph, so that the text looks as if it was printed in a book. Serif & Sans serif Serifs: little stroke MS Reference Serif C Sans Serif Font Sans: without MS Reference Sans Serif C Serifs Difficult to render accurately at low resolutions Hard to read on a computer screen Sans Serif fonts are widely used for windows titles and menu entries. Upright shape & Italic shape Upright: vertical strokes Italic: slanted to the right (Fig. 7.7) Slanted fonts (Fig. 7.8) Share the rightward slope of italic fonts but lack their calligraphic (書法的) quality Apply a shear transformation to an upright font Some italic fonts: handwriting Calligraphic font Shapes Outline fonts Hollow fonts Fonts with drop shadows Condensed fonts Extended fonts Weight Boldface (bold) Ultra-bold, semi-bold, light, ultra-light Reserved for Headings Never use boldface for emphasis, always italics Italic text renders badly at low resolutions => Bold text Families Atalic version, bold version of an upright font Group in to a family Lucida Bright family = 20 fonts When fonts from different families are combined, their differences can be very noticeable (Fig. 7.12) => Carefully avoided Text & Display Text: for continuous text Unobtrusive (不突出) Problematical: low resolution of monitors Text for display: 60% larger Display: heading, signs or advertising slogans on poster Short message Eye-catching Fig. 7.13 Desktop publishing (DTP): printing on paper No control over fonts that will be used when text is finally displayed Software used for display may let users override the original fonts with those of their own choosing. Most fonts’ repertoires consist of the letters from some alphabet. Not include lower case letters Mathematical symbols are usually grouped into their own fonts, knows as symbol fonts or pi fonts. Font Measurement Absolute length units points (pt) picas (pc) 1/72 inch = 0.3528 mm 12 point = 4.2333 mm Relative length units ex units (ex) X-height em units (em) The width of a capital letter M one ex is equal to one-half em Font Terminology Horizontal Layout Bounding box Left side bearing bearingX Top side bearing bearingY Vertical Layout Kerning Kerning is the art of character fitting so that the space between characters is visually correct rather proportionally set by the machine. Most often recommended in headlines, and larger settings of type, it's the art of carefully moving characters together so the word looks and reads better without holes within the word. Good cases are: Ta, To, Wo, Po or other situations where a hole is formed by a wide portion of a letter. Ligature A ligature (連字體) is a set of two or more characters that have been designed into a harmonious "set". Kerning, ligatures: High-quality text layout software Word processors and web browsers cannot do this. Bitmap Fonts Bitmap Fonts Bitmap fonts are by nature pre-rasterized, they render very quickly, making them a good choice where speed is important. Cannot be scaled gracefully Each platform has its own native Bitmapped font format. Outline Fonts Outline Fonts Outline fonts describe the character outlines with a combination of control points and curves. Cross-plateform Adobe type 1 (PostScript fonts), TrueType Scaled arbitrarily The same font for display and printing. Adobe type manager (ATM): Adobe type 1 TrueType & PostScript TrueType: quadratic (二次) curves PostScript: cubic Bezier curves TrueType font is stored as a series of points which define the lines and curves making up its shape. OpenType unifies Type 1 and TrueType Hints and Instruction Extra information for low resolutions Type 1: hints TrueType: instruction ClearType Windows XP ClearType delivers improved font display resolution over traditional anti-aliasing. It improves readability on color LCD monitors with a digital interface. Readability on CRT screens can also be somewhat improved. ClearType This is a picture of ClearType under extreme magnification, with the sub-pixels of an LCD explicitly rendered to show the structure of the ClearType letterforms. Anti-Aliasing Fig. 7.18 Anti-aliasing should be applied to large fonts. Multiple Master Fonts A new development Medium weight font might lie half-way between ultra-bold and ultra-light glyphs 4 Design axes: weight, width, optical size, serif style Fig 7.19: 3 design axis A partial answer to font substitution problem