1. Trang chủ
  2. » Công Nghệ Thông Tin

beginning html xhtml css and javascript phần 10 pot

90 264 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 90
Dung lượng 1,6 MB

Nội dung

bapp04.indd 748bapp04.indd 748 11/20/09 5:35:33 PM11/20/09 5:35:33 PM E Character Encodings In Appendix D, I discussed how computers store information, how a character - encoding scheme is a table that translates between characters, and how they are stored in the computer. The most common character set (or character encoding) in use on computers is ASCII (The American Standard Code for Information Interchange), and it is probably the most widely used character set for encoding text electronically. You can expect all computers browsing the Web to understand ASCII. Character Set Description ASCII American Standard Code for Information Interchange, which is used on most computers The problem with ASCII is that it supports only the upper - and lowercase Latin alphabet, the numbers 0 – 9, and some extra characters: a total of 128 characters in all. Here are the printable characters of ASCII (the other characters are things such as line feeds and carriage - return characters). ! ` ` # $ % & ` ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ However, many languages use either accented Latin characters or completely different alphabets. ASCII does not address these characters, so you need to learn about character encodings if you want to use any non - ASCII characters. bapp05.indd 749bapp05.indd 749 11/20/09 5:36:09 PM11/20/09 5:36:09 PM Appendix E: Character Encodings 750 Character encodings are also particularly important if you want to use symbols, as these cannot be guaranteed to transfer properly between different encodings (from some dashes to some quotation mark characters). If you do not indicate the character encoding the document is written in, some of the special characters might not display. The International Standards Organization created a range of character sets to deal with different national characters. ISO - 8859 - 1 is commonly used in Western versions of authoring tools such as Macromedia Dreamweaver, as well as applications such as Windows Notepad. Character Set Description ISO - 8859 - 1 Latin alphabet part 1 Covering North America, Western Europe, Latin America, the Caribbean, Canada, Africa ISO - 8859 - 2 Latin alphabet part 2 Covering Eastern Europe including Bosnian, Croatian, Czech, Hungarian, Polish, Romanian, Serbian (in Latin transcription), Serbocroatian, Slovak, Slovenian, Upper Sorbian, and Lower Sorbian ISO - 8859 - 3 Latin alphabet part 3 Covering SE Europe, Esperanto, Maltese, Turkish, and miscellaneous others ISO - 8859 - 4 Latin alphabet part 4 Covering Scandinavia/Baltics (and others not in ISO - 8859 - 1) ISO - 8859 - 5 Latin/Cyrillic alphabet part 5 ISO - 8859 - 6 Latin/Arabic alphabet part 6 ISO - 8859 - 7 Latin/Greek alphabet part 7 ISO - 8859 - 8 Latin/Hebrew alphabet part 8 ISO - 8859 - 9 Latin 5 alphabet part 9 (same as ISO - 8859 - 1 except Turkish characters replace Icelandic ones) ISO - 8859 - 10 Latin 6 Lappish, Nordic, and Eskimo ISO - 8859 - 15 The same as ISO - 8859 - 1 but with more characters added ISO - 8859 - 16 Latin 10 Covering SE Europe Albanian, Croatian, Hungarian, Polish, Romanian and Slovenian, plus can be used in French, German, Italian, and Irish Gaelic ISO - 2022 - JP Latin/Japanese alphabet part 1 ISO - 2022 - JP - 2 Latin/Japanese alphabet part 2 ISO - 2022 - KR Latin/Korean alphabet part 1 bapp05.indd 750bapp05.indd 750 11/20/09 5:36:10 PM11/20/09 5:36:10 PM Appendix E: Character Encodings 751 It is helpful to note that the first 128 characters of ISO - 8859 - 1 match those of ASCII, so you can safely use those characters as you would in ASCII. The Unicode Consortium was then set up to devise a way to show all characters of different languages, rather than have these different incompatible character codes for different languages. Therefore, if you want to create documents that use characters from multiple character sets, you will be able to do so using the single Unicode character encodings. Furthermore, users should be able to view documents written in different character sets, providing their processor (and fonts) support the Unicode standards, no matter what platform they are on or which country they are in. By having the single character encoding, you can reduce software development costs because the programs do not need to be designed to support multiple character encodings. One problem with Unicode is that a lot of older programs were written to support only 8 - bit character sets (limiting them to 256 characters), which is nowhere near the number required for all languages. Unicode therefore specifies encodings that can deal with a string in special ways so as to make enough space for the huge character set it encompasses. These are known as UTF - 8, UTF - 16, and UTF - 32. Character Set Description UTF - 8 A Unicode Translation Format that comes in 8 - bit units. That is, it comes in bytes . A character in UTF - 8 can be from 1 to 4 bytes long, making UTF - 8 variable width. UTF - 16 A Unicode Translation Format that comes in 16 - bit units. That is, it comes in shorts . It can be 1 or 2 shorts long, making UTF - 16 variable width. UTF - 32 A Unicode Translation Format that comes in 32 - bit units. That is, it comes in longs . It is a fixed - width format and is always 1 “ long ” in length. The first 256 characters of Unicode character sets correspond to the 256 characters of ISO - 8859 - 1. By default, HTML 4 processors should support UTF - 8, and XML processors are supposed to support UTF - 8 and UTF - 16; therefore, all XHTML - compliant processors should also support UTF - 16 (as XHTML is an application of XML). For more information on internationalization and different character sets and encodings, see www.i18nguy.com/ . bapp05.indd 751bapp05.indd 751 11/20/09 5:36:10 PM11/20/09 5:36:10 PM bapp05.indd 752bapp05.indd 752 11/20/09 5:36:10 PM11/20/09 5:36:10 PM F Special Characters Some characters are reserved in XHTML; for example, you cannot use the greater - than and less - than signs or angle brackets within your text because the browser could mistake them for markup. XHTML processors must support the five special characters listed in the table that follows. Symbol Description Entity Name Number Code & Ampersand & amp; & #38; < Less than & lt; & #60; > Greater than & gt; & #62; “ Double quote & quot; & #34; Non - breaking space & nbsp; & #160; To write an element and attribute into your page so that the code is shown to the user rather than being processed by the browser (for example, as < div id= ” character ” > ), you would write: & lt;div id= & quot;character & quot; & gt; There is also a long list of special characters that HTML 4.0 – aware processors should support. In order for these to appear in your document, you can use either the numerical code or the entity name. For example, to insert a copyright symbol you can use either of the following: & copy; 2008 & #169; 2008 bapp06.indd 753bapp06.indd 753 11/20/09 11:26:23 PM11/20/09 11:26:23 PM Appendix F: Special Characters 754 The special characters have been split into the following sections: Character Entity References for ISO 8859 - 1 Characters Character Entity References for Symbols, Mathematical Symbols, and Greek Letters Character Entity References for Markup - Significant and Internationalization Characters They are taken from the W3C website at www.w3.org/TR/REC - html40/sgml/entities.html . Character Entity References for ISO 8859 - 1 Characters Symbol Description Entity Name Number Code No - break space = non - breaking space & nbsp; & #160; Ă Inverted exclamation mark & iexcl; & #161; Â Cent sign & cent; & #162; Ê Pound sign & pound; & #163; Ô Currency sign & curren; & #164; Ơ Yen sign = yuan sign & yen; & #165; Ư Broken bar = broken vertical bar & brvbar; & #166; Đ Section sign & sect; & #167; ă Diaeresis = spacing diaeresis & uml; & #168; â Copyright sign & copy; & #169; a Feminine ordinal indicator & ordf; & #170; ô Left - pointing double angle quotation mark = left - pointing guillemet & laquo; & #171; ơ Not sign & not; & #172; Soft hyphen = discretionary hyphen & shy; & #173; đ Registered sign = registered trademark sign & reg; & #174; Macron = spacing macron = overline = APL overbar & macr; & #175; Degree sign & deg; & #176; Plus - minus sign = plus - or - minus sign & plusmn; & #177; 2 Superscript two = superscript digit two = squared & sup2; & #178; bapp06.indd 754bapp06.indd 754 11/20/09 11:26:24 PM11/20/09 11:26:24 PM Appendix F: Special Characters 755 Symbol Description Entity Name Number Code 3 Superscript three = superscript digit three = cubed & sup3; & #179; Acute accent = spacing acute & acute; & #180; à Micro sign & micro; & #181; ả Pilcrow sign = paragraph sign & para; & #182; ã Middle dot = Georgian comma = Greek middle dot & middot; & #183; á Cedilla = spacing cedilla & cedil; & #184; 1 Superscript one = superscript digit one & sup1; & #185; Masculine ordinal indicator & ordm; & #186; ằ Right - pointing double angle quotation mark = right pointing guillemet & raquo; & #187; ẳ Vulgar fraction one - quarter = fraction one - quarter & frac14; & #188; ẵ Vulgar fraction one - half = fraction one - half & frac12; & #189; ắ Vulgar fraction three - quarters = fraction three - quarters & frac34; & #190; Inverted question mark = turned question mark & iquest; & #191; Latin capital letter A with grave = Latin capital letter A grave & Agrave; & #192; Latin capital letter A with acute & Aacute; & #193; Latin capital letter A with circumflex & Acirc; & #194; Latin capital letter A with tilde & Atilde; & #195; Latin capital letter A with diaeresis & Auml; & #196; Latin capital letter A with ring above = Latin capital letter A ring & Aring; & #197; ặ Latin capital letter AE = Latin capital ligature AE & AElig; & #198; ầ Latin capital letter C with cedilla & Ccedil; & #199; ẩ Latin capital letter E with grave & Egrave; & #200; ẫ Latin capital letter E with acute & Eacute; & #201; ấ Latin capital letter E with circumflex & Ecirc; & #202; (continued) bapp06.indd 755bapp06.indd 755 11/20/09 11:26:24 PM11/20/09 11:26:24 PM Appendix F: Special Characters 756 Symbol Description Entity Name Number Code Ë Latin capital letter E with diaeresis & Euml; & #203; Ì Latin capital letter I with grave & Igrave; & #204; Í Latin capital letter I with acute & Iacute; & #205; Î Latin capital letter I with circumflex & Icirc; & #206; Ï Latin capital letter I with diaeresis & Iuml; & #207; Ð Latin capital letter ETH & ETH; & #208; Ñ Latin capital letter N with tilde & Ntilde; & #209; Ò Latin capital letter O with grave & Ograve; & #210; Ó Latin capital letter O with acute & Oacute; & #211; Ô Latin capital letter O with circumflex & Ocirc; & #212; Õ Latin capital letter O with tilde & Otilde; & #213; Ö Latin capital letter O with diaeresis & Ouml; & #214; ϫ Multiplication sign & times; & #215; Ø Latin capital letter O with stroke = Latin capital letter O slash & Oslash; & #216; Ù Latin capital letter U with grave & Ugrave; & #217; Ú Latin capital letter U with acute & Uacute; & #218; Û Latin capital letter U with circumflex & Ucirc; & #219; Ü Latin capital letter U with diaeresis & Uuml; & #220; Ý Latin capital letter Y with acute & Yacute; & #221; Þ Latin capital letter THORN & THORN; & #222; ß Latin small letter sharp s = ess - zed & szlig; & #223; à Latin small letter a with grave = Latin small letter a grave & agrave; & #224; á Latin small letter a with acute & aacute; & #225; â Latin small letter a with circumflex & acirc; & #226; bapp06.indd 756bapp06.indd 756 11/20/09 11:26:25 PM11/20/09 11:26:25 PM Appendix F: Special Characters 757 Symbol Description Entity Name Number Code ó Latin small letter a with tilde & atilde; & #227; ọ Latin small letter a with diaeresis & auml; & #228; ồ Latin small letter a with ring above = Latin small letter a ring & aring; & #229; ổ Latin small letter ae = Latin small ligature ae & aelig; & #230; ỗ Latin small letter c with cedilla & ccedil; & #231; ố Latin small letter e with grave & egrave; & #232; ộ Latin small letter e with acute & eacute; & #233; ờ Latin small letter e with circumflex & ecirc; & #234; ở Latin small letter e with diaeresis & euml; & #235; ỡ Latin small letter i with grave & igrave; & #236; ớ Latin small letter i with acute & iacute; & #237; ợ Latin small letter i with circumflex & icirc; & #238; ù Latin small letter i with diaeresis & iuml; & #239; Latin small letter eth & eth; & #240; ủ Latin small letter n with tilde & ntilde; & #241; ũ Latin small letter o with grave & ograve; & #242; ú Latin small letter o with acute & oacute; & #243; ụ Latin small letter o with circumflex & ocirc; & #244; ừ Latin small letter o with tilde & otilde; & #245; ử Latin small letter o with diaeresis & ouml; & #246; ữ Division sign & divide; & #247; ứ Latin small letter o with stroke = Latin small letter o slash & oslash; & #248; ự Latin small letter u with grave & ugrave; & #249; ỳ Latin small letter u with acute & uacute; & #250; ỷ Latin small letter u with circumflex & ucirc; & #251; (continued) bapp06.indd 757bapp06.indd 757 11/20/09 11:26:25 PM11/20/09 11:26:25 PM [...]... browser-specific elements and attributes never made it into the HTML recommendations, and are therefore referred to as browser-specific markup This appendix covers the following: ❑ ❑ Specification of font appearances without using CSS ❑ bapp09.indd 783 Elements and attributes that have been deprecated in recent versions of HTML and XHTML Control of backgrounds without using CSS 11/20/09 5:43:52 PM ... vnd.yamaha.smaf-phrase 782 bapp08.indd 782 11/20/09 5:41:49 PM I Deprecated and Browser - Specific Markup As the versions of HTML and XHTML have developed, quite a lot of markup has been deprecated, which is the W3C’s way of alerting web developers that is is likely to be removed from future versions of HTML and XHTML and that web-page authors should stop using it (although there is an acknowledgment that some people may... alternative way to achieve the same goal (in many cases using CSS) You can still use quite a lot of the deprecated markup that you meet in this chapter when using the Transitional XHTML DOCTYPE, but Strict XHTML has already removed most of the elements and attributes that affect presentation of elements I have included the details of these elements and attributes in this book, despite the fact that the markup... x-java vnd.wqd x -javascript vnd.wrq-hp3000-labelled x-msaccess vnd.wt.stf x-msexcel vnd.wv.csp+wbxml x-mspowerpoint vnd.wv.csp+xml x-rpm vnd.wv.ssp+xml x-zip vnd.xara x400-bp vnd.xfdl xhtml+ xml vnd.yamaha.hv-dic xml vnd.yamaha.hv-script xml-dtd vnd.yamaha.hv-voice xml-external-parsed-entity vnd.yamaha.smaf-audio zip vnd.yamaha.smaf-phrase 782 bapp08.indd 782 11/20/09 5:41:49 PM I Deprecated and Browser... forward slash character—for example, text /html for HTML This appendix is organized by the main types: ❑ ❑ image ❑ multipart ❑ bapp08.indd 771 text audio 11/20/09 5:41:38 PM Appendix H: MIME Media Types ❑ video ❑ message ❑ model ❑ application For example, the text main type contains types of plain-text files, such as: ❑ text/plain for plain text files ❑ text /html for HTML files ❑ text/rtf for text files... Faroese FO Kazakh KK Fiji FJ Kinyarwanda RW Finnish FI Kirghiz KY French FR Korean KO Frisian FY Kurdish KU Galician GL Kurundi RN Georgian KA Laothian LO German DE Latin LA Greek EL Latvian; Lettish LV Greenlandic KL Lingala LN Guarani GN Lithuanian LT Gujarati GU Macedonian MK Hausa HA Malagasy MG Hebrew HE Malay MS Hindi HI Malayalam ML Hungarian HU Maltese MT Icelandic IS Maori MI Indonesian ID Marathi... but not yet standardized) ‹ ‹ › Single right-pointing angle quotation mark (proposed, but not yet standardized) › › € Euro sign € € 765 bapp06.indd 765 11/20/09 11:26:36 PM bapp06.indd 766 11/20/09 11:26:36 PM G Language Codes The following table shows the two-letter ISO 639 language codes that are used to declare the language of a document in the lang and xml:lang... text formatting MIME types are officially supposed to be assigned and listed by the Internet Assigned Numbers Authority (IANA) Many of the popular MIME types in this list (all those that begin with “x-”) are not assigned by the IANA and do not have official status (Having said that, I should mention that some of these are very popular and browsers support them, such as audio/x-mp3 You can see the list... set, the default is US-ASCII calendar plain css prs.fallenstein.rst directory prs.lines.tag enriched rfc822-headers html richtext parityfec rtf 772 bapp08.indd 772 11/20/09 5:41:46 PM Appendix H: MIME Media Types sgml vnd.IPTC.NITF t140 vnd.latex-z tab-separated-values vnd.motorola.reflex uri-list vnd.ms-mediapackage vnd.abc vnd.net2phone.commcenter.command vnd.curl vnd.sun.j2me.app-descriptor vnd.DMClientScript... ♥ ♥ ♦ Black diamond suit ♦ ♦ Miscellaneous Technical Geometric Shape ◊ Miscellaneous Symbols Markup - Significant and Internationalization Characters Symbol Description Entity Name Number Code " Quotation mark = APL quote " " & Ampersand & & < Less-than sign < < > Greater-than sign > > (continued) 763 bapp06.indd 763 11/20/09 11:26:29 PM Appendix . should support UTF - 8, and XML processors are supposed to support UTF - 8 and UTF - 16; therefore, all XHTML - compliant processors should also support UTF - 16 (as XHTML is an application. internationalization and different character sets and encodings, see www.i18nguy.com/ . bapp05.indd 751bapp05.indd 751 11/20/09 5:36 :10 PM11/20/09 5:36 :10 PM bapp05.indd 752bapp05.indd 752 11/20/09 5:36 :10 PM11/20/09. replace Icelandic ones) ISO - 8859 - 10 Latin 6 Lappish, Nordic, and Eskimo ISO - 8859 - 15 The same as ISO - 8859 - 1 but with more characters added ISO - 8859 - 16 Latin 10 Covering SE

Ngày đăng: 14/08/2014, 10:22

TỪ KHÓA LIÊN QUAN