HTML4.01规范-文本(1)

9 Text

The following sections discuss issues surrounding the structuring of text. Elements that present text (alignment elements, font elements, style sheets, etc.) are discussed elsewhere in the specification. For information about characters, please consult the section on the document character set.express

本章下面的部分将围绕文本的结构进行讨论。展示文本的元素(对齐方式元素,字体元素,样式表等)在本规范的其余地方讨论。想要获取关于字符的信息,请参考文档字符集部分。app

9.1 White space

The document character set includes a wide variety of white space characters. Many of these are typographic elements used in some applications to produce particular visual spacing effects. In HTML, only the following characters are defined as white space characters:ide

文档字符集中包含了不少空白字符。他们当中不少都是在一些应用中产生特殊视觉空白效果的排版元素。在HTML中,只有下面的字符被定义成空白字符:工具

  • ASCII space ( )
  • ASCII tab (	)
  • ASCII form feed ()
  • Zero-width space (​)

  • ASCII 空格 ( )
  • ASCII 制表符 (	)
  • ASCII 换页符 ()
  • 零宽度空格 (​)

Line breaks are also white space characters. Note that although 
 and 
 are defined in [ISO10646] to unambiguously separate lines and paragraphs, respectively, these do not constitute line breaks in HTML, nor does this specification include them in the more general category of white space characters.字体

折行也是空白字符。请注意虽然在[ISO10646]中字符 
 和
定义用来分离行和段落,但它们在HTML中并不做为折行使用,本规范也没有将它们做为空白字符来对待。ui

This specification does not indicate the behavior, rendering or otherwise, of space characters other than those explicitly identified here as white space characters. For this reason, authors should use appropriate elements and styles to achieve visual formatting effects that involve white space, rather than space characters.this

除了在这里将全部的字符标记为空白字符外,本规范不规定空白字符的行为,如展示或其余的行为。因为这个缘由,做者应该使用合适的元素以及样式表来得到空白的视觉格式效果,而不是使用空白字符。搜索引擎

For all HTML elements except PRE, sequences of white space separate "words" (we use the term "word" here to mean "sequences of non-white space characters"). When formatting text, user agents should identify these words and lay them out according to the conventions of the particular written language (script) and target medium.

对于除了PRE之外的全部HTML元素,空白符的序列分隔”单词“(咱们这里使用”单词“的含义是”非空白符的字符序列“)。当格式化文本时,用户代理应该识别出这些单词而且根据相应书写语言以及目标介质来将它们展现出来。

This layout may involve putting space between words (called inter-word space), but conventions for inter-word space vary from script to script. For example, in Latin scripts, inter-word space is typically rendered as an ASCII space ( ), while in Thai it is a zero-width word separator (​). In Japanese and Chinese, inter-word space is not typically rendered at all.

这种展现可能包含在单词之间放置空格(叫作 词间空格),可是词间空格的约定会根据脚本的不一样而不一样。例如,在拉丁脚本中,词间空格一般被做为ASCII空格( )来展示,然而在泰国语中将会是零宽度单词分隔符(​)。在日文和中文中,词间空格一般不会被展示。

Note that a sequence of white spaces between words in the source document may result in an entirely different rendered inter-word spacing (except in the case of the PRE element). In particular, user agents should collapse input white space sequences when producing output inter-word space. This can and should be done even in the absence of language information (from the lang attribute, the HTTP "Content-Language" header field (see [RFC2616], section 14.12), user agent settings, etc.).

请注意除了PRE元素外,在源文档中词间的空白符序列可能在展示时会出现彻底不一样的词间空白。特别的,用户代理应该在产生词间空白时,瓦解掉输入的空白符序列。及时缺乏语言信息(该语言信息从lang属性,HTTP "Content-Language" 头字段(参考[RFC2616], 14.12部分),用户代理设置等等),这样的操做也应该被执行。

The PRE element is used for preformatted text, where white space is significant.

PRE元素被用于预格式化的文本,在哪里空白符是不会被瓦解的。

In order to avoid problems with SGML line break rules and inconsistencies among extant implementations, authors should not rely on user agents to render white space immediately after a start tag or immediately before an end tag. Thus, authors, and in particular authoring tools, should write:

为了不与SGML折行规则相关的问题以及在不一样实现的不一致性,做者应该依赖用户代理来展现出如今紧邻开始标签以后空白或者紧邻结束标签以前的空白。因此,做者以及使用某种撰写工具应该写成:

  <P>We offer free <A>technical support</A> for subscribers.</P>

and not:

而不是:

  <P>We offer free<A> technical support </A>for subscribers.</P>

9.2 Structured text

9.2.1 Phrase elements: EM, STRONG, DFN, CODE, SAMP, KBD, VAR, CITE, ABBR, and ACRONYM

<!ENTITY % phrase "EM | STRONG | DFN | CODE |
                   SAMP | KBD | VAR | CITE | ABBR | ACRONYM" >
<!ELEMENT (%fontstyle;|%phrase;) - - (%inline;)*>
<
!ATTLIST (%fontstyle;|%phrase;)
  %attrs;                              -- %coreattrs, %i18n, %events --
  >

Start tag: required, End tag: required

开始标签:必须,结束标签:必须

Attributes defined elsewhere

在其余地方定义的属性

Phrase elements add structural information to text fragments. The usual meanings of phrase elements are following:

短语型元素向文本段落中添加结构化信息。短语型元素的通常含义以下:

EM:
Indicates emphasis
表示强调.
STRONG:
Indicates stronger emphasis
表示突出强调.
CITE:
Contains a citation or a reference to other sources
承载一个引证或者一个指向其余资源的引用.
DFN:
Indicates that this is the defining instance of the enclosed term
表示术语定义.
CODE:
Designates a fragment of computer code
表示一段计算机代码.
SAMP:
Designates sample output from programs, scripts, etc
表示从程序,脚本等中输出的例子.
KBD:
Indicates text to be entered by the user
表示用户输入的文本.
VAR:
Indicates an instance of a variable or program argument
表示一个变量或程序参数的实例.
ABBR:
Indicates an abbreviated form (e.g., WWW, HTTP, URI, Mass., etc.)
表示一个缩写形式(好比:WWW,HTTP,URI,Mass,等).
ACRONYM:
Indicates an acronym (e.g., WAC, radar, etc.)
表示一个首字母缩写形式(例如:WAC,radar等).

EM and STRONG are used to indicate emphasis. The other phrase elements have particular significance in technical documents. These examples illustrate some of the phrase elements:

EM以及STRONG用来表示强调。其余的短语型元素在技术文档中会有很是大的用处。下面的例子展现了一些短语型元素:

As <CITE>Harry S. Truman</CITE> said,

<Q lang="en-us">The buck stops here.</Q>



More information can be found in <CITE>[ISO-0000]</CITE>.


Please refer to the following reference number in future
correspondence: <STRONG>1-234-55</STRONG>

The presentation of phrase elements depends on the user agent. Generally, visual user agents present EM text in italics and STRONG text in bold font. Speech synthesizer user agents may change the synthesis parameters, such as volume, pitch and rate accordingly.

短语型元素的展示依赖于用户代理。通常来讲,可视化用户代理用斜体表示EM,用粗体表示STRONG。语音合成器型用户代理可能会改变合成参数,例如会相应地音量,音高以及频率等。

The ABBR and ACRONYM elements allow authors to clearly indicate occurrences of abbreviations and acronyms. Western languages make extensive use of acronyms such as "GmbH", "NATO", and "F.B.I.", as well as abbreviations like "M.", "Inc.", "et al.", "etc.". Both Chinese and Japanese use analogous abbreviation mechanisms, wherein a long name is referred to subsequently with a subset of the Han characters from the original occurrence. Marking up these constructs provides useful information to user agents and tools such as spell checkers, speech synthesizers, translation systems and search-engine indexers.

ABBR和ACRONYM元素容许用户清晰的表示缩写和首字母缩写形式。但愿语言汇总会大量使用首字母缩写:例如,"GmbH", "NATO", 和 "F.B.I.",另外也会大量使用缩写,好比:"M.", "Inc.", "et al.", "etc."。中文和日语也使用相似的缩写机制,即:一个长名字来引用一个段话。对这些构件进行标记能够为用户代理及工具(例如:拼写检查,语言合成,翻译系统以及搜索引擎的索引器)提供不少有用信息。

The content of the ABBR and ACRONYM elements specifies the abbreviated expression itself, as it would normally appear in running text. The title attribute of these elements may be used to provide the full or expanded form of the expression.

ABBR和ACRONYM元素的内容表示缩写自己,它一般显示在正式文本中。这些元素的title属性能够被用来提偶那个缩写的缩写前内容。

Here are some sample uses of ABBR:

下面是使用ABBR的一些例子:

  <P>   
<ABBR title="World Wide Web">WWW</ABBR>   
<ABBR lang="fr"          title="Soci&eacute;t&eacute; Nationale des Chemins de Fer">      SNCF   </ABBR>   
<ABBR lang="es" title="Do&ntilde;a">Do&ntilde;a</ABBR>   <ABBR title="Abbreviation">abbr.</ABBR> 

Note that abbreviations and acronyms often have idiosyncratic pronunciations. For example, while "IRS" and "BBC" are typically pronounced letter by letter, "NATO" and "UNESCO" are pronounced phonetically. Still other abbreviated forms (e.g., "URI" and "SQL") are spelled out by some people and pronounced as words by other people. When necessary, authors should use style sheets to specify the pronunciation of an abbreviated form.

请注意缩写和首字母缩写一般有特定的发音。例如,“IRS”和"BBC"典型地发音是一个字母一个字母,"NATO"和 "UNESCO"就会以单词的形式发音。跟进一步,其余的一些缩写形式(例如:"URI"和"SQL")有的人会按字母读出,有些会按单词发音。若是有必要,做者应该使用样式表来指定一个缩写形式的发音。

9.2.2 Quotations: The BLOCKQUOTE and Q elements

<!ELEMENT BLOCKQUOTE - - (%block;|SCRIPT)+ -- long quotation -->

<!ATTLIST BLOCKQUOTE

  %attrs;                              -- %coreattrs, %i18n, %events --
  
cite        %URI;          #IMPLIED  -- URI for source document or msg --
  >

<!ELEMENT Q - - (%inline;)*            -- short inline quotation -->

<!ATTLIST Q
 %attrs;                              -- %coreattrs, %i18n, %events --
 cite        %URI;          #IMPLIED  -- URI for source document or msg --
  >

Start tag: required, End tag: required

开始标签:必须,结束标签:必须

Attribute definitions

属性定义

cite = uri [CT]
The value of this attribute is a URI that designates a source document or message. This attribute is intended to give information about the source from which the quotation was borrowed
该属性的值是一个指向某个源文档或消息的URI。该属性试图给出引文出自的源头。

 

These two elements designate quoted text. BLOCKQUOTE is for long quotations (block-level content) and Q is intended for short quotations (inline content) that don't require paragraph breaks.

这两个元素指定引文。BLOCKQUOTE用于长引文(块级别内容),Q用于不须要产生新段落的短引文(行内内容)。

This example formats an excerpt from "The Two Towers", by J.R.R. Tolkien, as a blockquote.

下面的例子将J.R.R. Tolkien 的“双塔奇兵”中的节选做为blockquote。

<BLOCKQUOTE cite="http://www.mycom.com/tolkien/twotowers.html">

<P>They went in single file, running like hounds on a strong scent,
and an eager light was in their eyes. Nearly due west the broad
swath of the marching Orcs tramped its ugly slot; the sweet grass
of Rohan had been bruised and blackened as they passed.</P>

</BLOCKQUOTE>

Rendering quotations

引文的展示

Visual user agents generally render BLOCKQUOTE as an indented block.

可视化用户代理一般会将BLOCKAUOTE展示成缩进块。

Visual user agents must ensure that the content of the Q element is rendered with delimiting quotation marks. Authors should not put quotation marks at the beginning and end of the content of a Q element.

可视化用户代理必须保证Q元素的内容以引号进行标记展现。做者不该该在Q元素内容的开始和结尾放置引号。

User agents should render quotation marks in a language-sensitive manner (see the lang attribute). Many languages adopt different quotation styles for outer and inner (nested) quotations, which should be respected by user-agents.

用户代理应该根据以语言感知的方式来展现引号(参看lang属性)。许多语言对外部和内部引号都有不一样的引用样式,用户代理必须知足这种状况。

The following example illustrates nested quotations with the Q element.

下面的例子展现了使用Q元素进行嵌套式引用。

John said, <Q lang="en-us">I saw Lucy at lunch, she told me
<Q lang="en-us">Mary wants you
to get some ice cream on your way home.</Q> I think I will get
some at Ben and Jerry's, on Gloucester Road.</Q>

Since the language of both quotations is American English, user agents should render them appropriately, for example with single quote marks around the inner quotation and double quote marks around the outer quotation:

因为两个引用都是语言都是美国英语,用户代理应该适当地展现他们,例如,在内部的引文采用单引号,在外部的引文采用双引号。

  John said, "I saw Lucy at lunch, she told me 'Mary wants you
  to get some ice cream on your way home.' I think I will get some
  at Ben and Jerry's, on Gloucester Road."
相关文章
相关标签/搜索