Showing:

Annotations

Attributes

Facets

http://www.infinity-loop.de/namespace/2006/upcast-internal


	Elements uci:annotation uci:annotations uci:binaries uci:body uci:box uci:boxes uci:cell uci:choice uci:choicelist uci:choicelists uci:columnbreak uci:content uci:contentref uci:deleted uci:document uci:documentinfo uci:endnote uci:endnotes uci:endtarget uci:extflow uci:extobject uci:footnote uci:footnotes uci:formcheckbox uci:formdropdown uci:formtext uci:gentext uci:head uci:highlight uci:image uci:indexentry uci:indextarget uci:indextargets uci:inline uci:inserted uci:item uci:linebreak uci:linestart uci:link uci:list uci:object uci:ole uci:pagebreak uci:pagefooter uci:pagefooters uci:pageheader uci:pageheaders uci:pagestart uci:par uci:part uci:property uci:reference uci:row uci:see uci:style uci:table uci:target uci:toc uci:tocs


	Complex Types uci:note

Resource hierarchy:

uci.xsd
- xml.xsd
- xlink.xsd
  - xml.xsd

Main schema uci.xsd

Annotations

This schema documents the internal document structure that the RTF Importer module creates as its result.

`uci` schema design

Physical separation of out-of-flow from in-flow content

During development of the RTF Importer module, the following fundamental design decision was made: out-of-flow elements are located in a separate, out-of-flow container . This is different from most other common schemas like DocBook or HTML. Why did we do this?

One of upCast RT's focus is offering operators to bring legacy, often unstructured word-processing content into a useful structure. For this, we often need to work on the raw text and its formatting for operations like regular expressions or determining styling for a run of text. If we kept out-of-flow elements like footnotes, textboxes or annotations inline in the text flow, those operations would become much more complicated to use (or they would need to be equipped with some auto-magic). Let's have a look at an example:

<par>Java<footnote>A programming language formerly called Oak.</footnote>-based applications</par>

Suppose the user wanted "Java-based applications" to be a heading, but did not use a dedicated style for this and used local formatting overrides instead. Now, we need to detect this - but how? A straight-forward approach would be to define the following rule for headings: If a paragraph consists only of at max 35 characters that are 16pt bold, consider it a heading. If you'd apply that rule to the paragraph above, you'd find that string-length( string(self::par) ) is greater than 35 because it counts characters also in descendants. Additionally, the characters in the footnote would not be 16pt and bold, so the second condition would also not match.

If you had ever to deal with these kind of problems (excluding certain sub-trees in operations on elements), you know that this can become a nightmare. You must define logic to exclude those items at any level, keep the list of elements to exclude up-to-date, etc. Since these operations are fundamental, common operations in legacy document conversion, we decided to move lofical out-of-flow content also physically out of low. This means that the above turns conceptually into a structure like

<body><par>Java<contentref idref="id1"/>-based applications</par></body>

<extflow><footnote id="id1">A programming language formerly called Oak.</footnote><extflow>

Layout properties exposed as namespaced attributes

You often will want to access certain layout properties on elements. These CSS properties are exposed in the tree as synthesized attributes. However, their being synthesized dynamically at query time is a technical detail you can neglect for most operations, except that you cannot set them. They are only made real element attributes when you serialize the internal tree with the XML Exporter module (and they are not filtered by its attribute filter settings). The properties are exposed on elements in the three semantic namespaces css, csso and cssc. See the upCast RT manual for details on the semantics of these namespaces.

Document structure

Basic structure

The basic document structure looks as follows:

<uci:document>
                                    <uci:head>...</uci:head>
                                    <uci:body>...</uci:body>
                                    <uci:extflow>...</uci:extflow>
                                    </uci:document>

The uci:body element contains the all content that is within the regular document flow.

The uci:extflow element contains all out-of-flow element definitions like footnotes, endnotes, annotations, page headers and footers, binary objects, textboxes etc.

Tables

Tables are represented as a rectangular, regular grid of uci:cell objects, organized in uci:row objects, which are contained in an uci:table object. There are no elements to physically represent table header, table footer or row groups. Instead, this info is attached to uci:cell and uci:row objects as attributes. This simple, generic grid structure allows us to

easily programmatically change row-grouping structures by simply setting attributes instead of having to physically juggle with table elements (creation, moving, etc.)
be table-model independent: the XML Exporter will take care of serializing the generic internal table structure in one of (currently) three target models based solely on the attribute settings: HTML, CALS and native.