upCast 7.0 Reference Manual

Christian Roth

R

Revision History
Revision 1Sun, 17 Jan 2010 13:36:00 CET

1. upCast Overview
1. What is it?
2. System requirements
2. upCast Architecture
1. The pipeline component
2. Pipeline views
3. The module component
4. Parameter Sets
5. Conversion phases
3. upCast UI
1. The pipeline document window
2. File menu
2.1. New
2.2. New from Template
2.3. Open…
2.4. Open Recent
2.5. Close
2.6. Save
2.7. Save as…
2.8. Save to Parameter Set…
2.9. Export to Ant
2.10. Export to Java Source…
2.11. Export to XML…
2.12. Generate documentation…
3. Edit menu
3.1. Cut / Copy / Paste
3.2. Copy as UPL
3.3. Duplicate
4. Pipeline menu
4.1. Run
4.2. Simple View
5. Help menu
5.1. Information…
5.2. License Agreement
5.3. Logfile
5.4. upCast Documentation…
5.5. UPL Documentation
5.6. Javadoc API Documentation…
5.7. Send Feedback…
4. The variable system
1. Variable reference syntax
2. The global realm
3. The application realm
4. The environment realm
5. The pipeline realm
6. The module realm
7. The javaproperty realm
8. The include realm
9. Parameter and Variable Types
5. Application-level Settings
1. Application Preferences
1.1. Application Settings
1.2. Catalogs
1.3. Font Configuration
1.4. Encodings
1.5. License
6. Pipeline-level Settings
1. Typographical conventions
2. Special pre-defined pipeline variables
2.1. PipelineBase, base
2.2. PipelineURI
2.3. ParamBase
2.4. ParamURI
2.5. PipelineInstanceId
3. Pipeline Settings
3.1. Pipeline Parameters
3.2. Settings
3.3. Catalogs
3.4. Font Configuration
3.5. Encodings
3.6. Export
3.7. License
7. Module-level settings
1. Typographical conventions
2. General module parameters
3. Module-specific parameters
8. Modules
1. Pipeline Variables [pipelinevars]
2. RTF Importer (upCast) [rtfimport]
3. UPL Processor [uplcode]
4. UPL Tree-Processor [upl]
5. Sectioner [sectioner]
5.1. Handling of uci:part elements
5.2. Handling of other elements
6. [DEPRECATED] Grouper [grouper]
7. XML Importer [xmlimport]
8. XML Exporter [xmlexport]
9. Commandline Processor [commandline]
10. XSLT Processor [xslt]
11. Unicode Translation Processor [unicodetranslator]
12. XML Validator [validator]
13. CSS Exporter [css]
14. RTF Exporter (“downCast”) [rtfexport]
15. External Pipeline Processor [extpipeline]
9. Parameter Sets
1. What parameter sets are
2. What parameter sets contain
3. How parameter sets work
4. Creating parameter sets
5. Variable: ${pipeline:ParamBase}
6. What happens when…
6.1. …the pipeline implementation’s number or type of parameters changes?
6.2. …I change a pipeline implementation while a depending parameter set is open?
6.3. …the Pipeline UID changes and parameter sets using the old id already exist?
6.4. …there is no mapping in the catalog system for a certain parameter set UID?
10. Grouping using Painters
1. The Painter concept
1.1. Tagging nodes and placing the painters
1.2. Painting the nodes
2. Node Tags
3. Painter Types
3.1. start-end
3.2. start*end
3.3. start-here
3.4. start*here
3.5. here-end
3.6. here*end
3.7. start-start
3.8. start*start
3.9. end-end
3.10. end*end
3.11. this
4. Grouping algorithm
5. Examples
5.1. Grouping by paragraph class
5.2. Grouping with a known start element
11. Commandline Interface
1. How it works
2. Synopsis
12. XML Namespaces in upCast
1. The upcast-internal namespace (uci)
2. The upcast-css namespace (css)
3. The upcast-cssoverride namespace (csso)
4. The upcast-cssclass namespace (cssc)
5. The upcast-cals namespace (cals)
6. The HTML namespace (html)
7. The XLink namespace (xlink)
8. The XML namespace (xml)
9. The Variable Realm namespaces
10. UPL Utility functions library namespace
13. Recognized Java system properties
14. Java API
1. Concepts
2. Using the API
2.1. General programming steps
2.2. Setup
2.3. Building and running a pipeline
3. Connecting to WordLink (Windows only)
3.1. Accessing Word as COM object in a restricted environment
4. API Error handling
4.1. Coding pattern
4.2. Error handling tidbits
15. upcast-runner Ant task
1. Defining the upcast task
2. Structure of the upcast-runner task
2.1. upcast-runner
2.2. source
2.3. catalogs
2.4. catalog
2.5. param
16. upcast Ant task
1. Defining the upcast task
2. Structure of the upcast task
2.1. upcast
2.2. source
2.3. settings
2.4. licensefile
2.5. logging
2.6. catalogs
2.7. catalog
2.8. encodings
2.9. encoding
2.10. fontconfig
2.11. parameters
2.12. pipeline
2.13. module
2.14. param
17. Logging Architecture
1. Log event processing
2. Processing exception
3. Filter specification syntax
18. Pipeline Templates
19. Standard Folders and Locations
20. Unicode translation map
1. Syntax
2. Options
2.1. @option-default-mapping
21. CSS property unit table
1. Syntax
2. Options
2.1. @option-default-length-unit
2.2. @option-default-length-precision
22. Fonts and Encodings
1. Font Configuration
1.1. Properties and Values
1.2. Options
1.3. File structure
1.4. Matching Algorithm
2. Custom Encodings
2.1. How it works
2.2. Associating a Font with an Encoding
2.3. File format
23. Troubleshooting
1. Finding out basic version info of an upcast.jar
2. Finding out extended environment info
3. Extended log info
24. Copyright, Licenses, Legal, Acknowledgements
1. Copyright, Licenses, Legal
1.1. upCast
1.2. Steadystate CSS2 parser
1.3. Xerces, Xalan
1.4. W3C
1.5. XML- and OASIS Catalog Support
1.6. Saxon 6.x
1.7. Saxon-B 9.x
1.8. MRJAdapter
1.9. Jaxen
1.10. Jing
2. Acknowledgements and Thanks

Important Note

This document is intended as a technical reference manual to upCast RT (in the following called just upCast).

It is not intended as a tutorial on how to use upCast efficiently, best create pipelines or similar questions. These will be covered in a separate tutorial-style document, a How-To section on our website as well as a Frequently Asked Questions document. Please turn to these documents as they are published on our website http://www.upcast.de/ in the near future.

This reference document describes upCast RT 7.1.2.

Chapter 1. upCast Overview

1. What is it?

upCast is a module-based document processing pipeline tool, specializing in legacy, “flat” and layout-driven content. It comes with pre-defined, configurable, task oriented modules (that perform operations like importing data, XSLT processing, serialization and validation etc.) that you can put into any order you wish to create a pipeline. Pipelines can be saved and parameterized as a whole and then be run either within upCast’s UI or from the commandline or directly from Java.

Pipelines can be set up to be fully relative in their file addressing and therefore can be shared without modifcations between computers, even different platforms.

2. System requirements

To run upCast, you must meet the following requirements:

  • Java Runtime Environment 5.0 or later (“Java 5”)

  • Xerces 2.9.x or later (upCast includes Xerces 2.9.1 and does not work in systems that have earlier versions than Xerces 2.9 in their classpath)

  • 512 MB of RAM available to upCast (depending on document size and pipeline configuration)

  • Display resolution of at least 1024 x 768 (when running the graphical development environment)

Chapter 2. upCast Architecture

1. The pipeline component

The highest-level component type in upCast usage is a document processing pipeline, or short: pipeline. Pipelines can be saved into documents (file extension .ucdoc) and recalled at any time. Complete pipelines can be exported into several formats, like a Java source file or source code for an Ant target.

2. Pipeline views

To the user, upCast presents its functionality in two layers: the so-called “Simple View” and the “Edit View”. Think of the Simple View as a simplifying, user-oriented layer over the Edit View, which is developer-oriented and shows the actual, fine-grained and possibly complex implementation of the conversion pipeline.

Simple View as a user-oriented layer over the detailed Edit View on a pipeline’s implementation

3. The module component

Pipelines are made up of modules. Modules each perform a specific and specialized task. Modules can be divided into the three categories importers, processors and exporters based on the tasks they perform.

Importers import documents into the internal document format. upCast currently includes a high-quality RTF/Word importer.

Processors come in two variants for internal and external processing. Internal processors modify the current, internal document representation. This is carried out in-place. External processors can be used to perform general tasks which are not dependent on the internal document, like running a shell command.

Exporters are used to serialize the internal document or part thereof in one of several formats.

Within a pipeline, at any time during execution there’s exactly one internal document representation the tasks are performed on. This means that modifications are in most cases performed in-place, so changes made to the internal document tree by one module are visible to subsequent ones.

While a run of an importer always replaces the internal document, you can have several exporters that serialize the same internal document in different ways. You can also serialize the document at any point in the pipeline and apply additional modifications using processors afterwards.

4. Parameter Sets

It is often useful to be able to save, quickly recall and share with other users different parameter settings for running a certain, parameterized pipeline. Such a parameter set can be saved into documents (file extension .ucpar) and recalled at any time.

A parameter set document only contains the pipeline parameter values as they are set in the Simple View at the time of saving. Only parameters that have their persistent property set to true are saved in that document. It also contains the Pipeline UID of the pipeline document it is based on so that it can load its implementation for execution.

For details, see the section on Parameter Sets.

5. Conversion phases

Usually, a conversion is a three-phase process: You import the source data into the application, process the data, and export the result. Sometimes, a fourth, external post-processing step is added. upCast offers various modules, which can be divided into three different classes: Importers, Processors and Exporters.

Here’s a diagram of a typical upCast pipeline (with the internal document indicated over time):

upCast sample pipeline with corresponding internal document life-span; list of modules

Chapter 3. upCast UI

1. The pipeline document window

The upCast UI is designed to be simple and effective. An upCast document is a complete pipeline setup and can be can be saved in a file with the default extension .ucdoc. Each document is shown in its own window, and you can have several pipelines open at the same time.

A document window in edit mode is divided into three parts.

The left pane shows the sequence of modules that make up the pipeline. The position of a selected module in the pipeline can be changed by using the nudge-up/nudge-down controls at the bottom of the list. A module can be deleted from the pipeline with the “–” control, a module can be added by clicking the “+” control and choosing the desired class from the popup. There can be multiple instances of the same module type in a pipeline as required, e.g. two or more XSLT processors.

The right pane shows the parameters for the currently selected module. Only one module can be selected at any time. Changes to a module’s parameters are effective immediately.

At the bottom of the window, the pipeline execution controls are placed for executing a pipeline, stopping it underway and checking its progress.

An upCast document window showing a pipeline setup with the XML (Raw) exporter selected

Note

This display is replaced by a dynamically generated, forms-like interface when the Simple View option is engaged.

2. File menu

2.1. New

This command creates a new, empty pipeline document.

2.2. New from Template

This command lets you create either a parameter set based on a factory-supplied template or a new, independent, self-contained pipeline configuration from one of the available templates.

create parameter set… Creates a parameter set from the respective template’s main pipeline document.

Note

The advantage of just creating a parameter set is that if you do not need to tweak the implementation, but just use the pipeline template’s functionality as-is only with variable parameter values, you will benefit from updates and bugfixes to the template automatically without any further manual intervention required. This comes from the fact that the parameter set only holds a reference to the actual template implementation and therefore is automatically updated when the implementation is.

create independent copy… Creates a full, physical copy of all the pipeline documents and resources the template is made up of. You are asked for the location (folder) and a base name for the new pipeline. Within the selected folder, a new folder by the specified name is created and any resources of the template, including the pipeline document, are copied into that folder.

Note

This pipeline created based on the chosen template is completely independent from its template. This means two things:

  • you get a complete, independent copy of the original template definition and resources

  • any updates to the template are not propagated forward to any pipelines you already have created based on an older version of the template

You can create your own, specific templates. For details on what makes a pipeline an upCast template and where to put those templates for upCast to recognize them, see the chapter on Pipeline Templates.

2.3. Open…

This shows a file chooser where you can open an already existing pipeline or parameter set document.

2.4. Open Recent

Shows in a sub-menu the most recent pipeline and parameter set documents you had open in the past. The number of items displayed in the sub menu can be set in upCast’s preferences.

Pipeline or parameter set documents you had open recently, but which are no longer available (for example because they have been deleted or the disk they reside on is currently not mounted) are shown in disabled state.

2.5. Close

Closes the top-most document. When changes to this document have not yet been saved, you are prompted to save them.

2.6. Save

Saves the top-most document, which can be a pipeline or parameter set document, the log window or the system information window.

2.7. Save as…

This allows you to save the top-most window under a new name.

Note

Note that for pipeline documents that refer relatively to needed resources, saving a pipeline document to a different location will usually break those links and the pipeline will not run as expected, since upCast cannot reliably track those resource links and copy them along automatically.

2.8. Save to Parameter Set…

This lets you save the persistent parameters and their values of the top-most pipeline document to a separate file, a parameter set document. This file internally links back to its pipeline document it was created from. This allows you to separately store configurations of parameter values that look like separate pipelines, but share one single pipeline implementation. When the latter gets updated, so do all parameter sets originating from it.

See the section on parameter sets for more information on how the linking to the respective pipeline document works and what the restrictions of parameter sets are.

2.9. Export to Ant

Saves the current pipeline document in form of an Ant task. Additional parameters can be set for the export operation in the Pipeline Settings dialog under the Export tab.

using ‘upcast-runner’ task exports an Ant task making use of an upCast runner object, which reads the specified pipeline and executes it. This is the recommended export option since you need to generate that task only once and it picks up automatically any changes in the referenced pipeline document.

as self-contained Task this creates a fully, self-contained Ant task of the current pipeline’s configuration. This means that the task can be run without having access to the original pipeline document it was generated from. This may be useful when you used the original pipeline document only for prototyping and testing, and want to apply changes directly to the Ant task’s definition thereafter, or can recreate the task automatically when making changes to the pipeline document (e.g. in an automated build using upCast’s Tools class).

2.10. Export to Java Source…

Saves the current pipeline document as Java Source code. Additional parameters can be set for the export operation in the Pipeline Settings dialog under the Export tab.

using RunPipeline class exports Java source code making use of upCast’s RunPipeline class, which reads the specified pipeline and executes it. This is the recommended export option since you need to re-generate the source code for that class only when the pipeline parameter configuration changes (i.e., parameters are added or removed) and it picks up automatically any further changes in the referenced pipeline document.

as self-contained source this creates a fully, self-contained Java class of the current pipeline’s configuration, utilizing the upCast Java API’s UpcastEngine class’s methods. This means that the code can be run without having access to the original pipeline document it was generated from. This may be useful when you need fine-grained control over error handling for each individual module’s execution step and/or need to dynamically execute additional code that cannot be integrated into a standard pipeline execution.

2.11. Export to XML…

Exports the current pipeline document as a human-readable XML source file.

This file is also used internally as the basis for the Ant task and Java Source export options, which are generated by appropriately configured XSLT transformations. With this export, you can create your own formats of export (e.g. customized Java code export or extended documentation generation).

2.12. Generate documentation…

Generates a self-contained HTML page with automatically generated documentation of the top-most pipeline document including things like commandline call syntax and Java call syntax, parameter descriptions and more.

3. Edit menu

3.1. Cut / Copy / Paste

The operations Cut, Copy and Paste are supported context sensitively, depending on where the current keyboard focus is directed to:

text field When the focus is on a text field, these methods work as usual.

pipeline modules list When the focus is on a module in the pipeline modules list, that module’s complete definition is copied in form of an XML snippet onto the clipboard. Using Paste while the focus is on a module entry, the module description on the clipboard is read and a new module is inserted above the currently selected module with all parameters set as for the module you copied. You can even copy modules conveniently across open pipelines this way.

3.2. Copy as UPL

When the focus is on a module in the pipeline modules list, this command will create UPL source code for running the selected module from UPL using the run-module() function and put it as text onto the clipboard. You can then insert it into a UPL code field within upCast or your favorite external editor where you are writing your UPL code.

3.3. Duplicate

This function will only work when the focus is on a module in the pipline modules list. In this case, it will insert a copy of the currently selected module directly below it.

4. Pipeline menu

4.1. Run

Runs the pipeline in the top-most pipeline window.

4.2. Simple View

With this toggle, you switch between the Simple View and Edit View of a pipeline configuration.

When checked, upCast shows its pipeline window in Simple View mode, hiding the actual pipeline implementation and showing only entry fields for the pipeline parameters that a typical user must supply.

When you want to edit the details of a pipeline, uncheck this item.

The state of this parameter is saved to the pipeline document and automatically restored at opening time. This means that for final distribution to your customers, check this parameter, then save the document again before packaging it into your distribution.

5. Help menu

5.1. Information…

Shows a window with detailed information on the execution environment of the topmost pipeline document and the upCast application, including version information on available XSLT processors, Java, loaded modules, license info etc. You may be asked by infinity-loop support for this info when tracking down problems you may have with upCast.

5.2. License Agreement

Shows the upCast License Agreement in a window.

5.3. Logfile

Shows a window with the external log file or a live view of log events as they are generated from within upCast.

The Source popup menu lets you choose between these two modes:

Logfile shows the current contents of the log file on disk

Live Events shows the log events in the system as they are generated from log sources within upCast

When showing Live Events, you can set a filter describing which log events generated by upCast should be displayed. This is done using the Filter text field. This setting is completely independent from the log level setting in upCast’s preferences. Several pre-defined settings are available from the associated popup menu, but you are free to specify any log event filtering expression you wish. The filter expression syntax is described here and is the same as used in other places within upCast.

All log events are held indefinitely while the window is open or until you click on Clear Window, so you should not leave the window open unattendedly as otherwise you will run out of heap space at some time. When the window is in Live Events mode, depending on the amount of logging events to display, you will see a performance degradation of pipeline execution. There’s no performance penalty when the window is closed, as then it detaches itself from all log sources automatically.

With Save as Text…, you can save the current contents of the window to a text file. You may be asked by infinity-loop support for this info when tracking down problems you may have with upCast.

5.4. upCast Documentation…

Shows this upCast reference documentation manual in the host system’s default web browser.

5.5. UPL Documentation

Shows the UPL reference documentation in the host system’s default web browser.

5.6. Javadoc API Documentation…

Shows the upCast API documentation (in javadoc format) in the host system’s default web browser.

5.7. Send Feedback…

Opens a pre-configured email in your default email application, ready to be amended by your problem report or question to infinity-loop Support department. This includes system information which you can preview in the generated email and – when desired for privacy reasons – trim to your liking.

You should use this function whenever you want to report a bug or problem to infinity-loop.

Chapter 4. The variable system

upCast offers several variable realms. Realms are distinct, non-overlapping value storage spaces. Think of them as different buckets placed next to each other, labelled with the realm name.

Some of the realms are read-only, and some of them calculate the actual value of a variable at the time of retrieval access.

Here’s an overview of the different realms and their names (monospace bold grey print) available in upCast:

Variable realms and names in upCast

To get or set a variable, two components must be specified:

  • the realm

  • the variable name

A variable reference is resolved in an upCast parameter field by simply replacing the variable reference by the textual value of the variable referenced.

Note

It is important to always keep in mind that the variable resolution process is an utterly dumb textual replacement process (much like a #define works in the C programming language). Specifically, no quoting or unquoting is performed.

The result of a variable reference to a variable that does not exist or cannot be resolved is the variable reference itself.

A piece of text containing variable references is processed as many times as the result changes. This allows you e.g. to have references to the include realm resolved also in already included content. Consequently, you must make sure that contents looking like a variable reference, but which may not be resolved, must be properly quoted (e.g. by doubling the $ sign). To avoid potential infinite recursion, this repeated resolution process on some source string is terminated when even after a certain number of iterations, changes in the result still occur. The limiting number of iterations currently is set at 32 by default. It can be changed by setting the Java property de.infinityloop.application.maxvarrecursion .

Naming restriction

All variable names that start with an upper-case letter are reserved for upCast’s own use.

You should therefore name your own variables in such a way that they do not start with an upper-case letter, even when at that time, a likewise named upCast-defined variables does not yet exist. We might introduce it in a subsequent release and make your pipeline not work correctly any more.

1. Variable reference syntax

The syntax to refer to a variable in a specific realm is similar to that of Ant, albeit with a twist:

${realm:name#modifier}

Note the special #modifier part: It is useful when wanting to modify the stored value of a variable before returning it in specific ways. This is most useful in file paths, e.g. to only retrieve the name of a file in an absolute path, the base name or just the path to some file.

Note

As with Ant, variable resolution is not recursive, i.e. you cannot write something like ${module:${pipeline:paramname}} to calculate the name of a module variable dynamically.

The components of a variable reference are:

realm

the realm of the variable; available values: application, pipeline, module, javaproperty, include

name

the name of the variable

modifier

for URL and file path variables only: extract elements of a path and/or convert the resulting variable value between local file system and URL format. The following modifiers are currently supported:

local return the value of the variable in local file system format

url return the value of the variable in URL format

localpath return only the path component (without filename and without trailing file separator) of the value of the variable. If the variable is a folder, the value is returned unchanged.

urlpath same as localpath, but returns the value in URL format

localname returns only the file name component of the variable value in local format

urlname same as localname, but returns the value in URL format

localextension returns only the file extension (excluding the dot) of the variable value in local format; empty, when there is no file extension

urlextension same as localextension, but returns the value in URL format

localbasename returns the same value as localname, but with trailing dot and extension stripped if it exists

urlbasename same as localbasename, but returns value in URL format

localbasenamepath essentially, this is localpath + localbasename, i.e. the value of the variable minus extension (including trailing dot)

urlbasenamepath same as localbasenamepath, but returns value in URL format

Example 4.1. Modifier sample results

With SourceFile having a value of “C:\Documents and Settings\upCast\The file.xml”, the following variable references with modifiers will evaluate to:

${SourceFile#local}

C:\Documents and Settings\upCast\The file.xml

${SourceFile#url}

file:///C:/Documents%20and%20Settings/upCast/The%20file.xml

${SourceFile#localpath}

C:\Documents and Settings\upCast

${SourceFile#urlpath}

file:///C:/Documents%20and%20Settings/upCast

${SourceFile#localname}

The file.xml

${SourceFile#urlname}

The%20file.xml

${SourceFile#localextension}

xml

${SourceFile#urlextension}

xml

${SourceFile#localbasename}

The file

${SourceFile#urlbasename}

The%20file

${SourceFile#localbasenamepath}

C:\Documents and Settings\upCast\The file

${SourceFile#urlbasenamepath}

file:///C:/Documents%20and%20Settings/upCast/The%20file


Let’s have a look at the various realms in more detail:

2. The global realm

This realm is not yet available and will be implemented in a later release of upCast.

3. The application realm

This realm is read-only.

This realm includes upCast application-global values.

The following variables are currently defined:

variable name

description

SupportFolder

path to the (OS/system-specific) support files folder

BundledResources

path to the resources folder bundled with the application distribution (when it was installed using one of the system-specific distribution packages)

Logfile

the path to the external logfile as calculated by the application and/or set in the java system property de.infinityloop.application.logfile

By default, all file path values returned are in URL format. You can use all available modifiers on them, of course, to change format or extract parts from them.

Example 4.2. 

To retrieve the location of the application’s support folder on the system it is running on, use:

${application:SupportFolder}

4. The environment realm

This realm is read-only.

This realm includes upCast pipeline-global environment values. Most of them are virtual in the sense of that they reflect some current state of the execution environment at the time of recalling them and are not actually stored.

The following variables are currently defined:

variable name

Java type

description

version

Integer

the application version (0x0Mmr format)

version-string

String

the application version in “M.m.r” format

build

Integer

the build number

build-timestamp

String

the timestamp string of the build in the format “dd-mm-yy hh:mm:ss [+|-]zzzz

license-features

List

list of Strings of features included in the current license; features not active are enclosed by parantheses ‘(‘ and ‘)’

license-features-valid

List

list of Strings that only includes features in the current license that are valid at the time of the query

license-info

String

a string describing the current license

license-feature-expdays-featurename

Integer

number of days until the license feature featurename expires

dir-installation

String

application installation folder

xml-catalogs

List

list of Strings identifying the locations of currently active XML catalog files in the pipeline

xml-xerces-version

String

version information of the included Xerces parser

xslt-xalan-version

String

version information of the included Xalan XSLT processor

xslt-saxon-version

String

version information of the included Saxon 9.x XSLT 2 processor

xslt-saxon6-version

String

version information of the included Saxon 6.x XSLT 1 processor

wordlink-version

Integer

version of the active WordLink component; returns null when WordLink is not installed or active

wordlink-wordversion

Integer

version of Microsoft Word that WordLink is currently linking to; returns null when Word is not installed on this machine or WordLink is not functional

wordlink-binarypath

String

absolute path to the application used for implementing the WordLink functionality; returns null when WordLink is not installed or active

mathlink-version

Integer

version of the active MathLink component (implementing the link to MathType 5); returns null when MathLink is not installed or active

mathlink-dllversion

Integer

version of the MathType DLL used for implementing MathLink; returns null when MathLink is not installed or active

progress-sublabel

String

the text currently displayed in the progress bar’s sub-label

progress-label

String

the text currently displayed in the progress bar’s label

progress-task-current

Long

the ordinal number (1-based) of the currently executed module task in the pipeline

progress-task-count

Long

the total number of tasks defined in the current pipeline

progress-task-current-max

Long

the maximum value for completion indication of the current task

progress-task-current-value

Long

the current value of completion for the currently running task; the task is completed when this value is equal to progress-task-current-max

dir-support

String

the folder searched for application support files

dir-licenses

String

the folder searched for license files

logfile

String

the absolute path of the log file written to

pipeline-gui

Boolean

true when this pipeline's GUI is shown

This means that the pipeline must be a top-level pipeline (see pipeline-toplevel) and that upCast must be running in GUI mode (in contrast to being run as a commandline tool or being controlled via its Java API)

pipeline-toplevel

Boolean

true when this pipeline is the top-level pipeline executing

This means that the pipeline is not one that is executed within an External Pipeline Processor as a sub-pipeline

version-latest

Integer

the build number of the latest available version of this application

This information is retrieved from infinity-loop’s servers by fetching the URL http://versioncheck.upcast.de/upcast7.plist.

When there is no newer version available, this returns 0.

When the information could not be retrieved (e.g. due to a server error or if there is no active connection to the internet), null is returned.

By default, all file path values returned are in URL format. You can use all available modifiers on them, of course, to change format or extract parts from them.

Example 4.3. 

To get information on the version of Xalan currently in use by upCast RT, write:

${environment:xslt-xalan-version}

which might return the value “Xalan Java 2.7.1”.


For accessing these environment values from UPL, access them using the environment namespace like ordinary UPL variables. Java types as listed in the table above are coerced to the respective UPL types.

With a namespace definition of

#namespace environment “http://www.infinity-loop.de/namespace/upcast-realm/environment”;

the code

println( $environment:dir-licenses );

might print the following on the console:

/Users/demo/Library/Application Support/infinity-loop/upCast RT/Licenses

and the code

println( $environment:license-features );

might print the following to the console:

{”rtfimportGUI”,”rtfimportAPI”,”rtfexportGUI”,”rtfexportAPI”,”uplGUI”,”uplAPI”}

5. The pipeline realm

It is often useful to store values that several modules will need as pipeline variables. Examples are the source document to process, the destination folder, the folder where images will be stored or the folder where temporary files should be created if needed by the pipeline.

The pipeline realm contains variables that are available to all modules in a specific pipeline. Each pipeline has its own set of pipeline variables. Modules can only access pipeline variables of the pipeline they are a member of.

Important

The set of pipeline variables is cleared before each execution of a pipeline with the exception of the following special, pre-defined variables:

  • base

  • PipelineURI

  • PipelineBase

  • ParamURI

  • ParamBase

  • PipelineInstanceId

6. The module realm

This realm includes all parameters of a single module in a pipeline. This realm can only be accessed from within that module, and only the parameters of the currently executed module at the time of reference resolution can be accessed.

Important

Referencing module variables is generally not recommended since upCast has no defined order of variable resolution and will not determine a suitable one by itself. Referring to module variables can therefore lead to infinite loops or referring to unresolved references.

7. The javaproperty realm

This realm is read-only, with the exception of the UPL execution context, where you can also set variables in that realm.

The javaproperty realm contains all currently defined Java system properties, either pre-defined ones by the Java Virtual Machine (like user.dir or user.home) or properties explicitly set on launch of the VM running the application.

Example 4.4. 

${javaproperty:user.home}

retrieves the path to the current user’s home directory.

${javaproperty:user.dir#url}

retrieves the path to the current directory in URL format by use of the #url modifier.


8. The include realm

This realm is read-only.

The include realm returns the contents of the file specified as the name of the variable. The syntax is as follows:

${include:/absolute/filepath/to/file.ext}
${include:relative/path/to/file.ext}

A relatively specified path is always considered to be relative to the value of ${pipeline:base}, i.e. the base URL of the pipeline.

The include realm can include parameters like e.g. specifying the encoding of the file to be used for reading it. The variable reference syntax can therefore take the following, extended form:

${include( paramname: “value” [, paramname: “value” ]* ):filepath}

The following table lists the possible parameters that can be specified for an include reference:

parameter name

value

encoding

the (Java-) name of the encoding to be used for reading the file

When this parameter is not specified, the platform’s default encoding is used.

source

lets you choose wherefrom the data to be included should originate from

file

the filepath component specifies a physical file (this is the default when the parameter is not specified)

variable

the filepath component specifies a variable identifier from which to get the included data. This option is useful when you need to pass literal code or code fragments (still to be parsed by upCast later) from a Simple View component into e.g. the External Pipeline Processor's Sub-pipeline Parameters field.

Example 4.5. 

The value of the variable reference

${include( encoding: "UTF-8" ):Resources/entity.map}

is the text contents of the file pipeline-basedir/Resources/entity.map, read with UTF-8 encoding.

The value of the variable reference

${include( source: "variable" ):pipeline:DestinationFolder}

is the text contents of the variable DestinationFolder in the pipeline realm.


9. Parameter and Variable Types

Parameters and variables are internally stored using standard, appropriate Java or UPL object types. Some parameters can take several different types, which, however, can only be set using native Java code using the upCast API or UPL functions. Parameters that can accept more than one of the following basic types will have this mentioned explicitly and in detail in their respective description section.

The basic parameter types are:

type name in this document

corresponding Java type

(class or interface)

corresponding UPL type

Bool

java.lang.Boolean

Bool

Integer

java.lang.Integer

Numeric

Double

java.lang.Double

Numeric

String

java.lang.String

String

List

java.lang.List

List

Object

java.lang.Object

Chapter 5. Application-level Settings

Some settings are global to the upCast application and (some of them optionally) affect all pipeline documents loaded.

1. Application Preferences

These can be set in the upCast Preferences dialog, available under the application menu (Mac OS X) or the File menu (other platforms).

To make the settings active, click Apply or the window’s close button.

The parameters are grouped into tabs.

1.1. Application Settings

Create new document on launch when no others are open

When selected, a new default pipeline document will be created on upCast launch when no other windows (e.g. from e previous, saved session) are open.

Re-open documents that were open when last quitting

When checked, all documents that were open when upCast was last quit will be re-opened in their previous locations.

Remember the most recent ___ pipeline documents

Here, you can enter the number of recently opened documents that should be listed in the File > Open Recent menu. Decreasing the value will forget any document listings beyond that new number.

Tip

To clear the File > Open Recent menu, set the number to 0, close the application preferences window by clicking Apply, then re-open and enter the number of documents you want to be remembered. Setting the value to 0 temporarily will clear the entire internal list of documents, effectively clearing the menu.

Log filter

Here, you can specify a log event filter expression. Only log events passing the filter expression are actually written to the external log file. Several often-used filter expressions are available from the associated popup menu, however you are free to define your own customized filter expressions. The filter expression syntax is described here.

Check for updates on launch

When checked, upCast will contact the infinity-loop version server to check whether a newer version of upCast is available for download. If there is, you will be notified in an info dialog.

Check for updates now…

Clicking this button will let you manually check for updates. This is particularly useful when Check for updates on launch is not checked.

Pipeline Template Paths

In this text field, you can specify paths where upCast will look for pipeline templates, one path per line. Use this if you store personal or company templates at a central place on your network and make those templates available automatically within upCast.

The default path form templates, which points to the templates copied to volume during installation, is

${application:BundledResources}/templates

You must include this path in this field if you want to have access to the application-included templates. On the other hand, if you want some users to not have access to the default templates but want them to be restricted to your specific, customized templates only, make sure that in those users’ installations, the default path is not included in the path list.

To add a path to the list, click Add Path… and navigate to the folder containing the pipeline template definition folders.

Empty lines or lines starting with // are considered comments and are discarded during parsing.

Note

You can use variable references from the include, application and javaproperty realm, but you cannot use the pipeline realm since the setting is application-global.

1.2. Catalogs

upCast supports the use of catalog files. A catalog file is in its simplest idea a mapping definition between PUBLIC DTD identifiers and the location of a physical copy of that specific DTD (or more general, entity). The upCast application supports the catalog file format as defined in http://www.oasis-open.org/specs/tr9401.html as well as XML Catalogs.

To add a catalog file, choose Add Catalog… and select the catalog file to add from the file system. The new catalog will be available to all modules immediately after closing the preferences window.

To remove a catalog, just delete its entry line.

Catalogs are considered in the order displayed.

OASIS catalog files are read with platform encoding, XML catalog files with the encoding specified in their XML declaration.

By clicking Insert upCast defaults, code is added to pick up any upCast default catalog possibly delivered with the application. You should have that entry in place for best performance.

Note

You can override the global Catalog setting individually for each pipeline.

1.3. Font Configuration

Font configuration

Specify the source code for the stdfonts.config override that should be used for this pipeline.

1.4. Encodings

Custom Encodings

To add a custom encoding file, choose Add Encoding… and select the custom encoding file (*.encoding) to add from the file system. Each line in the field specifies a custom encoding location.

To add a folder where upCast should pick all contained custom encoding files, choose Add Encodings Folder… and select it.

To remove a custom encoding entry, delete the text line containing its location specification.

1.5. License

This panel is for importing an upCast license file and reviewing current licensing status.

Certain module types require specific license features. The features available in the currently active license are listed in the license features table at the bottom of this panel. Please refer to the individual module’s documentation to check which license feature it requires to be fully (or: at all) functional.

Import new license

Clicking this button brings up a file chooser where you can find and select the license file you got sent upon your license request or purchase from infinity-loop’s licensing department. You get the chance to store this license in upCast’s Licenses special folder, so it will be available to you automatically at launch.

Pick from available licenses

Clicking this button shows you all licenses from upCast’s Licenses folder and any licenses packaged into the application itself that can be used for this version of upCast. This allows e.g. to switch between evaluation and full licenses or licenses with different features.

Chapter 6. Pipeline-level Settings

1. Typographical conventions

Parameters will be described using the following typography:

Name

DeleteEmpties

Java symbol

kDeleteEmptiesParamName

Type

Boolean

Value

false, true

Name gives the internal java.lang.String name of the parameter, which is used for storing in preferences files, and can be used in the Java API as String. However, in Java, the use of the Java symbol is highly recommended instead.

Java symbol names the Java constant definition for the parameter’s Name. All constants are defined in the class de.infinityloop.common.Params.

Type specified the recommended Java type to use when programming against the API. In the GUI version, that is taken care of automatically. Also, when using alternative interfaces like an Ant task, which allows only passing arguments as character strings, upCast tries to perform an appropriate cast. So you should make sure that the data you provide in these cases will be cast-able to the specified type, as otherwise the conversion will fail or produce incorrect results at runtime.

Value describes possible value ranges, supported keywords or other specifics about that parameter’s range.

2. Special pre-defined pipeline variables

2.1. PipelineBase, base

The pre-defined pipeline variables PipelineBase and base (deprecated) are automatically made available in the GUI version of upCast (read-only) and contain the path to the current pipeline document (*.ucdoc), excluding the actual name, in URL format (including trailing slash ‘/’). It is essential to have the pipeline document saved to a file on disk so upCast can determine this property. If the path can not be determined, the current directory is returned instead (Java property user.dir).

In API mode, this value must be explicitly set before working with pipelines that contain any references to values dependent on ${pipeline:PipelineBase}. Only use the setPipelineBaseURI() API method (class UpcastEngine) for setting the value for this pipeline variable.

You can use this for making the configuration independent from its actual location in the file system by specifying paths relative to the base variable, and storing an resources needed for the pipeline in subdirectories to this base URI.

For distributing a configuration, we recommend to put it at the root of a folder with required resources in sub-folders according to the following layout:

example folder layout for a distribution

Name

PipelineBase

Java symbol

kPipelineBaseParamName

Type

String

Value

absolute path to folder in which the current configuration file is located (if loaded via the GUI; automatically set) or the folder of a configuration distribution root in URL format

2.2. PipelineURI

This variable holds the full URI (as a file:-URL) of the pipeline document (*.ucdoc) implementing the current pipeline.

Name

PipelineURI

Java symbol

kPipelineURIParamName

Type

String

Value

absolute path to folder in which the current configuration file is located (if loaded via the GUI; automatically set) or the folder of a configuration distribution root in URL format

2.3. ParamBase

This variable holds the path to the current parameter set document (*.ucpar), excluding the actual name, in URL format (including trailing slash ‘/’). For regular pipeline documents, the contents of this variable is the same as that of PipelineBase.

Name

ParamBase

Java symbol

kParamBaseParamName

Type

String

Value

absolute path to folder in which the current parameter set file is located (if loaded via the GUI; automatically set) in URL format

2.4. ParamURI

This variable holds the full URI (as a file:-URL) of the current parameter set document (*.ucpar). For regular pipeline documents, the contents of this variable is the same as that of PipelineURI.

Determining if the current document is a pipeline or parameter set

This definition of the contents of the ParamURI pipeline variable can be used in UPL to determine whether the currently running application is run directly from a pipeline document (ucdoc) or via a parameter set (ucpar) with code like the following:

#namespace pipeline “”;
  ...
if( ends-with( $pipeline:ParamURI, “ucpar” ) ) {
  /* we’re running from a parameter set document */
} else {
  /* we’re running from a regular pipeline document */
}

Name

ParamURI

Java symbol

kParamURIParamName

Type

String

Value

absolute path to folder in which the current configuration file is located (if loaded via the GUI; automatically set) or the folder of a configuration distribution root in URL format

2.5. PipelineInstanceId

This variable holds a UUID string identifying this particular running instance of a pipeline.

Identifying a certain pipeline object instance is necessary in some upCast XSLT extension functions which need to retrieve information from the pipeline object that is running the transformation. The value in this variable is used for these identification purposes and must be passed as a stylesheet parameter when needed there.

Name

PipelineInstanceId

Java symbol

kPipelineInstanceIdParamName

Type

String

Value

3. Pipeline Settings

Click the Pipeline Settings… button in the pipeline window to access the window for setting pipeline-level settings.

To make the settings active, click Close or the window’s close button.

Many of these settings allow you to override the settings made in the upCast preferences.

Tip

When you are using the upCast GUI as the prototype and testing environment for your pipeline development, but intend to later export it in form of Java source code or an Ant task, we recommend overriding the global settings by pipeline-specific settings to get consistency in your output for a specific pipeline document instead of being dependent on the current global application preferences at time of export.

Access the various settings by choosing the respective tab:

3.1. Pipeline Parameters

Here, you can set up a description of parameters you want your pipeline to be dependent on. The information provided here is used in three ways:

  1. to create a simplified view and data entry UI objects for the user of a pipeline, where you want to hide the details of the implementation (i.e. the kind and order of modules used, calculations etc.),

  2. to define the parameters a pipeline accepts and requires to be able to run from the commandline or via the Java API functions, including the ability to check those parameters for legal values, and

  3. to provide documentation for the semantics of a parameter, which is shown in form of help tags in the UI, as text in the commandline, and formatted as HTML document when generating the pipeline documentation

This is a convenient feature to distribute complete, parameterized pipeline solutions to your customers in an easy-to-use, packaged way. All they need to do is open the pipeline, supply the requested parameters, and click the Run button. They are therefore completely shielded from the (possibly many) modules building up the pipeline and their complexity.

Interface element and parameter definitions

The description code you provide here serves two purposes:

  1. It is the basis for determining the number and name of pipeline parameters.

  2. It specifies the kind of form display element for each of these parameters.

Basically, you specify the name of the pipeline variable you wish to have set to the specified pipeline parameter’s value. This value is supplied as initial, pre-set pipeline variable to your pipeline definition.

Important

The pipeline parameters are only set when the GUI is in Simple View mode. When in full editing mode, the pipeline is executed with a completely clean set of pipeline variables (except for the base variable) – unless you check the Set specified parameter defaults when running a pipeline in edit view option (see above). In the latter case, the default values for those parameters that have a default specified are set.

Before the defining code is interpreted, upCast resolves any contained variable references for the following realms and in that order:

  • include

  • javaproperty

  • application

You cannot (for obvious resons) access variables in the pipeline or module realm.

Tip

You can use the include variable reference to your advantage in projects where you have to create similar pipelines that essentially should have the same Simple View definitions. To keep those in-sync, you can use an external file holding the parameter definition code, then include it in all pipelines that should show the same UI and have the same parameters. You then only need to update that single external file, and the UI definitions are updated automatically in all pipelines that include it.

upCast offers several types of UI elements for parameter entry: a decorating label, a text field or box, a filechooser, a popup menu and a checkbox, each one with its own set of dedicated properties.

You must assign one of these entry types to each pipeline parameter you need. The syntax for describing the properties is based on a CSS rule set: The selector part takes the form of an element selector and supplies the name of the pipeline variable to set. The declaration block part specifies the specific display and behavioural properties for that UI element.

Here are the properties which you can set for each of the following available types (option values are case-sensitive!):

label

type

label

text

the text to display in the label

font-family

the name of the font to use; when not specified, the system’s default label font

font-weight

normal | bold

font-style

normal | italic

font-size

size of the font; when not specified, the system’s default label font size

color

the text color; must be a CSS 2.1 color value

background-color

the background color; must be a CSS 2.1 color value

initialize-when

never | unset | always | <string>

Defines parameter values setting and initialization behaviour when the pipeline is run as a sub-pipeline from within an External Pipeline Processor (see there for details); the default value is unset

hidden

when true, this control is not displayed. The default value is false.

locked

when true, this control's value cannot be edited as long as the pipeline has an edit-lock password defined and the edit lock is in effect. The default value is false.

Example 6.1. 

myLabel {
    type: label;
    text: “Simple View Sample”;
    font-size: 20pt;
    font-weight: bold;
    color: olive;
}

creates a label with 20pt font size, bold text and olive text color.


text

type

text

label

text to display as label at the very left

persistent

when true, current values are saved to the pipeline document and re-loaded when the document is next opened

when false, the value is not stored in the document and is assigned the default value (if specified) when the document is next opened

The default value is false.

default

the default value for the parameter, if newly created or persistent is false

description

the string specified here is displayed as tool-tip text in the UI when hovering over the input element and also used for descriptive messages when running in commandline mode

required

(only used in upCast commandline mode) when true, this means that specifying a value for this parameter is required to run the pipeline

when false, that parameter is not required to be specified but instead, its default value should be used (if specified) or not be set at all if there is no default value

The default value is true.

lines

the number of lines of the text field, the default is 1

postfix

the text to display after the text entry field; use this e.g. for displaying a value unit like “dpi” to let the user know the semantics of the number entered in the text field

initialize-when

never | unset | always | <string>

Defines parameter values setting and initialization behaviour when the pipeline is run as a sub-pipeline from within an External Pipeline Processor (see there for details); the default value is unset

hidden

when true, this control is not displayed. The default value is false.

locked

when true, this control's value cannot be edited as long as the pipeline has an edit-lock password defined and the edit lock is in effect. The default value is false.

Example 6.2. 

headerText {
    type: text;
    label: “Header Text:”;
    persistent: true;
    default: “My Publication”;
    lines: 2;
}

creates a field to input header text used in the pipeline. The pipeline variable created will be named headerText, and values the user inputs will be stored across document openings. The input field will show two lines of text, and will be pre-occupated with the text “My Publication” on initial creation.


filechooser

type

filechooser

label

text to display as label at the very left

persistent

when true, current values are saved to the pipeline document and re-loaded when the document is next opened

when false, the value is not stored in the document and is assigned the default value (if specified) when the document is next opened

The default value is false.

default

the default value for the parameter, if newly created or persistent is false

description

the string specified here is displayed as tool-tip text in the UI when hovering over the input element and also used for descriptive messages when running in commandline mode

required

(only used in upCast commandline mode) when true, this means that specifying a value for this parameter is required to run the pipeline

when false, that parameter is not required to be specified but instead, its default value should be used (if specified) or not be set at all if there is no default value

The default value is true.

mode

open displays a chooser for opening file

save displays a chooser for saving a file

folder displays a chooser for picking a folder

file-or-folder displays a chooser for picking a file or a folder

format

local converts the chosen filepath in local file naming convention

url converts the chosen filepath to a URL

initialize-when

never | unset | always | <string>

Defines parameter values setting and initialization behaviour when the pipeline is run as a sub-pipeline from within an External Pipeline Processor (see there for details); the default value is unset

hidden

when true, this control is not displayed. The default value is false.

locked

when true, this control's value cannot be edited as long as the pipeline has an edit-lock password defined and the edit lock is in effect. The default value is false.

Example 6.3. 

SourceFile {
    type: filechooser;
    label: “Source file:”;
    persistent: true;
    mode: open;
    format: url;
}

creates a field with a button to call a file chooser. The pipeline variable created will be named SourceFile, and values the user inputs will be stored across document openings. The file chooser will allow the user to pick files only, and the result will be stored in URL format in the editable input field.


popup

type

popup

label

text to display as label at the very left

persistent

when true, current values are saved to the pipeline document and re-loaded when the document is next opened

when false, the value is not stored in the document and is assigned the default value (if specified) when the document is next opened

The default value is false.

default

the default internal value for the parameter, if newly created or persistent is false. The value here must be exactly one of the values specified in the internal-values property.

description

the string specified here is displayed as tool-tip text in the UI when hovering over the input element and also used for descriptive messages when running in commandline mode

required

(only used in upCast commandline mode) when true, this means that specifying a value for this parameter is required to run the pipeline

when false, that parameter is not required to be specified but instead, its default value should be used (if specified) or not be set at all if there is no default value

The default value is true.

values

a space- or comma-separated list of values to display in the popup and to pass into the pipeline variables

internal-values

a space- or comma-separated list of internal values. The value set on the pipeline variable is the one from this list whose index matches the selected option from the values property’s list of displayed values. Use this to use descriptive values in the displayed popup, while still getting short enum-type values in your variable. It also allows for easy localization of displayed values without having to change internal processing.

initialize-when

never | unset | always | <string>

Defines parameter values setting and initialization behaviour when the pipeline is run as a sub-pipeline from within an External Pipeline Processor (see there for details); the default value is unset

hidden

when true, this control is not displayed. The default value is false.

locked

when true, this control's value cannot be edited as long as the pipeline has an edit-lock password defined and the edit lock is in effect. The default value is false.

Example 6.4. 

targetType {
    type: popup;
    label: “Target type:”;
    persistent: true;
    default: “db4”;
    values: “DocBook 4”, “DocBook 5”, “DITA”;
    internal-values: “db4”, “db5”, “dita”;
}

creates a popup with three entries, “DocBook 4”, “DocBook 5” and “DITA”. The pipeline variable created will be named targetType, and its value will be one of the values “db4”, “db5” or “dita”, and the value selection will be stored across document openings. The default value of the variable will be “db4” upon field creation.


checkbox

type

checkbox

label

text to display as label at the very left

persistent

when true, current values are saved to the pipeline document and re-loaded when the document is next opened

when false, the value is not stored in the document and is assigned the default value (if specified) when the document is next opened

The default value is false.

default

the default value for the parameter, if newly created or persistent is false, either true or false

description

the string specified here is displayed as tool-tip text in the UI when hovering over the input element and also used for descriptive messages when running in commandline mode

required

(only used in upCast commandline mode) when true, this means that specifying a value for this parameter is required to run the pipeline

when false, that parameter is not required to be specified but instead, its default value should be used (if specified) or not be set at all if there is no default value

The default value is true.

text

the checkbox’s label text next to the actual checkbox graphic

initialize-when

never | unset | always | <string>

Defines parameter values setting and initialization behaviour when the pipeline is run as a sub-pipeline from within an External Pipeline Processor (see there for details); the default value is unset

hidden

when true, this control is not displayed. The default value is false.

locked

when true, this control's value cannot be edited as long as the pipeline has an edit-lock password defined and the edit lock is in effect. The default value is false.

Example 6.5. 

includeStyle {
    type: checkbox;
    label: “Option:”;
    persistent: false;
    default: true;
    text: “Include style information”;
}

creates a checkbox with text “Include style information”. The pipeline variable created will be named includeStyle and will have the Boolean value true when the box is checked, false otherwise. The popup value selection will not be remembered across document openings. The default will be the option being checked (=on).


list

type

list

label

text to display as label at the very left

persistent

when true, current values are saved to the pipeline document and re-loaded when the document is next opened

when false, the value is not stored in the document and is assigned the default value (if specified) when the document is next opened

The default value is false.

default

the default value for the parameter, if newly created or persistent is false

description

the string specified here is displayed as tool-tip text in the UI when hovering over the input element and also used for descriptive messages when running in commandline mode

required

(only used in upCast commandline mode) when true, this means that specifying a value for this parameter is required to run the pipeline

when false, that parameter is not required to be specified but instead, its default value should be used (if specified) or not be set at all if there is no default value

The default value is true.

lines

the number of lines of the text field, the default is 4

initialize-when

never | unset | always | <string>

Defines parameter values setting and initialization behaviour when the pipeline is run as a sub-pipeline from within an External Pipeline Processor (see there for details); the default value is unset

hidden

when true, this control is not displayed. The default value is false.

locked

when true, this control's value cannot be edited as long as the pipeline has an edit-lock password defined and the edit lock is in effect. The default value is false.

Example 6.6. 

files {
    type: list;
    label: “List of files:”;
    description: “list of files to process; enter one file path per line”;
    persistent: true;
}

creates an entry field for a list of values where each single line corresponds to one list item. An empty line create a list item consisting of the empty string.


The order in which the parameters are defined determine the display order and the order of parameters in created Java code functions.

Example 6.7. 

Concatenating all individual parameter definition examples from above into one, the following Simple View for the pipeline would be created:


Name

ParameterDefinitions

Java symbol

kParameterDefinitionsParamName

Type

String

Value

Reset persistent values

This clears any currently stored persistent values in the pipeline document.

Tip

You should clear the values when you make significant changes to the paraneter definition and prior to saving the pipeline configuration for distribution to your customers, so they do not see your last, private settings you made during development for parameters having persistence turned on.

3.2. Settings

Finalization

This parameter lets you specify the condition under which the pipeline signals an error to its parent, which is the application when it is a top-level pipeline, or the executing component, when it is run as a sub-pipeline (e.g. by the External Pipeline Processor).

In the case of being a top-level pipeline, signalling an execution error will result in an error dialog to be shown (if run in the GUI) or an exception being thrown (when run via the Java API).

You can specify the cases in which a pipeline execution failure should be signalled by using several pre-defined, often used conditions, or you can specify a custom condition in UPL:

continue pipeline execution is always reported as successful

signal on FATAL signal a pipeline execution failure when during execution, a FATAL log message has been received

signal on ERROR signal a pipeline execution failure when during execution, a FATAL or ERROR log message has been received. This is the default for new pipelines.

signal on WARN signal a pipeline execution failure when during execution, a FATAL, ERROR or WARN log message has been received

Log message forwarding behaviour

In all of the four pre-programmed finalization modes above, collected log messages from level WARN and up are forwarded to the parent (usually a pipeline object). See also the section on logging for more details.

custom finalization: this option lets you specify custom UPL function code which, by returning one of the two Id values TERMINATE or CONTINUE, signals the failure state of the pipeline

To edit the UPL code for the custom finalize() function, click the Edit finalization code… button. By returning the Id TERMINATE, you indicate that the execution of the pipeline has failed, and by returning CONTINUE you indicate that the pipeline execution succeeded.

The custom function receives an Id parameter which is TERMINATE when one of its child modules requested explicit, premature pipeline termination, CONTINUE otherwise.

Example 6.8. Finalization function template

function finalize( $childFinalizationResult as Id ) as Id {
  variable $result as Id := $childFinalizationResult; // default: CONTINUE
  /* Return the Id TERMINATE when you want to terminate the pipeline, CONTINUE otherwise. */
  return $result;
}

Generating a custom error message

Additionally, in the custom finalization code field, you can optionally specify a second UPL function, message-text(). When this function is defined and does not return the empty string, when finalize() returns TERMINATE, the string returned by this function will be shown to the user instead of the default message generated by upCast. This allows you to generate error messages that are tailored specifically to your application and its user base.

Example 6.9. Custom message text function template

function message-text() as String {
  variable $result as String := “”;
  /* Return a non-empty message string to display an error dialog resp. write the error text to the log. */
  return $result;
}

Name

FinalizationMode

Java symbol

kFinalizationModeParamName

Type

String

Value

module: continue | terminate-fatal | terminate-error | terminate-warning | custom; pipeline: standard | custom

Name

FinalizationCode

Java symbol

kFinalizationCodeParamName

Type

String

Value

UPL source code

Log filter

Sets the logging threshold for messages that the module accepts from children and produces itself (see the logging architecture description for details).

Some default filter expressions are available from the associated popup menu, however you are free to define your own customized filter expressions. The filter expression syntax is described here.

If set to inherit, the logging filter settings are governed by the application's filter settings for the external logger as set in the application's preferences.

Name

LogFilterSpec

Java symbol

kLogFilterSpecParamName

Type

String

Value

inherit | OFF | FATAL | ERROR | WARN | INFO | DEBUG | VERBOSE | DETAIL | TRACE | ALL | logfilterspec

Edit-Lock password

Here, you can specify a password that prevents switching off the Simple View and hence editing the pipeline from withing the GUI. It does not encrypt the pipeline document itself!

To remove the lock, clear the password field. The password “__________” (10 underscores) must not be used.

The password is stored in the pipeline document as base64-encoded MD5 hash.

Name

EditLockPassword

Java symbol

kEditLockPasswordParamName

Type

String

Value

Pipeline UID

To implement file-location independent linking from Parameter Set documents to their implementation pipeline document, each pipeline document must have a unique ID. This need not be a standard UUID when you can guarantee that these IDs will only be used in a controlled environment, where we suggest using speaking IDs to make it easier for users to manually find the respective pipeline given a UID value, which may be necessary when a link gets broken to a mis-configuration of the ID resolver.

By default, every pipeline document that is opened that does not yet have a non-empty pipeline ID setting, upCast will automatically generate a UID and set it for that pipeline document.

Name

PipelineUUID

Java symbol

kPipelineUUIDParamName

Type

String

Value

Required upCast build number

Enter the build number of the upCast application that this pipeline requires as a minimum to be able to run. When a user tries to run the pipeline with an application version that has a build number less than the one specified here, a dialog is shown allowing the user to abort the execution of the pipeline (the default), execute it nevertheless (at his own risk), or aborting the execution and check automatically for a newer version of the application at the vendor site.

When you leave the field empty, no minimum requirement check is performed.

When no UI is available (e.g. when running from the commandline or via the Java API), execution is aborted and a FATAL log message with details is generated.

Name

RequiredBuildNumber

Java symbol

kRequiredBuildNumberParamName

Type

Integer

Value

3.3. Catalogs

Inherit from parent

When checked, the catalogs set in the nearest pipeline ancestor in the call chain or, if it is the top-level pipeline, of the upCast application preferences will be used. When unchecked, the Catalog files setting as specified will be used.

Name

UseGlobalCatalogs

Java symbol

kUseGlobalCatalogsParamName

Type

Bool

Value

Catalog files

To add a catalog file, choose Add Catalog… and select the catalog file to add from the file system. Each line in the field specifies a catalog location. The new catalog will be available to all modules immediately after closing the preferences window.

When the catalog resides in the file hierarchy below the pipeline base URI, it is made relative by default using the ${pipeline:base} variable notation.

When you hold down the Alt key while clicking Add Catalog…, upCast tries to generate a pipeline base URI-relative path, even when the location is outside the directory subtree under the pipeline base URI.

To remove a catalog, delete the text line containing its location specification.

Catalogs are considered in the order displayed.

Name

Catalogs

Java symbol

kCatalogsParamName

Type

String

Value

one path to a catalog per line as string

3.4. Font Configuration

Inherit from parent

When checked, the font configuration set in the nearest pipeline ancestor in the call chain or, if it is the top-level pipeline, of the upCast application preferences will be used. When unchecked, the Font configuration setting as specified will be used.

Name

UseGlobalFontConfig

Java symbol

kUseGlobalFontConfigParamName

Type

Bool

Value

Font configuration

Specify the source code for the stdfonts.config override that should be used for this pipeline.

Name

FontConfiguration

Java symbol

kFontConfigurationParamName

Type

String

Value

font configuration code

3.5. Encodings

Inherit from parent

When checked, the custom encoding setting set in the nearest pipeline ancestor in the call chain or, if it is the top-level pipeline, of the upCast application preferences will be used. When unchecked, the Custom Enconigs setting as specified will be used.

Name

UseGlobalEncodings

Java symbol

kUseGlobalEncodingsParamName

Type

Bool

Value

Custom Encodings

To add a custom encoding file, choose Add Encoding… and select the custom encoding file (*.encoding) to add from the file system. Each line in the field specifies a custom encoding location. When the custom encoding resides in the file hierarchy below the pipeline base URI, it is made relative by default using the ${pipeline:base} variable notation.

To add a folder where upCast should pick all contained custom encoding files, choose Add Encodings Folder… and select it. When the folder resides in the file hierarchy below the pipeline base URI, it is made relative by default using the ${pipeline:base} variable notation.

When you hold down the Alt key while clicking Add Encoding… or Add Encodings Folder…, upCast tries to generate a pipeline base URI-relative path, even when the location is outside the directory subtree under the pipeline base URI.

To remove a custom encoding entry, delete the text line containing its location specification.

Name

CustomEncodings

Java symbol

kCustomEncodingsParamName

Type

String

Value

paths to custom encodings (either to an individual custom encoding file or to a folder containing *.encoding files), with one entry per line

3.6. Export

Java source export

Fully qualified class name

Specify the fully qualified class name where the Java source code export option should place the generated code. Any required, nesting package folders will be created automatically by the File > Export to Java Source… function.

Name

ExportJavaClass

Java symbol

kExportJavaClassParamName

Type

String

Value

Source root folder

Specify the absolute path to the source root folder, i.e. the root of the Java package hierarchy subdirectories. You can use the ${pipeline:base} variable as the first component of the path to specify the source root relative to the pipeline base URI.

When this field is left empty, upCast will ask for the source root folder every time you call the File > Export to Java Source… function. When this field is non-empty, that value will be used silently when calling the Java source export function.

With the Choose… button, you can request a file chooser to pick the Java source root folder. When this is a subdirectory of the pipeline base URI, the path is automatically made relative to it.

Tip

When you press the Alt key while clicking Choose…, upCast tries to always make the path relative, even if it is not in the subtree under the pipeline base URI.

Name

ExportJavaSourceRoot

Java symbol

kExportJavaSourceRootParamName

Type

String

Value

Ant build module export

‘upcast.jar’ location (or Ant expression)

Here you specify the path or expression to insert into the upcast Ant task definition to the upcast.jar file containing the actual Java code for the task. If you leave that field empty, “upcast.jar” will be used in the created Ant file module when using File > Export as Ant Task….

Note

The text you enter here is first processed by the usual upCast variable resolution mechanism. This has the advantage that you can use upCast variables for calculating the path, must, however, take care to quote the ‘$’ character (dollar sign) when you want that verbatim, e.g. to reference Ant variables.

So you could use something like

$${basedir}/tasks/upcast.jar

to keep the generated Ant file portable by referring to upcast.jar relatively from the Ant build file’s base directory.

Name

ExportAntJarLocation

Java symbol

kExportAntJarLocationParamName

Type

String

Value

literal Ant value for task’s ‘basedir’ attribute

Here you specify the literal code to be used for the upcast task’s basedir attribute in the generated target. This is useful to calculate the pipeline base URI relative to some Ant property and thus make the generated build module position independent. upCast variables are resolved as usual before writing the resulting text to the build file.

Example 6.10. 

To calculate the pipeline base URI to be used by the task relative to the position of the build file, you may want to use a setting like

$${basedir}/MyPipelineRoot/

Note how you must quote the ‘$’ character (dollar sign) to avoid upCast trying to treat it as an upCast variable and expand it.


Name

ExportAntBasedir

Java symbol

kExportAntBasedirParamName

Type

String

Value

literal Ant code for <source> selection

Here you specify the literal Ant source XML code to be inserted into the generated target code for selecting the source file(s) to be used.

With the Add source… button, you can generate code for a single source file.

When holding down the Meta (Mac OS X: Command) key while clicking the Add source… button, you can generate code for all files in the selected folder. A commented-out line for filtering based on extension is automatically generated, which you can uncomment and fill in as desired.

For both cases, when additionally holding down the Alt key, the reference generated will be relative to the literal value specified in the literal Ant value for task’s ‘basedir’ attribute field. For this, a special local variable ${taskbase} is used, which gets replaced by the resolved contents of the literal Ant value for task’s ‘basedir’ attribute parameter.

For the syntax used for source specification, see the description of the upCast Ant task.

upCast variables are resolved as usual before writing the resulting text to the build file, including the resolution of the special ${taskbase} variable as the last resolution step.

Name

ExportAntSourceCode

Java symbol

kExportAntSourceCodeParamName

Type

String

Value

3.7. License

Inherit from parent

When checked, the license set in the nearest pipeline ancestor in the call chain or, if it is the top-level pipeline, of the upCast application preferences will be used. When unchecked, the License location setting as specified will be used.

Name

UseGlobalLicense

Java symbol

kUseGlobalLicenseParamName

Type

Bool

Value

License location

To set the license to be used for running this pipeline, click Choose license file… and select the license file (*.uclicense) to be used.

When the license file resides in the file hierarchy below the pipeline base URI, it is made relative by default using the ${pipeline:PipelineBase} variable.

The license details of the active license are displayed in the fields below for your reference.

Name

LicenseFile

Java symbol

kLicenseFileParamName

Type

String

Value

path to license file

Chapter 7. Module-level settings

Each module type has its own, dedicated set of parameters to control its behavior. A few parameters are shared by all modules, both in name and semantics. These are listed explicitly below. However, all other parameter names are to be interpreted with the context of the module’s functionality in mind to infer their meaning.

Internally, parameters of modules are dynamically, weakly typed, though each parameter has a recommended or even required (by definition) type.

1. Typographical conventions

Parameters will be described using the following typography:

Name

DeleteEmpties

Java symbol

kDeleteEmptiesParamName

Type

Boolean

Value

false, true

Name gives the internal java.lang.String name of the parameter, which is used for storing in preferences files, and can be used in the Java API as String. However, in Java, the use of the Java symbol is highly recommended instead.

Java symbol names the Java constant definition for the parameter’s Name. All constants are defined in the class de.infinityloop.common.Params.

Type specified the recommended Java type to use when programming against the API. In the GUI version, that is taken care of automatically. Also, when using alternative interfaces like an Ant task, which allows only passing arguments as character strings, upCast tries to perform an appropriate cast. So you should make sure that the data you provide in these cases will be cast-able to the specified type, as otherwise the conversion will fail or produce incorrect results at runtime.

Value describes possible value ranges, supported keywords or other specifics about that parameter’s range.

2. General module parameters

The following parameters are available on all modules:

Active checkbox

When the “active” checkbox in the upper left corner of the module parameter pane is checked, the module is active in the pipeline.

During pipeline development, it is often useful to have several differently configured modules to switch between, or to have modules inserted in the pipeline that generate some sort of debug output. To quickly activate and deactivate a module without having to actually delete or insert it again into a pipeline, with this parameter, modules can be quickly temporarily disabled by unchecking it.

Deactivated modules are completely skipped during a pipeline run and impose only minimal overhead – actually, it’s just writing a line to the log file.

Name

ModuleEnabled

Java symbol

kModuleEnabledParamName

Type

Bool

Value

true | false

Name

Here, you can assign a meaningful name to a module instance. By default, modules’ names are their type, like “XSLT Processor” or “RTF Importer”. However, when you have e.g. several XSLT processors in your pipeline, it is desirable to use more meaningful names, like “strip namespaces XSLT” or “TEI conversion transformation”.

Name

InstanceNameUser

Java symbol

kModuleInstanceNameUserParamName

Type

String

Value

an arbitrary string

Export

When checked, this module is handled (exported) in a File > Export… function.

You can use this to set up a single upCast pipeline document in such a way that for export to Java code or an Ant task, only certain modules will be exported. This lets you use some module instances for debugging in the UI, which then won’t be part of an exported pipeline representation.

For the Export as XML… function, module elements will have an additional attribute export with value true or false, respectively. This allows you to decide in any custom post-processing of that pipeline export format whether you want to handle that module in a special way (like discarding it completely like the built-in export options Ant and Java source).

Name

ModuleExported

Java symbol

kModuleExportedParamName

Type

Bool

Value

true | false

Initialization

This parameter lets you programmatically set module parameters as well as (dynamically) prevent running the module even when its active checkbox is checked. For this, you can write a custom UPL function initialize() by clicking on the Edit initialization code… button.

Note

The text on the Edit initialization code… buttonwill be bold when a custom initialization function has been defined (and therefore the code field is not empty). This lets you quickly see if a module defines a custom initialization function without having to open the code entry dialog.

If the button text is plain, the code field is empty and the module will always be executed.

Tip

If you always want to run the module unconditionally, make sure the code field is empty. This also allows you to see at a glance in the UI to see whether a custom function is defined, and you protect you against possible future signature changes and therefore code incompatibilities in the initialize() function when you effectively don’t even use its features.

If the initialize() function returns EXECUTE (which is the default), the module is further executed.

If the initialize() function returns SKIP, the module’s action is not performed and the subsequent module in the pipeline (if there is one) is run.

If the initialize() function returns TERMINATE, the module’s action is not performed and additionally, further pipeline execution is aborted.

In the initialize() function’s body, you can run arbitrary UPL code. This code is run just before actually performing the module’s functionality. This function hook’s main intent is to give you the possibility to programmatically and dynamically set module parameters’ values based on e.g. pipeline variable values (which in turn may have been set through the Simple View or by an external parameter passed to the pipeline). This way, you can set a parameter that does not allow you to have variable references expanded, like popups or check boxes. Additionally, this function serves as a dynamically evaluated condition specifying whether to run the module or not (in contrast to the module’s static Active checkbox).

Example 7.1. 

Assuming you are offering your users the choice between the HTML and CALS table model by way of a pipeline parameter tableType (e.g. in the Simple View), the following code sets the corresponding module parameter TableModel dynamically in the XML Export module. This would not be otherwise possible via that module’s UI since for the selection, a popup is used which has no way to calculate its value based on pipeline variables.

The code assumes that the pipeline parameter tableType can have one of two values: html or cals.

#namespace module “http://www.infinity-loop.de/namespace/upcast-realm/module”;
#namespace pipeline “http://www.infinity-loop.de/namespace/upcast-realm/pipeline”;
function initialize() as Id {
  $module:TableModel := $pipeline:tableType;
  return EXECUTE; /* run the module */
}

Name

InitializationCode

Java symbol

kInitializationCodeParamName

Type

String

Value

UPL source code

Finalization

This parameter lets you specify the condition under which further pipeline execution should be cancelled after running this module.

This parameter will only be evaluated (and therefore have any effect) if the module action was actually performed, or in other words: if initialize() did not prevent the execution of the module’s action by returning TERMINATE or SKIP.

Normally, pipeline execution continues with the following defined modules even if in the current one there was a warning or error. These messages are collected and then displayed in the final pipeline execution error dialog. However, sometimes this is not a desired behaviour. Specifically, when subsequent modules rely on the proper execution of their predecessors to produce usable or correct results or – even more importantly – to not cause harm to data integrity, it may be necessary to immediately stop further execution of the pipeline when some module produces an error.

You can specify the termination behaviour by using several pre-defined, often used conditions, or you even can specify a custom condition in UPL:

continue continue pipeline execution no matter what, i.e. even when an ERROR or FATAL error has occurred

signal on FATAL terminate pipeline execution when during execution of this module, a FATAL error message has been generated

signal on ERROR terminate pipeline execution when during execution of this module, a FATAL or ERROR error message has been generated. This is the default value for new module instances.

signal on WARN terminate pipeline execution when during execution of this module, a FATAL, ERROR or WARN message has been generated

custom finalization: this option lets you specify custom UPL function code which, by returning one of the two Id values TERMINATE or CONTINUE, can request pipeline termination or continuation after this module.

To edit the UPL code for the custom finalize() function, click the Edit finalization code… button. By returning the Id TERMINATE, you can request pipeline termination, and by returning CONTINUE as result you can request pipeline continuation.

The custom function takes as Id parameter the termination status of its child component if there is any, CONTINUE otherwise.

Example 7.2. Finalization function template

function finalize( $childFinalizationResult as Id ) as Id {
  variable $result as Id := $childFinalizationResult; // default: CONTINUE
  /* Return the Id TERMINATE when you want to terminate the pipeline, CONTINUE otherwise. */
  return $result;
}

Name

FinalizationMode

Java symbol

kFinalizationModeParamName

Type

String

Value

module: continue | terminate-fatal | terminate-error | terminate-warning | custom; pipeline: standard | custom

Name

FinalizationCode

Java symbol

kFinalizationCodeParamName

Type

String

Value

UPL source code

Log filter

Sets the logging threshold for messages that the module accepts from children and produces itself (see the logging architecture description for details).

Some default filter expressions are available from the associated popup menu, however you are free to define your own customized filter expressions. The filter expression syntax is described here.

If set to inherit, the logging filter settings are governed by the module’s execution parent’s settings (that is usually the pipeline it is contained in).

Name

LogFilterSpec

Java symbol

kLogFilterSpecParamName

Type

String

Value

inherit | OFF | FATAL | ERROR | WARN | INFO | DEBUG | VERBOSE | DETAIL | TRACE | ALL | logfilterspec

3. Module-specific parameters

Parameters belonging to specific modules are described in that module’s detailed description.

Chapter 8. Modules

This section describes the available modules in more detail, listing available parameters. Filter type identifiers are given in square brackets after the UI module name.

1. Pipeline Variables [pipelinevars]

This module allows you to set some commonly used global variables easily for re-use in subsequent modules. It is therefore most useful as the first module in a pipeline.

You can set the global variables pipeline:SourceFile, pipeline:TemporaryItemsFolder, pipeline:DestinationFolder, pipeline:ImageDestinationFolder and pipeline:DebugFolder.

The effect of using this module in API mode is the same as using UpcastEngine.setPipelineVariable().

All parameters have a type of java.lang.String.

Important

When a field of the pre-defined parameters is left empty, that parameter is not set at all. This allows having this type of module somewhere in the middle of a pipeline and have it only set resp. override certain parameters (either custom parameters or selected pre-defined ones). All parameters with empty values in the list of pre-defined entry fields keep their assigned parameters (or are not created).

This also means that if you want to assign the empty string to some parameter, you can only do so by specifiying it in the Custom pipeline variables field.

Custom pipeline variables

Here, you can specify additional global values for use in subsequent modules. The definitions herein are processed after the fixed global parameters described above are evaluated and set, so you can refer to them using the usual ${pipeline:…} variable reference. A parameter definition must follow this syntax:

varname’:=’ ‘”’ value ‘”’;

Quotes within the variable value must themselves be quoted using the backslash character ‘\’.

Important

The text in this field and any contained variable references are resolved as follows:

  1. Any references to the include realm are resolved.

  2. The individual assignments are parsed according to the above syntax.

  3. Only now, any further references to realms other than include are resolved separately for each value part of an assignment.

This algorithm covers the usual cases where you might want to include constant assignment code shared by several pipelines using an include variable reference, but also reference variable values without having to worry whether their actual value breaks the value assignment syntax described above.

Name

PipelineVariables

Java symbol

kPipelineVariablesParamName

Type

String

Value

(string in same syntax as in corresponding UI field)

2. RTF Importer (upCast) [rtfimport]

Note

This module requires an appropriate RTF Importer feature included in your license to be fully functional.

This importer module handles conversion from RTF to the internal, unified upCast format, the upCast Internal DTD. With WordLink enabled, the filter also can convert Word binary files (*.doc).

Optional hyphen RTF symbol, Soft Hyphen (U+00AD) character

The RTF importer outputs the RTF Optional Hyphen symbol (\-) as codepoint U+E003 in the Unicode Private Use Area. This is to allow following pipeline steps to discriminate it from Soft Hyphen (U+00AD) Unicode characters entered directly in the RTF as Unicode. This has been implemented because rendering behaviour of the two in following rendering engines is different from Word’s display so that it is important to be able to differentiate between those two.

However, the Unicode Translation Map in effect in the XML Exporter module maps U+E003 to U+00AD by default. If you need or want to change the translation of RTF’s Optional Hyphen symbol to something other than the Soft Hyphen character in Unicode, you must change or override the default mapping of the source codepoint U+E003 in the XML Exporter module.

Parameters are grouped logically into tabs:

General

Source file

Specify the source file in RTF or, if WordLink is available, in Word binary format that should be imported.

Name

SourceFile

Java symbol

kSourceFileParamName

Type

Object

Value

absolute file path in URL or local file system convention

Hoist common inline properties to parent

If enabled, any inline formatting CSS property that extends and has the same value over all children of a paragraph-level element will be hoisted to its parent object as a style override. Effectively we’re making use of CSS inheritance and optimize the output by specifying that particular property only once on the parent instead of on each of its child elements.

Name

HoistCommonInlines

Java symbol

kHoistCommonInlinesParamName

Type

Bool

Value

true | false

Remove empty inlines

If enabled, any inline style specifications that do not contain any #PCDATA or similar, visually rendered content, are discarded from the document.

The default for this parameter is off based on the assumption that you may want to keep e.g. formatting information for empty cells so that a user may later fill in text and has the correct, originally intended formatting information available at that document location.

Name

RemoveEmptyInlines

Java symbol

kRemoveEmptyInlinesParamName

Type

Bool

Value

true | false

Allow ‘class’ and ‘style’ attributes simultaneously on <inline> elements

When on, this option allows that both a class and style attribute may be present on an element. Otherwise, the two are separated and an anonymous inline element is created for the style attribute instead.

Option checked:

This is <uci:inline uci:class=”slang” uci:style=”color: blue;”>True Blue</uci:inline>.

Option unchecked:

This is <uci:inline uci:class=”slang”><uci:inline uci:style=”color: blue;”>True Blue</uci:inline></uci:inline>.

You might want to use this option to have named Word styles always separated out in a dedicated element so that additional override styles can be recognized quickly by the additional inline element.

Name

CombineWithLogicalStyle

Java symbol

kCombineWithLogicalStyleParamName

Type

Bool

Value

true | false

Apply list structuring heuristics

If checked, special list structure detection algorithms are performed to create the best logically structured XML output. If unchecked, Word’s internal list IDs are used to track where a list starts and ends and where a new one begins, which may (based on the editing history of a particular list) have virtually no resemblance to what you are actually seeing in the layout.

The default value is on.

Name

ApplyListHeuristics

Java symbol

kApplyListHeuristicsParamName

Type

Bool

Value

true | false

Markup revision tracking using <inserted> and <deleted>

When this is checked, document revisions are marked up in the result using the inserted and deleted elements.

If this is off, only the result of the revisions will be exported, i.e. inserted content remains in the document and deleted content is removed.

Name

RevisionTracking

Java symbol

kRevisionTrackingParamName

Type

Bool

Value

true | false

Use CSS for forced pagebreaks (where possible)

When checked, the importer tries to use CSS code for specifying forced pagebreaks wherever possible by using the pagebreak-before: always property/value combination.

If this is off, a pagebreak element will always be used.

Name

UseCSSForPagebreaks

Java symbol

kUseCSSForPagebreaksParamName

Type

Bool

Value

true | false

Default font size

Some RTF documents do not specify a default font size for their text content, but rely on the default of the rendering application (like Microsoft Word). This parameter lets you set the default font size for such documents.

Microsoft Word applications up to and including Word 97 used a default value of 10pt, Word 2000 and later use a default of 12pt. When you set this parameter to * (i.e. automatic), upCast tries to guess from the RTF symbols it finds in the document whether it is a Word 2000 (or later) document and then will use 12pt as default font size, 10pt otherwise.

Name

DefaultFontSize

Java symbol

kDefaultFontSizeParamName

Type

String

Value

'*' | 1..999

Literal pass-through styles

If checked, you can specify a set of (Word-) styles, separately for the paragraph style and the character style category, by specifying their exact names which should be treated as literals. This means that all text in the document set using these styles will be written to the output without any interpretation by upCast. This lets you write e.g. XHTML or XML code directly within your document the way it should appear at that location in the output.

Name

LiteralProcessing

Java symbol

kLiteralProcessingParamName

Type

Bool

Value

true | false

Paragraph style names

When Literal pass-through styles is on, specify here the list of paragraph styles that should be treated as literal content indicators. Enclose the names of the styles in double-quotes, separate styles by a space character.

Name

LiteralParStyle

Java symbol

kLiteralParagraphStyleParamName

Type

String

Value

style name

Character style names

When Literal pass-through styles is on, specify here the list of character styles that should be treated as literal content indicators. Enclose the names of the styles in double-quotes, separate styles by a space character.

Name

LiteralCharStyle

Java symbol

kLiteralCharacterStyleParamName

Type

String

Value

style name

Images

Include images

When checked, images contained in the document are processed as configured by the image processing parameters. If unchecked, all images of the source document will be completely discarded from the document.

Name

IncludeImages

Java symbol

kIncludeImagesParamName

Type

Bool

Value

true | false

Temporary images folder

This is the location where images in a read document will be temporarily stored while the pipeline is processed. It is e.g. the responsibility of an exporter to copy images intended to be permanently saved across a pipeline run to a different location.

Note

A pipeline keeps track of temporary images created in the above location. After finishing a pipeline run, all these recorded temporary files are automatically deleted.

Name

TemporaryItemsFolder

Java symbol

kTemporaryItemsFolderParamName

Type

String

Value

path to temporary items folder

Use inline copies instead of references

When this option is checked, for images that have been included in the RTF document using both methods, by reference and by embedding, the module will try to use the embedded substitute representation. This option essentially breaks the link to the original image file, if a substitute representation has been embedded in the RTF file, and instead links to the embedded representation of the original file.

When an image has only been linked and no substitute representation is available in the RTF, however, the original link to the image is preserved and used.

Name

InlineReferencedImages

Java symbol

kPreferEmbeddedImagesParamName

Type

Bool

Value

true | false

Incoming images default resolution

This parameter determines the image resolution in dpi (dots per inch) to use for embedded images that do not specify their resolution explicitly. This is true for all (originally) GIF images and some variants of JPEG and PNG images.

Without any dpi information, the RTF importer (and, as a matter of fact, even Word) cannot determine the absolute size of images, which is necessary to create a fully specified export file. This parameter is then used to establish a default dpi value and corresponds roughly to Word’s Web Options > Image resolution setting.

When setting this to the default ‘*’ value, the RTF importer determines the absolute size of the image from the image properties in the RTF document (if available) and modifies the embedded image data by adding the resolution determined from the (absolute size/number of pixels)-pair to the externalized image. This ensures that subsequent processors can correctly determine absolute sizes and scale any images accordingly.

Tip

If you have control over the original document generation process and especially image creation, make sure that each image you add to a Word or RTF document contains explicit resolution information, as this avoids all sorts of platform incompatibilities.

This rule especially forbids importing GIF images as the GIF format does not include resolution information. However, also several Clip Art images in JPEG and PNG format do not contain this desirable information, with displayed image size in a document becoming dependent on platform, Word version or setting of the Web Options > Image resolution parameter – which is generally undesirable.

Outgoing images rendering resolution

This value affects the WMF to pixmap renderer built into the RTF Importer. This means that WMF (or EMF) images will be rendered into a pixmap with pixel dimensions for width and height that correspond to this value.

The default value is 96 dpi (used e.g. by Microsoft’s Internet Explorer™). You may want to change this when outputting for Netscape Navigator 4.7 on the Mac, which by default displays at 72 dpi and therefore would downscale images written using 96 dpi resolution.

Suppose you have a WMF image in your document that is 2 by 1 inches in size. With 96 dpi output resolution, this will yield a pixmap of size 192 by 96 pixels.

However, if you set the output resolution to only 72 dpi, the resulting pixmap will be 144 by 72 pixels in size.

Name

ImageRenderingResolution

Java symbol

kImageRenderingResolutionParamName

Type

Integer

Value

20..360

Export embedded images of type…

While exporting embedded images, you have the option to convert them to a different format.

The RTF Importer includes a custom WMF to pixmap renderer fully programmed in Java. It is neither intended nor recommended for production quality image conversion! To perform high-quality image conversion, we strongly encourage you to consider specialized third-party products. Nevertheless, the built-in renderer is useful and intended for producing draft image renderings for viewing in a web browser or creating documents for editorial review and should perform well enough for most purposes except final publishing.

Embedded images in an RTF document can be of several image format types: WMF, EMF, JPEG, PNG and Macintosh PICT. The RTF importer lets you specify a handling method for each of these formats, so you can e.g. use already pixel based images like JPEG or PNG unchanged while rendering vector formats like WMF into a pixel-based representation.

The following handling methods are available (some of which are not applicable to all source formats):

(no change) Export the embedded image as binary data without any modification applied

external cmd Export the embedded image as binary data without any modification applied, and then run the specified external command on it for further processing. (See below for details.)

*remove* The image will be completely removed from the document

JPEG The image will be converted into JPEG format, using the built-in WMF to pixmap renderer if necessary. Clicking Options… lets you set the JPEG compression quality.

PNG The image will be converted into PNG format, using the built-in WMF to pixmap renderer if necessary. Clicking Options… lets you set the PNG compression algorithm.

BMP The image will be converted into Windows bitmap (BMP) format, using the built-in WMF to pixmap renderer if necessary.

PICT The image will be converted into Macintosh PICT format, using the built-in WMF to pixmap renderer if necessary. Note that only the image map operator is supported. The RTF importer will not translate WMF vector operators into native PICT operators.

When using the option external cmd, two additional parameters can be set:

File extension The field should receive the destination file extension of the image file as it is after the external conversion. For example, if you want to convert a WMF file to TIFF, the extension should be tif or tiff.

Command This is the external command to execute for converting the image source file to the desired target format. You must use placeholders for the source and destination file name using the upCast variable syntax. The variables to use are:

${imgsrc#local}

the image source file in local file name convention

${imgsrc#url}

the image source file in URL format

${imgdest#local}

the destination file name in local file name convention

${imgdest#url}

the destination file name in URL format

This works as follows: The file to be converted is available at the location in imgsrc#local. The RTF importer then constructs a target file name, using the source file name as basis, but setting the extension to the one specified. Since the RTF importer needs to know the final resulting filename for referring to the externally converted image in the internal document tree, but there is no way to return a string from a shell command easily (just an integer return code), it prescribes the target file name itself. This is what the variable imgdest#local is for. You must make sure that the final, processed image file is available at the location contained in that specific variable.

Example 8.1. Example:

To convert a WMF file to JPEG, use settings like:

WMF to [external cmd:]

File extension: [jpg]

Command: [fileconverter -fmt jpeg -outfile ${imgdest#local} ${imgsrc#local}]


Name

WMFDestFormat

Java symbol

kWMFDestFormatParamName

Type

String

Value

unchanged | dispose | ExternalCommand | JPEG | PNG | BMP | PICT

Name

EMFDestFormat

Java symbol

kEMFDestFormatParamName

Type

String

Value

unchanged | dispose | UseWMFSubstitute | ExternalCommand

Name

JPEGDestFormat

Java symbol

kJPEGDestFormatParamName

Type

String

Value

unchanged | dispose | ExternalCommand | JPEG | PNG | BMP | PICT

Name

PNGDestFormat

Java symbol

kPNGDestFormatParamName

Type

String

Value

unchanged | dispose | ExternalCommand | JPEG | PNG | BMP | PICT

Name

PICTDestFormat

Java symbol

kPICTDestFormatParamName

Type

String

Value

unchanged | dispose | ExternalCommand | JPEG | PNG | BMP | PICT

Name

WMFDest.JPEG.Quality

Java symbol

kWMFDestJPEGQualityParamName

Type

Integer

Value

0..100

Name

WMFDest.PNG.CompressionType

Java symbol

kWMFDestPNGCompressionTypeParamName

Type

String

Value

default | fast | max | none

Name

JPEGDest.JPEG.Quality

Java symbol

kJPEGDestJPEGQualityParamName

Type

Integer

Value

0..100

Name

JPEGDest.PNG.CompressionType

Java symbol

kJPEGDestPNGCompressionTypeParamName

Type

String

Value

default | fast | max | none

Name

PNGDest.JPEG.Quality

Java symbol

kPNGDestJPEGQualityParamName

Type

Integer

Value

0..100

Name

PNGDest.PNG.CompressionType

Java symbol

kPNGDestPNGCompressionTypeParamName

Type

String

Value

default | fast | max | none

Name

PICTDest.JPEG.Quality

Java symbol

kPICTDestJPEGQualityParamName

Type

Integer

Value

0..100

Name

PICTDest.PNG.CompressionType

Java symbol

kPICTDestPNGCompressionTypeParamName

Type

String

Value

default | fast | max | none

Objects

These parameters specify how embedded objects (OLE) should be handled. The RTF importer generates an uci:object element for each embedded object it finds in the RTF. The child elements of this container object are alternative representations of the object’s data. This can can be an uci:image (if available in the source document: represents the current display of that object at the time of saving the document), or an uci:ole element (if available: it contains a base64 representation of the binary data of the OLE object, which makes it possible to reconstruct it to an editable instance using the RTF exporter).

Include image representation

When checked, an image representation alternative will be added to the object element (if available in the source document).

Include binary data

When checked, an uci:ole binary data representation alternative will be added to the object element. The uci:ole element contains the base64-encoded binary data as character data.

Include MathML representation

When MathLink is available, i.e. you have Design Science‘s MathType software (version 5.2) installed on your Windows system and are running upCast on that same machine, for MathType OLEs, you can also embed a MathML representation of your formula in the object element as m:math element.

Important

Since MathLink is only available on the Windows platform, this option will only be enabled when a functioning MathLink actually is available to the application.

Name

ObjectHandling

Java symbol

kObjectHandlingParamName

Type

String

Value

image || embed || mathml (separated by whitespace if more than one)

WordLink

Set WordLink features.

Important

Since WordLink is only available on the Windows platform, this tab will only be displayed when WordLink actually is available to the application.

Mode

When Process .doc files only is selected, WordLink and all options specified will only be applied to Word binary (*.doc) files.

When Process all files is selected, WordLink and all options specified will be applied to any input document, i.e. even files that are in RTF format already. This lets you automatically update fields or add pagestart and linestart elements.

Name

WordLinkMode

Java symbol

kWordLinkModeParamName

Type

String

Value

doc | all

Run macro named „il_premacro“

When checked, WordLink will first run a Word macro named il_premacro on the source document. This macro must either be defined in the respective document (when it is a Word binary .doc file) or in the global document template file (*.dot).

When this macro is not available, an error will be issued after conversion, though the further conversion process is not affected.

Update fields

When checked, WordLink will update any fields in the source document with current values: date, time, pages, …

Update from linked images

When including an image only by reference (i.e., using Word’s INCLUDEPICTURE field), the RTF importer is not able to determine the actual image size as that information is not part of RTF. By checking this option, the linked image is temporarily included into the document with the effect that image size and possibly applied scaling in the .doc Word binary file can be evaluated by the importer.

This feature is not beneficial for RTF source files, as in these the necessary information is already lost (also for Word).

Mark up layout page breaks using <pagestart />

This inserts a <pagestart /> empty inline element at those places where in current layout flow, there would be a dynamic page break when rendering the document.

Mark up layout line breaks using <linestart />

This inserts a <linestart /> empty inline element at those places where in current layout flow, there would be a dynamic line break when rendering the document.

Important

This is slow for documents bigger than about 100 pages. You may want to increase the Kill timeout value significantly. Also, some document structure constellations may yield wrong line break position results due to limitations in the Word application.

Name

WordLinkCommand

Java symbol

kWordLinkCommandParamName

Type

String

Value

Pages || Update || Premacro || Lines || Includelinkedimages || Updatelinks (concatenate desired options without any whitespace inbetween)

Kill timeout

When hitting a corrupt document, WordLink may have problems and/or hang the application. Therefore, you can set a kill timeout value after which the WordLink functions will be aborted. The default value is 300 seconds.

Note

Killing WordLink may leave an invisible instance of Word running. Please check in case of a timeout running processes and kill any zombie Word processes manually using the Process Viewer (Ctrl-Alt-Del on Windows 2000/XP).

Name

WordLinkKillTimeout

Java symbol

kWordLinkKillTimeoutParamName

Type

Integer

Value

timeout duration in milliseconds

Copy temporary .rtf file to debug folder as “basename-tmp.rtf”

This is mainly for debugging purposes. It copies the intermediate RTF file to the specified debug folder with a name of basename-tmp.rtf after having applied all WordLink functions. This is the file that the RTF importer itself takes as source for its actual conversion process.

Name

WordLinkCopyToOutput

Java symbol

kWordLinkCopyToOutputParamName

Type

Bool

Value

true | false

3. UPL Processor [uplcode]

Note

This module requires an appropriate UPL feature included in your license to be fully functional.

This module lets you run a program written in the Upcast Processing Language (UPL).

UPL code

This contains the UPL code you want to execute. The code must define a function main() as follows:

function main() as Value {
   ... your code goes here ...
}

The UPL Processor calls this function main() once when it runs and executes the code defined therein (or in any dependent, user-defined functions). For a detailed description of UPL, see the separate documentation, Upcast Processing Language.

The returned result of the function is stored into the pipeline variable ModuleResult.

Name

UPLCode

Java symbol

kUPLCodeParamName

Type

String

Value

UPL source code

UPL parameters

This contains the assignments for variables to be passed to the UPL program. For a detailed description of how UPL receives parameter values, see the separate documentation, Upcast Processing Language.

A parameter definition must follow this syntax:

paramname ‘:=’ ‘”’ value ‘”’;

Quotes within the parameter value must themselves be quoted using the backslash character ‘\’.

Important

The text in this field and any contained variable references are resolved as follows:

  1. Any references to the include realm are resolved.

  2. The individual assignments are parsed according to the above syntax.

  3. Only now, any further references to realms other than include are resolved separately for each parameter assignment’s value part of a parameter assignment.

This algorithm covers the usual cases where you might want to include constant parameter assignment code shared by several UPL modules in a pipeline using an include variable reference, but also reference variable values without having to worry whether their actual value breaks the value assignment syntax described above.

Name

UPLParameters

Java symbol

kUPLParametersParamName

Type

String

Value

(string in same format as in UI)

4. UPL Tree-Processor [upl]

Note

This module requires an appropriate UPL feature included in your license to be fully functional.

This module differs from the UPL Processor in that it does not call a single function once, but you can define code to be run upon visiting each node of the current internal document in a depth-first traversal, depending on certain conditions you specify.

UPL code

This contains the UPL code you want to execute. For a detailed description of the UPL, see the separate documentation, Upcast Processing Language.

Name

UPLCode

Java symbol

kUPLCodeParamName

Type

String

Value

UPL source code

UPL parameters

This contains the assignments for variables to be passed to the UPL program. For a detailed description of how UPL receives parameter values, see the separate documentation, Upcast Processing Language.

A parameter definition must follow this syntax:

paramname ‘:=’ ‘”’ value ‘”’;

Quotes within the parameter value must themselves be quoted using the backslash character ‘\’.

Important

The text in this field and any contained variable references are resolved as follows:

  1. Any references to the include realm are resolved.

  2. The individual assignments are parsed according to the above syntax.

  3. Only now, any further references to realms other than include are resolved separately for each parameter assignment’s value part of a parameter assignment.

This algorithm covers the usual cases where you might want to include constant parameter assignment code shared by several UPL modules in a pipeline using an include variable reference, but also reference variable values without having to worry whether their actual value breaks the value assignment syntax described above.

Name

UPLParameters

Java symbol

kUPLParametersParamName

Type

String

Value

(string in same format as in UI)

Grouper

When turned on, the grouping algorithm will be run on the internal tree. This will be before the finalize() or finalize-error() UPL method is called.

Name

RunGrouper

Java symbol

kRunGrouperParamName

Type

Bool

Value

Grouping processing order

This parameter lets you set the order of the colors in which the grouping should be performed.

With all colors in alphabetical order, all colors that have been used for setting painters or tags are grouped in alphabetical order of the color name. The ordering is the same as the Java class java.lang.TreeSet uses by default on the platform you are running upCast on.

With colors in specified order, then all remaining:, you can specify a list of colors you want to be grouped in a given order before all others in the text field below, with any remaining ones being grouped in alphabetical order (see all colors in alphabetical order) afterwards.

With only these colors in specified order, you can specify the colors and their order of grouping in the text field below. No other colors will be grouped, even if respective painters or tags have been placed on nodes in the internal document tree.

After a run of the grouper, all painters, tags and paintings of nodes are removed from the internal tree. This means that any further grouper instances running after a grouper has already run will have no effect unless new painters and tags have been placed on nodes of the internal document tree, usually using the UPL processor.

Color order is specified by listing the colors in sequence, separated by whitespace. If a color name includes whitespace (which is deprecated), the full color name must be enclosed in double quotes.

Name

GroupingColorOrder

Java symbol

kGroupingColorOrderParamName

Type

String

Value

alphabetic | only | first

Name

GroupingColors

Java symbol

kGroupingColorsParamName

Type

String

Value

ordered list of color names, separated by whitespace; to use color names that themself contain whitespace, surround them by double-quotes

Element Splitter

When turned on, any split actions attched to nodes using mark-split() in the internal tree will be executed. This will be after running the grouper (if enabled), but before the finalize() or finalize-error() UPL method will be called.

Name

RunSplitter

Java symbol

kRunSplitterParamName

Type

Bool

Value

5. Sectioner [sectioner]

The Sectioner module is used for creating a nested, deeper structure based specifically on elements that have a heading level set (via set-heading-level() in UPL), and uci:part elements that have the grouping property set (via set-grouping() in UPL).

Sectioning works only on the direct children of the uci:body element in the upCast Internal DTD.

5.1. Handling of uci:part elements

If the algorithm finds a uci:part element, it checks its grouping property. If this uci:part is a grouping part, all uci:body element children between this uci:part element and the next uci:part element that has the grouping property set will be surrounded by this uci:part element.

Example 8.2. Example:

…
<part is-grouping=”true”/>
<par>…</par>
<par>…</par>
<part is-grouping=”false”/>
<par>…</par>
<part is-grouping=”true”/>
<par>…<par>
…

will be transformed by a run of the Sectioner into

…
<part is-grouping=”true”>
    <par>…</par>
    <par>…</par>
    <part is-grouping=”false”/>
    <par>…</par>
</part>
<part is-grouping=”true”>
    <par>…<par>
…

Note that namespace prefixes/definitions have been omitted in the above for better readability.


<part> is grouping (by default)

When checked, even though you may not have specified this explicitly on each uci:part element (e.g. in UPL), all uci:part elements are treated as if they had set the grouping property by default. This mimics the behavior of pre-6.0 versions of upCast.

Name

PartIsGrouping

Java symbol

kPartIsGroupingParamName

Type

Bool

Value

true | false

5.2. Handling of other elements

Any elements that have a uci:heading-level attribute with a value greater than 0 are considered headings of the respective structure level. The Sectioner creates sections based on the heading level information on those elements by automatically creating a surrounding uci:section element, taking care to match the section nesting to the element’s heading level. This means that if there is a jump in heading level, the Sectioner will automatically generate additional, grouping uci:section elements.

When an element with the same heading level is found as the current section nesting, the current section is closed and a new one is opened at the same level.

When an element with a higher heading level than the current one is encountered, a new, nested section is created within the current section.

When an element with a lower heading level than the current one is encountered, the appropriate number of open, nested sections is closed (including the one with the same nesting level) and a new one is opened.

Here’s an example demonstrating all possible cases (assume all elements and attributes being in the uci namespace):

Example 8.3. Example: section nesting based on paragraph’s heading level

<par>…</par> 
<par heading-level=”1”>…</par>
<par>…</par>
<par heading-level=”2”>…</par>
<par>…</par> 
<par heading-level=”4”>…</par>
<par>…</par> 
<par heading-level=”3”>…</par>
<par>…</par> 
<par heading-level=”1”>…</par>
<par>…</par>

will result in the following structure generated:

<par>…</par>
<section level=”1”>
  <par heading-level=”1”>…</par>
  <par>…</par>
  <section level=”2”>
    <par heading-level=”2”>…</par>
    <par>…</par>
    <section level=”3”>
      <section level=”4”>
        <par heading-level=”4”>…</par>
        <par>…</par> 
      </section>
    </section>
    <section level=”3”>
      <par heading-level=”3”>…</par>
      <par>…</par> 
    </section>
  </section>
</section>
<section level=”1”>
  <par heading-level=”1”>…</par>
  <par>…</par>
</section>

Note that namespace prefixes/definitions have been omitted in the above for better readability.


The sectioning algorithm can be modified by two options:

Create <section> for empty headings

The default sectioning algorithm only creates a new section for the first of consecutive elements having a uci:heading-level attribute of the same value (if it is not empty).

The idea behind this option is that the user may have created a heading in Word, then hit return (not changing the style) to create visual space, and only then started writing the actual content. You certainly would not want to have a section on its own for each of the visual space generating empty heading-styled paragraphs, but only for the first one, so section nesting generation is suppressed for the remaining heading-styled paragraphs.

If, however, you want to create section nesting corresponding to each heading-styled paragraph in a document, even if it’s empty, check this option.

Name

GroupEmptyHeadings

Java symbol

kGroupEmptyHeadingsParamName

Type

Bool

Value

true | false

Create <sectionintro> around leading section content

Sections created using the sectioning algorithm may have leading content before any subsections they may also have. Checking this option allows you to have this leading content up to the start of the first nested (sub-) section be grouped by an uci:sectionintro element , e.g. for easier post-processing later with XSLT.

You can choose whether you want the uci:sectionintro element be created in any case (always) or only when the respective uci:section actually has sub-sections (when sub-sections exist).

In this example, assume all elements and attributes being in the uci namespace:

Example 8.4. Grouping the section introduction

<par heading-level=”1”>…</par>
<par>…</par>
<table>…</table>
<par>…</par>
<par heading-level=”2”>…</par>
<par>…</par>

will be transformed to the following when Create <sectionintro> around leading section content is checked with the always option:

<section level=”1”>
  <sectionintro>
    <par heading-level=”1”>…</par>
    <par>…</par>
    <table>…</table>
    <par>…</par>
  </sectionintro>
  <section level=”2”>
    <sectionintro>
      <par heading-level=”2”>…</par>
      <par>…</par>
    </sectionintro>
  </section>
</section>

or it will be transformed to the following when Create <sectionintro> around leading section content is checked with the when sub-sections exist option:

<section level=”1”>
  <sectionintro>
    <par heading-level=”1”>…</par>
    <par>…</par>
    <table>…</table>
    <par>…</par>
  </sectionintro>
  <section level=”2”>
    <par heading-level=”2”>…</par>
    <par>…</par>
  </section>
</section>

Note that namespace prefixes/definitions have been omitted in the above for better readability.


Name

GroupSectionIntro

Java symbol

kGroupSectionIntroParamName

Type

String

Value

never | always | child

6. [DEPRECATED] Grouper [grouper]

Important

This module is deprecated and must no longer be used in new development of processing pipelines. It will be removed completely in a future version of upCast. Update any of your existing pipeline definitions as soon as possible by transitioning to the use of the functionally equivalent Grouper option of the UPL Tree Processor module.

The Grouper module actually performs a grouping that has been earlier specified during a run of an UPL Tree Processor.

Grouping processing order

This parameter lets you set the order of the colors in which the grouping should be performed.

With all colors in alphabetical order, all colors that have been used for setting painters or tags are grouped in alphabetical order of the color name. The ordering is the same as the Java class java.lang.TreeSet uses.

With colors in specified order, then all remaining:, you can specify a list of colors you want to be grouped in a given order before all others in the text field below, with any remaining ones being grouped in alphabetical order (see all colors in alphabetical order) afterwards.

With only these colors in specified order, you can specify the colors and their order of grouping in the text field below. No other colors will be grouped, even if respective painters or tags have been placed on nodes in the internal document tree.

After a run of the grouper, all painters, tags and paintings of nodes are removed from the internal tree. This means that any further grouper instances running after a grouper has already run will have no effect unless new painters and tags have been placed on nodes of the internal document tree, usually using the UPL processor.

Color order is specified by listing the colors in sequence, separated by whitespace. If a color name includes whitespace (which is deprecated), it must be enclosed in double quotes.

Name

GroupingColorOrder

Java symbol

kGroupingColorOrderParamName

Type

String

Value

alphabetic | only | first

Name

GroupingColors

Java symbol

kGroupingColorsParamName

Type

String

Value

ordered list of color names, separated by whitespace; to use color names that themself contain whitespace, surround them by double-quotes

7. XML Importer [xmlimport]

This module imports any XML document into the internal tree variable, replacing any existing document. This is useful when you want to apply some of the specific UPL functions on it and need not rely on styling info (which is currently not imported/recognized and cannot be created within upCast).

Source File

This parameter lets you choose the source XML file to import.

Name

SourceFile

Java symbol

kSourceFileParamName

Type

Object

Value

absolute file path in URL or local file system convention

8. XML Exporter [xmlexport]

This module serves for serializing the internal tree to XML. It offers a choice for the table model to write (internal, HTML or CALS), debugging and pretty-printing options. It also offers choices for handling images in the document (separate for referenced/linked images and embedded images) and you can use a Unicode Translation Map.

General

Destination File

Choose the full filename into which the result should be written. You can use upCast’s variables for building the path.

Name

DestinationFile

Java symbol

kDestinationFileParamName

Type

String

Value

absolute path to desired result file

Output resolution

Specify the output resolution in dpi. This value is used for calculating device pixel values, e.g. in HTML tables’ cell widths or images’ sizes.

Name

OutputResolution

Java symbol

kOutputResolutionParamName

Type

Double

Value

1..9999

Output file encoding

Lets you specify the encoding in which the XML file will be written. If your further tool chain allows it, we strongly recommend to use the default, UTF-8.

Name

OutputEncoding

Java symbol

kOutputEncodingParamName

Type

String

Value

Java encoding name

Include generator info as comment

When checked, adds info about when and by which version of upCast the XML file was produced to that file as an XML comment. This may be useful both for infinity-loop support during trouble shooting and for you, when you need to relate some produced XML files to a certain version (in time) of your pipelines.

Name

IncludeGeneratorInfo

Java symbol

kIncludeGeneratorInfoParamName

Type

Bool

Value

true | false

Mark individual text() nodes using ‘[#[…]’ (for debugging only)

When checked, each single text node in the internal tree will be surrounded by “[#[” and “]” respectively. This allows you to better understand the internal tree and may help in diagnosing problems with any XPath queries or XSLT transformations by showing text node boundaries explicitly.

Name

MarkTextNodes

Java symbol

kMarkTextNodesParamName

Type

Bool

Value

true | false

Pretty-print output

Turns on pretty-printing the output for elements whose whitespace handling mode is known explicitly to the serializer.

Name

PrettyPrint

Java symbol

kPrettyPrintParamName

Type

Bool

Value

true, false

Table model

This parameter lets you choose which table model to use for tables. You can either choose the native (upCast) table model, which is a very simple table > row > cell model, the HTML 4 table model, or the OASIS-EM (CALS) (OASIS XML Exchange Table Model, a subset of CALS) table model.

The HTML 4 table model uses the HTML namespace http://www.w3.org/HTML/1998/html4, the CALS table model uses the special, proprietary namespace http://www.infinity-loop.de/namespace/2006/upcast-cals.

Name

TableModel

Java symbol

kTableModelParamName

Type

String

Value

HTML | CALS | native

Style information

Lets you specify how general CSS styles for known elements and named styles for paragraphs and inline elements should be exported. Options are:

none No style info is exported at all. This does not effect local styles on elements, which will be written in any case according to the “Explode CSS style info” setting.

internal (<style> element) The style info is written as CSS code in the special element uci:style (in the upCast internal namespace) in the document’s uci:head element.

external (default file) Writes a stylesheet processing instruction to point to a CSS file named basename.css in the same folder as the resulting XML file. This file can e.g. be created using the CSS Exporter module.

custom stylesheet PI this lets you specify a custom stylesheet processing instruction to e.g. link to a general CSS file you wish to use in all of the converted documents.

Name

StylesheetMode

Java symbol

kStylesheetModeParamName

Type

String

Value

none | internal | external | custom

Name

CustomStylesheetPI

Java symbol

kCustomStylesheetPIParamName

Type

String

Value

custom stylesheet PI string

Images

During import, e.g. using the RTF importer, all references to images are made absolute and stored this way in the internal tree as follows:

Embedded images are written to disk into a temporary location and possibly a format conversion is applied. The internal tree at this point holds the absolute path to these temporary image files.

Linked (or referenced) images are stored with their absolute path to the original image; no matching files for linked images are created in the temporary image files location.

At export time, you can decide how the image location information (and possibly the actual image files) should be handled. The handling mode can be set individually for images that were embedded in the original document and (external) images that were only linked to.

Embedded Images

This parameter governs the handling of images that originally had been embedded in the source document.

remove image from Destination File The uci:image element is completely dropped from the XML output.

copy to Image DestinationFolder (new file) This option copies the temporary image file to the Image Destination Folder. If a file of the desired name already exists at that location, a unique file name is generated by appending -1, -2 etc. to the basename until a name is found that is not already used. A relative reference to this copy is then used in the uci:image element.

copy to Image DestinationFolder (replacing) This option copies the temporary image file to the Image Destination Folder. If a file of the desired name already exists at that location, it is overwritten without prompting. A relative reference to this copy is then used in the uci:image element.

internal tree format (don’t copy) This option writes the absolute path to the temporary file as it is currently set in the internal tree unchanged. This is useful for checking how the internal tree looks like at a certain point in a module chain.

Important

The temporary image files will be deleted automatically after a pipeline execution for a certain document. This means that when using the internal tree format (don’t copy) option, the referenced image in the generated XML will have been deleted!

Name

EmbeddedImagesHandling

Java symbol

kEmbeddedImagesHandlingParamName

Type

String

Value

discard | copy | copyreplace | internal

Referenced images

This parameter governs the handling of linked (referenced) images in the original source file.

remove image from Destination File The uci:image element is completely dropped from the XML output.

copy original to Image DestinationFolder (new file), update link This option copies the original, linked-to image file to the Image Destination Folder. If a file of the desired name already exists at that location, a unique file name is generated by appending -1, -2 etc. to the basename until a name is found that is not already used. A relative reference to this copy is then used in the uci:image element.

If the original file is not accessible from the machine that executes the pipeline (be it due to network failure, the file not existing or some other problem), the option update link for Destination File location is used as fallback instead.

copy original to Image DestinationFolder (replacing), update link This option copies the original, linked-to image file to the Image Destination Folder. If a file of the desired name already exists at that location, it is overwritten without prompting. A relative reference to this copy is then used in the uci:image element.

If the original file is not accessible from the machine that executes the pipeline be it due to network failure, the file not existing or some other problem), the option update link for Destination File location is used as fallback instead.

keep verbatim link to original This option writes the reference the same way it was found in the original source document. This therefore may be an absolute or relative path.

Note that if the original location specification in the source RTF file was relative, but the XML file is not saved into the same folder as where the source document is located, chances are that the link is broken.

update link for Destination File location This option updates the reference to the original image in a way that it still points to that very image, even when the destination of the XML file is in a different folder (when the original reference was relative).

Name

LinkedImagesHandling

Java symbol

kLinkedImagesHandlingParamName

Type

String

Value

discard, copy, copyreplace, keep, update

Image Destination Folder

You can specify a separate folder dedicated for images. By default, this is set to ${module:DestinationFile#urlpath}, which evaluates to the same folder where the XML file is saved. However, if you want to put images into a separate folder, you can do this here. This is the folder where any of the above options that physically copy the image file will place the file. Any relative references to the image from within the XML file will be adjusted accordingly.

Name

ImageDestinationFolder

Java symbol

kImageDestinationFolderParamName

Type

String

Value

absolute path to folder

Filter

The nodes in the internal tree may have a very rich set of attributes attached, many of which have only been useful while processing the tree within upCast, e.g. with UPL. Serializing all those attributes may create huge files, where only a fraction of the info contained will be used down the further processing chain of the document. To reduce unnecessary memory consumption and processing time, the XML Exporter offers a way to set up a filter on the attributes serialized for each internal tree node. This is achieved by using a specially formed UPL program in conjunction with the dedicated filtering function filter-attrs().

Tip

This filter can be effectively used to reduce the set of CSS properties exploded into attributes to a minimal set that you are actually interested in for further processing, e.g. in an XSLT step.

Attribute Filter

This field holds the UPL program to perform the filtering.

As in the UPL Tree-Processor, you can define several UPL rules. The selector part determines for which kind of node (and possibly more complex conditions) the attribute filter applies. This lets you filter attributes differently on different elements.

The action part is applied, when the selector matches. Although theoretically, you can use the complete range of UPL functionality on such a node, many changes to the node will not be picked up by the serialiazer (except for changes in the node’s attributes), so we recommend against using this UPL program for other things than filtering attributes.

It is important to understand how the context node supplied to the UPL program looks like:

  • The context node supplied to the UPL program is a temporarily, newly created, artificial, single node. It lives by itself and neither has a parent, nor siblings, nor children. It is neither the node in the context of its later serialization nor the actual node of the internal tree to be serialized, but merely just a lookalike of the former. This means that among other things, you cannot query its context nodes with XPath using eval-xpath().

  • The context node does not hold synthesized style info, nor does it hold attached user values.

The filtering UPL code is not called for nodes of other DOM node types than Element.

Clicking the Insert defaults button inserts the current upCast default filter setup for new XML Exporter instances before any existing code in the Attribute Filter text field.

Name

SerializationFilter

Java symbol

kSerializationFilterParamName

Type

String

Value

Advanced

Unicode translation map

This field lets you enter a Unicode Translation Map. You can enter any mappings directly or include an externally created Unicode Translation Map file using the include realm: ${include(encoding=”…”):file}.

Name

UnicodeTranslationMap

Java symbol

kUnicodeTranslationMapParamName

Type

String

Value

Unicode translation map code

CSS property unit map

Here, you can specify a mapping table that associates any CSS <length> property with a pair of {unit, precision}. When the module needs to write length or size information in form of CSS properties, it consults this list to determine which length unit to use at which precision. For a description of the format, see CSS property unit table.

You can enter any mappings directly or include an externally created CSS property unit table file using the include realm: ${include(encoding=”…”):file}.

Name

CSSPropertyUnitMap

Java symbol

kCSSPropertyUnitMapParamName

Type

String

Value

CSS property unit map code

9. Commandline Processor [commandline]

This module serves for executing external system commands by way of the standard command-line interpreter available on the respective execution platform.

System command

The command to be executed by the underlying system’s command-line interpreter. You can use upCast variables for building the string.

For platform-independent, common file operations, upCast offers some internal “pseudo” commands:

upcast:delete-file filename*

Deletes all listed files.

upcast:copy-file source dest

Copies the file source to the new file dest.

upcast:move-file from to

Moves the file from to its new destination to. This is equivalent to the sequence of commands upcast:copy from to followed by upcast:delete-file from.

upcast:delete-recursively folder-or-file*

Recursively deletes all listed folders and/or files.

This command is potentially dangerous as it can lead to deleting a huge number of files when used carelessly! Please consider using upcast:delete-recursively-restricted instead.

upcast:delete-recursively-restricted deletionboundary folder-or-file*

Recursively deletes all listed folders and/or files that are equal or reside below the specified deletionboundary folder in the file system hierarchy.

This method is fail-fast, i.e. when a specified folder to be deleted is not hierarchically under the deletion boundary, any further actions on it are skipped. This should prevent the case where when you specify a folder where deletionboundary is a descendant of that folder, the complete contents of deletionboundary is deleted. Or, in other words: The specified root path for a recursive deletion operation must already satisfy the deletion boundary restriction to be considered any further.

Example 8.5. 

upcast:delete-recursively-restricted “/user/iloop/temp/” “/user/iloop/temp/test.txt”

deletes the file /user/iloop/temp/test.txt because it is a descendant of the deletion boundary folder /user/iloop/temp/.

upcast:delete-recursively-restricted “/user/iloop/temp/” “/user/iloop/”

deletes nothing because the folder /user/iloop/ is not a descendant of the deletion boundary folder /user/iloop/temp/.


Name

Commandline

Java symbol

kCommandlineParamName

Type

String

Value

commandline to execute, either as String or (in UPL or Java API) as List

Wait for completion

When checked, the command is executed synchronously, i.e. upCast waits until the external command has completed before continuing execution.

Important

Checking for errors occurring during external command execution can only be performed when this option is on. upCast considers any return value other than 0 (zero) an error.

Name

WaitForCompletion

Java symbol

kWaitForCompletionParamName

Type

Bool

Value

true | false

Example 8.6. 

To create a new directory images in the folder specified by the global variable DestinationFolder on a Unix system, you would use the following command-line:

mkdir “${pipeline:DestinationFolder#localpath}/images”

Note the quotes around the parameter to accommodate for path names that contain e.g. space characters.


10. XSLT Processor [xslt]

This module lets you apply an XSLT transformation to some external file (which might be the result of an earlier exporter module). You can choose between the Xalan XSLT processor from the Apache Software Foundation (ASF; http://xml.apache.org/), Saxon 6.5.5 by Michael Kay, or Saxon-B (version 9) from Saxonica (http://www.saxonica.com).

Source File

Specify the file the transformation should be applied to, most probably an XML file. You can use all upCast variables for dynamically creating the full path to the file.

Name

SourceFile

Java symbol

kSourceFileParamName

Type

Object

Value

absolute file path in URL or local file system convention

XSL Transformation File(s)

Specify the XSLT transformation (“XSLT file”) to apply.

You can specify several transformations (use one line each to specify the full path to an XSLT file) or, in other words: the paths must be separated by a newline character. These will be chained, i.e. the original source file will be processed using the first XSLT file specified, the result will be processed by the second and so on. Note, however, that all transformations share the same XSLT parameters.

Name

Stylesheet

Java symbol

kStylesheetParamName

Type

String

Value

path to stylesheet

XSLT parameters

Lets you specify parameters to be passed to the transformation. A parameter definition must follow this syntax:

paramname ‘:=’ ‘”’ value ‘”’;

Quotes within the parameter value must themselves be quoted using the backslash character ‘\’.

You may use upCast’s variable system for constructing parameter values.

Important

The text in this field and any contained variable references are resolved as follows:

  1. Any references to the include realm are resolved.

  2. The individual assignments are parsed according to the above syntax.

  3. Only now, any further references to realms other than include are resolved separately for each parameter assignment’s value part of a parameter assignment.

This algorithm covers the usual cases where you might want to include constant parameter assignment code shared by several XSLT Processor modules in a pipeline using an include variable reference, but also reference variable values without having to worry whether their actual value breaks the value assignment syntax described above.

Name

StylesheetParameters

Java symbol

kStylesheetParametersParamName

Type

String

Value

(string in same format as in UI)

Result file

Specify where the transformation result should be written. You can use all upCast variables for dynamically creating the full path to the file.

Name

DestinationFile

Java symbol

kDestinationFileParamName

Type

String

Value

absolute path to desired result file

XSLT processor

Lets you choose between Xalan and Saxon 6.x or Saxon 9.x as the XSLT processor to use (if available).

Name

XSLTProcessor

Java symbol

kXSLTProcessorParamName

Type

String

Value

xalan | saxon6 | saxon

11. Unicode Translation Processor [unicodetranslator]

This module lets you apply a Unicode Translation Map to an already existing XML document. Additionally, by way of the Output encoding parameter, you can quickly change the character encoding used in an XML file.

Though the implementation tries to preserve the formatting of the original document while doing its thing, there is no guarantee that the result is syntactically equivalent to the input, though structurally, it of course is.

The Unicode Translation Map rules are only applied to the XML document’s text and attribute nodes. Comments and PIs are left unchanged.

Source File

Specify the file the transformation should be applied to, which must be an XML file. You can use all upCast variables for dynamically creating the full path to the file.

Name

SourceFile

Java symbol

kSourceFileParamName

Type

Object

Value

absolute file path in URL or local file system convention

Unicode Translation Map

This field lets you enter a Unicode Translation Map. You can enter any mappings directly or include an externally created Unicode Translation Map file using the ${include(encoding:=”…”):file} variable reference, which is automatically replaced by the contents of the specified file after reading it using the specified encoding.

When you leave this field completely empty, no Unicode translation is performed. You can use this if the only thing you want to do is changing the character encoding the XML file is in by specifying the desired Output encoding.

Name

UnicodeTranslationMap

Java symbol

kUnicodeTranslationMapParamName

Type

String

Value

Unicode translation map code

Destination file

Specify where the translation result should be written to. You can use all upCast variables for dynamically creating the full path to the file.

Name

DestinationFile

Java symbol

kDestinationFileParamName

Type

String

Value

absolute path to desired result file

XML version attribute

Specify the value of the version attribute on the XML declaration at the beginning of the result XML file.

If you leave this empty, no XML declaration will be written. The default value is “1.0”.

Note that this is a textual parameter only; specifying e.g. “1.1” does not modify the file written such that it is a valid XML 1.1 file.

Name

XMLVersion

Java symbol

kXMLVersionParamName

Type

String

Value

value to be written in the 'version' attribute of the XML declaration; when empty, XML declaration is suppressed

Output encoding

Lets you specify a name of a supported output file encoding, e.g. UTF-8 or iso-8859-1. This encoding is also specified in the encoding attribute on the XML declaration (if written, see XML Version parameter above).

Name

OutputEncoding

Java symbol

kOutputEncodingParamName

Type

String

Value

Java encoding name

DOCTYPE declaration

This lets you add, override or remove an existing doctype declaration in the incoming document.

  • When this field is a single asterisk (“*”), the doctype declaration in the source document (if present) is passed through as-is.

  • When this field is empty (“”), any doctype declaration present in the source document is stripped from the output.

  • When this field contains any other data, that data is written verbatim to the output, replacing any possibly existing doctype declaration in the input document.

Name

DOCTYPEDeclaration

Java symbol

kDOCTYPEDeclarationParamName

Type

String

Value

literal value of full DOCTYPE declaration as String; when empty, DOCTYPE declaration is removed, when '*', DOCTYPE declaration is copied as in source

12. XML Validator [validator]

This module serves for validating arbitrary XML documents. The module supports validation against an XML DTD, XML Schema and Relax NG.

Source File

Specify the XML file that should be validated. You can use all upCast variables for dynamically creating the full path to the file.

Name

SourceFile

Java symbol

kSourceFileParamName

Type

Object

Value

absolute file path in URL or local file system convention

Schema type

Specify the type of Schema you want to validate the file against:

XML DTD validate against an XML DTD; the document to be validated must have a valid DOCTYPE declaration

XML Schema validate against an XML Schema; the document must have the respective schema file location attributes

Relax NG validate against a Relax NG schema; you must specify the location and type of the Relax NG schema file using the specific parameters shown when this type is selected (see below)

Name

SchemaType

Java symbol

kSchemaTypeParamName

Type

String

Value

dtd | xmlschema | relaxng

Relax NG Schema file

(for Relax NG schema type only)

Specify the location of the Relax NG schema file to validate the Source File against.

Name

RelaxSchemaLocation

Java symbol

kRelaxSchemaLocationParamName

Type

String

Value

absolute file path

Relax NG Syntax

(for Relax NG schema type only)

Specify the syntax the Relax NG schema file is written in, either XML syntax or compact syntax.

Name

RelaxSyntax

Java symbol

kRelaxSyntaxParamName

Type

String

Value

xml | compact

13. CSS Exporter [css]

This module writes an external Cascading Style Sheets, level 2 (CSS2) file comprising all styles (paragraph styles and character styles) used in the current internal document, matching their visual appearance as closely as reasonably possible. The output also includes information on the page setup like paper size and margins.

The CSS2 file written may for example be referenced by a file created by the XML Exporter module.

Selector syntax

Lets you choose which CSS selector syntax should be used:

CSS1 (‘class’ shorthand) Writes selectors using the ‘class’ attribute shorthand: .classname { ... }

CSS2 Selectors Writes selectors according to CSS2 selector syntax rules: *[class=classname] = { ... }

CSS1+CSS2 Writes both ways of expressing the selector so that tools understanding either can pick the one that they understand. First, the shorthand is written, followed by full CSS2 selector.

Name

SelectorSyntax

Java symbol

kSelectorSyntaxParamName

Type

String

Value

css1 | css2 | all

upCast DTD elements namespace prefix

Specify the namespace prefix for the upCast DTD elements that the final XML file is using which includes the generated CSS file by this module.

The default is the empty string, i.e. no namespace prefix used.

Note

Setting this parameter is necessary until widespread support for the CSS Namespaces Module is available. Until then, element names are bound by their qualified name, including namespace prefix plus separating colon (if existant). To generate the qualified element name, the module must be told the namespace prefixes it should use.

Name

UpcastDTDNamespacePrefix

Java symbol

kUpcastDTDNamespacePrefixParamName

Type

String

Value

prefix for elements in upCast DTD

HTML4 DTD elements namespace prefix

Specify the namespace prefix for the HTML4 elements that the final XML file is using which includes the generated CSS file by this module. HTML elements are e.g. used for tables (if you opted for the HTML table model).

The default is html.

Name

HTML4DTDNamespacePrefix

Java symbol

kHTML4DTDNamespacePrefixParamName

Type

String

Value

the desired namespace prefix

Output file

Specify where the CSS file should be written. You can use all upCast variables for dynamically creating the full path to the file.

Name

DestinationFile

Java symbol

kDestinationFileParamName

Type

String

Value

absolute path to desired result file

Output encoding

Lets you specify a name of a supported output file encoding, e.g. UTF-8 or iso-8859-1. This encoding is also specified in the @charset rule at the very beginning of the CSS file.

Name

OutputEncoding

Java symbol

kOutputEncodingParamName

Type

String

Value

Java encoding name

14. RTF Exporter (“downCast”) [rtfexport]

Note

This module requires an appropriate RTF Exporter feature included in your license to be fully functional.

The RTF Exporter was formerly a separate product called “downCast”. This module is a much improved version of downCast 1.x, especially in respect to performance (up to 300% faster) .

This module converts XML documents to Word or, more precisely, RTF documents. For specifying the layout, the module relies on a subset of Cascading Style Sheets, level 2 (CSS2) properties, amended by several proprietary properties where needed. Input XML documents must either be valid against the upCast DTD (note that this is different from the upCast internal DTD!), or they can be any arbitrary XML language for which a transformation into the upCast DTD can (and needs to) be created.

For more details on supported CSS and custom properties and their semantics, see the separate RTF Exporter documentation.

Source File

Specify the XML file that should be converted to RTF. You can use all upCast variables for dynamically creating the full path to the file. This must be an XML file conforming to the upCast DTD or – in experimental status – XSL-FO.

Name

SourceFile

Java symbol

kSourceFileParamName

Type

Object

Value

absolute file path in URL or local file system convention

Destination file

Specify where the RTF result should be written to. You can use all upCast variables for dynamically creating the full path to the file.

When running on Windows and having WordLink installed and functional, by specifying the destination file extension as .doc, you can have the module automatically convert the generated RTF file into a Word binary file.

Name

DestinationFile

Java symbol

kDestinationFileParamName

Type

String

Value

absolute path to desired result file

Source format

Specify the format the source file is in, either upCast DTD or XSL-FO.

Name

SourceFormat

Java symbol

kSourceFormatParamName

Type

String

Value

upcast | xslfo

Output resolution

When the RTF exporter must include images that do not specify their resolution explicitly in the file, the application uses the value that you specify here to calculate image size and resulting scaling factor to apply in the RTF output.

The default value is 96 dpi.

Name

OutputResolution

Java symbol

kOutputResolutionParamName

Type

Double

Value

1..9999

Missing/unsupported images

Specifies what the RTF Exporter should do when it encounters images in a format that it cannot handle or that are not supported in RTF, or when an image file to embed into a document is missing.

discard the image is completely removed from the result

show filename only as inline text error text indicating the file name of the missing image is embedded into the output document, prominently visible to the user

show full path as inline text error text indicating the full, absolute path and file name of the missing image is embedded into the output document, prominently visible to the user

show detailed error message as inline text error text indicating the full, absolute path and file name of the missing or unsupported image, including further error details, is embedded into the output document, prominently visible to the user

replace with generic image a generic replacement image is embedded into the final result document, respecting and scaled to the originally requested image size so it does not break the layout of the document

Name

ImageErrorHandling

Java symbol

kImageErrorHandlingParamName

Type

String

Value

discard | filename | filepath | details | image

User stylesheet

Here, you can specify a CSS stylesheet to use for the conversion instead of the stylesheet (possibly) specified in the XML source. You can use all upCast variables for dynamically creating the full path to that file.

Name

UserStylesheet

Java symbol

kUserStylesheetParamName

Type

String

Value

path to user stylesheet

Whitespace handler class

For experts only!

The RTF Exporter makes use of special code to handle whitespace characters in the input stream. This field lets you set a custom whitespace handler if this is required. A whitespace handler must be a Java class that implements the WhitespaceHandler interface. If you think you need to implement your own whitespace handler, please contact us directly at <support@infinity-loop.de> in advance.

The default value is ‘*’ (asterisk) which lets the implementation decide on the most appropriate whitespace handler for the input document and should not be changed for normal use.

The module provides three Whitespace Handlers for different situations. You request their explicit use by specifying their full, qualified class name in the Whitespace Handler class input field.

Important

Except for the NoopWhiteSpaceHandler, all are more or less experimental and we do not guarantee their correctness or usefulness.

de.infinityloop.downcast.rtflib.NoopWhiteSpaceHandler This is the default handler for input documents valid according to the upCast DTD. All whitespace is significant in mixed-content elements. It is automatically used when you specify ‘*’ and the Source format is upCast DTD.

de.infinityloop.downcast.rtflib.XSLFOWhiteSpaceHandler This is a white space minimizing handler, minimizing whitespace in mixed-content elements. It is automatically used when you specify ‘*’ and the Source format is XSL-FO. It tries to mimic XSL-FO required behavior when minimizing whitespace before and around inline elements. Whitespace is collapsed to the left.

de.infinityloop.downcast.rtflib.CSS3WhiteSpaceHandler This handler behaves exactly the same as the XSLFOWhiteSpaceHandler, except that it respects the setting of the white-space CSS3 shorthand property, resp. its all-space-treatment component when resolved to its constituent properties. When this has the value preserve, whitespace is preserved in that element, unless overridden in a child element. When this is collapse (the default), the handler behaves as described above. Note that you should explicitly specify the desired behavior on the immediate parent element of (possibly) mixed content.

ID rendering mode

For elements having an id attribute of type ID, you can specify if and how this information should be translated into RTF bookmarks.

don’t render The ID information is not used and no bookmarks are created in the resulting RTF based on an id attribute.

before element only A bookmark with the id’s value is created just before the start of the element’s contents.

after element only A bookmark with the id’s value is created immediately after the full contents of the element has been written to the RTF file.

surround element A bookmark with the id’s value is created that starts just before the start of the element’s contents and ends just after the full contents of the element has been written to RTF, i.e. the bookmark spans the contents of the element.

Name

IDRenderMode

Java symbol

kIdRenderModeParamName

Type

String

Value

surround | ignore | before | after

Style name output format

Determines how style names should be written to the RTF stylesheet destination. When Unicode, we generally use Unicode characters to express possible umlauts; if normal (use document encoding), the document encoding is used wherever possible.

Name

StyleNameFormat

Java symbol

kStyleNameFormatParamName

Type

String

Value

unicode | normal

Table ‘frame’ attribute overrides cell border definitions

When checked, the frame attribute on table elements overrides any settings of cell borders that border on the outmost surrounding table border.

When not checked, a cell’s border CSS definition takes highest precedence in rendering.

Name

FrameOverridesCells

Java symbol

kFrameOverridesCellsParamName

Type

String

Value

true | false

15. External Pipeline Processor [extpipeline]

This module lets you execute another, external pipeline document as a sub-pipeline within the current pipeline execution. The execution can be conditional, i.e. based on arbitrary UPL code that finally must return either true or false.

Note

It is not possible to provide the external pipeline in form of a Java Stream object, it must be an external file residing in the file system.

Source File

The path to the external pipeline document (.ucdoc) to include in the current pipeline.

Name

SourceFile

Java symbol

kSourceFileParamName

Type

Object

Value

absolute file path in URL or local file system convention

Pipeline variables

This lets you choose how pipeline variables for the included pipeline should be created:

Use independent variables in sub-pipeline the included pipeline gets its own, initially empty set of pipeline parameters. Think of this as when running that pipeline as a completely independent pipeline

Copy variables to sub-pipeline this creates a copy of the current pipeline variables and passes it on to the included pipeline. This lets you pass all the current pipeline variables to the included pipeline. When the included pipeline modifies any variables, this only affects itself, but not the calling pipeline. This way, it is possible to provide values (like “parameters”) to the included pipeline. When execution of the included pipeline finishes, the pipeline variables of the calling pipeline will be in exactly the same state as before running the included pipeline. Effectively, the included pipeline can not have any side-effects on the callers set of variables.

Share variables with sub-pipeline in this mode, the included pipeline uses the same instance of pipeline variables as the caller. This means that the included pipeline receives and can modify the pipeline variables of the including pipeline. This way, it is possible to provide values (like “parameters”) to the included pipeline, and have the included pipeline “return values” by setting them in the pipeline variables.

The only exception to this rule is the pipeline:base variable, which is not inherited but set according to the included pipeline’s location on disk so that relative references therein are resolved properly. After the sub-pipeline’s execution, the original value is restored for the pipeline:base variable before continuing in the calling pipeline.

To have more control per parameter how it behaves in sub-pipeline execution environments, there is a specific property for specifying the setting behaviour. For each parameter, you can specify the initialize-when property, with values never, always, unset or an arbitrary <string-value>. The default value is unset. Here’s an outline of what happens with respect to pipeline parameters during a sub-pipeline call in all of the three cases above:

  1. The pipeline variables pool of the sub-pipeline to be called is initialized or created according to the above parameter.

  2. The pipeline:base variable is set to the appropriate value depending on the storage location of the sub-pipeline.

  3. Then, for each parameter defined in the sub-pipeline:

    1. If the parameter’s initialize-when value is unset (or the property is not defined) and the pipeline variable pool does not already contain a variable by that name:

      1. If the parameter is a persistent parameter, a new variable is created in the pipeline parameters with that parameter’s current value stored in the sub-pipeline document as value.

      2. Otherwise, if the parameter is not a persistent parameter, but has a default value defined, a new variable is created in the pipeline parameters with that parameter’s default value as its value.

      3. Otherwise, if it’s neither a persistent parameter nor has it a default value, that parameter is left undefined (and will probably cause execution errors later if the pipeline tries to access that value). This condition is considered a programming error.

    2. If the parameter’s initialize-when value is always:

      1. If the parameter is a persistent parameter, a new variable is created (or a possibly existing variable overwritten) in the pipeline parameters with that parameter’s current value stored in the sub-pipeline document as value.

      2. Otherwise, if the parameter is not a persistent parameter, but has a default value defined, a new variable is created (or a possibly existing variable overwritten) in the pipeline parameters with that parameter’s default value as its value.

      3. Otherwise, if it’s neither a persistent parameter nor has it a default value, that parameter is left undefined (and will probably cause execution errors later if the pipeline tries to access that value). This condition is considered a programming error.

    3. If the parameter’s initialize-when value is never, no further actions are taken.

    4. If the parameter’s initialize-when value is a string value and either the pipeline variable pool does not already contain a variable by that name or any existing variable by that name has the same string value as the string specified for the initialize-when value:

      1. If the parameter is a persistent parameter, a new variable is created (or a possibly existing variable overwritten) in the pipeline parameters with that parameter’s current value stored in the sub-pipeline document as value.

      2. Otherwise, if the parameter is not a persistent parameter, but has a default value defined, a new variable is created (or a possibly existing variable overwritten) in the pipeline parameters with that parameter’s default value as its value.

      3. Otherwise, if it’s neither a persistent parameter nor has it a default value, that parameter is left undefined (and will probably cause execution errors later if the pipeline tries to access that value). This condition is considered a programming error.

Example 8.7. 

By creating a parameter definition

copyright {
  type: text;
  label: “Copyright notice:”;
}

in the main pipeline, which calls a sub-pipeline with the definition

copyright {
  type: text;
  label: “Copyright notice:”;
  default: “(c) 2008 My Company”;
  initialize-when: “”;
}

the sub-pipeline will have set the copyright pipeline variable to the “(c) 2008 My Companydefault value if the user did not provide a value in the text field for Copyright notice in the main pipeline. This lets you implement some sort of default or fallback value mechanism for values that are used in a sub-pipeline if the value has not been set (or, to be exact: has been set to the empty string “”) in the calling pipeline.


Name

PipelineRealmMode

Java symbol

kPipelineRealmModeParamName

Type

String

Value

separate | copy | share

Only run modules with ‘exported’ status

When checked, only modules that have the status exported set will be executed.

Tip

You can use this feature like this:

Develop your sub-pipeline on its own. For testing and debugging purposes, you will probably want to provide initial values (using an instance of the Pipeline Variables module) and debugging output within the pipeline using additional instances of the XML (Raw) Exporter modules. Now simply remove the exported status on these modules in the sub-pipeline and check the above option in the importing pipeline.

Effectively, this ensures that all debugging and setup code is only run when you run the sub-pipeline on its own (e.g. during development and isolated debugging), but does not run when the pipeline is included in any other pipelines. No further module activation/deactivation orgies to think of, all done automatically once set up as described – pretty neat, isn’t it?

Name

OnlyRunExportedModules

Java symbol

kOnlyRunExportedModulesParamName

Type

Bool

Value

true | false

Sub-pipeline Parameters

Parameters

Lets you specify parameters to be passed to the called sub-pipeline. This is especially useful when calling the sub-pipeline in Use independent variables in sub-pipeline or Copy variables to sub-pipeline mode. The parameters defined here are explicitly set in the pipeline realm of the sub-pipeline’s variables to the values specified here. This happens before any modules of the sub-pipeline run. Using this mechanism, it is possible to pass certain variable values to the sub-pipeline without having to share the pipeline variable pool with the calling pipeline. Note, however, that resulting variable’s values can not be passed from a sub-pipeline back to the calling pipeline.

A parameter definition must follow this syntax:

paramname ‘:=’ ‘”’ value ‘”’;

Quotes within the parameter value must themselves be quoted using the backslash character ‘\’.

You may use upCast’s variable system for constructing parameter values.

Important

The text in this field and any contained variable references are resolved as follows:

  1. Any references to the include realm are resolved.

  2. The individual assignments are parsed according to the above syntax.

  3. Only now, any further references to realms other than include are resolved separately for each parameter assignment’s value part of a parameter assignment.

This algorithm covers the usual cases where you might want to include constant parameter assignment code shared by several External Pipeline Processor modules in a pipeline using an include variable reference, but also reference variable values without having to worry whether their actual value breaks the value assignment syntax described above.

Name

PipelineVariables

Java symbol

kPipelineVariablesParamName

Type

String

Value

(string in same syntax as in corresponding UI field)

Chapter 9. Parameter Sets

1. What parameter sets are

When working with pipelines, especially ones that are parameterized, it is often convenient to have different sets of parameter settings at hand to run the pipeline. For example, when you are converting documents in the DocBook DTD to your own, you may want to set different header info depending on whether it is a technical article or a medical article. The conversion itself, however, is the same for both. In this case, you’d set up the actual conversion pipeline only once (with the benefit that both document types automatically see any improvements in that pipeline automatically), but have different sets of parameters for the text in the document header area. So what you would do is to create two parameter sets for that single pipeline document and store them in a document separate from the actual implementation logic. Depending on the document you need to convert, you’d just load the respective parameter set document and start the conversion, with all the parameters for that particular document type already set up correctly, with you only having to specify the input file. Well – this is what parameter set documents are for! They separate actual parameter value storage from the pipeline implementation (where they are normally stored as part of the pipeline document).

2. What parameter sets contain

Parameter sets contain essentially only two types of data:

  • the current value of all Simple View parameters that have the persistent flag set to true

  • the Pipeline UID they are based on resp. referring to

That’s all. Parameter sets, in particular, do not contain any application or pipeline logic.

3. How parameter sets work

A parameter set is always derived from a single, specific pipeline document. The link to its implementing pipeline is established by way of the pipeline’s UID.

When loading a parameter set, what actually happens is that the pipeline it is based on is loaded automatically, and then the parameter values stored in the parameter set document are automatically set on that pipeline. To the user, it looks like she has just opened a pipeline document in Simple View mode, then set the parameter values as stored in the parameter set. The only difference is that the user cannot actually edit the pipeline implementation, or in other words: he cannot switch to edit mode. The second visual difference is that parameter sets open in a window with blue background, whereas real pipelines display in the default system color background.

When parameter values in a parameter set are edited, they can be saved back to the parameter set using the usual commands in the File menu: Save (to save in the same file, overwriting the old values) or Save As… to save the parameter set in a new file. You can also copy parameter set files on the operating system level and/or rename them.

Now, how do they find their pipeline implementation file when opened? This is done (re-)purposing the well-known and established Catalog system. The only difference is that you do not resolve the PUBLIC identifier of some DTD or entity to an absolute file path, but the pipeline UID, which you can think of the PUBLIC identifier of the pipeline document. This mechanism allows you to configure your system easily so that these Pipeline UIDs can be resolved to the actual, single implementation file from literally anywhere on your network: Just set up a single catalog for all your pipeline implementations and have your users add that file to upCast’s catalog system in the upCast Preferences, Catalog tab.

Example 9.1. Pipeline UID catalog example

A catalog might contain the following entries:

PUBLIC “d3546614-fb0e-4739-bfea-1f74280d9761” “file:///upcast/pipelines/docbook.ucdoc”
PUBLIC “ACME-XHTML-conversion-pipelineV1.1” “file:///upcast/pipelines/acme2html.ucdoc

When you add this file to upCast’s catalog system, you can open a parameter set from anywhere on your local disk or even the LAN and have it automatically load and run the pipeline document it depends on.

In the first line, the Pipeline UID has been auto-generated by upCast and is using a standard UUID.

In the second line, a speaking UID has been chosen by the pipeline author, who of course must be sure and ensure that this ID will never be used in any of the pipelines a potential user will want to run using a parameter set file.


4. Creating parameter sets

The first parameter set for a certain pipeline must be created by opening it in upCast, then doing a File > Save to Parameter Set… . You will be prompted for a file name to save the parameter set to. Parameter set files always have the extension ucpar (short for upCast parameter set). The pipeline document will be closed and the new parameter set file will be opened in its place.

From there on, you can create additional instances either by repeating the above, or simply by saving copies of an open parameter set.

Note

Note that only the values of parameters are saved that have their persistent property set to true in the implementing pipeline document. The decision on this property is up to the pipeline author. You will see all parameters defined in original pipeline when opening a parameter set, those values will either be empty or filled with the default values the pipeline author has specified for those parameters.

5. Variable: ${pipeline:ParamBase}

Even when loading a parameter set, be aware that the pipeline variablereference to ${pipeline:base} will resolve to the folder where the implementing pipeline document is located, not where the parameter set document lives.

If you want to specify e.g. file path parameters relatively to the location of the parameter set, you can use the new variable ${pipeline:ParamBase} that is automatically created, and which holds the absolute path to the folder within which the respective parameter set resides on disk..

Note

Even for pipeline documents, ${pipeline:ParamBase} is always defined. In that case, it has the same value as ${pipeline:base}.

6. What happens when…

6.1. …the pipeline implementation’s number or type of parameters changes?

In this case, only the parameters that still have their counterpart will be loaded from the parameter set, and for the remaining parameters it will be automatically updated to the new parameter configuration. This is done on a best-effort basis. Incompatible parameter’s values will be discarded.

6.2. …I change a pipeline implementation while a depending parameter set is open?

When the changes are not affecting the configuration of parameters, the pipeline implementation will be re-loaded automatically once you click the Run button. This will only work reliably when your file system delivers correct last modified date information for files.

When changes are also affecting the configuration (number, type, text, defaults etc.) of pipeline parameters, the parameter set will detect this when re-loading the pipeline implementation due to the change and instruct you to close, then re-open the parameter set to have it pick up the changes.

6.3. …the Pipeline UID changes and parameter sets using the old id already exist?

Assuming you updated the respective catalog entry, the parameter set will no longer be able to resolve its id to the required pipeline implementation and therefore cannot be used any longer.

Also, when the pipeline document a catalog UID lookup resolves to does not actually match the requested UID, an error dialog will be shown and the parameter set cannot be used.

6.4. …there is no mapping in the catalog system for a certain parameter set UID?

In this case, the system will try to load the pipeline implementation from the system path additionally stored in the parameter set. This path holds the absolute path to the pipeline document at the time the File > Save to Parameter Set… command was run. When this file still exists and its a pipeline document that has the requested Pipeline UID, then that pipeline implementation is loaded. Otherwise, an error is issued and the parameter set cannot be opened.

Chapter 10. Grouping using Painters

Basically, the action of making consecutive sibling nodes based on certain conditions children of a newly created surrounding element is called grouping. These conditions are exposed to you by way of the unique painter concept.

1. The Painter concept

To understand the painter concept, you first of all need to be fully aware of the following, most important fact: Grouping is always performed on a flat, linear list of nodes. Huh? I thought we’re working on a document tree? Though this is of course true, grouping only occurs among sibling nodes, i.e. all direct children nodes of an individual element. Any element’s direct children can be expressed by an ordered, flat list. Of course, we recursively group on a child’s list of children, but this is a completely independent grouping operation. So again, a single, independent grouping operation is always performed on a flat, ordered list of nodes.

Now, for the following let’s think of nodes being white bricks placed in an ordered row on the floor. These bricks can be painted with one (or even several – think: spotty!) colors. The color indicates the element by which the bricks should be grouped.

The grouper does one very simple thing: It wraps all adjacent, likewise colored nodes in a parent element (think of this being some kind of bag) that has the same name as the color of the nodes it wraps.

So the essential part to be done beforehand is to color the nodes in the desired way. This is a two-step process: First, you need to check the role of each node as far as grouping is concerned and assign it that role by placing a painter on it that knows how to go about painting for this specific role. Second, the painting is actually performed.

1.1. Tagging nodes and placing the painters

In this first step, consider yourself a paint-shop owner, making a work-plan for your painter employees. Equipped with a packet of self-adhesive post-it notes and a pencil, you start figuring out the work to be done at the first node in the list of sibling nodes. For now, you are just interested in determining which nodes should be collected into groups of the color green. You examine the node you are on. For example, you may look at some of its attributes or layout properties, or perform a more complex examination which may include evaluating a boolean XPath expression. After some pondering, you will come to a certain conclusion as to the role of the node you are currently standing on. This can be one of the following:

You know that this node will always start a group of the color you are currently considering (i.e. green). Therefore, you write “start green” on one of your post-its and tack that to the node.

You know that this node will always end (and therefore be the last one in) a group of the color you are currently considering (i.e. green). Therefore, you write “end green” on one of your post-its and tack that to the node.

Now it is time to think of which of your painter employees is best suited for the painting job. For this you have to evaluate the constellations that may happen in your document regarding the nodes that should be grouped.

For example, you may know that if you don’t find a node starting a group and a node ending the group, the grouping should not occur. In other words, the known start and end nodes (i.e. nodes that fulfill the requirements for being tagged as such) are required for a grouping to happen.

Other situations could be as follows: group from a start node to the next start node, group from an end node to the next end node, group adjacent likewise colored nodes, etc. For each of these situations, you have dedicated painters. To have them do their work in the next step, you place them on nodes.

Suppose in our example, we require a start and end node for a grouping to happen, and we have just tagged the current node as a start node. We therefore choose a start-end painter and place it on the current node.

When we have done both, tagged the node (if possible) and placed a painter (if we could determine a suitable), we move on to the next node in the ordered sequence and start over.

Finally, we’ll reach the last node in the sibling node sequence and will have tagged some nodes and/or placed painters on some of the nodes. Now, all preparation work is done and we can tell the painters to do their work, i.e. start painting.

1.2. Painting the nodes

Now, consider yourself a painter, with a bucket of color of a certain kind (the color-“name” corresponds to the element name that should be the grouping element later). In the previous step, you have been placed on some node in the sequence.

Depending on your kind, you try to paint from your location.

In our example, you are a start-end painter. This means from the place you are at, you look in direction of the start of the sequence and look for the nearest node that has been tagged with a “start green” label. (This may be the node you are standing on.) If you find such a node, you remember it. If you do not find such a node, you cannot fulfill your task (which is “Paint from start node to end node”) and give up, not painting anything.

Next, you look into the direction of the end of the sequence and look for the nearest node tagged with an “end green” label. (This may, again, be the node you are standing on.) If you find that as well, you can fulfill your painting job and start painting all nodes from the start node you found to the end node you found (including both). Then, you are finished.

The above is repeated for all painters that have been placed on nodes in the current node sequence. After this has been finished, the complete sequence got painted in a way that the actual grouping can take place, based on the paint color information on each node and the start and end tagging.

2. Node Tags

For each color, a node can have either no tag, or it can be tagged as a start node, tagged as an end node, or tagged as both, start and end node for that respective color.

These tags can currently be set using the UPL functions mark-start() and mark-end().

3. Painter Types

The example in the introduction to the painter concept already mentioned the start-end painter type. Painters can be placed on a node using the UPL function set-painter().

Note that you can place an ordered list of painters for a single color on a node. The idea is to have fallback painters when the first one fails to paint because its requirements cannot be fulfilled (like e.g. for a start-end painter, when there’s either no start tag or end tag). In such a case, painting using the second-specified painter is tried. If that cannot paint as well due to unsatisfied requirements, the next painter is tried and so on until either a painter is able to paint, or the end of the list is reached, in which case no painting occurs.

In the examples below for each painter, we use the following symbols:

Legend for example graphics

Follows a description of all available painter types:

3.1. start-end

This painter will paint from the nearest start-tagged node of the node sequence (in direction to the start) to the nearest end-tagged node (in direction to the end), observing its own node.

There may be no end-tagged node between the painter and the nearest start-tagged node, nor a start-tagged node between the painter and the nearest end-tagged node. In both of these cases, the painter will fail. The “-“ in the name symbolizes a direct, uninterrupted (by other tags) sequence of nodes between start and end.

Results of a start-end painter

3.2. start*end

This is the same as start-end, but it allows differently tagged nodes between the nearest start- and end-tagged nodes (see last two examples). The “*” in the name symbolizes a wildcard sequence of tagged nodes between start and end.

The painter will fail if either there’s no start-tagged node earlier in the node list or no end-tagged node later in the node list.

Results of start*end painter

3.3. start-here

This painter will paint from the last start-tagged node up to the one it was placed on.

There may be no end-tagged node between the painter and the nearest start-tagged node. If this is the case, the painter will fail. The “-“ in the name symbolizes a direct, uninterrupted (by other tags) sequence of nodes between start and painter position.

The painter will also fail if there is no start-tagged node earlier in the node list.

Results of start-here painter

3.4. start*here

This is the same as start-here, but it allows end-tagged nodes between it and the nearest preceding start-tagged node (see last two examples). The “*” in the name symbolizes a wildcard sequence of end-tagged nodes between start and painter node.

The painter will fail if there is no start-tagged node earlier in the node list.

Results of start*here painter

3.5. here-end

This painter will paint from the node it is placed on up to the next end-tagged node.

There may be no start-tagged node between the painter and the next end-tagged node. If this is the case, the painter will fail. The “-“ in the name symbolizes a direct, uninterrupted (by other tags) sequence of nodes between start and painter position.

The painter will fail if there is no end-tagged node later in the node list.

Results of here-end painter

3.6. here*end

This is the same as here-end, but it allows start-tagged nodes between it and the nearest following end-tagged node (see last example). The “*” in the name symbolizes a wildcard sequence of start-tagged nodes between painter node and next end-tagged node.

The painter will fail if there is no end-tagged node later in the node list.

Results of here*end painter

3.7. start-start

This painter paints from the nearest preceding start-tagged node to the next start-tagged node (not including the latter).

It fails if there is an end-tagged node in-between. It also fails if there is no preceding or following start-tagged node.

Results of start-start painter

3.8. start*start

This is the same as the start-start painter, however end-tagged nodes between those marked with a start tag are allowed (see examples 7 and 8).

Results of start*start painter

3.9. end-end

This painter paints from the nearest preceding end-tagged node to the next end-tagged node (not including the former).

It fails if there is a start-tagged node in-between. It also fails if there is no preceding or following end-tagged node.

Results of end-end painter

3.10. end*end

This is the same as the end-end painter, however start-tagged nodes between those marked with an end tag are allowed (see examples 2 and 4).

Results of end*end painter

3.11. this

This painter only colors the node it is placed on.

This painter never fails.

Results of this painter

4. Grouping algorithm

Grouping is performed on the whole document tree in a bottom-up document order. It is performed individually for each element’s children. It is also performed in a defined color order that you can specify, i.e. colors are always processed in a defined order.

Grouping does take into account node start and end tags. This is necessary in order to support directly adjacent groups. If grouping was only based on contiguous coloring, adjacent groups would not be possible since the grouper would not know where to split contiguously colored nodes into groups. In this, tags live up to their original roles, that is start tags always start a new group on that respective node and end tags end the currently open group after that node.

The following sample graphics shows – for a single color – how grouping takes place in a specific painting/tagging situation:

Final grouping of a given node sequence with markers

Group 1 is delimited by the start tag of node #2.

Group 2 is delimited by the end tag of node #2.

Group 3 is delimited by the end tag of node #4.

Group 4 is delimited by the end tag of node #7.

Group 5 is delimited by the non-painted node #9.

Group 6 is delimited by the end of the node sequence.

When placing tags on nodes it is therefore important to always bear in mind that these tags will also govern the final grouping in situations where painted nodes are adjacent.

5. Examples

Follow some examples you may encounter in one or another form in your own grouping requirements:

5.1. Grouping by paragraph class

Suppose you want to group adjacent paragraphs that are of class “Note”, because you want to group them using a note element.

The UPL code you should run before the grouper in the UPL processor should look like:

[element(uci:par) and @uci:class=”Note”] {
  set-painter( note, {”this”} );
}

This will set a painter of color “note” and type this on all uci:par elements that are of class “Note”. During painting, those nodes will be painted with the specified color, and during grouping all contiguously adjacent, likewise colored node groups will be grouped by an <uci:block uci:type=”note”>…</uci:block> element.

5.2. Grouping with a known start element

Suppose you want to group nodes where you know exactly which conditions must be met by a node to start a group, but you don’t know the end. What you additionally do know is which kind of nodes are certainly part of the group (if they exist).

Let’s say we have the following XML fragment of sibling nodes:

➊ <p>Some text.</p>
➋ <p class=”example-title”>Example</p>
➌ <p class=”example-text”>Fruits are:</p>
➍ <list>
  <item>apples</item>
  <item>bananas</item>
</list>
➎ <p class=”example-text”>All these can be bought at Miller’s.</p>
➏ <p>As you have seen,…</p>

In this example, you know that paragraphs of class “example-text” always are part of an example, and that an example is always started by a paragraph of class “example-title”. You do not know more, i.e. there may be arbitrary elements in-between like the list element in the example.

A suitable UPL code to group the elements #2 to #5 could be:

[element(p) and @class=”example-title”] {
  mark-start( example );
  set-painter( example, {”this”} ); /* optional, see below */
}
[element(p) and @class=”example-text”] {
  set-painter( example, {”start-here”} );
}

What does this do?

First, a start tag with color “example” is set on node #2, along with a painter that only colors itself. This is necessary when an example is allowed to only consist of an “example-title”-paragraph. If you require an example to at least have one “example-text”-paragraph to be a valid example, don’t use the line of code marked optional in the above.

Then, a painter of color “example” is placed on node #3 that paints from the nearest preceding start tagged node of color “example” up to itself. On the list element (#4), no painter or tag is set. On node #5, we again set a painter of color “example” that paints from the nearest preceding start tagged node of color “example” up to itself.

This happens during the run of the UPL program in the UPL Tree-Processor module.

Now, it’s the grouper’s turn, and it is about to perform the grouping for the color “example”. As we have seen above, the first thing it does is apply the painting through the painters. The painters execute in document order, one after the other, so you get the following sequence of painting and – finally – grouping:

First, painter P1 does its node painting. It is a this painter and therefore only paints the node it was placed on. Follows painter P2 of type start-here. Then finally, painter P3 starts painting. It is also of start-here type, and therefore paints from the nearest preceding start-tag up to the node it was placed on. Finally, the grouping G is created and nodes #2 to #5 are wrapped by a <uci:block uci:type=”example”>…</uci:block> element.

Note how the list node #4 is painted by painter P3 even though it has neither been tagged nor has a painter been placed on it. Instead of the list node, any number of nodes not known in advance could have been present between node #3 and #5, and they would have been automatically grouped into an “example”. This is a very important fact to both keep in mind and utilize to your advantage, for example in documents that have no strict, dependable structure but where you must work with only few known node constellations.

But what if…? Sure you have asked yourself, “But what if some badly authored document contains an ‘example-text’-paragraph without a preceding ‘example-title’-paragraph?” Here, the precise definition of the painter types comes into play.

Let’s assume node #2 is removed from the above example sequence. In this case, painter P2 would be the first painter to be executed. It is of type start-here, which fails if no suitable start-tagged node is found – which is the case here: there is no start-tagged node at or earlier in the node sequence. P2 fails, and a painter failing means it does not paint anything. The same is true for painter P3, with the effect that no node gets painted at all if node #2 (i.e., a start-tagged node) does not exist. Consequently, no grouping will occur.

Maybe that is not what you want. Maybe you want semantics like, “If a start-tagged node exists, then use that. If, however, it doesn’t, then at least make the individual ‘example-text’-paragraphs groups.” This is where the painter fallback types come in handy. For the above, you’d need to change the UPL code as follows:

[element(p) and @class=”example-title”] {
  mark-start( example );
  set-painter( example, {”this”} ); /* optional, see below */
}
[element(p) and @class=”example-text”] {
  set-painter( example, {”start-here”, “this”} );
}

Note the added painter type this in the second rule. This has the effect that when the first painter type (start-here) fails, the next – this – is tried, which – as already described – only paints the node the painter was placed on. So if node #2 was missing in our example, with the new UPL code we’d make sure that at least the paragraphs of class “example-text” would get painted, and therefore grouped, either on their own as in our example or, if adjacent, as a whole.

More real-world examples will be posted as supplemental material on our website in form of tutorials and how-tos in the following weeks and months.

Chapter 11. Commandline Interface

upCast offers a convenient helper class for running pipeline documents from the commandline. It also allows you to pass parameter values to the pipeline if they have been defined in the Pipeline Settings > Pipeline Parameters tab.

1. How it works

The commandline pipeline document interpreter class reads the specified pipeline document and looks for all defined pipeline parameters in it:

  • If a parameter has the property required set to true, a value for it must be specified on the commandline. If no value is specified, the execution is stopped and an appropriate error message is output to the console.

  • If a parameter does not have the required property set to true, and if no value for it is specified in the commandline call, and if it has its default property specified, that specified value is set.

  • If a parameter does not have the required property set to true, and if its default property has not been specified, and if no value for it is specified in the commandline call, then this parameter will not be set at all. Trying to retrieve the value for such a parameter during pipeline execution will result in an error to the effect that the requested parameter resp. pipeline variable is undefined.

  • If a parameter is specified on the commandline that is not defined as a parameter in the pipeline, an error is issued to the console and execution is halted.

After these checks, the parameter values that are defined will be set as variables in the pipeline realm (similar to how is the case when running the pipeline in Simple View mode), and then modules will be executed in the order as defined in the pipeline document.

Important

A pipeline document to be run by the commandline interface must be self-contained, i.e. it must explicitly specify

  • its license file

  • catalogs to be used

  • font configuration definitions or overrides

  • any custom encodings to be used

You should make sure that pipeline documents intended to be run via the commandline do not have their Use application settings checkbox checked on their Pipeline Settings > Catalogs, Pipeline Settings > Font configuration, Pipeline Settings > Encodings and Pipeline Settings > License tab.

Note that by default, upCast’s built-in templates have this checkbox checked!

Note

For parameters of type popup, the internal value (from the internal-values property list) must be passed in as the parameter value, not the displayed value.

2. Synopsis

java –cp upcast.jar de.infinityloop.upcast.RunPipeline parameters...

with parameters being:

[0] absolute path to the pipeline document to be run

[1..n] standard options

Standard options are as follows:

-p name value set the pipeline parameter name to the value value

-debug N turn on debug output for the conversion with specified level of verbosity; N is a number between 0 (least verbose) and 10 (annoyingly verbose)

-version display upCast version information

-help show help on the defined parameters for the specified pipeline document

Chapter 12. XML Namespaces in upCast

The use of XML namespaces is a core concept of upCast. Namespaces are essential to the processing pipeline, since they allow the clash-free co-existence of user-defined attributes and elements with upCast’s automatically generated elements and attributes. Clear separation of element and attribute domains allows targeted, semantically clear selection and filtering of the rich information present in the internal tree at serialization time.

1. The upcast-internal namespace (uci)

namespace name

recommended prefix

http://www.infinity-loop.de/namespace/2006/upcast-internal

uci

All elements and attributes of the upCast Internal DTD are members of the http://www.infinity-loop.de/namespace/2006/upcast-internal namespace. The suggested namespace prefix is uci.

Besides the goal of avoiding name clashes, attributes are members of the upcast-internal namespace so that they can be put on any element in the internal tree, even if it is a non-upcast-internal element, and still be recognized easily as such.

2. The upcast-css namespace (css)

namespace name

recommended prefix

http://www.infinity-loop.de/namespace/2006/upcast-css

css

The upcast-css namespace with the name http://www.infinity-loop.de/namespace/2006/upcast-css has a recommended prefix of css. It is currently only used for attributes and does not contain any elements. It is essentially a virtual namespace, which means that elements of the internal tree pretend to have attributes in the upcast-css namespace, when in fact these attributes are not actually materialized on the elements for efficiency reasons, but are generated on the fly when queried.

The css namespace contains the current value of all properties at the context node that have been defined by either applying a class to an element or a manual style override. It is assumed that all properties are inherited, and that manual overrides take precedence over class application when occurring on the same node.

The upcast-css namespace contains CSS styling properties mapped to an attribute representation. Since not all CSS property names used in upCast are also valid XML attribute local names, the following translation applies:

CSS property name

virtualized attribute name

-ilx-name

css:ilx-name

othername

css:othername

The only time the virtual attributes in the upcast-css namespace are actually materialized is when an exporter wants to serialize the internal tree verbatim, e.g. for reference reasons.

Note

To export materialized attributes in the upcast-css namespace using the supplied XML Exporter, turn on its Explode CSS style info into real attributes option.

3. The upcast-cssoverride namespace (csso)

namespace name

recommended prefix

http://www.infinity-loop.de/namespace/2006/upcast-cssoverride

csso

The upcast-cssoverride namespace with the name http://www.infinity-loop.de/namespace/2006/upcast-cssoverride has a recommended prefix of csso. It is currently only used for attributes and does not contain any elements. It is essentially a virtual namespace, which means that elements of the internal tree pretend to have attributes in the upcast-cssoverride namespace, when in fact these attributes are not actually materialized on the elements for efficiency reasons, but are generated on the fly when queried.

The upcast-cssoverride namespace contains CSS styling properties mapped to an attribute representation. It contains only properties that have been brought into the tree by applying a manual, explicit, anonymous style property override at a certain node, usually by way of a style attribute with local style property settings. The properties available in the csso namespace on a specific context node consist of the union of all such properties having been applied either on the node itself or one of its ancestors in the described fashion, in order from document root to context node, unless they are identical in name and value with a property in the fully calculated cssc namespace on that node, in which case they are not added. (It is assumed that cssc properties are always inherited.)

Since not all CSS property names used in upCast are also valid XML attribute local names, the following translation applies:

property name

virtualized attribute name

-ilx-name

csso:ilx-name

othername

csso:othername

The only time the virtual attributes in the upcast-cssoverride namespace are actually materialized is when an exporter wants to serialize the internal tree verbatim, e.g. for reference reasons.

Note

To export materialized attributes in the upcast-cssoverride namespace using the supplied XML Exporter, turn on its Explode CSS style info into real attributes option.

4. The upcast-cssclass namespace (cssc)

namespace name

recommended prefix

http://www.infinity-loop.de/namespace/2006/upcast-cssclass

cssc

The upcast-cssclass namespace with the name http://www.infinity-loop.de/namespace/2006/upcast-cssclass has a recommended prefix of cssc. It is currently only used for attributes and does not contain any elements. It is essentially a virtual namespace, which means that elements of the internal tree pretend to have attributes in the upcast-cssclass namespace, when in fact these attributes are not actually materialized on the elements for efficiency reasons, but are generated on the fly when queried.

The upcast-cssclass namespace contains CSS styling properties mapped to an attribute representation. The cssc namespace contains only properties that have been brought into the tree by applying a named style class from an external stylesheet onto a node, usually by way of a style reference using the class attribute. The properties available in the cssc namespace on a specific context node consist of the union of all such properties having been applied either on the node itself or one of its ancestors in the described fashion, in order from document root to context node. It is assumed that cssc properties are always inherited.

Since not all CSS property names used in upCast are also valid XML attribute local names, the following translation applies:

property name

virtualized attribute name

-ilx-name

cssc:ilx-name

othername

cssc:othername

The only time the virtual attributes in the upcast-cssoverride namespace are actually materialized is when an exporter wants to serialize the internal tree verbatim, e.g. for reference reasons.

Note

To export materialized attributes in the upcast-cssclass namespace using the supplied XML Exporter, turn on its Explode CSS style info into real attributes option.

5. The upcast-cals namespace (cals)

namespace name

recommended prefix

http://www.infinity-loop.de/namespace/2006/upcast-cals

cals

The upcast-cals namespace with the name http://www.infinity-loop.de/namespace/2006/upcast-cals has a recommended prefix of cals. It is used to differentiate attributes on tables from similarly or likewise named attributes and elements in other table models like HTML or the internal table model. You can therefore already decide at the top-level cals:table element that you are dealing with a CALS table without having to infer this from the further descendant element structure.

6. The HTML namespace (html)

namespace name

recommended prefix

http://www.w3.org/HTML/1998/html4

html

The html namespace with the name http://www.w3.org/HTML/1998/html4 has a recommended prefix of html. It is used to differentiate attributes on tables from similarly or likewise named attributes and elements in other table models like CALS or the internal table model. You can therefore already decide at the top-level html:table element that you are dealing with a HTML table without having to infer this from the further descendant element structure.

7. The XLink namespace (xlink)

namespace name

recommended prefix

http://www.w3.org/1999/xlink

xlink

The XLink namespace with the name http://www.w3.org/1999/xlink has a recommended prefix of xlink. It is used to identify linking attributes on elements.

8. The XML namespace (xml)

namespace name

recommended prefix

http://www.w3.org/XML/1998/namespace

xml

The XML namespace with the name http://www.w3.org/XML/1998/namespace has a recommended prefix of xml.

9. The Variable Realm namespaces

In UPL, you can refer to variables and values in a specific realm using that realm’s namespace. For each realm, there is a corresponding namespace.

For details on UPL variable references, confer the UPL specification.

For details on upCast variable realms, see here.

realm

namespace name

recommended prefix

application

http://www.infinity-loop.de/namespace/upcast-realm/application

application

environment

http://www.infinity-loop.de/namespace/upcast-realm/environment

environment

pipeline

http://www.infinity-loop.de/namespace/upcast-realm/pipeline

pipeline

module

http://www.infinity-loop.de/namespace/upcast-realm/module

module

javaproperty

http://www.infinity-loop.de/namespace/upcast-realm/javaproperty

javaproperty

include

http://www.infinity-loop.de/namespace/upcast-realm/include

include

map

http://www.infinity-loop.de/namespace/upcast-realm/map

map

10. UPL Utility functions library namespace

namespace name

recommended prefix

http://www.infinity-loop.de/namespace/upl/utility-functions

util

upCast comes with a library of several UPL utility function definitions. To have those separate from your own function definitions, even if they may share the same local function name, all these functions are located in a specific namespace: http://www.infinity-loop.de/namespace/upl/utility-functions .

For more info when it becomes available, confer the UPL specification.

Chapter 13. Recognized Java system properties

Some settings need to be made early in the startup process of upCast. In fact so early, that they can not be read with application-internal means, but need already be set and available when upCast starts running. To set those values in cases where their default is not desirable, you can pass them via Java system properties to the JVM running the upCast application.

The following parameters are available, with their defaults, which are sometimes calculated dynamically based on the system/OS the application is running on, given as well:

de.infinityloop.exe.location (Windows only)

Default: ${application:BundledResources}/EXEs

Specifies the folder where upCast will look for supporting .exe files (like il-gw.exe, used for WordLink).

de.infinityloop.application.location

Default: (application installation root)

The folder where the application’s installation root lies.

de.infinityloop.application.preferencesdir

Default: (system dependent)

The folder where upCast will write its preferences file to.

de.infinityloop.application.logfile

Default: (system dependent)

The file where upCast will write its external logfile. Whether this value is actually used is dependent on the log subsystem chosen.

upCast uses SLF4J as the logging bridge and comes with a log4j 1.2 bridging implementation and log4j 1.2.x by default.

de.infinityloop.loglevel

Default: 3

Set to a number greater or equal 0 identifying the threshold below which messages will be output to the log subsystem. The currently used range is 0..7, with 7 being the highest level debugging, i.e. “output always”. To have verbose logging, set this to a high value. To get reduced logging info, reduce the number.

The default value 3 corresponds to INFO type messages (and above).

Important

When running a pipeline using the de.infinityloop.upcast.RunPipeline class, use the -debug N option described there instead of the de.infinityloop.loglevel property.

de.infinityloop.application.maxvarrecursion

Default: 32

The maximum number of iterations a field value is variable-resolved in search of a point where it no longer changes between iterations before the process is aborted and a fatal error is issued about a possible infinite recursion.

de.infinityloop.logfilterspec

Default: (empty)

This property serves can be used for specifying the log event filter used at the interface to the external logging system (usually a file or the console). Additionally, it is used in some selected places within upCast’s code base to prevent the time-consuming creation of complex log events already at their originating place.

The filter expression syntax is the same as described here.

Example 13.1. 

-Dde.infinityloop.logfilterspec=+ERROR,+FATAL,+INFO

only passes messages of type ERROR, FATAL and INFO to the external logging subsystem, but not WARN messages.


Note

At this time, the only supported message constant preventing log message generation already at its origin is CurrentRTFToken.

Chapter 14. Java API

Important Note

It is strongly recommended to only use the detailed Java API (described in this chapter) when your requirements do not let you use the static de.infinityloop.upcast.UpcastEngine.runPipeline() method. This is the case e.g. when you must dynamically select and parameterize the modules and their sequence of execution, or must react specifically in Java code on error conditions after each single module execution.

If you do not absolutely need these fine-grained control capabilities, which is usually the case when you can set up and run the pipeline you need using just the upCast GUI, please do not use the low-level API described in the following. Just use the de.infinityloop.upcast.UpcastEngine.runPipeline() method with the pipeline or parameter set file you developed in the GUI, instead. This makes changes to the pipeline possible without the need for re-compilation and therefore maintenance so much easier…!

Complete sample code (just a few lines of Java) ready for copy and paste into your Java project for each individual pipeline using de.infinityloop.upcast.UpcastEngine.runPipeline() can be obtained from the pipeline documentation in HTML format you get from File > Generate Documentation….

1. Concepts

Accessing upCast functionality is carried out via one instance of a broker object: UpcastEngine. You should create one instance of that object at startup and use it for many subsequent conversions, since creation of this object is rather expensive. There are no problems in reusing that object for subsequent conversions (in contrast e.g. to many XML parser implementations, for example) – to the contrary, it is highly recommended from a performance point of view.

You may create several instances of the UpcastEngine object in order to run multiple conversion threads at the same time in your single application. Please note that the maximum number of parallel threads may be restricted by your license.

2. Using the API

We assume that you are familiar with Java programming and its concepts like objects, interfaces and implementations. You should also be fluent with the Java object notion and with Java Streams.

The javadoc API reference can be found here.

2.1. General programming steps

The general programming steps are as follows:

  1. Instantiate a de.infinityloop.upcast.UpcastEngine object. You can think of this object as the interface to your pipeline.

  2. Set the pipeline base URI using the setPipelineBaseURI() method.

  3. Register that instance with an appropriate license file using its setLicense() method.

  4. Set global pipeline parameters like catalogs to use, overrides to the standard font configuration and custom encodings to use via the appropriate instance methods.

  5. Call the initializeConversion() method.

  6. Set pipeline variables using the setPipelineVariable() method.

  7. Choose a module class via the method setModuleType(), which then internally gets instantiated and becomes the current module.

  8. Set module parameters using (possibly repeated calls to) the setModuleParameter() method.

  9. Start the module execution by calling runModule().

  10. (optional) Repeat from step 7 for subsequent modules in the desired pipeline.

  11. Call the cleanupConversion() method.

  12. (optional) Repeat from step 5 for converting another document.

Expressed in actual Java code, this might look something like this:

String moduleID = null;
UpcastEngine ucInst = new UpcastEngine( “instance one” );
ucInst.setPipelineBaseURI( “file:///path/to/basefolder/” );
ucInst.setLicense( “file:///path/to/upcast.uclicense” );
ucInst.setPipelineVariable( “DestinationFolder”, “/test/out/” );
ucInst.setPipelineVariable( “ImageDestinationFolder”, “/test/out/” );
ucInst.initializeConversion();
  moduleID = ucInst.setModuleType( UpcastEngine.kRTFImporterType );
  ucInst.setModuleParameter( moduleID, “OrigNumbering”, Boolean.TRUE );
  ucInst.setModuleParameter( moduleID, “SourceFile”, “/test/in/in.rtf” );
  ucInst.runModule( moduleID );
  moduleID = ucInst.setModuleType( UpcastEngine.kXMLExporterType );
  ucInst.setModuleParameter( moduleID, “DeleteEmpties”, Boolean.FALSE );
  ucInst.setModuleParameter( moduleID, “DestinationFile”, “${pipeline:DestinationFolder}/out.xml;” );
  ucInst.runModule( moduleID );
ucInst.cleanupConversion();

Tip

To quickly construct a slightly more sophisticated Java source code template for a pipeline you have already built using the GUI, use the File > Export to Java source command. You can then modify this generated code to your liking, preferably by subclassing it and overriding methods where needed.

2.2. Setup

2.2.1. Create an UpcastEngine instance

You gain access to all functionality of upCast by means of objects of a single class: UpcastEngine. An instance of this object is what you will use in your application in order to access the full range of upCast API functionality.

Before you can do anything with upCast, you need to instantiate a UpcastEngineobject:

UpcastEngine ucInst = new UpcastEngine( “instance one” );

The UpcastEngine class is to be found in the de.infinityloop.upcast package.

You should keep this object stored in a variable which you can access from all places inside your program where you need to access upCast functionality.

You should strive to have only one instance of the UpcastEngine object per physical CPU at any time for performance reasons. Also make sure you only instantiate this object once during the life of your application process, as instantiating and disposing of this object is a relatively costly operation.

2.2.2. Setting the pipeline:base property

In the GUI version of upCast, this proeprty is set automatically for you, as there is a pipeline document that determines this value. In the API, however, there is no such document, so you must tell the upCast pipeline processor the value of this property. It serves as basis for resolving any ${pipeline:base} references you might have in module parameter values or pipeline setting values.

ucInst.setPipelineBaseURI( “file:///path/to/basefolder/” );

This should be called immediately after creating the UpcastEngine object instance.

2.2.3. Setting the license

To use upCast in API mode, a license file is required that includes either or both of the upcast-api and downcast-api features. If in doubt, contact us at licensing@infinity-loop.de.

ucInst.setLicense( “file:///path/to/upcast.uclicense” );

2.2.4. Setting pipeline properties

You can set pipeline properties directly on the UpcastEngine instance object. This includes amending or overriding the font configuration (setCustomFontConfiguration()), adding catalog files to be used by XML processing (addCatalog(), discardCatalogs()), and adding custom encodings to the set of built-in ones (addCustomEncoding()). These settings remain valid as long as the UpcastEngine instance lives or until you explicitly clear or set them to different values.

(Do not confuse these settings with the setting of pipeline variables; see below.)

2.3. Building and running a pipeline

Whereas in the GUI, you build a static pipeline by choosing a specific sequence of modules, the API handles a pipeline differently. In fact, there is no concept of a pre-built pipeline setup to be run; instead, you run modules one at a time. This has the great advantage that you can dynamically and programmatically build the actual pipeline for each single conversion, e.g. based on results of a preceding module execution on that input source.

2.3.1. Bracketing a pipeline by initializeConversion() and cleanupConversion()

ucInst.initializeConversion();
  /* ... your pipeline code goes here ... */
ucInst.cleanupConversion();

Since upCast has to do some housekeeping for each conceptual pipeline run (independent of the actual number and sequence of modules run within), you need to tell it when you conceptually start a pipeline for a specific input file, and when you are done with it, i.e. when you have run the last module for this specific input file. This is done by the initializeConversion() and cleanupConversion() methods.

For example, initializeConversion() cleans the pipeline variable realm so that subsequent pipeline runs do not see values set by a previous run. And cleanupConversion() makes sure any temporary files created by some module get properly deleted when they are no longer needed.

Important

It is very important that you obey this pipeline bracketing rule at all times, as strange, non-deterministic behaviour may occur otherwise.

2.3.2. Setting pipeline variables

As in the GUI (by way of the Pipeline Variables module), you can set variables in the pipeline realm to be used by modules run subsequently. The method to use is setPipelineVariable(), e.g.:

ucInst.setPipelineVariable( “DestinationFolder”, “/test/out/” );

Note

The pipeline variable realm is cleared by a call to initializeConversion(). You therefore must explicitly (re-)set them at the beginning of a new conversion pipeline execution for a document.

2.3.3. Setting up and running a module

Each module to be run has to be set up individually. This is done in three general steps:

  1. Choose and set the module class to use.

  2. Set module parameters.

  3. Run the module.

First, you choose from one of the available module classes and set that using the setModuleType() method:

moduleID = ucInst.setModuleType( UpcastEngine.kRTFImporterType );

This will create a new instance of this module type and set it as the current module.

The constants to be used (in the UpcastEngine class) for the available module types are:

Module Type

Java constant name

Pipeline Variables

kPipelineVariablesProcessorType

RTF Importer (“upCast”)

kRTFImporterType

UPL Processor

kUPLProcessorType

UPL Tree Processor

kUPLTreeProcessorType

Sectioner

kSectionProcessorType

XML Exporter

kXMLExporterType

Commandline Processor

kCommandlineProcessorType

XSLT Processor

kXSLTProcessorType

Unicode Translation Processor

kUnicodeTranslationProcessorType

XML Validator

kValidationProcessorType

CSS Exporter

kCSSExporterType

RTF Exporter (“downCast”)

kRTFExporterType

XML Importer

kXMLImporterType

External Pipeline Processor

kExternalPipelineProcessorType

Module parameters will be set to defaults. The call will return a handle (a String) to that module which you must pass to subsequent setModuleParameter() calls:

ucInst.setModuleParameter( moduleID, “SourceFile”, “/test/in/in.rtf” );

Note

The default parameter setting of modules is not documented. Though usually reasonable, these may change from release to release. We therefore highly recommend to set all parameters of a module explicitly to the desired values in order to not have your code break at an upCast update.

Tip

Like in the GUI version of upCast, you can use variable references in the parameter values which will be resolved by upCast automatically.

Example 14.1. 

To specify the source file relatively to the pipeline base directory (to whatever value it is currently set), use a line like

ucInst.setModuleParameter( moduleID, “SourceFile”, “${pipeline:base}/source/in.rtf” );

Parameter names for each module are given in the description of the individual modules earlier in this manual. The parameter value has to be passed as a Java Object. The required object class depends on the specific parameter and is documented for each available parameter.

If you set a parameter more than once, the last value set will be used.

To set several parameters, you need to repeatedly call the method setModuleParameter().

Note

If you try to set a parameter that is not supported by the current module, the parameter simply will have no effect, but no error is reported. To track which parameters you set in your application, you should turn on debug logging.

Important

If you use a different Java Object (sub-)class for the parameter value than specified in the reference section, the behaviour is undefined. Some types may be compatible, but in general you will get a Java exception at some point later in the execution of upCast or the operation will not work the way you intended.

Finally, you’ll want to kick off the module’s execution. This is done by the runModule() method:

ucInst.runModule( moduleID );

After this, you could either setup the next module exactly as described in this section so far. You could even base the selection of the next module on the value of some pipeline variable which the module might have set to some specific value or some other condition. You can query the values of variables in the pipeline realm using the getPipelineVariable() method.

3. Connecting to WordLink (Windows only)

To access WordLink functionality also from upCast running via the API, you need to tell it where the WordLink component il-gw.exe is to be found before you instantiate an UpcastEngine object. This is done by setting the system property de.infinityloop.exe.location to the folder where il-gw.exe resides:

System.setProperty( “de.infinityloop.exe.location”, “/path/to/il-gw-folder/” );

On a typical Windows installation, this is C:\Program Files\infinity-loop\upCast\Resources\EXEs , but you are free to move the application file il-gw.exe anwhere in your filesystem where it is convenient for your deployed application.

Important

Using WordLink in a critical server-based unattended environment is not supported and therefore not recommended. WordLink uses an installed copy of Word in component mode. Such use is explicitly warned against by Microsoft for server or server-like applications for technical reasons (letting alone any remaining licensing issues).

3.1. Accessing Word as COM object in a restricted environment

WordLink must access and launch Word to do what it needs to do. However, when running in a server environment, rights of running processes are usually tightly restricted. For example, Word might not be allowed to be accessed by the server process as COM object.

To make WordLink work in such restricted environments, you need to explicitly grant the user running the server access to the Word COM object. You can check and do this as follows:

  • On the Windows commandline, start dcomcnfg.exe .

  • Choose the component “Microsoft Word Document” (or similar, depending on localization) and click Properties... .

  • Under Security > Use custom launch permissions, add the account that runs the server using Edit...Add... . (On one of our machines, this e.g. was “ASPNET (ASP.NET Machine Account)”).

After this modification, WordLink should also work in the restricted environment.

4. API Error handling

During a single call to an API method, several problems may occur, some of them quite significant, some of them less significant. In every case, the method will throw a single UpcastException. An UpcastException is a special descendant of a java.lang.Exception that encapsulates a list of errors and/or warnings that occurred during the last call to an API method.

You can query an UpcastException for its single constituents, which are objects of type LogEvent. A LogEvent encompasses:

  • a numerical message code

  • a message class, one of: FATAL, ERROR, WARN, INFO, DEBUG, VERBOSE, DETAIL

  • a human readable message as String

  • a (possibly null) array of parameters that were used in constructing the message

4.1. Coding pattern

The recommended coding style for error handling is to wrap each call to an API method in its own try{}/catch{}-block and catching UpcastExceptions explicitly. This is useful if e.g. the runModule() call throws an exception, but the severity is not high and you decide to continue processing because it only contains a warning that you do not care about and that does not affect the document integrity. By wrapping each call separately, you get the maximum out of any sequence of API calls by just skipping the portions that did not work.

A typical API call including error handling would look something like:

try {
  ucInst.runModule( moduleID );
} catch( UpcastException e ) {
  if( e.extractSignificantEntries( 
      new int[] { 
        LogEvent.FATAL, 
        LogEvent.ERROR 
      }, 
      null, 
      null ).size() > 0 ) { // we only react on FATAL or ERROR types, but not WARNings
    ... do some error handling ...
  }
}

4.2. Error handling tidbits

Using the extractSignificantEntries() method you can specify in very high detail in what messages you are interested in. For more information on how to use this method, see the javadoc API reference.

The message codes are all constants of a special class, Msg. See the javadoc API reference for a description of the currently defined message codes and the number and semantics of parameters available for a specific message.

Chapter 15. upcast-runner Ant task

upCast’s distribution jar includes an Ant task that lets you run upCast pipelines from files (*.ucdoc) from within Ant. This has the advantage that usually, you do not have to create the Ant task code anew whenever you make minor to moderate changes to your pipeline. To use it, you have to first define the task for use by Ant, then create the correct sub-structure of the upcast task.

Tip

To quickly construct an Ant build file code template for the upcast-runner task, use the File > Export to Ant > using 'upcast-runner' Task command. You can then modify this generated code to your liking or include it into an existing build file.

1. Defining the upcast task

To define the task, use the following code:

<taskdef
    classname=”de.infinityloop.upcast.ant.UpcastRunnerAntTask”
    name=”upcast-runner”
    classpath=”upcast.jar”
/>

For upcast.jar, you must specify the path to the distributed upcast.jar file. E.g., if you have a specific tasks folder next to your build file, you should copy the upcast.jar file there and specify ${basedir}/tasks/upcast.jar.

2. Structure of the upcast-runner task

<upcast-runner
    file=”/path/to/pipeline.ucdoc”
    logfilter=”DEBUG”
    sourceparam=”SourceFile”
>
    <source dir=”...”>
        <include name=”pattern” />
        …
    </source>

    <catalogs>
        <catalog file="..." />
    </catalogs>

    <param name="..." value="..." />
    …

</upcast>

On the upcast-runner task itself, some global parameters need to be set, above all file, which is the absolute path to the pipeline to run by this task.

The upcast-runner task can contain the following elements as nested elements: source (to set the source files; see below for the exact semantics), catalogs (to specify global catalog files to use; needed to resolve the PUBLIC ID of the pipeline in case the task is used to run a parameter set (*.ucpar)), and one or more param elements setting the pipeline's public parameters.

We’ll discuss each of these elements in more detail in the following.