This document is intended as a technical reference manual to upCast RT (in the following called just upCast).
It is not intended as a tutorial on how to use upCast efficiently, best create pipelines or similar questions. These will be covered in a separate tutorial-style document, a How-To section on our website as well as a Frequently Asked Questions document. Please turn to these documents as they are published on our website http://www.upcast.de/ in the near future.
This reference document describes upCast RT 7.1.2.
upCast is a module-based document processing pipeline tool, specializing in legacy, “flat” and layout-driven content. It comes with pre-defined, configurable, task oriented modules (that perform operations like importing data, XSLT processing, serialization and validation etc.) that you can put into any order you wish to create a pipeline. Pipelines can be saved and parameterized as a whole and then be run either within upCast’s UI or from the commandline or directly from Java.
Pipelines can be set up to be fully relative in their file addressing and therefore can be shared without modifcations between computers, even different platforms.
To run upCast, you must meet the following requirements:
Java Runtime Environment 5.0 or later (“Java 5”)
Xerces 2.9.x or later (upCast includes Xerces 2.9.1 and does not work in systems that have earlier versions than Xerces 2.9 in their classpath)
512 MB of RAM available to upCast (depending on document size and pipeline configuration)
Display resolution of at least 1024 x 768 (when running the graphical development environment)
The highest-level component type in upCast usage is a document processing pipeline, or short: pipeline. Pipelines can be saved into documents (file extension .ucdoc) and recalled at any time. Complete pipelines can be exported into several formats, like a Java source file or source code for an Ant target.
To the user, upCast presents its functionality in two layers: the so-called “Simple View” and the “Edit View”. Think of the Simple View as a simplifying, user-oriented layer over the Edit View, which is developer-oriented and shows the actual, fine-grained and possibly complex implementation of the conversion pipeline.

Simple View as a user-oriented layer over the detailed Edit View on a pipeline’s implementation
Pipelines are made up of modules. Modules each perform a specific and specialized task. Modules can be divided into the three categories importers, processors and exporters based on the tasks they perform.
Importers import documents into the internal document format. upCast currently includes a high-quality RTF/Word importer.
Processors come in two variants for internal and external processing. Internal processors modify the current, internal document representation. This is carried out in-place. External processors can be used to perform general tasks which are not dependent on the internal document, like running a shell command.
Exporters are used to serialize the internal document or part thereof in one of several formats.
Within a pipeline, at any time during execution there’s exactly one internal document representation the tasks are performed on. This means that modifications are in most cases performed in-place, so changes made to the internal document tree by one module are visible to subsequent ones.
While a run of an importer always replaces the internal document, you can have several exporters that serialize the same internal document in different ways. You can also serialize the document at any point in the pipeline and apply additional modifications using processors afterwards.
It is often useful to be able to save, quickly recall and share with other users different parameter settings for running a certain, parameterized pipeline. Such a parameter set can be saved into documents (file extension .ucpar) and recalled at any time.
A parameter set document only contains the pipeline parameter values as they are set in the Simple View at the time of saving. Only parameters that have their persistent property set to true are saved in that document. It also contains the Pipeline UID of the pipeline document it is based on so that it can load its implementation for execution.
For details, see the section on Parameter Sets.
Usually, a conversion is a three-phase process: You import the source data into the application, process the data, and export the result. Sometimes, a fourth, external post-processing step is added. upCast offers various modules, which can be divided into three different classes: Importers, Processors and Exporters.
Here’s a diagram of a typical upCast pipeline (with the internal document indicated over time):

The upCast UI is designed to be simple and effective. An upCast document is a complete pipeline setup and can be can be saved in a file with the default extension .ucdoc. Each document is shown in its own window, and you can have several pipelines open at the same time.
A document window in edit mode is divided into three parts.
The left pane shows the sequence of modules that make up the pipeline. The position of a selected module in the pipeline can be changed by using the nudge-up/nudge-down controls at the bottom of the list. A module can be deleted from the pipeline with the “–” control, a module can be added by clicking the “+” control and choosing the desired class from the popup. There can be multiple instances of the same module type in a pipeline as required, e.g. two or more XSLT processors.
The right pane shows the parameters for the currently selected module. Only one module can be selected at any time. Changes to a module’s parameters are effective immediately.
At the bottom of the window, the pipeline execution controls are placed for executing a pipeline, stopping it underway and checking its progress.

This display is replaced by a dynamically generated, forms-like interface when the Simple View option is engaged.

This command lets you create either a parameter set based on a factory-supplied template or a new, independent, self-contained pipeline configuration from one of the available templates.
create parameter set… Creates a parameter set from the respective template’s main pipeline document.
The advantage of just creating a parameter set is that if you do not need to tweak the implementation, but just use the pipeline template’s functionality as-is only with variable parameter values, you will benefit from updates and bugfixes to the template automatically without any further manual intervention required. This comes from the fact that the parameter set only holds a reference to the actual template implementation and therefore is automatically updated when the implementation is.
create independent copy… Creates a full, physical copy of all the pipeline documents and resources the template is made up of. You are asked for the location (folder) and a base name for the new pipeline. Within the selected folder, a new folder by the specified name is created and any resources of the template, including the pipeline document, are copied into that folder.
This pipeline created based on the chosen template is completely independent from its template. This means two things:
you get a complete, independent copy of the original template definition and resources
any updates to the template are not propagated forward to any pipelines you already have created based on an older version of the template
You can create your own, specific templates. For details on what makes a pipeline an upCast template and where to put those templates for upCast to recognize them, see the chapter on Pipeline Templates.
This shows a file chooser where you can open an already existing pipeline or parameter set document.
Shows in a sub-menu the most recent pipeline and parameter set documents you had open in the past. The number of items displayed in the sub menu can be set in upCast’s preferences.
Pipeline or parameter set documents you had open recently, but which are no longer available (for example because they have been deleted or the disk they reside on is currently not mounted) are shown in disabled state.
Closes the top-most document. When changes to this document have not yet been saved, you are prompted to save them.
Saves the top-most document, which can be a pipeline or parameter set document, the log window or the system information window.
This allows you to save the top-most window under a new name.
Note that for pipeline documents that refer relatively to needed resources, saving a pipeline document to a different location will usually break those links and the pipeline will not run as expected, since upCast cannot reliably track those resource links and copy them along automatically.
This lets you save the persistent parameters and their values of the top-most pipeline document to a separate file, a parameter set document. This file internally links back to its pipeline document it was created from. This allows you to separately store configurations of parameter values that look like separate pipelines, but share one single pipeline implementation. When the latter gets updated, so do all parameter sets originating from it.
See the section on parameter sets for more information on how the linking to the respective pipeline document works and what the restrictions of parameter sets are.
Saves the current pipeline document in form of an Ant task. Additional parameters can be set for the export operation in the Pipeline Settings dialog under the Export tab.
using ‘upcast-runner’ task exports an Ant task making use of an upCast runner object, which reads the specified pipeline and executes it. This is the recommended export option since you need to generate that task only once and it picks up automatically any changes in the referenced pipeline document.
as self-contained Task this creates a fully, self-contained Ant task of the current pipeline’s configuration. This means that the task can be run without having access to the original pipeline document it was generated from. This may be useful when you used the original pipeline document only for prototyping and testing, and want to apply changes directly to the Ant task’s definition thereafter, or can recreate the task automatically when making changes to the pipeline document (e.g. in an automated build using upCast’s Tools class).
Saves the current pipeline document as Java Source code. Additional parameters can be set for the export operation in the Pipeline Settings dialog under the Export tab.
using RunPipeline class exports Java source code making use of upCast’s RunPipeline class, which reads the specified pipeline and executes it. This is the recommended export option since you need to re-generate the source code for that class only when the pipeline parameter configuration changes (i.e., parameters are added or removed) and it picks up automatically any further changes in the referenced pipeline document.
as self-contained source this creates a fully, self-contained Java class of the current pipeline’s configuration, utilizing the upCast Java API’s UpcastEngine class’s methods. This means that the code can be run without having access to the original pipeline document it was generated from. This may be useful when you need fine-grained control over error handling for each individual module’s execution step and/or need to dynamically execute additional code that cannot be integrated into a standard pipeline execution.
Exports the current pipeline document as a human-readable XML source file.
This file is also used internally as the basis for the Ant task and Java Source export options, which are generated by appropriately configured XSLT transformations. With this export, you can create your own formats of export (e.g. customized Java code export or extended documentation generation).

The operations Cut, Copy and Paste are supported context sensitively, depending on where the current keyboard focus is directed to:
text field When the focus is on a text field, these methods work as usual.
pipeline modules list When the focus is on a module in the pipeline modules list, that module’s complete definition is copied in form of an XML snippet onto the clipboard. Using Paste while the focus is on a module entry, the module description on the clipboard is read and a new module is inserted above the currently selected module with all parameters set as for the module you copied. You can even copy modules conveniently across open pipelines this way.
When the focus is on a module in the pipeline modules list, this command will create UPL source code for running the selected module from UPL using the run-module() function and put it as text onto the clipboard. You can then insert it into a UPL code field within upCast or your favorite external editor where you are writing your UPL code.

With this toggle, you switch between the Simple View and Edit View of a pipeline configuration.
When checked, upCast shows its pipeline window in Simple View mode, hiding the actual pipeline implementation and showing only entry fields for the pipeline parameters that a typical user must supply.
When you want to edit the details of a pipeline, uncheck this item.
The state of this parameter is saved to the pipeline document and automatically restored at opening time. This means that for final distribution to your customers, check this parameter, then save the document again before packaging it into your distribution.

Shows a window with detailed information on the execution environment of the topmost pipeline document and the upCast application, including version information on available XSLT processors, Java, loaded modules, license info etc. You may be asked by infinity-loop support for this info when tracking down problems you may have with upCast.
Shows a window with the external log file or a live view of log events as they are generated from within upCast.

The Source popup menu lets you choose between these two modes:
Logfile shows the current contents of the log file on disk
Live Events shows the log events in the system as they are generated from log sources within upCast
When showing Live Events, you can set a filter describing which log events generated by upCast should be displayed. This is done using the Filter text field. This setting is completely independent from the log level setting in upCast’s preferences. Several pre-defined settings are available from the associated popup menu, but you are free to specify any log event filtering expression you wish. The filter expression syntax is described here and is the same as used in other places within upCast.
All log events are held indefinitely while the window is open or until you click on Clear Window, so you should not leave the window open unattendedly as otherwise you will run out of heap space at some time. When the window is in Live Events mode, depending on the amount of logging events to display, you will see a performance degradation of pipeline execution. There’s no performance penalty when the window is closed, as then it detaches itself from all log sources automatically.
With Save as Text…, you can save the current contents of the window to a text file. You may be asked by infinity-loop support for this info when tracking down problems you may have with upCast.
Shows this upCast reference documentation manual in the host system’s default web browser.
Shows the UPL reference documentation in the host system’s default web browser.
Shows the upCast API documentation (in javadoc format) in the host system’s default web browser.
Opens a pre-configured email in your default email application, ready to be amended by your problem report or question to infinity-loop Support department. This includes system information which you can preview in the generated email and – when desired for privacy reasons – trim to your liking.
You should use this function whenever you want to report a bug or problem to infinity-loop.
upCast offers several variable realms. Realms are distinct, non-overlapping value storage spaces. Think of them as different buckets placed next to each other, labelled with the realm name.
Some of the realms are read-only, and some of them calculate the actual value of a variable at the time of retrieval access.
Here’s an overview of the different realms and their names (monospace bold grey print) available in upCast:

To get or set a variable, two components must be specified:
the realm
the variable name
A variable reference is resolved in an upCast parameter field by simply replacing the variable reference by the textual value of the variable referenced.
It is important to always keep in mind that the variable resolution process is an utterly dumb textual replacement process (much like a #define works in the C programming language). Specifically, no quoting or unquoting is performed.
The result of a variable reference to a variable that does not exist or cannot be resolved is the variable reference itself.
A piece of text containing variable references is processed as many times as the result changes. This allows you e.g. to have references to the include realm resolved also in already included content. Consequently, you must make sure that contents looking like a variable reference, but which may not be resolved, must be properly quoted (e.g. by doubling the $ sign). To avoid potential infinite recursion, this repeated resolution process on some source string is terminated when even after a certain number of iterations, changes in the result still occur. The limiting number of iterations currently is set at 32 by default. It can be changed by setting the Java property de.infinityloop.application.maxvarrecursion .
All variable names that start with an upper-case letter are reserved for upCast’s own use.
You should therefore name your own variables in such a way that they do not start with an upper-case letter, even when at that time, a likewise named upCast-defined variables does not yet exist. We might introduce it in a subsequent release and make your pipeline not work correctly any more.
The syntax to refer to a variable in a specific realm is similar to that of Ant, albeit with a twist:
${realm:name#modifier}
Note the special #modifier part: It is useful when wanting to modify the stored value of a variable before returning it in specific ways. This is most useful in file paths, e.g. to only retrieve the name of a file in an absolute path, the base name or just the path to some file.
As with Ant, variable resolution is not recursive, i.e. you cannot write something like ${module:${pipeline:paramname}} to calculate the name of a module variable dynamically.
The components of a variable reference are:
Example 4.1. Modifier sample results
With SourceFile having a value of “C:\Documents and Settings\upCast\The file.xml”, the following variable references with modifiers will evaluate to:
${SourceFile#local}
→ C:\Documents and Settings\upCast\The file.xml
${SourceFile#url}
→ file:///C:/Documents%20and%20Settings/upCast/The%20file.xml
${SourceFile#localpath}
→ C:\Documents and Settings\upCast
${SourceFile#urlpath}
→ file:///C:/Documents%20and%20Settings/upCast
${SourceFile#localname}
→ The file.xml
${SourceFile#urlname}
→ The%20file.xml
${SourceFile#localextension}
→ xml
${SourceFile#urlextension}
→ xml
${SourceFile#localbasename}
→ The file
${SourceFile#urlbasename}
→ The%20file
${SourceFile#localbasenamepath}
→ C:\Documents and Settings\upCast\The file
${SourceFile#urlbasenamepath}
→ file:///C:/Documents%20and%20Settings/upCast/The%20file
Let’s have a look at the various realms in more detail:
This realm is not yet available and will be implemented in a later release of upCast.
This realm is read-only.
This realm includes upCast application-global values.
The following variables are currently defined:
variable name | description |
| path to the (OS/system-specific) support files folder |
| path to the resources folder bundled with the application distribution (when it was installed using one of the system-specific distribution packages) |
| the path to the external logfile as calculated by the application and/or set in the java system property |
By default, all file path values returned are in URL format. You can use all available modifiers on them, of course, to change format or extract parts from them.
Example 4.2.
To retrieve the location of the application’s support folder on the system it is running on, use:
${application:SupportFolder}
This realm is read-only.
This realm includes upCast pipeline-global environment values. Most of them are virtual in the sense of that they reflect some current state of the execution environment at the time of recalling them and are not actually stored.
The following variables are currently defined:
variable name | Java type | description |
| Integer | the application version (0x0Mmr format) |
String | the application version in “M.m.r” format | |
Integer | the build number | |
String | the timestamp string of the build in the format “ | |
| List | list of Strings of features included in the current license; features not active are enclosed by parantheses ‘(‘ and ‘)’ |
List | list of Strings that only includes features in the current license that are valid at the time of the query | |
String | a string describing the current license | |
Integer | number of days until the license feature | |
String | application installation folder | |
List | list of Strings identifying the locations of currently active XML catalog files in the pipeline | |
String | version information of the included Xerces parser | |
| String | version information of the included Xalan XSLT processor |
| String | version information of the included Saxon 9.x XSLT 2 processor |
String | version information of the included Saxon 6.x XSLT 1 processor | |
| Integer | version of the active WordLink component; returns |
| Integer | version of Microsoft Word that WordLink is currently linking to; returns |
| String | absolute path to the application used for implementing the WordLink functionality; returns |
| Integer | version of the active MathLink component (implementing the link to MathType 5); returns |
| Integer | version of the MathType DLL used for implementing MathLink; returns |
| String | the text currently displayed in the progress bar’s sub-label |
| String | the text currently displayed in the progress bar’s label |
| Long | the ordinal number (1-based) of the currently executed module task in the pipeline |
| Long | the total number of tasks defined in the current pipeline |
Long | the maximum value for completion indication of the current task | |
| Long | the current value of completion for the currently running task; the task is completed when this value is equal to |
| String | the folder searched for application support files |
| String | the folder searched for license files |
String | the absolute path of the log file written to | |
Boolean |
This means that the pipeline must be a top-level pipeline (see | |
| Boolean |
This means that the pipeline is not one that is executed within an External Pipeline Processor as a sub-pipeline |
Integer | the build number of the latest available version of this application This information is retrieved from infinity-loop’s servers by fetching the URL http://versioncheck.upcast.de/upcast7.plist. When there is no newer version available, this returns 0. When the information could not be retrieved (e.g. due to a server error or if there is no active connection to the internet), |
By default, all file path values returned are in URL format. You can use all available modifiers on them, of course, to change format or extract parts from them.
Example 4.3.
To get information on the version of Xalan currently in use by upCast RT, write:
${environment:xslt-xalan-version}
which might return the value “Xalan Java 2.7.1”.
For accessing these environment values from UPL, access them using the environment namespace like ordinary UPL variables. Java types as listed in the table above are coerced to the respective UPL types.
With a namespace definition of
#namespace environment “http://www.infinity-loop.de/namespace/upcast-realm/environment”;
the code
println( $environment:dir-licenses );
might print the following on the console:
/Users/demo/Library/Application Support/infinity-loop/upCast RT/Licenses
and the code
println( $environment:license-features );
might print the following to the console:
{”rtfimportGUI”,”rtfimportAPI”,”rtfexportGUI”,”rtfexportAPI”,”uplGUI”,”uplAPI”}
It is often useful to store values that several modules will need as pipeline variables. Examples are the source document to process, the destination folder, the folder where images will be stored or the folder where temporary files should be created if needed by the pipeline.
The pipeline realm contains variables that are available to all modules in a specific pipeline. Each pipeline has its own set of pipeline variables. Modules can only access pipeline variables of the pipeline they are a member of.
The set of pipeline variables is cleared before each execution of a pipeline with the exception of the following special, pre-defined variables:
base
PipelineURI
PipelineBase
ParamURI
ParamBase
PipelineInstanceId
This realm includes all parameters of a single module in a pipeline. This realm can only be accessed from within that module, and only the parameters of the currently executed module at the time of reference resolution can be accessed.
Referencing module variables is generally not recommended since upCast has no defined order of variable resolution and will not determine a suitable one by itself. Referring to module variables can therefore lead to infinite loops or referring to unresolved references.
This realm is read-only, with the exception of the UPL execution context, where you can also set variables in that realm.
The javaproperty realm contains all currently defined Java system properties, either pre-defined ones by the Java Virtual Machine (like user.dir or user.home) or properties explicitly set on launch of the VM running the application.
This realm is read-only.
The include realm returns the contents of the file specified as the name of the variable. The syntax is as follows:
${include:/absolute/filepath/to/file.ext}
${include:relative/path/to/file.ext}
A relatively specified path is always considered to be relative to the value of ${pipeline:base}, i.e. the base URL of the pipeline.
The include realm can include parameters like e.g. specifying the encoding of the file to be used for reading it. The variable reference syntax can therefore take the following, extended form:
${include( paramname: “value” [, paramname: “value” ]* ):filepath}
The following table lists the possible parameters that can be specified for an include reference:
parameter name | value |
encoding | the (Java-) name of the encoding to be used for reading the file When this parameter is not specified, the platform’s default encoding is used. |
source | lets you choose wherefrom the data to be included should originate from
|
Example 4.5.
The value of the variable reference
${include( encoding: "UTF-8" ):Resources/entity.map}
is the text contents of the file pipeline-basedir/Resources/entity.map, read with UTF-8 encoding.
The value of the variable reference
${include( source: "variable" ):pipeline:DestinationFolder}
is the text contents of the variable DestinationFolder in the pipeline realm.
Parameters and variables are internally stored using standard, appropriate Java or UPL object types. Some parameters can take several different types, which, however, can only be set using native Java code using the upCast API or UPL functions. Parameters that can accept more than one of the following basic types will have this mentioned explicitly and in detail in their respective description section.
The basic parameter types are:
corresponding Java type (class or interface) | corresponding UPL type | |
Bool |
| Bool |
Integer |
| Numeric |
Double |
| Numeric |
String |
| String |
List |
| List |
|
| — |
Some settings are global to the upCast application and (some of them optionally) affect all pipeline documents loaded.
These can be set in the upCast Preferences dialog, available under the application menu (Mac OS X) or the File menu (other platforms).
To make the settings active, click Apply or the window’s close button.
The parameters are grouped into tabs.
Create new document on launch when no others are open
When selected, a new default pipeline document will be created on upCast launch when no other windows (e.g. from e previous, saved session) are open.
Re-open documents that were open when last quitting
When checked, all documents that were open when upCast was last quit will be re-opened in their previous locations.
Remember the most recent ___ pipeline documents
Here, you can enter the number of recently opened documents that should be listed in the File > Open Recent menu. Decreasing the value will forget any document listings beyond that new number.
To clear the File > Open Recent menu, set the number to 0, close the application preferences window by clicking Apply, then re-open and enter the number of documents you want to be remembered. Setting the value to 0 temporarily will clear the entire internal list of documents, effectively clearing the menu.
Here, you can specify a log event filter expression. Only log events passing the filter expression are actually written to the external log file. Several often-used filter expressions are available from the associated popup menu, however you are free to define your own customized filter expressions. The filter expression syntax is described here.
Check for updates on launch
When checked, upCast will contact the infinity-loop version server to check whether a newer version of upCast is available for download. If there is, you will be notified in an info dialog.
Check for updates now…
Clicking this button will let you manually check for updates. This is particularly useful when Check for updates on launch is not checked.
Pipeline Template Paths
In this text field, you can specify paths where upCast will look for pipeline templates, one path per line. Use this if you store personal or company templates at a central place on your network and make those templates available automatically within upCast.
The default path form templates, which points to the templates copied to volume during installation, is
${application:BundledResources}/templates
You must include this path in this field if you want to have access to the application-included templates. On the other hand, if you want some users to not have access to the default templates but want them to be restricted to your specific, customized templates only, make sure that in those users’ installations, the default path is not included in the path list.
To add a path to the list, click Add Path… and navigate to the folder containing the pipeline template definition folders.
Empty lines or lines starting with // are considered comments and are discarded during parsing.
You can use variable references from the include, application and javaproperty realm, but you cannot use the pipeline realm since the setting is application-global.
upCast supports the use of catalog files. A catalog file is in its simplest idea a mapping definition between PUBLIC DTD identifiers and the location of a physical copy of that specific DTD (or more general, entity). The upCast application supports the catalog file format as defined in http://www.oasis-open.org/specs/tr9401.html as well as XML Catalogs.
To add a catalog file, choose Add Catalog… and select the catalog file to add from the file system. The new catalog will be available to all modules immediately after closing the preferences window.
To remove a catalog, just delete its entry line.
Catalogs are considered in the order displayed.
OASIS catalog files are read with platform encoding, XML catalog files with the encoding specified in their XML declaration.
By clicking Insert upCast defaults, code is added to pick up any upCast default catalog possibly delivered with the application. You should have that entry in place for best performance.
You can override the global Catalog setting individually for each pipeline.
Font configuration
Specify the source code for the stdfonts.config override that should be used for this pipeline.
Custom Encodings
To add a custom encoding file, choose Add Encoding… and select the custom encoding file (*.encoding) to add from the file system. Each line in the field specifies a custom encoding location.
To add a folder where upCast should pick all contained custom encoding files, choose Add Encodings Folder… and select it.
To remove a custom encoding entry, delete the text line containing its location specification.
This panel is for importing an upCast license file and reviewing current licensing status.
Certain module types require specific license features. The features available in the currently active license are listed in the license features table at the bottom of this panel. Please refer to the individual module’s documentation to check which license feature it requires to be fully (or: at all) functional.
Import new license
Clicking this button brings up a file chooser where you can find and select the license file you got sent upon your license request or purchase from infinity-loop’s licensing department. You get the chance to store this license in upCast’s Licenses special folder, so it will be available to you automatically at launch.
Pick from available licenses
Clicking this button shows you all licenses from upCast’s Licenses folder and any licenses packaged into the application itself that can be used for this version of upCast. This allows e.g. to switch between evaluation and full licenses or licenses with different features.
Parameters will be described using the following typography:
Name | DeleteEmpties |
Java symbol |
|
Type | Boolean |
Value | false, true |
Name gives the internal java.lang.String name of the parameter, which is used for storing in preferences files, and can be used in the Java API as String. However, in Java, the use of the Java symbol is highly recommended instead.
Java symbol names the Java constant definition for the parameter’s Name. All constants are defined in the class de.infinityloop.common.Params.
Type specified the recommended Java type to use when programming against the API. In the GUI version, that is taken care of automatically. Also, when using alternative interfaces like an Ant task, which allows only passing arguments as character strings, upCast tries to perform an appropriate cast. So you should make sure that the data you provide in these cases will be cast-able to the specified type, as otherwise the conversion will fail or produce incorrect results at runtime.
Value describes possible value ranges, supported keywords or other specifics about that parameter’s range.
The pre-defined pipeline variables PipelineBase and base (deprecated) are automatically made available in the GUI version of upCast (read-only) and contain the path to the current pipeline document (*.ucdoc), excluding the actual name, in URL format (including trailing slash ‘/’). It is essential to have the pipeline document saved to a file on disk so upCast can determine this property. If the path can not be determined, the current directory is returned instead (Java property user.dir).
In API mode, this value must be explicitly set before working with pipelines that contain any references to values dependent on ${pipeline:PipelineBase}. Only use the setPipelineBaseURI() API method (class UpcastEngine) for setting the value for this pipeline variable.
You can use this for making the configuration independent from its actual location in the file system by specifying paths relative to the base variable, and storing an resources needed for the pipeline in subdirectories to this base URI.
For distributing a configuration, we recommend to put it at the root of a folder with required resources in sub-folders according to the following layout:

|
Name |
PipelineBase |
|
Java symbol |
|
|
Type |
String |
|
Value |
absolute path to folder in which the current configuration file is located (if loaded via the GUI; automatically set) or the folder of a configuration distribution root in URL format |
This variable holds the full URI (as a file:-URL) of the pipeline document (*.ucdoc) implementing the current pipeline.
|
Name |
PipelineURI |
|
Java symbol |
|
|
Type |
String |
|
Value |
absolute path to folder in which the current configuration file is located (if loaded via the GUI; automatically set) or the folder of a configuration distribution root in URL format |
This variable holds the path to the current parameter set document (*.ucpar), excluding the actual name, in URL format (including trailing slash ‘/’). For regular pipeline documents, the contents of this variable is the same as that of PipelineBase.
|
Name |
ParamBase |
|
Java symbol |
|
|
Type |
String |
|
Value |
absolute path to folder in which the current parameter set file is located (if loaded via the GUI; automatically set) in URL format |
This variable holds the full URI (as a file:-URL) of the current parameter set document (*.ucpar). For regular pipeline documents, the contents of this variable is the same as that of PipelineURI.
This definition of the contents of the ParamURI pipeline variable can be used in UPL to determine whether the currently running application is run directly from a pipeline document (ucdoc) or via a parameter set (ucpar) with code like the following:
#namespace pipeline “”;
...
if( ends-with( $pipeline:ParamURI, “ucpar” ) ) {
/* we’re running from a parameter set document */
} else {
/* we’re running from a regular pipeline document */
}
|
Name |
ParamURI |
|
Java symbol |
|
|
Type |
String |
|
Value |
absolute path to folder in which the current configuration file is located (if loaded via the GUI; automatically set) or the folder of a configuration distribution root in URL format |
This variable holds a UUID string identifying this particular running instance of a pipeline.
Identifying a certain pipeline object instance is necessary in some upCast XSLT extension functions which need to retrieve information from the pipeline object that is running the transformation. The value in this variable is used for these identification purposes and must be passed as a stylesheet parameter when needed there.
|
Name |
PipelineInstanceId |
|
Java symbol |
|
|
Type |
String |
|
Value |
Click the Pipeline Settings… button in the pipeline window to access the window for setting pipeline-level settings.
To make the settings active, click Close or the window’s close button.
Many of these settings allow you to override the settings made in the upCast preferences.
When you are using the upCast GUI as the prototype and testing environment for your pipeline development, but intend to later export it in form of Java source code or an Ant task, we recommend overriding the global settings by pipeline-specific settings to get consistency in your output for a specific pipeline document instead of being dependent on the current global application preferences at time of export.
Access the various settings by choosing the respective tab:
Here, you can set up a description of parameters you want your pipeline to be dependent on. The information provided here is used in three ways:
to create a simplified view and data entry UI objects for the user of a pipeline, where you want to hide the details of the implementation (i.e. the kind and order of modules used, calculations etc.),
to define the parameters a pipeline accepts and requires to be able to run from the commandline or via the Java API functions, including the ability to check those parameters for legal values, and
to provide documentation for the semantics of a parameter, which is shown in form of help tags in the UI, as text in the commandline, and formatted as HTML document when generating the pipeline documentation
This is a convenient feature to distribute complete, parameterized pipeline solutions to your customers in an easy-to-use, packaged way. All they need to do is open the pipeline, supply the requested parameters, and click the Run button. They are therefore completely shielded from the (possibly many) modules building up the pipeline and their complexity.
Interface element and parameter definitions
The description code you provide here serves two purposes:
It is the basis for determining the number and name of pipeline parameters.
It specifies the kind of form display element for each of these parameters.
Basically, you specify the name of the pipeline variable you wish to have set to the specified pipeline parameter’s value. This value is supplied as initial, pre-set pipeline variable to your pipeline definition.
The pipeline parameters are only set when the GUI is in Simple View mode. When in full editing mode, the pipeline is executed with a completely clean set of pipeline variables (except for the base variable) – unless you check the Set specified parameter defaults when running a pipeline in edit view option (see above). In the latter case, the default values for those parameters that have a default specified are set.
Before the defining code is interpreted, upCast resolves any contained variable references for the following realms and in that order:
include
javaproperty
application
You cannot (for obvious resons) access variables in the pipeline or module realm.
You can use the include variable reference to your advantage in projects where you have to create similar pipelines that essentially should have the same Simple View definitions. To keep those in-sync, you can use an external file holding the parameter definition code, then include it in all pipelines that should show the same UI and have the same parameters. You then only need to update that single external file, and the UI definitions are updated automatically in all pipelines that include it.
upCast offers several types of UI elements for parameter entry: a decorating label, a text field or box, a filechooser, a popup menu and a checkbox, each one with its own set of dedicated properties.
You must assign one of these entry types to each pipeline parameter you need. The syntax for describing the properties is based on a CSS rule set: The selector part takes the form of an element selector and supplies the name of the pipeline variable to set. The declaration block part specifies the specific display and behavioural properties for that UI element.
Here are the properties which you can set for each of the following available types (option values are case-sensitive!):
label | |
type |
|
text | the text to display in the label |
font-family | the name of the font to use; when not specified, the system’s default label font |
font-weight | normal | bold |
font-style | normal | italic |
font-size | size of the font; when not specified, the system’s default label font size |
color | the text color; must be a CSS 2.1 color value |
background-color | the background color; must be a CSS 2.1 color value |
initialize-when |
Defines parameter values setting and initialization behaviour when the pipeline is run as a sub-pipeline from within an External Pipeline Processor (see there for details); the default value is |
when | |
locked | when |
Example 6.1.
myLabel {
type: label;
text: “Simple View Sample”;
font-size: 20pt;
font-weight: bold;
color: olive;
}
creates a label with 20pt font size, bold text and olive text color.
text | |
type |
|
label | text to display as label at the very left |
persistent | when when The default value is |
default | the default value for the parameter, if newly created or persistent is |
description | the string specified here is displayed as tool-tip text in the UI when hovering over the input element and also used for descriptive messages when running in commandline mode |
required | (only used in upCast commandline mode) when when The default value is |
lines | the number of lines of the text field, the default is 1 |
postfix | the text to display after the text entry field; use this e.g. for displaying a value unit like “dpi” to let the user know the semantics of the number entered in the text field |
initialize-when |
Defines parameter values setting and initialization behaviour when the pipeline is run as a sub-pipeline from within an External Pipeline Processor (see there for details); the default value is |
hidden | when |
locked | when |
Example 6.2.
headerText {
type: text;
label: “Header Text:”;
persistent: true;
default: “My Publication”;
lines: 2;
}
creates a field to input header text used in the pipeline. The pipeline variable created will be named headerText, and values the user inputs will be stored across document openings. The input field will show two lines of text, and will be pre-occupated with the text “My Publication” on initial creation.
filechooser | |
type |
|
label | text to display as label at the very left |
persistent | when when The default value is |
default | the default value for the parameter, if newly created or persistent is |
description | the string specified here is displayed as tool-tip text in the UI when hovering over the input element and also used for descriptive messages when running in commandline mode |
required | (only used in upCast commandline mode) when when The default value is |
mode | open displays a chooser for opening file save displays a chooser for saving a file folder displays a chooser for picking a folder file-or-folder displays a chooser for picking a file or a folder |
format | local converts the chosen filepath in local file naming convention url converts the chosen filepath to a URL |
initialize-when |
Defines parameter values setting and initialization behaviour when the pipeline is run as a sub-pipeline from within an External Pipeline Processor (see there for details); the default value is |
hidden | when |
locked | when |
Example 6.3.
SourceFile {
type: filechooser;
label: “Source file:”;
persistent: true;
mode: open;
format: url;
}
creates a field with a button to call a file chooser. The pipeline variable created will be named SourceFile, and values the user inputs will be stored across document openings. The file chooser will allow the user to pick files only, and the result will be stored in URL format in the editable input field.
popup | |
type |
|
label | text to display as label at the very left |
persistent | when when The default value is |
default | the default internal value for the parameter, if newly created or persistent is |
description | the string specified here is displayed as tool-tip text in the UI when hovering over the input element and also used for descriptive messages when running in commandline mode |
required | (only used in upCast commandline mode) when when The default value is |
values | a space- or comma-separated list of values to display in the popup and to pass into the pipeline variables |
internal-values | a space- or comma-separated list of internal values. The value set on the pipeline variable is the one from this list whose index matches the selected option from the values property’s list of displayed values. Use this to use descriptive values in the displayed popup, while still getting short enum-type values in your variable. It also allows for easy localization of displayed values without having to change internal processing. |
initialize-when |
Defines parameter values setting and initialization behaviour when the pipeline is run as a sub-pipeline from within an External Pipeline Processor (see there for details); the default value is |
hidden | when |
locked | when |
Example 6.4.
targetType {
type: popup;
label: “Target type:”;
persistent: true;
default: “db4”;
values: “DocBook 4”, “DocBook 5”, “DITA”;
internal-values: “db4”, “db5”, “dita”;
}
creates a popup with three entries, “DocBook 4”, “DocBook 5” and “DITA”. The pipeline variable created will be named targetType, and its value will be one of the values “db4”, “db5” or “dita”, and the value selection will be stored across document openings. The default value of the variable will be “db4” upon field creation.
checkbox | |
type |
|
label | text to display as label at the very left |
persistent | when when The default value is |
default | the default value for the parameter, if newly created or persistent is |
description | the string specified here is displayed as tool-tip text in the UI when hovering over the input element and also used for descriptive messages when running in commandline mode |
required | (only used in upCast commandline mode) when when The default value is |
text | the checkbox’s label text next to the actual checkbox graphic |
initialize-when |
Defines parameter values setting and initialization behaviour when the pipeline is run as a sub-pipeline from within an External Pipeline Processor (see there for details); the default value is |
hidden | when |
locked | when |
Example 6.5.
includeStyle {
type: checkbox;
label: “Option:”;
persistent: false;
default: true;
text: “Include style information”;
}
creates a checkbox with text “Include style information”. The pipeline variable created will be named includeStyle and will have the Boolean value true when the box is checked, false otherwise. The popup value selection will not be remembered across document openings. The default will be the option being checked (=on).
list | |
type |
|
label | text to display as label at the very left |
persistent | when when The default value is |
default | the default value for the parameter, if newly created or persistent is |
description | the string specified here is displayed as tool-tip text in the UI when hovering over the input element and also used for descriptive messages when running in commandline mode |
required | (only used in upCast commandline mode) when when The default value is |
lines | the number of lines of the text field, the default is 4 |
initialize-when |
Defines parameter values setting and initialization behaviour when the pipeline is run as a sub-pipeline from within an External Pipeline Processor (see there for details); the default value is |
hidden | when |
locked | when |
Example 6.6.
files {
type: list;
label: “List of files:”;
description: “list of files to process; enter one file path per line”;
persistent: true;
}
creates an entry field for a list of values where each single line corresponds to one list item. An empty line create a list item consisting of the empty string.
The order in which the parameters are defined determine the display order and the order of parameters in created Java code functions.
Example 6.7.
Concatenating all individual parameter definition examples from above into one, the following Simple View for the pipeline would be created:

|
Name |
ParameterDefinitions |
|
Java symbol |
|
|
Type |
String |
|
Value |
Reset persistent values
This clears any currently stored persistent values in the pipeline document.
You should clear the values when you make significant changes to the paraneter definition and prior to saving the pipeline configuration for distribution to your customers, so they do not see your last, private settings you made during development for parameters having persistence turned on.
Finalization
This parameter lets you specify the condition under which the pipeline signals an error to its parent, which is the application when it is a top-level pipeline, or the executing component, when it is run as a sub-pipeline (e.g. by the External Pipeline Processor).
In the case of being a top-level pipeline, signalling an execution error will result in an error dialog to be shown (if run in the GUI) or an exception being thrown (when run via the Java API).
You can specify the cases in which a pipeline execution failure should be signalled by using several pre-defined, often used conditions, or you can specify a custom condition in UPL:
continue pipeline execution is always reported as successful
signal on FATAL signal a pipeline execution failure when during execution, a FATAL log message has been received
signal on ERROR signal a pipeline execution failure when during execution, a FATAL or ERROR log message has been received. This is the default for new pipelines.
signal on WARN signal a pipeline execution failure when during execution, a FATAL, ERROR or WARN log message has been received
In all of the four pre-programmed finalization modes above, collected log messages from level WARN and up are forwarded to the parent (usually a pipeline object). See also the section on logging for more details.
custom finalization: this option lets you specify custom UPL function code which, by returning one of the two Id values TERMINATE or CONTINUE, signals the failure state of the pipeline
To edit the UPL code for the custom finalize() function, click the Edit finalization code… button. By returning the Id TERMINATE, you indicate that the execution of the pipeline has failed, and by returning CONTINUE you indicate that the pipeline execution succeeded.
The custom function receives an Id parameter which is TERMINATE when one of its child modules requested explicit, premature pipeline termination, CONTINUE otherwise.
Example 6.8. Finalization function template
function finalize( $childFinalizationResult as Id ) as Id {
variable $result as Id := $childFinalizationResult; // default: CONTINUE
/* Return the Id TERMINATE when you want to terminate the pipeline, CONTINUE otherwise. */
return $result;
}
Generating a custom error message
Additionally, in the custom finalization code field, you can optionally specify a second UPL function, message-text(). When this function is defined and does not return the empty string, when finalize() returns TERMINATE, the string returned by this function will be shown to the user instead of the default message generated by upCast. This allows you to generate error messages that are tailored specifically to your application and its user base.
Example 6.9. Custom message text function template
function message-text() as String {
variable $result as String := “”;
/* Return a non-empty message string to display an error dialog resp. write the error text to the log. */
return $result;
}
|
Name |
FinalizationMode |
|
Java symbol |
|
|
Type |
String |
|
Value |
module: continue | terminate-fatal | terminate-error | terminate-warning | custom; pipeline: standard | custom |
|
Name |
FinalizationCode |
|
Java symbol |
|
|
Type |
String |
|
Value |
UPL source code |
Log filter
Sets the logging threshold for messages that the module accepts from children and produces itself (see the logging architecture description for details).
Some default filter expressions are available from the associated popup menu, however you are free to define your own customized filter expressions. The filter expression syntax is described here.
If set to inherit, the logging filter settings are governed by the application's filter settings for the external logger as set in the application's preferences.
|
Name |
LogFilterSpec |
|
Java symbol |
|
|
Type |
String |
|
Value |
inherit | OFF | FATAL | ERROR | WARN | INFO | DEBUG | VERBOSE | DETAIL | TRACE | ALL | |
Edit-Lock password
Here, you can specify a password that prevents switching off the Simple View and hence editing the pipeline from withing the GUI. It does not encrypt the pipeline document itself!
To remove the lock, clear the password field. The password “__________” (10 underscores) must not be used.
The password is stored in the pipeline document as base64-encoded MD5 hash.
|
Name |
EditLockPassword |
|
Java symbol |
|
|
Type |
String |
|
Value |
Pipeline UID
To implement file-location independent linking from Parameter Set documents to their implementation pipeline document, each pipeline document must have a unique ID. This need not be a standard UUID when you can guarantee that these IDs will only be used in a controlled environment, where we suggest using speaking IDs to make it easier for users to manually find the respective pipeline given a UID value, which may be necessary when a link gets broken to a mis-configuration of the ID resolver.
By default, every pipeline document that is opened that does not yet have a non-empty pipeline ID setting, upCast will automatically generate a UID and set it for that pipeline document.
|
Name |
PipelineUUID |
|
Java symbol |
|
|
Type |
String |
|
Value |
Required upCast build number
Enter the build number of the upCast application that this pipeline requires as a minimum to be able to run. When a user tries to run the pipeline with an application version that has a build number less than the one specified here, a dialog is shown allowing the user to abort the execution of the pipeline (the default), execute it nevertheless (at his own risk), or aborting the execution and check automatically for a newer version of the application at the vendor site.
When you leave the field empty, no minimum requirement check is performed.
When no UI is available (e.g. when running from the commandline or via the Java API), execution is aborted and a FATAL log message with details is generated.
|
Name |
RequiredBuildNumber |
|
Java symbol |
|
|
Type |
Integer |
|
Value |
When checked, the catalogs set in the nearest pipeline ancestor in the call chain or, if it is the top-level pipeline, of the upCast application preferences will be used. When unchecked, the Catalog files setting as specified will be used.
|
Name |
UseGlobalCatalogs |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
Catalog files
To add a catalog file, choose Add Catalog… and select the catalog file to add from the file system. Each line in the field specifies a catalog location. The new catalog will be available to all modules immediately after closing the preferences window.
When the catalog resides in the file hierarchy below the pipeline base URI, it is made relative by default using the ${pipeline:base} variable notation.
When you hold down the Alt key while clicking Add Catalog…, upCast tries to generate a pipeline base URI-relative path, even when the location is outside the directory subtree under the pipeline base URI.
To remove a catalog, delete the text line containing its location specification.
Catalogs are considered in the order displayed.
|
Name |
Catalogs |
|
Java symbol |
|
|
Type |
String |
|
Value |
one path to a catalog per line as string |
Inherit from parent
When checked, the font configuration set in the nearest pipeline ancestor in the call chain or, if it is the top-level pipeline, of the upCast application preferences will be used. When unchecked, the Font configuration setting as specified will be used.
|
Name |
UseGlobalFontConfig |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
Font configuration
Specify the source code for the stdfonts.config override that should be used for this pipeline.
|
Name |
FontConfiguration |
|
Java symbol |
|
|
Type |
String |
|
Value |
font configuration code |
Inherit from parent
When checked, the custom encoding setting set in the nearest pipeline ancestor in the call chain or, if it is the top-level pipeline, of the upCast application preferences will be used. When unchecked, the Custom Enconigs setting as specified will be used.
|
Name |
UseGlobalEncodings |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
Custom Encodings
To add a custom encoding file, choose Add Encoding… and select the custom encoding file (*.encoding) to add from the file system. Each line in the field specifies a custom encoding location. When the custom encoding resides in the file hierarchy below the pipeline base URI, it is made relative by default using the ${pipeline:base} variable notation.
To add a folder where upCast should pick all contained custom encoding files, choose Add Encodings Folder… and select it. When the folder resides in the file hierarchy below the pipeline base URI, it is made relative by default using the ${pipeline:base} variable notation.
When you hold down the Alt key while clicking Add Encoding… or Add Encodings Folder…, upCast tries to generate a pipeline base URI-relative path, even when the location is outside the directory subtree under the pipeline base URI.
To remove a custom encoding entry, delete the text line containing its location specification.
|
Name |
CustomEncodings |
|
Java symbol |
|
|
Type |
String |
|
Value |
paths to custom encodings (either to an individual custom encoding file or to a folder containing *.encoding files), with one entry per line |
Specify the fully qualified class name where the Java source code export option should place the generated code. Any required, nesting package folders will be created automatically by the File > Export to Java Source… function.
|
Name |
ExportJavaClass |
|
Java symbol |
|
|
Type |
String |
|
Value |
Source root folder
Specify the absolute path to the source root folder, i.e. the root of the Java package hierarchy subdirectories. You can use the ${pipeline:base} variable as the first component of the path to specify the source root relative to the pipeline base URI.
When this field is left empty, upCast will ask for the source root folder every time you call the File > Export to Java Source… function. When this field is non-empty, that value will be used silently when calling the Java source export function.
With the Choose… button, you can request a file chooser to pick the Java source root folder. When this is a subdirectory of the pipeline base URI, the path is automatically made relative to it.
When you press the Alt key while clicking Choose…, upCast tries to always make the path relative, even if it is not in the subtree under the pipeline base URI.
|
Name |
ExportJavaSourceRoot |
|
Java symbol |
|
|
Type |
String |
|
Value |
‘upcast.jar’ location (or Ant expression)
Here you specify the path or expression to insert into the upcast Ant task definition to the upcast.jar file containing the actual Java code for the task. If you leave that field empty, “upcast.jar” will be used in the created Ant file module when using File > Export as Ant Task….
The text you enter here is first processed by the usual upCast variable resolution mechanism. This has the advantage that you can use upCast variables for calculating the path, must, however, take care to quote the ‘$’ character (dollar sign) when you want that verbatim, e.g. to reference Ant variables.
So you could use something like
$${basedir}/tasks/upcast.jar
to keep the generated Ant file portable by referring to upcast.jar relatively from the Ant build file’s base directory.
|
Name |
ExportAntJarLocation |
|
Java symbol |
|
|
Type |
String |
|
Value |
literal Ant value for task’s ‘basedir’ attribute
Here you specify the literal code to be used for the upcast task’s basedir attribute in the generated target. This is useful to calculate the pipeline base URI relative to some Ant property and thus make the generated build module position independent. upCast variables are resolved as usual before writing the resulting text to the build file.
Example 6.10.
To calculate the pipeline base URI to be used by the task relative to the position of the build file, you may want to use a setting like
$${basedir}/MyPipelineRoot/
Note how you must quote the ‘$’ character (dollar sign) to avoid upCast trying to treat it as an upCast variable and expand it.
|
Name |
ExportAntBasedir |
|
Java symbol |
|
|
Type |
String |
|
Value |
literal Ant code for <source> selection
Here you specify the literal Ant source XML code to be inserted into the generated target code for selecting the source file(s) to be used.
With the Add source… button, you can generate code for a single source file.
When holding down the Meta (Mac OS X: Command) key while clicking the Add source… button, you can generate code for all files in the selected folder. A commented-out line for filtering based on extension is automatically generated, which you can uncomment and fill in as desired.
For both cases, when additionally holding down the Alt key, the reference generated will be relative to the literal value specified in the literal Ant value for task’s ‘basedir’ attribute field. For this, a special local variable ${taskbase} is used, which gets replaced by the resolved contents of the literal Ant value for task’s ‘basedir’ attribute parameter.
For the syntax used for source specification, see the description of the upCast Ant task.
upCast variables are resolved as usual before writing the resulting text to the build file, including the resolution of the special ${taskbase} variable as the last resolution step.
|
Name |
ExportAntSourceCode |
|
Java symbol |
|
|
Type |
String |
|
Value |
When checked, the license set in the nearest pipeline ancestor in the call chain or, if it is the top-level pipeline, of the upCast application preferences will be used. When unchecked, the License location setting as specified will be used.
|
Name |
UseGlobalLicense |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
License location
To set the license to be used for running this pipeline, click Choose license file… and select the license file (*.uclicense) to be used.
When the license file resides in the file hierarchy below the pipeline base URI, it is made relative by default using the ${pipeline:PipelineBase} variable.
The license details of the active license are displayed in the fields below for your reference.
|
Name |
LicenseFile |
|
Java symbol |
|
|
Type |
String |
|
Value |
path to license file |
Each module type has its own, dedicated set of parameters to control its behavior. A few parameters are shared by all modules, both in name and semantics. These are listed explicitly below. However, all other parameter names are to be interpreted with the context of the module’s functionality in mind to infer their meaning.
Internally, parameters of modules are dynamically, weakly typed, though each parameter has a recommended or even required (by definition) type.
Parameters will be described using the following typography:
Name | DeleteEmpties |
Java symbol |
|
Type | Boolean |
Value | false, true |
Name gives the internal java.lang.String name of the parameter, which is used for storing in preferences files, and can be used in the Java API as String. However, in Java, the use of the Java symbol is highly recommended instead.
Java symbol names the Java constant definition for the parameter’s Name. All constants are defined in the class de.infinityloop.common.Params.
Type specified the recommended Java type to use when programming against the API. In the GUI version, that is taken care of automatically. Also, when using alternative interfaces like an Ant task, which allows only passing arguments as character strings, upCast tries to perform an appropriate cast. So you should make sure that the data you provide in these cases will be cast-able to the specified type, as otherwise the conversion will fail or produce incorrect results at runtime.
Value describes possible value ranges, supported keywords or other specifics about that parameter’s range.
The following parameters are available on all modules:
Active checkbox
When the “active” checkbox in the upper left corner of the module parameter pane is checked, the module is active in the pipeline.
During pipeline development, it is often useful to have several differently configured modules to switch between, or to have modules inserted in the pipeline that generate some sort of debug output. To quickly activate and deactivate a module without having to actually delete or insert it again into a pipeline, with this parameter, modules can be quickly temporarily disabled by unchecking it.
Deactivated modules are completely skipped during a pipeline run and impose only minimal overhead – actually, it’s just writing a line to the log file.
|
Name |
ModuleEnabled |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
true | false |
Name
Here, you can assign a meaningful name to a module instance. By default, modules’ names are their type, like “XSLT Processor” or “RTF Importer”. However, when you have e.g. several XSLT processors in your pipeline, it is desirable to use more meaningful names, like “strip namespaces XSLT” or “TEI conversion transformation”.
|
Name |
InstanceNameUser |
|
Java symbol |
|
|
Type |
String |
|
Value |
an arbitrary string |
Export
When checked, this module is handled (exported) in a File > Export… function.
You can use this to set up a single upCast pipeline document in such a way that for export to Java code or an Ant task, only certain modules will be exported. This lets you use some module instances for debugging in the UI, which then won’t be part of an exported pipeline representation.
For the Export as XML… function, module elements will have an additional attribute export with value true or false, respectively. This allows you to decide in any custom post-processing of that pipeline export format whether you want to handle that module in a special way (like discarding it completely like the built-in export options Ant and Java source).
|
Name |
ModuleExported |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
true | false |
Initialization
This parameter lets you programmatically set module parameters as well as (dynamically) prevent running the module even when its active checkbox is checked. For this, you can write a custom UPL function initialize() by clicking on the Edit initialization code… button.
The text on the Edit initialization code… buttonwill be bold when a custom initialization function has been defined (and therefore the code field is not empty). This lets you quickly see if a module defines a custom initialization function without having to open the code entry dialog.
If the button text is plain, the code field is empty and the module will always be executed.
If you always want to run the module unconditionally, make sure the code field is empty. This also allows you to see at a glance in the UI to see whether a custom function is defined, and you protect you against possible future signature changes and therefore code incompatibilities in the initialize() function when you effectively don’t even use its features.
If the initialize() function returns EXECUTE (which is the default), the module is further executed.
If the initialize() function returns SKIP, the module’s action is not performed and the subsequent module in the pipeline (if there is one) is run.
If the initialize() function returns TERMINATE, the module’s action is not performed and additionally, further pipeline execution is aborted.
In the initialize() function’s body, you can run arbitrary UPL code. This code is run just before actually performing the module’s functionality. This function hook’s main intent is to give you the possibility to programmatically and dynamically set module parameters’ values based on e.g. pipeline variable values (which in turn may have been set through the Simple View or by an external parameter passed to the pipeline). This way, you can set a parameter that does not allow you to have variable references expanded, like popups or check boxes. Additionally, this function serves as a dynamically evaluated condition specifying whether to run the module or not (in contrast to the module’s static Active checkbox).
Example 7.1.
Assuming you are offering your users the choice between the HTML and CALS table model by way of a pipeline parameter tableType (e.g. in the Simple View), the following code sets the corresponding module parameter TableModel dynamically in the XML Export module. This would not be otherwise possible via that module’s UI since for the selection, a popup is used which has no way to calculate its value based on pipeline variables.
The code assumes that the pipeline parameter tableType can have one of two values: html or cals.
#namespace module “http://www.infinity-loop.de/namespace/upcast-realm/module”; #namespace pipeline “http://www.infinity-loop.de/namespace/upcast-realm/pipeline”; function initialize() as Id { $module:TableModel := $pipeline:tableType; return EXECUTE; /* run the module */ }
|
Name |
InitializationCode |
|
Java symbol |
|
|
Type |
String |
|
Value |
UPL source code |
Finalization
This parameter lets you specify the condition under which further pipeline execution should be cancelled after running this module.
This parameter will only be evaluated (and therefore have any effect) if the module action was actually performed, or in other words: if initialize() did not prevent the execution of the module’s action by returning TERMINATE or SKIP.
Normally, pipeline execution continues with the following defined modules even if in the current one there was a warning or error. These messages are collected and then displayed in the final pipeline execution error dialog. However, sometimes this is not a desired behaviour. Specifically, when subsequent modules rely on the proper execution of their predecessors to produce usable or correct results or – even more importantly – to not cause harm to data integrity, it may be necessary to immediately stop further execution of the pipeline when some module produces an error.
You can specify the termination behaviour by using several pre-defined, often used conditions, or you even can specify a custom condition in UPL:
continue continue pipeline execution no matter what, i.e. even when an ERROR or FATAL error has occurred
signal on FATAL terminate pipeline execution when during execution of this module, a FATAL error message has been generated
signal on ERROR terminate pipeline execution when during execution of this module, a FATAL or ERROR error message has been generated. This is the default value for new module instances.
signal on WARN terminate pipeline execution when during execution of this module, a FATAL, ERROR or WARN message has been generated
custom finalization: this option lets you specify custom UPL function code which, by returning one of the two Id values TERMINATE or CONTINUE, can request pipeline termination or continuation after this module.
To edit the UPL code for the custom finalize() function, click the Edit finalization code… button. By returning the Id TERMINATE, you can request pipeline termination, and by returning CONTINUE as result you can request pipeline continuation.
The custom function takes as Id parameter the termination status of its child component if there is any, CONTINUE otherwise.
Example 7.2. Finalization function template
function finalize( $childFinalizationResult as Id ) as Id {
variable $result as Id := $childFinalizationResult; // default: CONTINUE
/* Return the Id TERMINATE when you want to terminate the pipeline, CONTINUE otherwise. */
return $result;
}
|
Name |
FinalizationMode |
|
Java symbol |
|
|
Type |
String |
|
Value |
module: continue | terminate-fatal | terminate-error | terminate-warning | custom; pipeline: standard | custom |
|
Name |
FinalizationCode |
|
Java symbol |
|
|
Type |
String |
|
Value |
UPL source code |
Sets the logging threshold for messages that the module accepts from children and produces itself (see the logging architecture description for details).
Some default filter expressions are available from the associated popup menu, however you are free to define your own customized filter expressions. The filter expression syntax is described here.
If set to inherit, the logging filter settings are governed by the module’s execution parent’s settings (that is usually the pipeline it is contained in).
|
Name |
LogFilterSpec |
|
Java symbol |
|
|
Type |
String |
|
Value |
inherit | OFF | FATAL | ERROR | WARN | INFO | DEBUG | VERBOSE | DETAIL | TRACE | ALL | |
This section describes the available modules in more detail, listing available parameters. Filter type identifiers are given in square brackets after the UI module name.
This module allows you to set some commonly used global variables easily for re-use in subsequent modules. It is therefore most useful as the first module in a pipeline.
You can set the global variables pipeline:SourceFile, pipeline:TemporaryItemsFolder, pipeline:DestinationFolder, pipeline:ImageDestinationFolder and pipeline:DebugFolder.
The effect of using this module in API mode is the same as using UpcastEngine.setPipelineVariable().
All parameters have a type of java.lang.String.
When a field of the pre-defined parameters is left empty, that parameter is not set at all. This allows having this type of module somewhere in the middle of a pipeline and have it only set resp. override certain parameters (either custom parameters or selected pre-defined ones). All parameters with empty values in the list of pre-defined entry fields keep their assigned parameters (or are not created).
This also means that if you want to assign the empty string to some parameter, you can only do so by specifiying it in the Custom pipeline variables field.
Custom pipeline variables
Here, you can specify additional global values for use in subsequent modules. The definitions herein are processed after the fixed global parameters described above are evaluated and set, so you can refer to them using the usual ${pipeline:…} variable reference. A parameter definition must follow this syntax:
varname’:=’ ‘”’ value ‘”’;
Quotes within the variable value must themselves be quoted using the backslash character ‘\’.
The text in this field and any contained variable references are resolved as follows:
Any references to the include realm are resolved.
The individual assignments are parsed according to the above syntax.
Only now, any further references to realms other than include are resolved separately for each value part of an assignment.
This algorithm covers the usual cases where you might want to include constant assignment code shared by several pipelines using an include variable reference, but also reference variable values without having to worry whether their actual value breaks the value assignment syntax described above.
|
Name |
PipelineVariables |
|
Java symbol |
|
|
Type |
String |
|
Value |
(string in same syntax as in corresponding UI field) |
This module requires an appropriate RTF Importer feature included in your license to be fully functional.
This importer module handles conversion from RTF to the internal, unified upCast format, the upCast Internal DTD. With WordLink enabled, the filter also can convert Word binary files (*.doc).
The RTF importer outputs the RTF Optional Hyphen symbol (\-) as codepoint U+E003 in the Unicode Private Use Area. This is to allow following pipeline steps to discriminate it from Soft Hyphen (U+00AD) Unicode characters entered directly in the RTF as Unicode. This has been implemented because rendering behaviour of the two in following rendering engines is different from Word’s display so that it is important to be able to differentiate between those two.
However, the Unicode Translation Map in effect in the XML Exporter module maps U+E003 to U+00AD by default. If you need or want to change the translation of RTF’s Optional Hyphen symbol to something other than the Soft Hyphen character in Unicode, you must change or override the default mapping of the source codepoint U+E003 in the XML Exporter module.
Parameters are grouped logically into tabs:
General
Source file
Specify the source file in RTF or, if WordLink is available, in Word binary format that should be imported.
|
Name |
SourceFile |
|
Java symbol |
|
|
Type |
Object |
|
Value |
absolute file path in URL or local file system convention |
Hoist common inline properties to parent
If enabled, any inline formatting CSS property that extends and has the same value over all children of a paragraph-level element will be hoisted to its parent object as a style override. Effectively we’re making use of CSS inheritance and optimize the output by specifying that particular property only once on the parent instead of on each of its child elements.
|
Name |
HoistCommonInlines |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
true | false |
Remove empty inlines
If enabled, any inline style specifications that do not contain any #PCDATA or similar, visually rendered content, are discarded from the document.
The default for this parameter is off based on the assumption that you may want to keep e.g. formatting information for empty cells so that a user may later fill in text and has the correct, originally intended formatting information available at that document location.
|
Name |
RemoveEmptyInlines |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
true | false |
Allow ‘class’ and ‘style’ attributes simultaneously on <inline> elements
When on, this option allows that both a class and style attribute may be present on an element. Otherwise, the two are separated and an anonymous inline element is created for the style attribute instead.
Option checked:
This is <uci:inline uci:class=”slang” uci:style=”color: blue;”>True Blue</uci:inline>.
Option unchecked:
This is <uci:inline uci:class=”slang”><uci:inline uci:style=”color: blue;”>True Blue</uci:inline></uci:inline>.
You might want to use this option to have named Word styles always separated out in a dedicated element so that additional override styles can be recognized quickly by the additional inline element.
|
Name |
CombineWithLogicalStyle |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
true | false |
Apply list structuring heuristics
If checked, special list structure detection algorithms are performed to create the best logically structured XML output. If unchecked, Word’s internal list IDs are used to track where a list starts and ends and where a new one begins, which may (based on the editing history of a particular list) have virtually no resemblance to what you are actually seeing in the layout.
The default value is on.
|
Name |
ApplyListHeuristics |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
true | false |
Markup revision tracking using <inserted> and <deleted>
When this is checked, document revisions are marked up in the result using the inserted and deleted elements.
If this is off, only the result of the revisions will be exported, i.e. inserted content remains in the document and deleted content is removed.
|
Name |
RevisionTracking |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
true | false |
Use CSS for forced pagebreaks (where possible)
When checked, the importer tries to use CSS code for specifying forced pagebreaks wherever possible by using the pagebreak-before: always property/value combination.
If this is off, a pagebreak element will always be used.
|
Name |
UseCSSForPagebreaks |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
true | false |
Default font size
Some RTF documents do not specify a default font size for their text content, but rely on the default of the rendering application (like Microsoft Word). This parameter lets you set the default font size for such documents.
Microsoft Word applications up to and including Word 97 used a default value of 10pt, Word 2000 and later use a default of 12pt. When you set this parameter to * (i.e. automatic), upCast tries to guess from the RTF symbols it finds in the document whether it is a Word 2000 (or later) document and then will use 12pt as default font size, 10pt otherwise.
|
Name |
DefaultFontSize |
|
Java symbol |
|
|
Type |
String |
|
Value |
'*' | 1..999 |
Literal pass-through styles
If checked, you can specify a set of (Word-) styles, separately for the paragraph style and the character style category, by specifying their exact names which should be treated as literals. This means that all text in the document set using these styles will be written to the output without any interpretation by upCast. This lets you write e.g. XHTML or XML code directly within your document the way it should appear at that location in the output.
|
Name |
LiteralProcessing |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
true | false |
Paragraph style names
When Literal pass-through styles is on, specify here the list of paragraph styles that should be treated as literal content indicators. Enclose the names of the styles in double-quotes, separate styles by a space character.
|
Name |
LiteralParStyle |
|
Java symbol |
|
|
Type |
String |
|
Value |
style name |
Character style names
When Literal pass-through styles is on, specify here the list of character styles that should be treated as literal content indicators. Enclose the names of the styles in double-quotes, separate styles by a space character.
|
Name |
LiteralCharStyle |
|
Java symbol |
|
|
Type |
String |
|
Value |
style name |
Images
Include images
When checked, images contained in the document are processed as configured by the image processing parameters. If unchecked, all images of the source document will be completely discarded from the document.
|
Name |
IncludeImages |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
true | false |
Temporary images folder
This is the location where images in a read document will be temporarily stored while the pipeline is processed. It is e.g. the responsibility of an exporter to copy images intended to be permanently saved across a pipeline run to a different location.
A pipeline keeps track of temporary images created in the above location. After finishing a pipeline run, all these recorded temporary files are automatically deleted.
|
Name |
TemporaryItemsFolder |
|
Java symbol |
|
|
Type |
String |
|
Value |
path to temporary items folder |
Use inline copies instead of references
When this option is checked, for images that have been included in the RTF document using both methods, by reference and by embedding, the module will try to use the embedded substitute representation. This option essentially breaks the link to the original image file, if a substitute representation has been embedded in the RTF file, and instead links to the embedded representation of the original file.
When an image has only been linked and no substitute representation is available in the RTF, however, the original link to the image is preserved and used.
|
Name |
InlineReferencedImages |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
true | false |
Incoming images default resolution
This parameter determines the image resolution in dpi (dots per inch) to use for embedded images that do not specify their resolution explicitly. This is true for all (originally) GIF images and some variants of JPEG and PNG images.
Without any dpi information, the RTF importer (and, as a matter of fact, even Word) cannot determine the absolute size of images, which is necessary to create a fully specified export file. This parameter is then used to establish a default dpi value and corresponds roughly to Word’s Web Options > Image resolution setting.
When setting this to the default ‘*’ value, the RTF importer determines the absolute size of the image from the image properties in the RTF document (if available) and modifies the embedded image data by adding the resolution determined from the (absolute size/number of pixels)-pair to the externalized image. This ensures that subsequent processors can correctly determine absolute sizes and scale any images accordingly.
If you have control over the original document generation process and especially image creation, make sure that each image you add to a Word or RTF document contains explicit resolution information, as this avoids all sorts of platform incompatibilities.
This rule especially forbids importing GIF images as the GIF format does not include resolution information. However, also several Clip Art images in JPEG and PNG format do not contain this desirable information, with displayed image size in a document becoming dependent on platform, Word version or setting of the Web Options > Image resolution parameter – which is generally undesirable.
Outgoing images rendering resolution
This value affects the WMF to pixmap renderer built into the RTF Importer. This means that WMF (or EMF) images will be rendered into a pixmap with pixel dimensions for width and height that correspond to this value.
The default value is 96 dpi (used e.g. by Microsoft’s Internet Explorer™). You may want to change this when outputting for Netscape Navigator 4.7 on the Mac, which by default displays at 72 dpi and therefore would downscale images written using 96 dpi resolution.
Suppose you have a WMF image in your document that is 2 by 1 inches in size. With 96 dpi output resolution, this will yield a pixmap of size 192 by 96 pixels.
However, if you set the output resolution to only 72 dpi, the resulting pixmap will be 144 by 72 pixels in size.
|
Name |
ImageRenderingResolution |
|
Java symbol |
|
|
Type |
Integer |
|
Value |
20..360 |
Export embedded images of type…
While exporting embedded images, you have the option to convert them to a different format.
The RTF Importer includes a custom WMF to pixmap renderer fully programmed in Java. It is neither intended nor recommended for production quality image conversion! To perform high-quality image conversion, we strongly encourage you to consider specialized third-party products. Nevertheless, the built-in renderer is useful and intended for producing draft image renderings for viewing in a web browser or creating documents for editorial review and should perform well enough for most purposes except final publishing.
Embedded images in an RTF document can be of several image format types: WMF, EMF, JPEG, PNG and Macintosh PICT. The RTF importer lets you specify a handling method for each of these formats, so you can e.g. use already pixel based images like JPEG or PNG unchanged while rendering vector formats like WMF into a pixel-based representation.
The following handling methods are available (some of which are not applicable to all source formats):
(no change) Export the embedded image as binary data without any modification applied
external cmd Export the embedded image as binary data without any modification applied, and then run the specified external command on it for further processing. (See below for details.)
*remove* The image will be completely removed from the document
JPEG The image will be converted into JPEG format, using the built-in WMF to pixmap renderer if necessary. Clicking Options… lets you set the JPEG compression quality.
PNG The image will be converted into PNG format, using the built-in WMF to pixmap renderer if necessary. Clicking Options… lets you set the PNG compression algorithm.
BMP The image will be converted into Windows bitmap (BMP) format, using the built-in WMF to pixmap renderer if necessary.
PICT The image will be converted into Macintosh PICT format, using the built-in WMF to pixmap renderer if necessary. Note that only the image map operator is supported. The RTF importer will not translate WMF vector operators into native PICT operators.
When using the option external cmd, two additional parameters can be set:
File extension The field should receive the destination file extension of the image file as it is after the external conversion. For example, if you want to convert a WMF file to TIFF, the extension should be tif or tiff.
Command This is the external command to execute for converting the image source file to the desired target format. You must use placeholders for the source and destination file name using the upCast variable syntax. The variables to use are:
| the image source file in local file name convention |
| the image source file in URL format |
| the destination file name in local file name convention |
| the destination file name in URL format |
This works as follows: The file to be converted is available at the location in imgsrc#local. The RTF importer then constructs a target file name, using the source file name as basis, but setting the extension to the one specified. Since the RTF importer needs to know the final resulting filename for referring to the externally converted image in the internal document tree, but there is no way to return a string from a shell command easily (just an integer return code), it prescribes the target file name itself. This is what the variable imgdest#local is for. You must make sure that the final, processed image file is available at the location contained in that specific variable.
Example 8.1. Example:
To convert a WMF file to JPEG, use settings like:
WMF to [external cmd:]
File extension: [jpg]
Command: [fileconverter -fmt jpeg -outfile ${imgdest#local} ${imgsrc#local}]
|
Name |
WMFDestFormat |
|
Java symbol |
|
|
Type |
String |
|
Value |
unchanged | dispose | ExternalCommand | JPEG | PNG | BMP | PICT |
|
Name |
EMFDestFormat |
|
Java symbol |
|
|
Type |
String |
|
Value |
unchanged | dispose | UseWMFSubstitute | ExternalCommand |
|
Name |
JPEGDestFormat |
|
Java symbol |
|
|
Type |
String |
|
Value |
unchanged | dispose | ExternalCommand | JPEG | PNG | BMP | PICT |
|
Name |
PNGDestFormat |
|
Java symbol |
|
|
Type |
String |
|
Value |
unchanged | dispose | ExternalCommand | JPEG | PNG | BMP | PICT |
|
Name |
PICTDestFormat |
|
Java symbol |
|
|
Type |
String |
|
Value |
unchanged | dispose | ExternalCommand | JPEG | PNG | BMP | PICT |
|
Name |
WMFDest.JPEG.Quality |
|
Java symbol |
|
|
Type |
Integer |
|
Value |
0..100 |
|
Name |
WMFDest.PNG.CompressionType |
|
Java symbol |
|
|
Type |
String |
|
Value |
default | fast | max | none |
|
Name |
JPEGDest.JPEG.Quality |
|
Java symbol |
|
|
Type |
Integer |
|
Value |
0..100 |
|
Name |
JPEGDest.PNG.CompressionType |
|
Java symbol |
|
|
Type |
String |
|
Value |
default | fast | max | none |
|
Name |
PNGDest.JPEG.Quality |
|
Java symbol |
|
|
Type |
Integer |
|
Value |
0..100 |
|
Name |
PNGDest.PNG.CompressionType |
|
Java symbol |
|
|
Type |
String |
|
Value |
default | fast | max | none |
|
Name |
PICTDest.JPEG.Quality |
|
Java symbol |
|
|
Type |
Integer |
|
Value |
0..100 |
|
Name |
PICTDest.PNG.CompressionType |
|
Java symbol |
|
|
Type |
String |
|
Value |
default | fast | max | none |
Objects
These parameters specify how embedded objects (OLE) should be handled. The RTF importer generates an uci:object element for each embedded object it finds in the RTF. The child elements of this container object are alternative representations of the object’s data. This can can be an uci:image (if available in the source document: represents the current display of that object at the time of saving the document), or an uci:ole element (if available: it contains a base64 representation of the binary data of the OLE object, which makes it possible to reconstruct it to an editable instance using the RTF exporter).
Include image representation
When checked, an image representation alternative will be added to the object element (if available in the source document).
Include binary data
When checked, an uci:ole binary data representation alternative will be added to the object element. The uci:ole element contains the base64-encoded binary data as character data.
Include MathML representation
When MathLink is available, i.e. you have Design Science‘s MathType software (version 5.2) installed on your Windows system and are running upCast on that same machine, for MathType OLEs, you can also embed a MathML representation of your formula in the object element as m:math element.
Since MathLink is only available on the Windows platform, this option will only be enabled when a functioning MathLink actually is available to the application.
|
Name |
ObjectHandling |
|
Java symbol |
|
|
Type |
String |
|
Value |
image || embed || mathml (separated by whitespace if more than one) |
WordLink
Set WordLink features.
Since WordLink is only available on the Windows platform, this tab will only be displayed when WordLink actually is available to the application.
Mode
When Process .doc files only is selected, WordLink and all options specified will only be applied to Word binary (*.doc) files.
When Process all files is selected, WordLink and all options specified will be applied to any input document, i.e. even files that are in RTF format already. This lets you automatically update fields or add pagestart and linestart elements.
|
Name |
WordLinkMode |
|
Java symbol |
|
|
Type |
String |
|
Value |
doc | all |
Run macro named „il_premacro“
When checked, WordLink will first run a Word macro named il_premacro on the source document. This macro must either be defined in the respective document (when it is a Word binary .doc file) or in the global document template file (*.dot).
When this macro is not available, an error will be issued after conversion, though the further conversion process is not affected.
Update fields
When checked, WordLink will update any fields in the source document with current values: date, time, pages, …
Update from linked images
When including an image only by reference (i.e., using Word’s INCLUDEPICTURE field), the RTF importer is not able to determine the actual image size as that information is not part of RTF. By checking this option, the linked image is temporarily included into the document with the effect that image size and possibly applied scaling in the .doc Word binary file can be evaluated by the importer.
This feature is not beneficial for RTF source files, as in these the necessary information is already lost (also for Word).
Mark up layout page breaks using <pagestart />
This inserts a <pagestart /> empty inline element at those places where in current layout flow, there would be a dynamic page break when rendering the document.
Mark up layout line breaks using <linestart />
This inserts a <linestart /> empty inline element at those places where in current layout flow, there would be a dynamic line break when rendering the document.
This is slow for documents bigger than about 100 pages. You may want to increase the Kill timeout value significantly. Also, some document structure constellations may yield wrong line break position results due to limitations in the Word application.
|
Name |
WordLinkCommand |
|
Java symbol |
|
|
Type |
String |
|
Value |
Pages || Update || Premacro || Lines || Includelinkedimages || Updatelinks (concatenate desired options without any whitespace inbetween) |
Kill timeout
When hitting a corrupt document, WordLink may have problems and/or hang the application. Therefore, you can set a kill timeout value after which the WordLink functions will be aborted. The default value is 300 seconds.
Killing WordLink may leave an invisible instance of Word running. Please check in case of a timeout running processes and kill any zombie Word processes manually using the Process Viewer (Ctrl-Alt-Del on Windows 2000/XP).
|
Name |
WordLinkKillTimeout |
|
Java symbol |
|
|
Type |
Integer |
|
Value |
timeout duration in milliseconds |
Copy temporary .rtf file to debug folder as “basename-tmp.rtf”
This is mainly for debugging purposes. It copies the intermediate RTF file to the specified debug folder with a name of basename-tmp.rtf after having applied all WordLink functions. This is the file that the RTF importer itself takes as source for its actual conversion process.
|
Name |
WordLinkCopyToOutput |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
true | false |
This module requires an appropriate UPL feature included in your license to be fully functional.
This module lets you run a program written in the Upcast Processing Language (UPL).
UPL code
This contains the UPL code you want to execute. The code must define a function main() as follows:
function main() as Value {
... your code goes here ...
}
The UPL Processor calls this function main() once when it runs and executes the code defined therein (or in any dependent, user-defined functions). For a detailed description of UPL, see the separate documentation, Upcast Processing Language.
The returned result of the function is stored into the pipeline variable ModuleResult.
|
Name |
UPLCode |
|
Java symbol |
|
|
Type |
String |
|
Value |
UPL source code |
UPL parameters
This contains the assignments for variables to be passed to the UPL program. For a detailed description of how UPL receives parameter values, see the separate documentation, Upcast Processing Language.
A parameter definition must follow this syntax:
paramname ‘:=’ ‘”’ value ‘”’;
Quotes within the parameter value must themselves be quoted using the backslash character ‘\’.
The text in this field and any contained variable references are resolved as follows:
Any references to the include realm are resolved.
The individual assignments are parsed according to the above syntax.
Only now, any further references to realms other than include are resolved separately for each parameter assignment’s value part of a parameter assignment.
This algorithm covers the usual cases where you might want to include constant parameter assignment code shared by several UPL modules in a pipeline using an include variable reference, but also reference variable values without having to worry whether their actual value breaks the value assignment syntax described above.
|
Name |
UPLParameters |
|
Java symbol |
|
|
Type |
String |
|
Value |
(string in same format as in UI) |
This module requires an appropriate UPL feature included in your license to be fully functional.
This module differs from the UPL Processor in that it does not call a single function once, but you can define code to be run upon visiting each node of the current internal document in a depth-first traversal, depending on certain conditions you specify.
This contains the UPL code you want to execute. For a detailed description of the UPL, see the separate documentation, Upcast Processing Language.
|
Name |
UPLCode |
|
Java symbol |
|
|
Type |
String |
|
Value |
UPL source code |
This contains the assignments for variables to be passed to the UPL program. For a detailed description of how UPL receives parameter values, see the separate documentation, Upcast Processing Language.
A parameter definition must follow this syntax:
paramname ‘:=’ ‘”’ value ‘”’;
Quotes within the parameter value must themselves be quoted using the backslash character ‘\’.
The text in this field and any contained variable references are resolved as follows:
Any references to the include realm are resolved.
The individual assignments are parsed according to the above syntax.
Only now, any further references to realms other than include are resolved separately for each parameter assignment’s value part of a parameter assignment.
This algorithm covers the usual cases where you might want to include constant parameter assignment code shared by several UPL modules in a pipeline using an include variable reference, but also reference variable values without having to worry whether their actual value breaks the value assignment syntax described above.
|
Name |
UPLParameters |
|
Java symbol |
|
|
Type |
String |
|
Value |
(string in same format as in UI) |
Grouper
When turned on, the grouping algorithm will be run on the internal tree. This will be before the finalize() or finalize-error() UPL method is called.
|
Name |
RunGrouper |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
Grouping processing order
This parameter lets you set the order of the colors in which the grouping should be performed.
With all colors in alphabetical order, all colors that have been used for setting painters or tags are grouped in alphabetical order of the color name. The ordering is the same as the Java class java.lang.TreeSet uses by default on the platform you are running upCast on.
With colors in specified order, then all remaining:, you can specify a list of colors you want to be grouped in a given order before all others in the text field below, with any remaining ones being grouped in alphabetical order (see all colors in alphabetical order) afterwards.
With only these colors in specified order, you can specify the colors and their order of grouping in the text field below. No other colors will be grouped, even if respective painters or tags have been placed on nodes in the internal document tree.
After a run of the grouper, all painters, tags and paintings of nodes are removed from the internal tree. This means that any further grouper instances running after a grouper has already run will have no effect unless new painters and tags have been placed on nodes of the internal document tree, usually using the UPL processor.
Color order is specified by listing the colors in sequence, separated by whitespace. If a color name includes whitespace (which is deprecated), the full color name must be enclosed in double quotes.
|
Name |
GroupingColorOrder |
|
Java symbol |
|
|
Type |
String |
|
Value |
alphabetic | only | first |
|
Name |
GroupingColors |
|
Java symbol |
|
|
Type |
String |
|
Value |
ordered list of color names, separated by whitespace; to use color names that themself contain whitespace, surround them by double-quotes |
Element Splitter
When turned on, any split actions attched to nodes using mark-split() in the internal tree will be executed. This will be after running the grouper (if enabled), but before the finalize() or finalize-error() UPL method will be called.
|
Name |
RunSplitter |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
The Sectioner module is used for creating a nested, deeper structure based specifically on elements that have a heading level set (via set-heading-level() in UPL), and uci:part elements that have the grouping property set (via set-grouping() in UPL).
Sectioning works only on the direct children of the uci:body element in the upCast Internal DTD.
If the algorithm finds a uci:part element, it checks its grouping property. If this uci:part is a grouping part, all uci:body element children between this uci:part element and the next uci:part element that has the grouping property set will be surrounded by this uci:part element.
Example 8.2. Example:
… <part is-grouping=”true”/> <par>…</par> <par>…</par> <part is-grouping=”false”/> <par>…</par> <part is-grouping=”true”/> <par>…<par> …
will be transformed by a run of the Sectioner into
… <part is-grouping=”true”> <par>…</par> <par>…</par> <part is-grouping=”false”/> <par>…</par> </part> <part is-grouping=”true”> <par>…<par> …
Note that namespace prefixes/definitions have been omitted in the above for better readability.
<part> is grouping (by default)
When checked, even though you may not have specified this explicitly on each uci:part element (e.g. in UPL), all uci:part elements are treated as if they had set the grouping property by default. This mimics the behavior of pre-6.0 versions of upCast.
|
Name |
PartIsGrouping |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
true | false |
Any elements that have a uci:heading-level attribute with a value greater than 0 are considered headings of the respective structure level. The Sectioner creates sections based on the heading level information on those elements by automatically creating a surrounding uci:section element, taking care to match the section nesting to the element’s heading level. This means that if there is a jump in heading level, the Sectioner will automatically generate additional, grouping uci:section elements.
When an element with the same heading level is found as the current section nesting, the current section is closed and a new one is opened at the same level.
When an element with a higher heading level than the current one is encountered, a new, nested section is created within the current section.
When an element with a lower heading level than the current one is encountered, the appropriate number of open, nested sections is closed (including the one with the same nesting level) and a new one is opened.
Here’s an example demonstrating all possible cases (assume all elements and attributes being in the uci namespace):
Example 8.3. Example: section nesting based on paragraph’s heading level
<par>…</par> <par heading-level=”1”>…</par> <par>…</par> <par heading-level=”2”>…</par> <par>…</par> <par heading-level=”4”>…</par> <par>…</par> <par heading-level=”3”>…</par> <par>…</par> <par heading-level=”1”>…</par> <par>…</par>
will result in the following structure generated:
<par>…</par> <section level=”1”> <par heading-level=”1”>…</par> <par>…</par> <section level=”2”> <par heading-level=”2”>…</par> <par>…</par> <section level=”3”> <section level=”4”> <par heading-level=”4”>…</par> <par>…</par> </section> </section> <section level=”3”> <par heading-level=”3”>…</par> <par>…</par> </section> </section> </section> <section level=”1”> <par heading-level=”1”>…</par> <par>…</par> </section>
Note that namespace prefixes/definitions have been omitted in the above for better readability.
The sectioning algorithm can be modified by two options:
Create <section> for empty headings
The default sectioning algorithm only creates a new section for the first of consecutive elements having a uci:heading-level attribute of the same value (if it is not empty).
The idea behind this option is that the user may have created a heading in Word, then hit return (not changing the style) to create visual space, and only then started writing the actual content. You certainly would not want to have a section on its own for each of the visual space generating empty heading-styled paragraphs, but only for the first one, so section nesting generation is suppressed for the remaining heading-styled paragraphs.
If, however, you want to create section nesting corresponding to each heading-styled paragraph in a document, even if it’s empty, check this option.
|
Name |
GroupEmptyHeadings |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
true | false |
Create <sectionintro> around leading section content
Sections created using the sectioning algorithm may have leading content before any subsections they may also have. Checking this option allows you to have this leading content up to the start of the first nested (sub-) section be grouped by an uci:sectionintro element , e.g. for easier post-processing later with XSLT.
You can choose whether you want the uci:sectionintro element be created in any case (always) or only when the respective uci:section actually has sub-sections (when sub-sections exist).
In this example, assume all elements and attributes being in the uci namespace:
Example 8.4. Grouping the section introduction
<par heading-level=”1”>…</par> <par>…</par> <table>…</table> <par>…</par> <par heading-level=”2”>…</par> <par>…</par>
will be transformed to the following when Create <sectionintro> around leading section content is checked with the always option:
<section level=”1”> <sectionintro> <par heading-level=”1”>…</par> <par>…</par> <table>…</table> <par>…</par> </sectionintro> <section level=”2”> <sectionintro> <par heading-level=”2”>…</par> <par>…</par> </sectionintro> </section> </section>
or it will be transformed to the following when Create <sectionintro> around leading section content is checked with the when sub-sections exist option:
<section level=”1”> <sectionintro> <par heading-level=”1”>…</par> <par>…</par> <table>…</table> <par>…</par> </sectionintro> <section level=”2”> <par heading-level=”2”>…</par> <par>…</par> </section> </section>
Note that namespace prefixes/definitions have been omitted in the above for better readability.
|
Name |
GroupSectionIntro |
|
Java symbol |
|
|
Type |
String |
|
Value |
never | always | child |
This module is deprecated and must no longer be used in new development of processing pipelines. It will be removed completely in a future version of upCast. Update any of your existing pipeline definitions as soon as possible by transitioning to the use of the functionally equivalent Grouper option of the UPL Tree Processor module.
The Grouper module actually performs a grouping that has been earlier specified during a run of an UPL Tree Processor.
Grouping processing order
This parameter lets you set the order of the colors in which the grouping should be performed.
With all colors in alphabetical order, all colors that have been used for setting painters or tags are grouped in alphabetical order of the color name. The ordering is the same as the Java class java.lang.TreeSet uses.
With colors in specified order, then all remaining:, you can specify a list of colors you want to be grouped in a given order before all others in the text field below, with any remaining ones being grouped in alphabetical order (see all colors in alphabetical order) afterwards.
With only these colors in specified order, you can specify the colors and their order of grouping in the text field below. No other colors will be grouped, even if respective painters or tags have been placed on nodes in the internal document tree.
After a run of the grouper, all painters, tags and paintings of nodes are removed from the internal tree. This means that any further grouper instances running after a grouper has already run will have no effect unless new painters and tags have been placed on nodes of the internal document tree, usually using the UPL processor.
Color order is specified by listing the colors in sequence, separated by whitespace. If a color name includes whitespace (which is deprecated), it must be enclosed in double quotes.
|
Name |
GroupingColorOrder |
|
Java symbol |
|
|
Type |
String |
|
Value |
alphabetic | only | first |
|
Name |
GroupingColors |
|
Java symbol |
|
|
Type |
String |
|
Value |
ordered list of color names, separated by whitespace; to use color names that themself contain whitespace, surround them by double-quotes |
This module imports any XML document into the internal tree variable, replacing any existing document. This is useful when you want to apply some of the specific UPL functions on it and need not rely on styling info (which is currently not imported/recognized and cannot be created within upCast).
Source File
This parameter lets you choose the source XML file to import.
|
Name |
SourceFile |
|
Java symbol |
|
|
Type |
Object |
|
Value |
absolute file path in URL or local file system convention |
This module serves for serializing the internal tree to XML. It offers a choice for the table model to write (internal, HTML or CALS), debugging and pretty-printing options. It also offers choices for handling images in the document (separate for referenced/linked images and embedded images) and you can use a Unicode Translation Map.
General
Destination File
Choose the full filename into which the result should be written. You can use upCast’s variables for building the path.
|
Name |
DestinationFile |
|
Java symbol |
|
|
Type |
String |
|
Value |
absolute path to desired result file |
Output resolution
Specify the output resolution in dpi. This value is used for calculating device pixel values, e.g. in HTML tables’ cell widths or images’ sizes.
|
Name |
OutputResolution |
|
Java symbol |
|
|
Type |
Double |
|
Value |
1..9999 |
Output file encoding
Lets you specify the encoding in which the XML file will be written. If your further tool chain allows it, we strongly recommend to use the default, UTF-8.
|
Name |
OutputEncoding |
|
Java symbol |
|
|
Type |
String |
|
Value |
Java encoding name |
Include generator info as comment
When checked, adds info about when and by which version of upCast the XML file was produced to that file as an XML comment. This may be useful both for infinity-loop support during trouble shooting and for you, when you need to relate some produced XML files to a certain version (in time) of your pipelines.
|
Name |
IncludeGeneratorInfo |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
true | false |
Mark individual text() nodes using ‘[#[…]’ (for debugging only)
When checked, each single text node in the internal tree will be surrounded by “[#[” and “]” respectively. This allows you to better understand the internal tree and may help in diagnosing problems with any XPath queries or XSLT transformations by showing text node boundaries explicitly.
|
Name |
MarkTextNodes |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
true | false |
Pretty-print output
Turns on pretty-printing the output for elements whose whitespace handling mode is known explicitly to the serializer.
|
Name |
PrettyPrint |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
true, false |
Table model
This parameter lets you choose which table model to use for tables. You can either choose the native (upCast) table model, which is a very simple table > row > cell model, the HTML 4 table model, or the OASIS-EM (CALS) (OASIS XML Exchange Table Model, a subset of CALS) table model.
The HTML 4 table model uses the HTML namespace http://www.w3.org/HTML/1998/html4, the CALS table model uses the special, proprietary namespace http://www.infinity-loop.de/namespace/2006/upcast-cals.
|
Name |
TableModel |
|
Java symbol |
|
|
Type |
String |
|
Value |
HTML | CALS | native |
Style information
Lets you specify how general CSS styles for known elements and named styles for paragraphs and inline elements should be exported. Options are:
none No style info is exported at all. This does not effect local styles on elements, which will be written in any case according to the “Explode CSS style info” setting.
internal (<style> element) The style info is written as CSS code in the special element uci:style (in the upCast internal namespace) in the document’s uci:head element.
external (default file) Writes a stylesheet processing instruction to point to a CSS file named basename.css in the same folder as the resulting XML file. This file can e.g. be created using the CSS Exporter module.
custom stylesheet PI this lets you specify a custom stylesheet processing instruction to e.g. link to a general CSS file you wish to use in all of the converted documents.
|
Name |
StylesheetMode |
|
Java symbol |
|
|
Type |
String |
|
Value |
none | internal | external | custom |
|
Name |
CustomStylesheetPI |
|
Java symbol |
|
|
Type |
String |
|
Value |
custom stylesheet PI string |
Images
During import, e.g. using the RTF importer, all references to images are made absolute and stored this way in the internal tree as follows:
Embedded images are written to disk into a temporary location and possibly a format conversion is applied. The internal tree at this point holds the absolute path to these temporary image files.
Linked (or referenced) images are stored with their absolute path to the original image; no matching files for linked images are created in the temporary image files location.
At export time, you can decide how the image location information (and possibly the actual image files) should be handled. The handling mode can be set individually for images that were embedded in the original document and (external) images that were only linked to.
Embedded Images
This parameter governs the handling of images that originally had been embedded in the source document.
remove image from Destination File The uci:image element is completely dropped from the XML output.
copy to Image DestinationFolder (new file) This option copies the temporary image file to the Image Destination Folder. If a file of the desired name already exists at that location, a unique file name is generated by appending -1, -2 etc. to the basename until a name is found that is not already used. A relative reference to this copy is then used in the uci:image element.
copy to Image DestinationFolder (replacing) This option copies the temporary image file to the Image Destination Folder. If a file of the desired name already exists at that location, it is overwritten without prompting. A relative reference to this copy is then used in the uci:image element.
internal tree format (don’t copy) This option writes the absolute path to the temporary file as it is currently set in the internal tree unchanged. This is useful for checking how the internal tree looks like at a certain point in a module chain.
The temporary image files will be deleted automatically after a pipeline execution for a certain document. This means that when using the internal tree format (don’t copy) option, the referenced image in the generated XML will have been deleted!
|
Name |
EmbeddedImagesHandling |
|
Java symbol |
|
|
Type |
String |
|
Value |
discard | copy | copyreplace | internal |
Referenced images
This parameter governs the handling of linked (referenced) images in the original source file.
remove image from Destination File The uci:image element is completely dropped from the XML output.
copy original to Image DestinationFolder (new file), update link This option copies the original, linked-to image file to the Image Destination Folder. If a file of the desired name already exists at that location, a unique file name is generated by appending -1, -2 etc. to the basename until a name is found that is not already used. A relative reference to this copy is then used in the uci:image element.
If the original file is not accessible from the machine that executes the pipeline (be it due to network failure, the file not existing or some other problem), the option update link for Destination File location is used as fallback instead.
copy original to Image DestinationFolder (replacing), update link This option copies the original, linked-to image file to the Image Destination Folder. If a file of the desired name already exists at that location, it is overwritten without prompting. A relative reference to this copy is then used in the uci:image element.
If the original file is not accessible from the machine that executes the pipeline be it due to network failure, the file not existing or some other problem), the option update link for Destination File location is used as fallback instead.
keep verbatim link to original This option writes the reference the same way it was found in the original source document. This therefore may be an absolute or relative path.
Note that if the original location specification in the source RTF file was relative, but the XML file is not saved into the same folder as where the source document is located, chances are that the link is broken.
update link for Destination File location This option updates the reference to the original image in a way that it still points to that very image, even when the destination of the XML file is in a different folder (when the original reference was relative).
|
Name |
LinkedImagesHandling |
|
Java symbol |
|
|
Type |
String |
|
Value |
discard, copy, copyreplace, keep, update |
Image Destination Folder
You can specify a separate folder dedicated for images. By default, this is set to ${module:DestinationFile#urlpath}, which evaluates to the same folder where the XML file is saved. However, if you want to put images into a separate folder, you can do this here. This is the folder where any of the above options that physically copy the image file will place the file. Any relative references to the image from within the XML file will be adjusted accordingly.
|
Name |
ImageDestinationFolder |
|
Java symbol |
|
|
Type |
String |
|
Value |
absolute path to folder |
The nodes in the internal tree may have a very rich set of attributes attached, many of which have only been useful while processing the tree within upCast, e.g. with UPL. Serializing all those attributes may create huge files, where only a fraction of the info contained will be used down the further processing chain of the document. To reduce unnecessary memory consumption and processing time, the XML Exporter offers a way to set up a filter on the attributes serialized for each internal tree node. This is achieved by using a specially formed UPL program in conjunction with the dedicated filtering function filter-attrs().
This filter can be effectively used to reduce the set of CSS properties exploded into attributes to a minimal set that you are actually interested in for further processing, e.g. in an XSLT step.
Attribute Filter
This field holds the UPL program to perform the filtering.
As in the UPL Tree-Processor, you can define several UPL rules. The selector part determines for which kind of node (and possibly more complex conditions) the attribute filter applies. This lets you filter attributes differently on different elements.
The action part is applied, when the selector matches. Although theoretically, you can use the complete range of UPL functionality on such a node, many changes to the node will not be picked up by the serialiazer (except for changes in the node’s attributes), so we recommend against using this UPL program for other things than filtering attributes.
It is important to understand how the context node supplied to the UPL program looks like:
The context node supplied to the UPL program is a temporarily, newly created, artificial, single node. It lives by itself and neither has a parent, nor siblings, nor children. It is neither the node in the context of its later serialization nor the actual node of the internal tree to be serialized, but merely just a lookalike of the former. This means that among other things, you cannot query its context nodes with XPath using eval-xpath().
The context node does not hold synthesized style info, nor does it hold attached user values.
The filtering UPL code is not called for nodes of other DOM node types than Element.
Clicking the Insert defaults button inserts the current upCast default filter setup for new XML Exporter instances before any existing code in the Attribute Filter text field.
|
Name |
SerializationFilter |
|
Java symbol |
|
|
Type |
String |
|
Value |
Advanced
Unicode translation map
This field lets you enter a Unicode Translation Map. You can enter any mappings directly or include an externally created Unicode Translation Map file using the include realm: ${include(encoding=”…”):file}.
|
Name |
UnicodeTranslationMap |
|
Java symbol |
|
|
Type |
String |
|
Value |
Unicode translation map code |
CSS property unit map
Here, you can specify a mapping table that associates any CSS <length> property with a pair of {unit, precision}. When the module needs to write length or size information in form of CSS properties, it consults this list to determine which length unit to use at which precision. For a description of the format, see CSS property unit table.
You can enter any mappings directly or include an externally created CSS property unit table file using the include realm: ${include(encoding=”…”):file}.
|
Name |
CSSPropertyUnitMap |
|
Java symbol |
|
|
Type |
String |
|
Value |
CSS property unit map code |
This module serves for executing external system commands by way of the standard command-line interpreter available on the respective execution platform.
System command
The command to be executed by the underlying system’s command-line interpreter. You can use upCast variables for building the string.
For platform-independent, common file operations, upCast offers some internal “pseudo” commands:
upcast:delete-file filename*
Deletes all listed files.
upcast:copy-file source dest
Copies the file source to the new file dest.
upcast:move-file from to
Moves the file from to its new destination to. This is equivalent to the sequence of commands upcast:copy from to followed by upcast:delete-file from.
upcast:delete-recursively folder-or-file*
Recursively deletes all listed folders and/or files.
This command is potentially dangerous as it can lead to deleting a huge number of files when used carelessly! Please consider using upcast:delete-recursively-restricted instead.
upcast:delete-recursively-restricteddeletionboundaryfolder-or-file*
Recursively deletes all listed folders and/or files that are equal or reside below the specified deletionboundary folder in the file system hierarchy.
This method is fail-fast, i.e. when a specified folder to be deleted is not hierarchically under the deletion boundary, any further actions on it are skipped. This should prevent the case where when you specify a folder where deletionboundary is a descendant of that folder, the complete contents of deletionboundary is deleted. Or, in other words: The specified root path for a recursive deletion operation must already satisfy the deletion boundary restriction to be considered any further.
Example 8.5.
upcast:delete-recursively-restricted “/user/iloop/temp/” “/user/iloop/temp/test.txt”
deletes the file /user/iloop/temp/test.txt because it is a descendant of the deletion boundary folder /user/iloop/temp/.
upcast:delete-recursively-restricted “/user/iloop/temp/” “/user/iloop/”
deletes nothing because the folder /user/iloop/ is not a descendant of the deletion boundary folder /user/iloop/temp/.
|
Name |
Commandline |
|
Java symbol |
|
|
Type |
String |
|
Value |
commandline to execute, either as String or (in UPL or Java API) as List |
Wait for completion
When checked, the command is executed synchronously, i.e. upCast waits until the external command has completed before continuing execution.
Checking for errors occurring during external command execution can only be performed when this option is on. upCast considers any return value other than 0 (zero) an error.
|
Name |
WaitForCompletion |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
true | false |
Example 8.6.
To create a new directory images in the folder specified by the global variable DestinationFolder on a Unix system, you would use the following command-line:
mkdir “${pipeline:DestinationFolder#localpath}/images”
Note the quotes around the parameter to accommodate for path names that contain e.g. space characters.
This module lets you apply an XSLT transformation to some external file (which might be the result of an earlier exporter module). You can choose between the Xalan XSLT processor from the Apache Software Foundation (ASF; http://xml.apache.org/), Saxon 6.5.5 by Michael Kay, or Saxon-B (version 9) from Saxonica (http://www.saxonica.com).
Source File
Specify the file the transformation should be applied to, most probably an XML file. You can use all upCast variables for dynamically creating the full path to the file.
|
Name |
SourceFile |
|
Java symbol |
|
|
Type |
Object |
|
Value |
absolute file path in URL or local file system convention |
XSL Transformation File(s)
Specify the XSLT transformation (“XSLT file”) to apply.
You can specify several transformations (use one line each to specify the full path to an XSLT file) or, in other words: the paths must be separated by a newline character. These will be chained, i.e. the original source file will be processed using the first XSLT file specified, the result will be processed by the second and so on. Note, however, that all transformations share the same XSLT parameters.
|
Name |
Stylesheet |
|
Java symbol |
|
|
Type |
String |
|
Value |
path to stylesheet |
XSLT parameters
Lets you specify parameters to be passed to the transformation. A parameter definition must follow this syntax:
paramname ‘:=’ ‘”’ value ‘”’;
Quotes within the parameter value must themselves be quoted using the backslash character ‘\’.
You may use upCast’s variable system for constructing parameter values.
The text in this field and any contained variable references are resolved as follows:
Any references to the include realm are resolved.
The individual assignments are parsed according to the above syntax.
Only now, any further references to realms other than include are resolved separately for each parameter assignment’s value part of a parameter assignment.
This algorithm covers the usual cases where you might want to include constant parameter assignment code shared by several XSLT Processor modules in a pipeline using an include variable reference, but also reference variable values without having to worry whether their actual value breaks the value assignment syntax described above.
|
Name |
StylesheetParameters |
|
Java symbol |
|
|
Type |
String |
|
Value |
(string in same format as in UI) |
Result file
Specify where the transformation result should be written. You can use all upCast variables for dynamically creating the full path to the file.
|
Name |
DestinationFile |
|
Java symbol |
|
|
Type |
String |
|
Value |
absolute path to desired result file |
XSLT processor
Lets you choose between Xalan and Saxon 6.x or Saxon 9.x as the XSLT processor to use (if available).
|
Name |
XSLTProcessor |
|
Java symbol |
|
|
Type |
String |
|
Value |
xalan | saxon6 | saxon |
This module lets you apply a Unicode Translation Map to an already existing XML document. Additionally, by way of the Output encoding parameter, you can quickly change the character encoding used in an XML file.
Though the implementation tries to preserve the formatting of the original document while doing its thing, there is no guarantee that the result is syntactically equivalent to the input, though structurally, it of course is.
The Unicode Translation Map rules are only applied to the XML document’s text and attribute nodes. Comments and PIs are left unchanged.
Source File
Specify the file the transformation should be applied to, which must be an XML file. You can use all upCast variables for dynamically creating the full path to the file.
|
Name |
SourceFile |
|
Java symbol |
|
|
Type |
Object |
|
Value |
absolute file path in URL or local file system convention |
Unicode Translation Map
This field lets you enter a Unicode Translation Map. You can enter any mappings directly or include an externally created Unicode Translation Map file using the ${include(encoding:=”…”):file} variable reference, which is automatically replaced by the contents of the specified file after reading it using the specified encoding.
When you leave this field completely empty, no Unicode translation is performed. You can use this if the only thing you want to do is changing the character encoding the XML file is in by specifying the desired Output encoding.
|
Name |
UnicodeTranslationMap |
|
Java symbol |
|
|
Type |
String |
|
Value |
Unicode translation map code |
Destination file
Specify where the translation result should be written to. You can use all upCast variables for dynamically creating the full path to the file.
|
Name |
DestinationFile |
|
Java symbol |
|
|
Type |
String |
|
Value |
absolute path to desired result file |
XML version attribute
Specify the value of the version attribute on the XML declaration at the beginning of the result XML file.
If you leave this empty, no XML declaration will be written. The default value is “1.0”.
Note that this is a textual parameter only; specifying e.g. “1.1” does not modify the file written such that it is a valid XML 1.1 file.
|
Name |
XMLVersion |
|
Java symbol |
|
|
Type |
String |
|
Value |
value to be written in the 'version' attribute of the XML declaration; when empty, XML declaration is suppressed |
Output encoding
Lets you specify a name of a supported output file encoding, e.g. UTF-8 or iso-8859-1. This encoding is also specified in the encoding attribute on the XML declaration (if written, see XML Version parameter above).
|
Name |
OutputEncoding |
|
Java symbol |
|
|
Type |
String |
|
Value |
Java encoding name |
DOCTYPE declaration
This lets you add, override or remove an existing doctype declaration in the incoming document.
When this field is a single asterisk (“*”), the doctype declaration in the source document (if present) is passed through as-is.
When this field is empty (“”), any doctype declaration present in the source document is stripped from the output.
When this field contains any other data, that data is written verbatim to the output, replacing any possibly existing doctype declaration in the input document.
|
Name |
DOCTYPEDeclaration |
|
Java symbol |
|
|
Type |
String |
|
Value |
literal value of full DOCTYPE declaration as String; when empty, DOCTYPE declaration is removed, when ' |
This module serves for validating arbitrary XML documents. The module supports validation against an XML DTD, XML Schema and Relax NG.
Source File
Specify the XML file that should be validated. You can use all upCast variables for dynamically creating the full path to the file.
|
Name |
SourceFile |
|
Java symbol |
|
|
Type |
Object |
|
Value |
absolute file path in URL or local file system convention |
Schema type
Specify the type of Schema you want to validate the file against:
XML DTD validate against an XML DTD; the document to be validated must have a valid DOCTYPE declaration
XML Schema validate against an XML Schema; the document must have the respective schema file location attributes
Relax NG validate against a Relax NG schema; you must specify the location and type of the Relax NG schema file using the specific parameters shown when this type is selected (see below)
|
Name |
SchemaType |
|
Java symbol |
|
|
Type |
String |
|
Value |
dtd | xmlschema | relaxng |
Relax NG Schema file
(for Relax NG schema type only)
Specify the location of the Relax NG schema file to validate the Source File against.
|
Name |
RelaxSchemaLocation |
|
Java symbol |
|
|
Type |
String |
|
Value |
absolute file path |
Relax NG Syntax
(for Relax NG schema type only)
Specify the syntax the Relax NG schema file is written in, either XML syntax or compact syntax.
|
Name |
RelaxSyntax |
|
Java symbol |
|
|
Type |
String |
|
Value |
xml | compact |
This module writes an external Cascading Style Sheets, level 2 (CSS2) file comprising all styles (paragraph styles and character styles) used in the current internal document, matching their visual appearance as closely as reasonably possible. The output also includes information on the page setup like paper size and margins.
The CSS2 file written may for example be referenced by a file created by the XML Exporter module.
Selector syntax
Lets you choose which CSS selector syntax should be used:
CSS1 (‘class’ shorthand) Writes selectors using the ‘class’ attribute shorthand: .classname { ... }
CSS2 Selectors Writes selectors according to CSS2 selector syntax rules: *[class=classname] = { ... }
CSS1+CSS2 Writes both ways of expressing the selector so that tools understanding either can pick the one that they understand. First, the shorthand is written, followed by full CSS2 selector.
|
Name |
SelectorSyntax |
|
Java symbol |
|
|
Type |
String |
|
Value |
css1 | css2 | all |
upCast DTD elements namespace prefix
Specify the namespace prefix for the upCast DTD elements that the final XML file is using which includes the generated CSS file by this module.
The default is the empty string, i.e. no namespace prefix used.
Setting this parameter is necessary until widespread support for the CSS Namespaces Module is available. Until then, element names are bound by their qualified name, including namespace prefix plus separating colon (if existant). To generate the qualified element name, the module must be told the namespace prefixes it should use.
|
Name |
UpcastDTDNamespacePrefix |
|
Java symbol |
|
|
Type |
String |
|
Value |
prefix for elements in upCast DTD |
HTML4 DTD elements namespace prefix
Specify the namespace prefix for the HTML4 elements that the final XML file is using which includes the generated CSS file by this module. HTML elements are e.g. used for tables (if you opted for the HTML table model).
The default is html.
|
Name |
HTML4DTDNamespacePrefix |
|
Java symbol |
|
|
Type |
String |
|
Value |
the desired namespace prefix |
Output file
Specify where the CSS file should be written. You can use all upCast variables for dynamically creating the full path to the file.
|
Name |
DestinationFile |
|
Java symbol |
|
|
Type |
String |
|
Value |
absolute path to desired result file |
Output encoding
Lets you specify a name of a supported output file encoding, e.g. UTF-8 or iso-8859-1. This encoding is also specified in the @charset rule at the very beginning of the CSS file.
|
Name |
OutputEncoding |
|
Java symbol |
|
|
Type |
String |
|
Value |
Java encoding name |
This module requires an appropriate RTF Exporter feature included in your license to be fully functional.
The RTF Exporter was formerly a separate product called “downCast”. This module is a much improved version of downCast 1.x, especially in respect to performance (up to 300% faster) .
This module converts XML documents to Word or, more precisely, RTF documents. For specifying the layout, the module relies on a subset of Cascading Style Sheets, level 2 (CSS2) properties, amended by several proprietary properties where needed. Input XML documents must either be valid against the upCast DTD (note that this is different from the upCast internal DTD!), or they can be any arbitrary XML language for which a transformation into the upCast DTD can (and needs to) be created.
For more details on supported CSS and custom properties and their semantics, see the separate RTF Exporter documentation.
Source File
Specify the XML file that should be converted to RTF. You can use all upCast variables for dynamically creating the full path to the file. This must be an XML file conforming to the upCast DTD or – in experimental status – XSL-FO.
|
Name |
SourceFile |
|
Java symbol |
|
|
Type |
Object |
|
Value |
absolute file path in URL or local file system convention |
Destination file
Specify where the RTF result should be written to. You can use all upCast variables for dynamically creating the full path to the file.
When running on Windows and having WordLink installed and functional, by specifying the destination file extension as .doc, you can have the module automatically convert the generated RTF file into a Word binary file.
|
Name |
DestinationFile |
|
Java symbol |
|
|
Type |
String |
|
Value |
absolute path to desired result file |
Source format
Specify the format the source file is in, either upCast DTD or XSL-FO.
|
Name |
SourceFormat |
|
Java symbol |
|
|
Type |
String |
|
Value |
upcast | xslfo |
Output resolution
When the RTF exporter must include images that do not specify their resolution explicitly in the file, the application uses the value that you specify here to calculate image size and resulting scaling factor to apply in the RTF output.
The default value is 96 dpi.
|
Name |
OutputResolution |
|
Java symbol |
|
|
Type |
Double |
|
Value |
1..9999 |
Specifies what the RTF Exporter should do when it encounters images in a format that it cannot handle or that are not supported in RTF, or when an image file to embed into a document is missing.
discard the image is completely removed from the result
show filename only as inline text error text indicating the file name of the missing image is embedded into the output document, prominently visible to the user
show full path as inline text error text indicating the full, absolute path and file name of the missing image is embedded into the output document, prominently visible to the user
show detailed error message as inline text error text indicating the full, absolute path and file name of the missing or unsupported image, including further error details, is embedded into the output document, prominently visible to the user
replace with generic image a generic replacement image is embedded into the final result document, respecting and scaled to the originally requested image size so it does not break the layout of the document
|
Name |
ImageErrorHandling |
|
Java symbol |
|
|
Type |
String |
|
Value |
discard | filename | filepath | details | image |
User stylesheet
Here, you can specify a CSS stylesheet to use for the conversion instead of the stylesheet (possibly) specified in the XML source. You can use all upCast variables for dynamically creating the full path to that file.
|
Name |
UserStylesheet |
|
Java symbol |
|
|
Type |
String |
|
Value |
path to user stylesheet |
Whitespace handler class
For experts only!
The RTF Exporter makes use of special code to handle whitespace characters in the input stream. This field lets you set a custom whitespace handler if this is required. A whitespace handler must be a Java class that implements the WhitespaceHandler interface. If you think you need to implement your own whitespace handler, please contact us directly at <support@infinity-loop.de> in advance.
The default value is ‘*’ (asterisk) which lets the implementation decide on the most appropriate whitespace handler for the input document and should not be changed for normal use.
The module provides three Whitespace Handlers for different situations. You request their explicit use by specifying their full, qualified class name in the Whitespace Handler class input field.
Except for the NoopWhiteSpaceHandler, all are more or less experimental and we do not guarantee their correctness or usefulness.
de.infinityloop.downcast.rtflib.NoopWhiteSpaceHandler This is the default handler for input documents valid according to the upCast DTD. All whitespace is significant in mixed-content elements. It is automatically used when you specify ‘*’ and the Source format is upCast DTD.
de.infinityloop.downcast.rtflib.XSLFOWhiteSpaceHandler This is a white space minimizing handler, minimizing whitespace in mixed-content elements. It is automatically used when you specify ‘*’ and the Source format is XSL-FO. It tries to mimic XSL-FO required behavior when minimizing whitespace before and around inline elements. Whitespace is collapsed to the left.
de.infinityloop.downcast.rtflib.CSS3WhiteSpaceHandler This handler behaves exactly the same as the XSLFOWhiteSpaceHandler, except that it respects the setting of the white-space CSS3 shorthand property, resp. its all-space-treatment component when resolved to its constituent properties. When this has the value preserve, whitespace is preserved in that element, unless overridden in a child element. When this is collapse (the default), the handler behaves as described above. Note that you should explicitly specify the desired behavior on the immediate parent element of (possibly) mixed content.
ID rendering mode
For elements having an id attribute of type ID, you can specify if and how this information should be translated into RTF bookmarks.
don’t render The ID information is not used and no bookmarks are created in the resulting RTF based on an id attribute.
before element only A bookmark with the id’s value is created just before the start of the element’s contents.
after element only A bookmark with the id’s value is created immediately after the full contents of the element has been written to the RTF file.
surround element A bookmark with the id’s value is created that starts just before the start of the element’s contents and ends just after the full contents of the element has been written to RTF, i.e. the bookmark spans the contents of the element.
|
Name |
IDRenderMode |
|
Java symbol |
|
|
Type |
String |
|
Value |
surround | ignore | before | after |
Style name output format
Determines how style names should be written to the RTF stylesheet destination. When Unicode, we generally use Unicode characters to express possible umlauts; if normal (use document encoding), the document encoding is used wherever possible.
|
Name |
StyleNameFormat |
|
Java symbol |
|
|
Type |
String |
|
Value |
unicode | normal |
Table ‘frame’ attribute overrides cell border definitions
When checked, the frame attribute on table elements overrides any settings of cell borders that border on the outmost surrounding table border.
When not checked, a cell’s border CSS definition takes highest precedence in rendering.
|
Name |
FrameOverridesCells |
|
Java symbol |
|
|
Type |
String |
|
Value |
true | false |
This module lets you execute another, external pipeline document as a sub-pipeline within the current pipeline execution. The execution can be conditional, i.e. based on arbitrary UPL code that finally must return either true or false.
It is not possible to provide the external pipeline in form of a Java Stream object, it must be an external file residing in the file system.
Source File
The path to the external pipeline document (.ucdoc) to include in the current pipeline.
|
Name |
SourceFile |
|
Java symbol |
|
|
Type |
Object |
|
Value |
absolute file path in URL or local file system convention |
Pipeline variables
This lets you choose how pipeline variables for the included pipeline should be created:
Use independent variables in sub-pipeline the included pipeline gets its own, initially empty set of pipeline parameters. Think of this as when running that pipeline as a completely independent pipeline
Copy variables to sub-pipeline this creates a copy of the current pipeline variables and passes it on to the included pipeline. This lets you pass all the current pipeline variables to the included pipeline. When the included pipeline modifies any variables, this only affects itself, but not the calling pipeline. This way, it is possible to provide values (like “parameters”) to the included pipeline. When execution of the included pipeline finishes, the pipeline variables of the calling pipeline will be in exactly the same state as before running the included pipeline. Effectively, the included pipeline can not have any side-effects on the callers set of variables.
Share variables with sub-pipeline in this mode, the included pipeline uses the same instance of pipeline variables as the caller. This means that the included pipeline receives and can modify the pipeline variables of the including pipeline. This way, it is possible to provide values (like “parameters”) to the included pipeline, and have the included pipeline “return values” by setting them in the pipeline variables.
The only exception to this rule is the pipeline:base variable, which is not inherited but set according to the included pipeline’s location on disk so that relative references therein are resolved properly. After the sub-pipeline’s execution, the original value is restored for the pipeline:base variable before continuing in the calling pipeline.
To have more control per parameter how it behaves in sub-pipeline execution environments, there is a specific property for specifying the setting behaviour. For each parameter, you can specify the initialize-when property, with values never, always, unset or an arbitrary <string-value>. The default value is unset. Here’s an outline of what happens with respect to pipeline parameters during a sub-pipeline call in all of the three cases above:
The pipeline variables pool of the sub-pipeline to be called is initialized or created according to the above parameter.
The pipeline:base variable is set to the appropriate value depending on the storage location of the sub-pipeline.
Then, for each parameter defined in the sub-pipeline:
If the parameter’s initialize-when value is unset (or the property is not defined) and the pipeline variable pool does not already contain a variable by that name:
If the parameter is a persistent parameter, a new variable is created in the pipeline parameters with that parameter’s current value stored in the sub-pipeline document as value.
Otherwise, if the parameter is not a persistent parameter, but has a default value defined, a new variable is created in the pipeline parameters with that parameter’s default value as its value.
Otherwise, if it’s neither a persistent parameter nor has it a default value, that parameter is left undefined (and will probably cause execution errors later if the pipeline tries to access that value). This condition is considered a programming error.
If the parameter’s initialize-when value is always:
If the parameter is a persistent parameter, a new variable is created (or a possibly existing variable overwritten) in the pipeline parameters with that parameter’s current value stored in the sub-pipeline document as value.
Otherwise, if the parameter is not a persistent parameter, but has a default value defined, a new variable is created (or a possibly existing variable overwritten) in the pipeline parameters with that parameter’s default value as its value.
Otherwise, if it’s neither a persistent parameter nor has it a default value, that parameter is left undefined (and will probably cause execution errors later if the pipeline tries to access that value). This condition is considered a programming error.
If the parameter’s initialize-when value is never, no further actions are taken.
If the parameter’s initialize-when value is a string value and either the pipeline variable pool does not already contain a variable by that name or any existing variable by that name has the same string value as the string specified for the initialize-when value:
If the parameter is a persistent parameter, a new variable is created (or a possibly existing variable overwritten) in the pipeline parameters with that parameter’s current value stored in the sub-pipeline document as value.
Otherwise, if the parameter is not a persistent parameter, but has a default value defined, a new variable is created (or a possibly existing variable overwritten) in the pipeline parameters with that parameter’s default value as its value.
Otherwise, if it’s neither a persistent parameter nor has it a default value, that parameter is left undefined (and will probably cause execution errors later if the pipeline tries to access that value). This condition is considered a programming error.
Example 8.7.
By creating a parameter definition
copyright {
type: text;
label: “Copyright notice:”;
}
in the main pipeline, which calls a sub-pipeline with the definition
copyright {
type: text;
label: “Copyright notice:”;
default: “(c) 2008 My Company”;
initialize-when: “”;
}
the sub-pipeline will have set the copyright pipeline variable to the “(c) 2008 My Company” default value if the user did not provide a value in the text field for Copyright notice in the main pipeline. This lets you implement some sort of default or fallback value mechanism for values that are used in a sub-pipeline if the value has not been set (or, to be exact: has been set to the empty string “”) in the calling pipeline.
|
Name |
PipelineRealmMode |
|
Java symbol |
|
|
Type |
String |
|
Value |
separate | copy | share |
Only run modules with ‘exported’ status
When checked, only modules that have the status exported set will be executed.
You can use this feature like this:
Develop your sub-pipeline on its own. For testing and debugging purposes, you will probably want to provide initial values (using an instance of the Pipeline Variables module) and debugging output within the pipeline using additional instances of the XML (Raw) Exporter modules. Now simply remove the exported status on these modules in the sub-pipeline and check the above option in the importing pipeline.
Effectively, this ensures that all debugging and setup code is only run when you run the sub-pipeline on its own (e.g. during development and isolated debugging), but does not run when the pipeline is included in any other pipelines. No further module activation/deactivation orgies to think of, all done automatically once set up as described – pretty neat, isn’t it?
|
Name |
OnlyRunExportedModules |
|
Java symbol |
|
|
Type |
Bool |
|
Value |
true | false |
Sub-pipeline Parameters
Parameters
Lets you specify parameters to be passed to the called sub-pipeline. This is especially useful when calling the sub-pipeline in Use independent variables in sub-pipeline or Copy variables to sub-pipeline mode. The parameters defined here are explicitly set in the pipeline realm of the sub-pipeline’s variables to the values specified here. This happens before any modules of the sub-pipeline run. Using this mechanism, it is possible to pass certain variable values to the sub-pipeline without having to share the pipeline variable pool with the calling pipeline. Note, however, that resulting variable’s values can not be passed from a sub-pipeline back to the calling pipeline.
A parameter definition must follow this syntax:
paramname ‘:=’ ‘”’ value ‘”’;
Quotes within the parameter value must themselves be quoted using the backslash character ‘\’.
You may use upCast’s variable system for constructing parameter values.
The text in this field and any contained variable references are resolved as follows:
Any references to the include realm are resolved.
The individual assignments are parsed according to the above syntax.
Only now, any further references to realms other than include are resolved separately for each parameter assignment’s value part of a parameter assignment.
This algorithm covers the usual cases where you might want to include constant parameter assignment code shared by several External Pipeline Processor modules in a pipeline using an include variable reference, but also reference variable values without having to worry whether their actual value breaks the value assignment syntax described above.
|
Name |
PipelineVariables |
|
Java symbol |
|
|
Type |
String |
|
Value |
(string in same syntax as in corresponding UI field) |
When working with pipelines, especially ones that are parameterized, it is often convenient to have different sets of parameter settings at hand to run the pipeline. For example, when you are converting documents in the DocBook DTD to your own, you may want to set different header info depending on whether it is a technical article or a medical article. The conversion itself, however, is the same for both. In this case, you’d set up the actual conversion pipeline only once (with the benefit that both document types automatically see any improvements in that pipeline automatically), but have different sets of parameters for the text in the document header area. So what you would do is to create two parameter sets for that single pipeline document and store them in a document separate from the actual implementation logic. Depending on the document you need to convert, you’d just load the respective parameter set document and start the conversion, with all the parameters for that particular document type already set up correctly, with you only having to specify the input file. Well – this is what parameter set documents are for! They separate actual parameter value storage from the pipeline implementation (where they are normally stored as part of the pipeline document).
Parameter sets contain essentially only two types of data:
the current value of all Simple View parameters that have the persistent flag set to true
the Pipeline UID they are based on resp. referring to
That’s all. Parameter sets, in particular, do not contain any application or pipeline logic.
A parameter set is always derived from a single, specific pipeline document. The link to its implementing pipeline is established by way of the pipeline’s UID.
When loading a parameter set, what actually happens is that the pipeline it is based on is loaded automatically, and then the parameter values stored in the parameter set document are automatically set on that pipeline. To the user, it looks like she has just opened a pipeline document in Simple View mode, then set the parameter values as stored in the parameter set. The only difference is that the user cannot actually edit the pipeline implementation, or in other words: he cannot switch to edit mode. The second visual difference is that parameter sets open in a window with blue background, whereas real pipelines display in the default system color background.
When parameter values in a parameter set are edited, they can be saved back to the parameter set using the usual commands in the File menu: Save (to save in the same file, overwriting the old values) or Save As… to save the parameter set in a new file. You can also copy parameter set files on the operating system level and/or rename them.
Now, how do they find their pipeline implementation file when opened? This is done (re-)purposing the well-known and established Catalog system. The only difference is that you do not resolve the PUBLIC identifier of some DTD or entity to an absolute file path, but the pipeline UID, which you can think of the PUBLIC identifier of the pipeline document. This mechanism allows you to configure your system easily so that these Pipeline UIDs can be resolved to the actual, single implementation file from literally anywhere on your network: Just set up a single catalog for all your pipeline implementations and have your users add that file to upCast’s catalog system in the upCast Preferences, Catalog tab.
Example 9.1. Pipeline UID catalog example
A catalog might contain the following entries:
PUBLIC “d3546614-fb0e-4739-bfea-1f74280d9761” “file:///upcast/pipelines/docbook.ucdoc” PUBLIC “ACME-XHTML-conversion-pipelineV1.1” “file:///upcast/pipelines/acme2html.ucdoc”
When you add this file to upCast’s catalog system, you can open a parameter set from anywhere on your local disk or even the LAN and have it automatically load and run the pipeline document it depends on.
In the first line, the Pipeline UID has been auto-generated by upCast and is using a standard UUID.
In the second line, a speaking UID has been chosen by the pipeline author, who of course must be sure and ensure that this ID will never be used in any of the pipelines a potential user will want to run using a parameter set file.
The first parameter set for a certain pipeline must be created by opening it in upCast, then doing a File > Save to Parameter Set… . You will be prompted for a file name to save the parameter set to. Parameter set files always have the extension ucpar (short for upCast parameter set). The pipeline document will be closed and the new parameter set file will be opened in its place.
From there on, you can create additional instances either by repeating the above, or simply by saving copies of an open parameter set.
Note that only the values of parameters are saved that have their persistent property set to true in the implementing pipeline document. The decision on this property is up to the pipeline author. You will see all parameters defined in original pipeline when opening a parameter set, those values will either be empty or filled with the default values the pipeline author has specified for those parameters.
Even when loading a parameter set, be aware that the pipeline variablereference to ${pipeline:base} will resolve to the folder where the implementing pipeline document is located, not where the parameter set document lives.
If you want to specify e.g. file path parameters relatively to the location of the parameter set, you can use the new variable ${pipeline:ParamBase} that is automatically created, and which holds the absolute path to the folder within which the respective parameter set resides on disk..
Even for pipeline documents, ${pipeline:ParamBase} is always defined. In that case, it has the same value as ${pipeline:base}.
In this case, only the parameters that still have their counterpart will be loaded from the parameter set, and for the remaining parameters it will be automatically updated to the new parameter configuration. This is done on a best-effort basis. Incompatible parameter’s values will be discarded.
When the changes are not affecting the configuration of parameters, the pipeline implementation will be re-loaded automatically once you click the Run button. This will only work reliably when your file system delivers correct last modified date information for files.
When changes are also affecting the configuration (number, type, text, defaults etc.) of pipeline parameters, the parameter set will detect this when re-loading the pipeline implementation due to the change and instruct you to close, then re-open the parameter set to have it pick up the changes.
Assuming you updated the respective catalog entry, the parameter set will no longer be able to resolve its id to the required pipeline implementation and therefore cannot be used any longer.
Also, when the pipeline document a catalog UID lookup resolves to does not actually match the requested UID, an error dialog will be shown and the parameter set cannot be used.
In this case, the system will try to load the pipeline implementation from the system path additionally stored in the parameter set. This path holds the absolute path to the pipeline document at the time the File > Save to Parameter Set… command was run. When this file still exists and its a pipeline document that has the requested Pipeline UID, then that pipeline implementation is loaded. Otherwise, an error is issued and the parameter set cannot be opened.
Basically, the action of making consecutive sibling nodes based on certain conditions children of a newly created surrounding element is called grouping. These conditions are exposed to you by way of the unique painter concept.
To understand the painter concept, you first of all need to be fully aware of the following, most important fact: Grouping is always performed on a flat, linear list of nodes. Huh? I thought we’re working on a document tree? Though this is of course true, grouping only occurs among sibling nodes, i.e. all direct children nodes of an individual element. Any element’s direct children can be expressed by an ordered, flat list. Of course, we recursively group on a child’s list of children, but this is a completely independent grouping operation. So again, a single, independent grouping operation is always performed on a flat, ordered list of nodes.
Now, for the following let’s think of nodes being white bricks placed in an ordered row on the floor. These bricks can be painted with one (or even several – think: spotty!) colors. The color indicates the element by which the bricks should be grouped.
The grouper does one very simple thing: It wraps all adjacent, likewise colored nodes in a parent element (think of this being some kind of bag) that has the same name as the color of the nodes it wraps.
So the essential part to be done beforehand is to color the nodes in the desired way. This is a two-step process: First, you need to check the role of each node as far as grouping is concerned and assign it that role by placing a painter on it that knows how to go about painting for this specific role. Second, the painting is actually performed.
In this first step, consider yourself a paint-shop owner, making a work-plan for your painter employees. Equipped with a packet of self-adhesive post-it notes and a pencil, you start figuring out the work to be done at the first node in the list of sibling nodes. For now, you are just interested in determining which nodes should be collected into groups of the color green. You examine the node you are on. For example, you may look at some of its attributes or layout properties, or perform a more complex examination which may include evaluating a boolean XPath expression. After some pondering, you will come to a certain conclusion as to the role of the node you are currently standing on. This can be one of the following:
You know that this node will always start a group of the color you are currently considering (i.e. green). Therefore, you write “start green” on one of your post-its and tack that to the node.
You know that this node will always end (and therefore be the last one in) a group of the color you are currently considering (i.e. green). Therefore, you write “end green” on one of your post-its and tack that to the node.
Now it is time to think of which of your painter employees is best suited for the painting job. For this you have to evaluate the constellations that may happen in your document regarding the nodes that should be grouped.
For example, you may know that if you don’t find a node starting a group and a node ending the group, the grouping should not occur. In other words, the known start and end nodes (i.e. nodes that fulfill the requirements for being tagged as such) are required for a grouping to happen.
Other situations could be as follows: group from a start node to the next start node, group from an end node to the next end node, group adjacent likewise colored nodes, etc. For each of these situations, you have dedicated painters. To have them do their work in the next step, you place them on nodes.
Suppose in our example, we require a start and end node for a grouping to happen, and we have just tagged the current node as a start node. We therefore choose a start-end painter and place it on the current node.
When we have done both, tagged the node (if possible) and placed a painter (if we could determine a suitable), we move on to the next node in the ordered sequence and start over.
Finally, we’ll reach the last node in the sibling node sequence and will have tagged some nodes and/or placed painters on some of the nodes. Now, all preparation work is done and we can tell the painters to do their work, i.e. start painting.
Now, consider yourself a painter, with a bucket of color of a certain kind (the color-“name” corresponds to the element name that should be the grouping element later). In the previous step, you have been placed on some node in the sequence.
Depending on your kind, you try to paint from your location.
In our example, you are a start-end painter. This means from the place you are at, you look in direction of the start of the sequence and look for the nearest node that has been tagged with a “start green” label. (This may be the node you are standing on.) If you find such a node, you remember it. If you do not find such a node, you cannot fulfill your task (which is “Paint from start node to end node”) and give up, not painting anything.
Next, you look into the direction of the end of the sequence and look for the nearest node tagged with an “end green” label. (This may, again, be the node you are standing on.) If you find that as well, you can fulfill your painting job and start painting all nodes from the start node you found to the end node you found (including both). Then, you are finished.
The above is repeated for all painters that have been placed on nodes in the current node sequence. After this has been finished, the complete sequence got painted in a way that the actual grouping can take place, based on the paint color information on each node and the start and end tagging.
For each color, a node can have either no tag, or it can be tagged as a start node, tagged as an end node, or tagged as both, start and end node for that respective color.
These tags can currently be set using the UPL functions mark-start() and mark-end().
The example in the introduction to the painter concept already mentioned the start-end painter type. Painters can be placed on a node using the UPL function set-painter().
Note that you can place an ordered list of painters for a single color on a node. The idea is to have fallback painters when the first one fails to paint because its requirements cannot be fulfilled (like e.g. for a start-end painter, when there’s either no start tag or end tag). In such a case, painting using the second-specified painter is tried. If that cannot paint as well due to unsatisfied requirements, the next painter is tried and so on until either a painter is able to paint, or the end of the list is reached, in which case no painting occurs.
In the examples below for each painter, we use the following symbols:

Follows a description of all available painter types:
This painter will paint from the nearest start-tagged node of the node sequence (in direction to the start) to the nearest end-tagged node (in direction to the end), observing its own node.
There may be no end-tagged node between the painter and the nearest start-tagged node, nor a start-tagged node between the painter and the nearest end-tagged node. In both of these cases, the painter will fail. The “-“ in the name symbolizes a direct, uninterrupted (by other tags) sequence of nodes between start and end.

This is the same as start-end, but it allows differently tagged nodes between the nearest start- and end-tagged nodes (see last two examples). The “*” in the name symbolizes a wildcard sequence of tagged nodes between start and end.
The painter will fail if either there’s no start-tagged node earlier in the node list or no end-tagged node later in the node list.

This painter will paint from the last start-tagged node up to the one it was placed on.
There may be no end-tagged node between the painter and the nearest start-tagged node. If this is the case, the painter will fail. The “-“ in the name symbolizes a direct, uninterrupted (by other tags) sequence of nodes between start and painter position.
The painter will also fail if there is no start-tagged node earlier in the node list.

This is the same as start-here, but it allows end-tagged nodes between it and the nearest preceding start-tagged node (see last two examples). The “*” in the name symbolizes a wildcard sequence of end-tagged nodes between start and painter node.
The painter will fail if there is no start-tagged node earlier in the node list.

This painter will paint from the node it is placed on up to the next end-tagged node.
There may be no start-tagged node between the painter and the next end-tagged node. If this is the case, the painter will fail. The “-“ in the name symbolizes a direct, uninterrupted (by other tags) sequence of nodes between start and painter position.
The painter will fail if there is no end-tagged node later in the node list.

This is the same as here-end, but it allows start-tagged nodes between it and the nearest following end-tagged node (see last example). The “*” in the name symbolizes a wildcard sequence of start-tagged nodes between painter node and next end-tagged node.
The painter will fail if there is no end-tagged node later in the node list.

This painter paints from the nearest preceding start-tagged node to the next start-tagged node (not including the latter).
It fails if there is an end-tagged node in-between. It also fails if there is no preceding or following start-tagged node.

This is the same as the start-start painter, however end-tagged nodes between those marked with a start tag are allowed (see examples 7 and 8).

This painter paints from the nearest preceding end-tagged node to the next end-tagged node (not including the former).
It fails if there is a start-tagged node in-between. It also fails if there is no preceding or following end-tagged node.

This is the same as the end-end painter, however start-tagged nodes between those marked with an end tag are allowed (see examples 2 and 4).

Grouping is performed on the whole document tree in a bottom-up document order. It is performed individually for each element’s children. It is also performed in a defined color order that you can specify, i.e. colors are always processed in a defined order.
Grouping does take into account node start and end tags. This is necessary in order to support directly adjacent groups. If grouping was only based on contiguous coloring, adjacent groups would not be possible since the grouper would not know where to split contiguously colored nodes into groups. In this, tags live up to their original roles, that is start tags always start a new group on that respective node and end tags end the currently open group after that node.
The following sample graphics shows – for a single color – how grouping takes place in a specific painting/tagging situation:

Group 1 is delimited by the start tag of node #2.
Group 2 is delimited by the end tag of node #2.
Group 3 is delimited by the end tag of node #4.
Group 4 is delimited by the end tag of node #7.
Group 5 is delimited by the non-painted node #9.
Group 6 is delimited by the end of the node sequence.
When placing tags on nodes it is therefore important to always bear in mind that these tags will also govern the final grouping in situations where painted nodes are adjacent.
Follow some examples you may encounter in one or another form in your own grouping requirements:
Suppose you want to group adjacent paragraphs that are of class “Note”, because you want to group them using a note element.
The UPL code you should run before the grouper in the UPL processor should look like:
[element(uci:par) and @uci:class=”Note”] {
set-painter( note, {”this”} );
}
This will set a painter of color “note” and type this on all uci:par elements that are of class “Note”. During painting, those nodes will be painted with the specified color, and during grouping all contiguously adjacent, likewise colored node groups will be grouped by an <uci:block uci:type=”note”>…</uci:block> element.
Suppose you want to group nodes where you know exactly which conditions must be met by a node to start a group, but you don’t know the end. What you additionally do know is which kind of nodes are certainly part of the group (if they exist).
Let’s say we have the following XML fragment of sibling nodes:
➊ <p>Some text.</p> ➋ <p class=”example-title”>Example</p> ➌ <p class=”example-text”>Fruits are:</p> ➍ <list> <item>apples</item> <item>bananas</item> </list> ➎ <p class=”example-text”>All these can be bought at Miller’s.</p> ➏ <p>As you have seen,…</p>
In this example, you know that paragraphs of class “example-text” always are part of an example, and that an example is always started by a paragraph of class “example-title”. You do not know more, i.e. there may be arbitrary elements in-between like the list element in the example.
A suitable UPL code to group the elements #2 to #5 could be:
[element(p) and @class=”example-title”] {
mark-start( example );
set-painter( example, {”this”} ); /* optional, see below */
}
[element(p) and @class=”example-text”] {
set-painter( example, {”start-here”} );
}
What does this do?
First, a start tag with color “example” is set on node #2, along with a painter that only colors itself. This is necessary when an example is allowed to only consist of an “example-title”-paragraph. If you require an example to at least have one “example-text”-paragraph to be a valid example, don’t use the line of code marked optional in the above.
Then, a painter of color “example” is placed on node #3 that paints from the nearest preceding start tagged node of color “example” up to itself. On the list element (#4), no painter or tag is set. On node #5, we again set a painter of color “example” that paints from the nearest preceding start tagged node of color “example” up to itself.
This happens during the run of the UPL program in the UPL Tree-Processor module.
Now, it’s the grouper’s turn, and it is about to perform the grouping for the color “example”. As we have seen above, the first thing it does is apply the painting through the painters. The painters execute in document order, one after the other, so you get the following sequence of painting and – finally – grouping:

<uci:block uci:type=”example”>…</uci:block> element.Note how the list node #4 is painted by painter P3 even though it has neither been tagged nor has a painter been placed on it. Instead of the list node, any number of nodes not known in advance could have been present between node #3 and #5, and they would have been automatically grouped into an “example”. This is a very important fact to both keep in mind and utilize to your advantage, for example in documents that have no strict, dependable structure but where you must work with only few known node constellations.
But what if…? Sure you have asked yourself, “But what if some badly authored document contains an ‘example-text’-paragraph without a preceding ‘example-title’-paragraph?” Here, the precise definition of the painter types comes into play.
Let’s assume node #2 is removed from the above example sequence. In this case, painter P2 would be the first painter to be executed. It is of type start-here, which fails if no suitable start-tagged node is found – which is the case here: there is no start-tagged node at or earlier in the node sequence. P2 fails, and a painter failing means it does not paint anything. The same is true for painter P3, with the effect that no node gets painted at all if node #2 (i.e., a start-tagged node) does not exist. Consequently, no grouping will occur.
Maybe that is not what you want. Maybe you want semantics like, “If a start-tagged node exists, then use that. If, however, it doesn’t, then at least make the individual ‘example-text’-paragraphs groups.” This is where the painter fallback types come in handy. For the above, you’d need to change the UPL code as follows:
[element(p) and @class=”example-title”] {
mark-start( example );
set-painter( example, {”this”} ); /* optional, see below */
}
[element(p) and @class=”example-text”] {
set-painter( example, {”start-here”, “this”} );
}
Note the added painter type this in the second rule. This has the effect that when the first painter type (start-here) fails, the next – this – is tried, which – as already described – only paints the node the painter was placed on. So if node #2 was missing in our example, with the new UPL code we’d make sure that at least the paragraphs of class “example-text” would get painted, and therefore grouped, either on their own as in our example or, if adjacent, as a whole.
More real-world examples will be posted as supplemental material on our website in form of tutorials and how-tos in the following weeks and months.
upCast offers a convenient helper class for running pipeline documents from the commandline. It also allows you to pass parameter values to the pipeline if they have been defined in the Pipeline Settings > Pipeline Parameters tab.
The commandline pipeline document interpreter class reads the specified pipeline document and looks for all defined pipeline parameters in it:
If a parameter has the property required set to true, a value for it must be specified on the commandline. If no value is specified, the execution is stopped and an appropriate error message is output to the console.
If a parameter does not have the required property set to true, and if no value for it is specified in the commandline call, and if it has its default property specified, that specified value is set.
If a parameter does not have the required property set to true, and if its default property has not been specified, and if no value for it is specified in the commandline call, then this parameter will not be set at all. Trying to retrieve the value for such a parameter during pipeline execution will result in an error to the effect that the requested parameter resp. pipeline variable is undefined.
If a parameter is specified on the commandline that is not defined as a parameter in the pipeline, an error is issued to the console and execution is halted.
After these checks, the parameter values that are defined will be set as variables in the pipeline realm (similar to how is the case when running the pipeline in Simple View mode), and then modules will be executed in the order as defined in the pipeline document.
A pipeline document to be run by the commandline interface must be self-contained, i.e. it must explicitly specify
its license file
catalogs to be used
font configuration definitions or overrides
any custom encodings to be used
You should make sure that pipeline documents intended to be run via the commandline do not have their Use application settings checkbox checked on their Pipeline Settings > Catalogs, Pipeline Settings > Font configuration, Pipeline Settings > Encodings and Pipeline Settings > License tab.
Note that by default, upCast’s built-in templates have this checkbox checked!
For parameters of type popup, the internal value (from the internal-values property list) must be passed in as the parameter value, not the displayed value.
java –cp upcast.jar de.infinityloop.upcast.RunPipeline parameters...
with parameters being:
[0] absolute path to the pipeline document to be run
[1..n] standard options
Standard options are as follows:
-p name value set the pipeline parameter name to the value value
-debug N turn on debug output for the conversion with specified level of verbosity; N is a number between 0 (least verbose) and 10 (annoyingly verbose)
-version display upCast version information
-help show help on the defined parameters for the specified pipeline document
The use of XML namespaces is a core concept of upCast. Namespaces are essential to the processing pipeline, since they allow the clash-free co-existence of user-defined attributes and elements with upCast’s automatically generated elements and attributes. Clear separation of element and attribute domains allows targeted, semantically clear selection and filtering of the rich information present in the internal tree at serialization time.
namespace name | recommended prefix |
uci |
All elements and attributes of the upCast Internal DTD are members of the http://www.infinity-loop.de/namespace/2006/upcast-internal namespace. The suggested namespace prefix is uci.
Besides the goal of avoiding name clashes, attributes are members of the upcast-internal namespace so that they can be put on any element in the internal tree, even if it is a non-upcast-internal element, and still be recognized easily as such.
namespace name | recommended prefix |
css |
The upcast-css namespace with the name http://www.infinity-loop.de/namespace/2006/upcast-css has a recommended prefix of css. It is currently only used for attributes and does not contain any elements. It is essentially a virtual namespace, which means that elements of the internal tree pretend to have attributes in the upcast-css namespace, when in fact these attributes are not actually materialized on the elements for efficiency reasons, but are generated on the fly when queried.
The css namespace contains the current value of all properties at the context node that have been defined by either applying a class to an element or a manual style override. It is assumed that all properties are inherited, and that manual overrides take precedence over class application when occurring on the same node.
The upcast-css namespace contains CSS styling properties mapped to an attribute representation. Since not all CSS property names used in upCast are also valid XML attribute local names, the following translation applies:
virtualized attribute name | |
-ilx- | css:ilx- |
| css: |
The only time the virtual attributes in the upcast-css namespace are actually materialized is when an exporter wants to serialize the internal tree verbatim, e.g. for reference reasons.
To export materialized attributes in the upcast-css namespace using the supplied XML Exporter, turn on its Explode CSS style info into real attributes option.
namespace name | recommended prefix |
http://www.infinity-loop.de/namespace/2006/upcast-cssoverride | csso |
The upcast-cssoverride namespace with the name http://www.infinity-loop.de/namespace/2006/upcast-cssoverride has a recommended prefix of csso. It is currently only used for attributes and does not contain any elements. It is essentially a virtual namespace, which means that elements of the internal tree pretend to have attributes in the upcast-cssoverride namespace, when in fact these attributes are not actually materialized on the elements for efficiency reasons, but are generated on the fly when queried.
The upcast-cssoverride namespace contains CSS styling properties mapped to an attribute representation. It contains only properties that have been brought into the tree by applying a manual, explicit, anonymous style property override at a certain node, usually by way of a style attribute with local style property settings. The properties available in the csso namespace on a specific context node consist of the union of all such properties having been applied either on the node itself or one of its ancestors in the described fashion, in order from document root to context node, unless they are identical in name and value with a property in the fully calculated cssc namespace on that node, in which case they are not added. (It is assumed that cssc properties are always inherited.)
Since not all CSS property names used in upCast are also valid XML attribute local names, the following translation applies:
property name | virtualized attribute name |
-ilx- | csso:ilx- |
| csso: |
The only time the virtual attributes in the upcast-cssoverride namespace are actually materialized is when an exporter wants to serialize the internal tree verbatim, e.g. for reference reasons.
To export materialized attributes in the upcast-cssoverride namespace using the supplied XML Exporter, turn on its Explode CSS style info into real attributes option.
namespace name | recommended prefix |
cssc |
The upcast-cssclass namespace with the name http://www.infinity-loop.de/namespace/2006/upcast-cssclass has a recommended prefix of cssc. It is currently only used for attributes and does not contain any elements. It is essentially a virtual namespace, which means that elements of the internal tree pretend to have attributes in the upcast-cssclass namespace, when in fact these attributes are not actually materialized on the elements for efficiency reasons, but are generated on the fly when queried.
The upcast-cssclass namespace contains CSS styling properties mapped to an attribute representation. The cssc namespace contains only properties that have been brought into the tree by applying a named style class from an external stylesheet onto a node, usually by way of a style reference using the class attribute. The properties available in the cssc namespace on a specific context node consist of the union of all such properties having been applied either on the node itself or one of its ancestors in the described fashion, in order from document root to context node. It is assumed that cssc properties are always inherited.
Since not all CSS property names used in upCast are also valid XML attribute local names, the following translation applies:
property name | virtualized attribute name |
-ilx- | cssc:ilx- |
| cssc: |
The only time the virtual attributes in the upcast-cssoverride namespace are actually materialized is when an exporter wants to serialize the internal tree verbatim, e.g. for reference reasons.
To export materialized attributes in the upcast-cssclass namespace using the supplied XML Exporter, turn on its Explode CSS style info into real attributes option.
namespace name | recommended prefix |
cals |
The upcast-cals namespace with the name http://www.infinity-loop.de/namespace/2006/upcast-cals has a recommended prefix of cals. It is used to differentiate attributes on tables from similarly or likewise named attributes and elements in other table models like HTML or the internal table model. You can therefore already decide at the top-level cals:table element that you are dealing with a CALS table without having to infer this from the further descendant element structure.
namespace name | recommended prefix |
html |
The html namespace with the name http://www.w3.org/HTML/1998/html4 has a recommended prefix of html. It is used to differentiate attributes on tables from similarly or likewise named attributes and elements in other table models like CALS or the internal table model. You can therefore already decide at the top-level html:table element that you are dealing with a HTML table without having to infer this from the further descendant element structure.
namespace name | recommended prefix |
xlink |
The XLink namespace with the name http://www.w3.org/1999/xlink has a recommended prefix of xlink. It is used to identify linking attributes on elements.
namespace name | recommended prefix |
xml |
The XML namespace with the name http://www.w3.org/XML/1998/namespace has a recommended prefix of xml.
In UPL, you can refer to variables and values in a specific realm using that realm’s namespace. For each realm, there is a corresponding namespace.
For details on UPL variable references, confer the UPL specification.
For details on upCast variable realms, see here.
realm | namespace name | recommended prefix |
application | http://www.infinity-loop.de/namespace/upcast-realm/application | application |
environment | http://www.infinity-loop.de/namespace/upcast-realm/environment | environment |
pipeline | pipeline | |
module | module | |
javaproperty | http://www.infinity-loop.de/namespace/upcast-realm/javaproperty | javaproperty |
include | include | |
map | map |
namespace name | recommended prefix |
util |
upCast comes with a library of several UPL utility function definitions. To have those separate from your own function definitions, even if they may share the same local function name, all these functions are located in a specific namespace: http://www.infinity-loop.de/namespace/upl/utility-functions .
For more info when it becomes available, confer the UPL specification.
Some settings need to be made early in the startup process of upCast. In fact so early, that they can not be read with application-internal means, but need already be set and available when upCast starts running. To set those values in cases where their default is not desirable, you can pass them via Java system properties to the JVM running the upCast application.
The following parameters are available, with their defaults, which are sometimes calculated dynamically based on the system/OS the application is running on, given as well:
de.infinityloop.exe.location (Windows only)
Default: ${application:BundledResources}/EXEs
Specifies the folder where upCast will look for supporting .exe files (like il-gw.exe, used for WordLink).
de.infinityloop.application.location
Default: (application installation root)
The folder where the application’s installation root lies.
de.infinityloop.application.preferencesdir
Default: (system dependent)
The folder where upCast will write its preferences file to.
de.infinityloop.application.logfile
Default: (system dependent)
The file where upCast will write its external logfile. Whether this value is actually used is dependent on the log subsystem chosen.
upCast uses SLF4J as the logging bridge and comes with a log4j 1.2 bridging implementation and log4j 1.2.x by default.
Default: 3
Set to a number greater or equal 0 identifying the threshold below which messages will be output to the log subsystem. The currently used range is 0..7, with 7 being the highest level debugging, i.e. “output always”. To have verbose logging, set this to a high value. To get reduced logging info, reduce the number.
The default value 3 corresponds to INFO type messages (and above).
When running a pipeline using the de.infinityloop.upcast.RunPipeline class, use the -debug N option described there instead of the de.infinityloop.loglevel property.
de.infinityloop.application.maxvarrecursion
Default: 32
The maximum number of iterations a field value is variable-resolved in search of a point where it no longer changes between iterations before the process is aborted and a fatal error is issued about a possible infinite recursion.
de.infinityloop.logfilterspec
Default: (empty)
This property serves can be used for specifying the log event filter used at the interface to the external logging system (usually a file or the console). Additionally, it is used in some selected places within upCast’s code base to prevent the time-consuming creation of complex log events already at their originating place.
The filter expression syntax is the same as described here.
Example 13.1.
-Dde.infinityloop.logfilterspec=+ERROR,+FATAL,+INFO
only passes messages of type ERROR, FATAL and INFO to the external logging subsystem, but not WARN messages.
At this time, the only supported message constant preventing log message generation already at its origin is CurrentRTFToken.
It is strongly recommended to only use the detailed Java API (described in this chapter) when your requirements do not let you use the static de.infinityloop.upcast.UpcastEngine.runPipeline() method. This is the case e.g. when you must dynamically select and parameterize the modules and their sequence of execution, or must react specifically in Java code on error conditions after each single module execution.
If you do not absolutely need these fine-grained control capabilities, which is usually the case when you can set up and run the pipeline you need using just the upCast GUI, please do not use the low-level API described in the following. Just use the de.infinityloop.upcast.UpcastEngine.runPipeline() method with the pipeline or parameter set file you developed in the GUI, instead. This makes changes to the pipeline possible without the need for re-compilation and therefore maintenance so much easier…!
Complete sample code (just a few lines of Java) ready for copy and paste into your Java project for each individual pipeline using de.infinityloop.upcast.UpcastEngine.runPipeline() can be obtained from the pipeline documentation in HTML format you get from File > Generate Documentation….
Accessing upCast functionality is carried out via one instance of a broker object: UpcastEngine. You should create one instance of that object at startup and use it for many subsequent conversions, since creation of this object is rather expensive. There are no problems in reusing that object for subsequent conversions (in contrast e.g. to many XML parser implementations, for example) – to the contrary, it is highly recommended from a performance point of view.
You may create several instances of the UpcastEngine object in order to run multiple conversion threads at the same time in your single application. Please note that the maximum number of parallel threads may be restricted by your license.
We assume that you are familiar with Java programming and its concepts like objects, interfaces and implementations. You should also be fluent with the Java object notion and with Java Streams.
The javadoc API reference can be found here.
The general programming steps are as follows:
Instantiate a de.infinityloop.upcast.UpcastEngine object. You can think of this object as the interface to your pipeline.
Set the pipeline base URI using the setPipelineBaseURI() method.
Register that instance with an appropriate license file using its setLicense() method.
Set global pipeline parameters like catalogs to use, overrides to the standard font configuration and custom encodings to use via the appropriate instance methods.
Call the initializeConversion() method.
Set pipeline variables using the setPipelineVariable() method.
Choose a module class via the method setModuleType(), which then internally gets instantiated and becomes the current module.
Set module parameters using (possibly repeated calls to) the setModuleParameter() method.
Start the module execution by calling runModule().
(optional) Repeat from step 7 for subsequent modules in the desired pipeline.
Call the cleanupConversion() method.
(optional) Repeat from step 5 for converting another document.
Expressed in actual Java code, this might look something like this:
String moduleID = null; UpcastEngine ucInst = new UpcastEngine( “instance one” ); ucInst.setPipelineBaseURI( “file:///path/to/basefolder/” ); ucInst.setLicense( “file:///path/to/upcast.uclicense” ); ucInst.setPipelineVariable( “DestinationFolder”, “/test/out/” ); ucInst.setPipelineVariable( “ImageDestinationFolder”, “/test/out/” ); ucInst.initializeConversion(); moduleID = ucInst.setModuleType( UpcastEngine.kRTFImporterType ); ucInst.setModuleParameter( moduleID, “OrigNumbering”, Boolean.TRUE ); ucInst.setModuleParameter( moduleID, “SourceFile”, “/test/in/in.rtf” ); ucInst.runModule( moduleID ); moduleID = ucInst.setModuleType( UpcastEngine.kXMLExporterType ); ucInst.setModuleParameter( moduleID, “DeleteEmpties”, Boolean.FALSE ); ucInst.setModuleParameter( moduleID, “DestinationFile”, “${pipeline:DestinationFolder}/out.xml;” ); ucInst.runModule( moduleID ); ucInst.cleanupConversion();
To quickly construct a slightly more sophisticated Java source code template for a pipeline you have already built using the GUI, use the File > Export to Java source command. You can then modify this generated code to your liking, preferably by subclassing it and overriding methods where needed.
You gain access to all functionality of upCast by means of objects of a single class: UpcastEngine. An instance of this object is what you will use in your application in order to access the full range of upCast API functionality.
Before you can do anything with upCast, you need to instantiate a UpcastEngineobject:
UpcastEngine ucInst = new UpcastEngine( “instance one” );
The UpcastEngine class is to be found in the de.infinityloop.upcast package.
You should keep this object stored in a variable which you can access from all places inside your program where you need to access upCast functionality.
You should strive to have only one instance of the UpcastEngine object per physical CPU at any time for performance reasons. Also make sure you only instantiate this object once during the life of your application process, as instantiating and disposing of this object is a relatively costly operation.
In the GUI version of upCast, this proeprty is set automatically for you, as there is a pipeline document that determines this value. In the API, however, there is no such document, so you must tell the upCast pipeline processor the value of this property. It serves as basis for resolving any ${pipeline:base} references you might have in module parameter values or pipeline setting values.
ucInst.setPipelineBaseURI( “file:///path/to/basefolder/” );
This should be called immediately after creating the UpcastEngine object instance.
To use upCast in API mode, a license file is required that includes either or both of the upcast-api and downcast-api features. If in doubt, contact us at licensing@infinity-loop.de.
ucInst.setLicense( “file:///path/to/upcast.uclicense” );
You can set pipeline properties directly on the UpcastEngine instance object. This includes amending or overriding the font configuration (setCustomFontConfiguration()), adding catalog files to be used by XML processing (addCatalog(), discardCatalogs()), and adding custom encodings to the set of built-in ones (addCustomEncoding()). These settings remain valid as long as the UpcastEngine instance lives or until you explicitly clear or set them to different values.
(Do not confuse these settings with the setting of pipeline variables; see below.)
Whereas in the GUI, you build a static pipeline by choosing a specific sequence of modules, the API handles a pipeline differently. In fact, there is no concept of a pre-built pipeline setup to be run; instead, you run modules one at a time. This has the great advantage that you can dynamically and programmatically build the actual pipeline for each single conversion, e.g. based on results of a preceding module execution on that input source.
ucInst.initializeConversion(); /* ... your pipeline code goes here ... */ ucInst.cleanupConversion();
Since upCast has to do some housekeeping for each conceptual pipeline run (independent of the actual number and sequence of modules run within), you need to tell it when you conceptually start a pipeline for a specific input file, and when you are done with it, i.e. when you have run the last module for this specific input file. This is done by the initializeConversion() and cleanupConversion() methods.
For example, initializeConversion() cleans the pipeline variable realm so that subsequent pipeline runs do not see values set by a previous run. And cleanupConversion() makes sure any temporary files created by some module get properly deleted when they are no longer needed.
It is very important that you obey this pipeline bracketing rule at all times, as strange, non-deterministic behaviour may occur otherwise.
As in the GUI (by way of the Pipeline Variables module), you can set variables in the pipeline realm to be used by modules run subsequently. The method to use is setPipelineVariable(), e.g.:
ucInst.setPipelineVariable( “DestinationFolder”, “/test/out/” );
The pipeline variable realm is cleared by a call to initializeConversion(). You therefore must explicitly (re-)set them at the beginning of a new conversion pipeline execution for a document.
Each module to be run has to be set up individually. This is done in three general steps:
Choose and set the module class to use.
Set module parameters.
Run the module.
First, you choose from one of the available module classes and set that using the setModuleType() method:
moduleID = ucInst.setModuleType( UpcastEngine.kRTFImporterType );
This will create a new instance of this module type and set it as the current module.
The constants to be used (in the UpcastEngine class) for the available module types are:
Module Type | Java constant name |
Pipeline Variables |
|
RTF Importer (“upCast”) |
|
UPL Processor |
|
UPL Tree Processor |
|
Sectioner | |
XML Exporter |
|
Commandline Processor |
|
XSLT Processor |
|
Unicode Translation Processor |
|
XML Validator |
|
CSS Exporter |
|
RTF Exporter (“downCast”) |
|
XML Importer |
|
External Pipeline Processor |
|
Module parameters will be set to defaults. The call will return a handle (a String) to that module which you must pass to subsequent setModuleParameter() calls:
ucInst.setModuleParameter( moduleID, “SourceFile”, “/test/in/in.rtf” );
The default parameter setting of modules is not documented. Though usually reasonable, these may change from release to release. We therefore highly recommend to set all parameters of a module explicitly to the desired values in order to not have your code break at an upCast update.
Like in the GUI version of upCast, you can use variable references in the parameter values which will be resolved by upCast automatically.
Example 14.1.
To specify the source file relatively to the pipeline base directory (to whatever value it is currently set), use a line like
ucInst.setModuleParameter( moduleID, “SourceFile”, “${pipeline:base}/source/in.rtf” );
Parameter names for each module are given in the description of the individual modules earlier in this manual. The parameter value has to be passed as a Java Object. The required object class depends on the specific parameter and is documented for each available parameter.
If you set a parameter more than once, the last value set will be used.
To set several parameters, you need to repeatedly call the method setModuleParameter().
If you try to set a parameter that is not supported by the current module, the parameter simply will have no effect, but no error is reported. To track which parameters you set in your application, you should turn on debug logging.
If you use a different Java Object (sub-)class for the parameter value than specified in the reference section, the behaviour is undefined. Some types may be compatible, but in general you will get a Java exception at some point later in the execution of upCast or the operation will not work the way you intended.
Finally, you’ll want to kick off the module’s execution. This is done by the runModule() method:
ucInst.runModule( moduleID );
After this, you could either setup the next module exactly as described in this section so far. You could even base the selection of the next module on the value of some pipeline variable which the module might have set to some specific value or some other condition. You can query the values of variables in the pipeline realm using the getPipelineVariable() method.
To access WordLink functionality also from upCast running via the API, you need to tell it where the WordLink component il-gw.exe is to be found before you instantiate an UpcastEngine object. This is done by setting the system property de.infinityloop.exe.location to the folder where il-gw.exe resides:
System.setProperty( “de.infinityloop.exe.location”, “/path/to/il-gw-folder/” );
On a typical Windows installation, this is C:\Program Files\infinity-loop\upCast\Resources\EXEs , but you are free to move the application file il-gw.exe anwhere in your filesystem where it is convenient for your deployed application.
Using WordLink in a critical server-based unattended environment is not supported and therefore not recommended. WordLink uses an installed copy of Word in component mode. Such use is explicitly warned against by Microsoft for server or server-like applications for technical reasons (letting alone any remaining licensing issues).
WordLink must access and launch Word to do what it needs to do. However, when running in a server environment, rights of running processes are usually tightly restricted. For example, Word might not be allowed to be accessed by the server process as COM object.
To make WordLink work in such restricted environments, you need to explicitly grant the user running the server access to the Word COM object. You can check and do this as follows:
On the Windows commandline, start dcomcnfg.exe .
Choose the component “Microsoft Word Document” (or similar, depending on localization) and click Properties... .
Under Security > Use custom launch permissions, add the account that runs the server using Edit... – Add... . (On one of our machines, this e.g. was “ASPNET (ASP.NET Machine Account)”).
After this modification, WordLink should also work in the restricted environment.
During a single call to an API method, several problems may occur, some of them quite significant, some of them less significant. In every case, the method will throw a single UpcastException. An UpcastException is a special descendant of a java.lang.Exception that encapsulates a list of errors and/or warnings that occurred during the last call to an API method.
You can query an UpcastException for its single constituents, which are objects of type LogEvent. A LogEvent encompasses:
a numerical message code
a message class, one of: FATAL, ERROR, WARN, INFO, DEBUG, VERBOSE, DETAIL
a human readable message as String
a (possibly null) array of parameters that were used in constructing the message
The recommended coding style for error handling is to wrap each call to an API method in its own try{}/catch{}-block and catching UpcastExceptions explicitly. This is useful if e.g. the runModule() call throws an exception, but the severity is not high and you decide to continue processing because it only contains a warning that you do not care about and that does not affect the document integrity. By wrapping each call separately, you get the maximum out of any sequence of API calls by just skipping the portions that did not work.
A typical API call including error handling would look something like:
try {
ucInst.runModule( moduleID );
} catch( UpcastException e ) {
if( e.extractSignificantEntries(
new int[] {
LogEvent.FATAL,
LogEvent.ERROR
},
null,
null ).size() > 0 ) { // we only react on FATAL or ERROR types, but not WARNings
... do some error handling ...
}
}
Using the extractSignificantEntries() method you can specify in very high detail in what messages you are interested in. For more information on how to use this method, see the javadoc API reference.
The message codes are all constants of a special class, Msg. See the javadoc API reference for a description of the currently defined message codes and the number and semantics of parameters available for a specific message.
upCast’s distribution jar includes an Ant task that lets you run upCast pipelines from files (*.ucdoc) from within Ant. This has the advantage that usually, you do not have to create the Ant task code anew whenever you make minor to moderate changes to your pipeline. To use it, you have to first define the task for use by Ant, then create the correct sub-structure of the upcast task.
To quickly construct an Ant build file code template for the upcast-runner task, use the File > Export to Ant > using 'upcast-runner' Task command. You can then modify this generated code to your liking or include it into an existing build file.
To define the task, use the following code:
<taskdef
classname=”de.infinityloop.upcast.ant.UpcastRunnerAntTask”
name=”upcast-runner”
classpath=”upcast.jar”
/>
For upcast.jar, you must specify the path to the distributed upcast.jar file. E.g., if you have a specific tasks folder next to your build file, you should copy the upcast.jar file there and specify ${basedir}/tasks/upcast.jar.
<upcast-runner file=”/path/to/pipeline.ucdoc” logfilter=”DEBUG”sourceparam=”SourceFile” > <source dir=”...”> <include name=”pattern” /> … </source> <catalogs> <catalog file="..." /> </catalogs> <param name="..." value="..." /> … </upcast>
On the upcast-runner task itself, some global parameters need to be set, above all file, which is the absolute path to the pipeline to run by this task.
The upcast-runner task can contain the following elements as nested elements: source (to set the source files; see below for the exact semantics), catalogs (to specify global catalog files to use; needed to resolve the PUBLIC ID of the pipeline in case the task is used to run a parameter set (*.ucpar)), and one or more param elements setting the pipeline's public parameters.
We’ll discuss each of these elements in more detail in the following.