This document is intended as a technical reference manual to upCast RT (in the following called just upCast).
It is not intended as a tutorial on how to use upCast efficiently, best practices for creating pipelines or similar topics. These will be covered in separate tutorial-style documents, a How-To section on our website as well as a Frequently Asked Questions document. Please turn to these documents as they are published on our website http://www.upcast.de/ in the near future.
This reference document describes upCast RT 8.0.
upCast is a module-based document processing pipeline tool, specializing in legacy, "flat" and layout-driven content. It comes with pre-defined, configurable, task oriented modules (that perform operations like importing data, XSLT processing, serialization and validation etc.) that you can put into any order you wish to create a pipeline. Pipelines can be saved and parameterized as a whole and then be run either within upCast’s UI or from the commandline or directly from Java.
Pipelines can be set up to be fully relative in their file addressing and therefore can be shared without modifcations between computers, even different platforms.
To run upCast, you must meet the following minimum requirements:
Java Runtime Environment 7.0 or later ("Java 7")
Xerces 2.11.x or later (upCast includes Xerces 2.11.0 and does not work in systems that have earlier versions than Xerces 2.9 in their classpath)
1024 MB of Java heap available to upCast (depending on document size and pipeline configuration, actual memory requirements may be lower or higher)
Display resolution of at least 1280 x 1024 (when running the graphical development environment)
The highest-level component type in upCast usage is a document processing pipeline, or short: pipeline. Pipelines can be saved into documents (file extension .ucdoc
) and recalled at any time. Complete pipelines can be exported into several formats, like a Java source file or source code for an Ant target.
To the user, upCast presents its functionality in two layers: the so-called "Simple View" and the "Edit View". Think of the Simple View as a simplifying, user-oriented layer over the Edit View, which is developer-oriented and shows the actual, fine-grained and possibly complex implementation of the conversion pipeline.
Pipelines are made up of modules. Modules each perform a specific and specialized task. Modules can be divided into the three categories importers, processors and exporters based on the tasks they perform.
Importers import documents into the internal document format. upCast currently includes a high-quality RTF/Word importer.
Processors come in two variants for internal and external processing. Internal processors modify the current, internal document representation. This is carried out in-place. External processors can be used to perform general tasks which are not dependent on the internal document, like running a shell command.
Exporters are used to serialize the internal document or part thereof in one of several formats.
Within a pipeline, at any time during execution there’s exactly one internal document representation the tasks are performed on. This means that modifications are in most cases performed in-place, so changes made to the internal document tree by one module are visible to subsequent ones.
While a run of an importer always replaces the internal document, you can have several exporters that serialize the same internal document in different ways. You can also serialize the document at any point in the pipeline and apply additional modifications using processors afterwards.
It is often useful to be able to save, quickly recall and share with other users different parameter settings for running a certain, parameterized pipeline. Such a parameter set can be saved into documents (file extension .ucpar
) and recalled at any time.
A parameter set document only contains the pipeline parameter values as they are set in the Simple View at the time of saving. Only parameters that have their persistent
property set to true
are saved in that document. It also contains the Pipeline UID of the pipeline document it is based on so that it can load its implementation for execution.
For details, see the section on Parameter Sets.
Usually, a conversion is a three-phase process: You import the source data into the application, process the data, and export the result. Sometimes, a fourth, external post-processing step is added. upCast offers various modules, which can be divided into three different classes: Importers, Processors and Exporters.
Here’s a diagram of a typical upCast pipeline (with the internal document indicated over time):
Download the Windows installer package and run the installer. It will create a customized Java launcher, create shortcuts and register appropriate file type associations. You then launch upCast by clicking the upCast RT application icon.
Download the disk image file (.dmg), mount it by double-clicking and copy the upCast RT application to your Applications folder (or at any other place you wish). Start upCast by double-clicking the application icon.
Download the upcast.jar file and launch it from the commandline using
java -Xmx1024m -jar upcast.jar
, which is short for
java -Xmx1024m -classpath upcast.jar de.infinityloop.upcast.AppUI
There are a few additional commandline options that are supported. Here's the synopsis:
java –jar upcast.jar parameters...
with parameters
being either of the following variants of option sets:
absolute path(s) to the pipeline or parameter set document(s) to be opened initially; any files passed here will override the Re-open documents that were open when last quitting setting in the application's preferences
standard options
Standard options are as follows:
one or more (XML-) catalog files to set up as global upCast catalogs before further processing. Setting this option is essential when wanting to initially open parameter sets (*.ucpar
) that rely on resolving their PUBLIC identifier to find the corresponding pipeline implementation file.
Example 3.1.
java -jar upcast.jar myfile1.ucdoc myfile2.ucpar -catalog catalog1 catalog2
starts upCast in GUI mode, initially opening myfile1.ucdoc and myfile2.ucpar, with the catalog system set up to use the catalog files catalog1 and catalog2.
will display extensive version and execution environment information of the upCast application as present in the respective upcast.jar file
Example 3.2.
java -jar upcast.jar -info
will print extensive version and environment information to the console.
will display upCast version and build info
Example 3.3.
java -jar upcast.jar -version
will print version number, build number and build date to the console.
Variant 4
will print the upCast raw build number, followed by a newline character, to the console
The upCast UI is designed to be simple and effective. An upCast document is a complete pipeline setup and can be can be saved in a file with the default extension .ucdoc
. Each document is shown in its own window, and you can have several pipelines open at the same time.
A document window in edit mode is divided into three parts.
The left pane shows the sequence of modules that make up the pipeline. The position of a selected module in the pipeline can be changed by using the nudge-up/nudge-down controls at the bottom of the list. A module can be deleted from the pipeline with the "–" control, a module can be added by clicking the "+" control and choosing the desired class from the popup. There can be multiple instances of the same module type in a pipeline as required, e.g. two or more XSLT processors.
A module can have the following decorators:
The right pane shows the parameters for the currently selected module. Only one module can be selected at any time. Changes to a module’s parameters are effective immediately.
At the bottom of the window, the pipeline execution controls are placed for executing a pipeline, stopping it underway and checking its progress.
This display is replaced by a dynamically generated, forms-like interface when the Simple View option is engaged.
This command lets you create either a parameter set based on a factory-supplied template or a new, independent, self-contained pipeline configuration from one of the available templates.
Creates a parameter set from the respective template’s main pipeline document.
The advantage of just creating a parameter set is that if you do not need to tweak the implementation, but just use the pipeline template’s functionality as-is only with variable parameter values, you will benefit from updates and bugfixes to the template automatically without any further manual intervention required. This comes from the fact that the parameter set only holds a reference to the actual template implementation and therefore is automatically updated when the implementation is.
Creates a full, physical copy of all the pipeline documents and resources the template is made up of. You are asked for the location (folder) and a base name for the new pipeline. Within the selected folder, a new folder by the specified name is created and any resources of the template, including the pipeline document, are copied into that folder.
This pipeline created based on the chosen template is completely independent from its template. This means two things:
you get a complete, independent copy of the original template definition and resources
any updates to the template are not propagated forward to any pipelines you already have created based on an older version of the template
You can create your own, specific templates. For details on what makes a pipeline an upCast template and where to put those templates for upCast to recognize them, see the chapter on Pipeline Templates.
This shows a file chooser where you can open an already existing pipeline or parameter set document.
Shows in a sub-menu the most recent pipeline and parameter set documents you had open in the past. The number of items displayed in the sub menu can be set in upCast’s preferences.
Pipeline or parameter set documents you had open recently, but which are no longer available (for example because they have been deleted or the disk they reside on is currently not mounted) are shown in disabled state.
Closes the top-most document. When changes to this document have not yet been saved, you are prompted to save them.
Saves the top-most document, which can be a pipeline or parameter set document, the log window or the system information window.
This allows you to save the top-most window under a new name.
Note that for pipeline documents that refer relatively to needed resources, saving a pipeline document to a different location will usually break those links and the pipeline will not run as expected, since upCast cannot reliably track those resource links and copy them along automatically.
This lets you save the persistent parameters and their values of the top-most pipeline document to a separate file, a parameter set document. This file internally links back to its pipeline document it was created from. This allows you to separately store configurations of parameter values that look like separate pipelines, but share one single pipeline implementation. When the latter gets updated, so do all parameter sets originating from it.
See the section on parameter sets for more information on how the linking to the respective pipeline document works and what the restrictions of parameter sets are.
Saves the current pipeline document in form of an Ant task. Additional parameters can be set for the export operation in the Pipeline Settings dialog under the Export tab.
exports an Ant task making use of an upCast runner object, which reads the specified pipeline and executes it. This is the recommended export option since you need to generate that task only once and it picks up automatically any changes in the referenced pipeline document.
this creates a fully, self-contained Ant task of the current pipeline’s configuration. This means that the task can be run without having access to the original pipeline document it was generated from. This may be useful when you used the original pipeline document only for prototyping and testing, and want to apply changes directly to the Ant task’s definition thereafter, or can recreate the task automatically when making changes to the pipeline document (e.g. in an automated build using upCast’s Tools class).
Saves the current pipeline document as Java Source code. Additional parameters can be set for the export operation in the Pipeline Settings dialog under the Export tab.
exports Java source code making use of upCast’s RunPipeline class, which reads the specified pipeline and executes it. This is the recommended export option since you need to re-generate the source code for that class only when the pipeline parameter configuration changes (i.e., parameters are added or removed) and it picks up automatically any further changes in the referenced pipeline document.
this creates a fully, self-contained Java class of the current pipeline’s configuration, utilizing the upCast Java API’s UpcastEngine class’s methods. This means that the code can be run without having access to the original pipeline document it was generated from. This may be useful when you need fine-grained control over error handling for each individual module’s execution step and/or need to dynamically execute additional code that cannot be integrated into a standard pipeline execution.
Exports the current pipeline document as a human-readable XML source file.
This file is also used internally as the basis for the Ant task and Java Source export options, which are generated by appropriately configured XSLT transformations. With this export, you can create your own formats of export (e.g. customized Java code export or extended documentation generation).
The operations Cut, Copy and Paste are supported context sensitively, depending on where the current keyboard focus is directed to:
When the focus is on a text field, these methods work as usual.
When the focus is on a module in the pipeline modules list, that module’s complete definition is copied in form of an XML snippet onto the clipboard. Using Paste while the focus is on a module entry, the module description on the clipboard is read and a new module is inserted above the currently selected module with all parameters set as for the module you copied. You can even copy modules conveniently across open pipelines this way.
When the focus is on a module in the pipeline modules list, this command will create UPL source code for running the selected module from UPL using the run-module()
function and put it as text onto the clipboard. You can then insert it into a UPL code field within upCast or your favorite external editor where you are writing your UPL code.
With this toggle, you switch between the Simple View and Edit View of a pipeline configuration.
When checked, upCast shows its pipeline window in Simple View mode, hiding the actual pipeline implementation and showing only entry fields for the pipeline parameters that a typical user must supply.
When you want to edit the details of a pipeline, uncheck this item.
The state of this parameter is saved to the pipeline document and automatically restored at opening time. This means that for final distribution to your customers, check this parameter, then save the document again before packaging it into your distribution.
Shows a window with detailed information on the execution environment of the topmost pipeline document and the upCast application, including version information on available XSLT processors, Java, loaded modules, license info etc. You may be asked by infinity-loop support for this info when tracking down problems you may have with upCast.
Shows a window with the external log file or a live view of log events as they are generated from within upCast.
The Source popup menu lets you choose between these two modes:
shows the current contents of the log file on disk
shows the log events in the system as they are generated from log sources within upCast
When showing Live Events, you can set a filter describing which log events generated by upCast should be displayed. This is done using the Filter text field. This setting is completely independent from the log level setting in upCast’s preferences. Several pre-defined settings are available from the associated popup menu, but you are free to specify any log event filtering expression you wish. The filter expression syntax is described here and is the same as used in other places within upCast.
All log events are held indefinitely while the window is open or until you click on Clear Window, so you should not leave the window open unattendedly as otherwise you will run out of heap space at some time. When the window is in Live Events mode, depending on the amount of logging events to display, you will see a performance degradation of pipeline execution. There’s no performance penalty when the window is closed, as then it detaches itself from all log sources automatically.
With Save as Text…, you can save the current contents of the window to a text file. You may be asked by infinity-loop support for this info when tracking down problems you may have with upCast.
Shows this upCast reference documentation manual in the host system’s default web browser.
Shows the UPL reference documentation in the host system’s default web browser.
Shows the upCast API documentation (in javadoc format) in the host system’s default web browser.
Opens a pre-configured email in your default email application, ready to be amended by your problem report or question to infinity-loop Support department. This includes system information which you can preview in the generated email and – when desired for privacy reasons – trim to your liking.
You should use this function whenever you want to report a bug or problem to infinity-loop.
upCast offers several variable realms. Realms are distinct, non-overlapping value storage spaces. Think of them as different buckets placed next to each other, labelled with the realm name.
Some of the realms are read-only, and some of them calculate the actual value of a variable at the time of retrieval access.
Here’s an overview of the different realms and their names (monospace bold grey print) available in upCast:
To get or set a variable, two components must be specified:
the realm
the variable name
A variable reference is resolved in an upCast parameter field by simply replacing the variable reference by the textual value of the variable referenced.
It is important to always keep in mind that the variable resolution process is an utterly dumb textual replacement process (much like a #define
works in the C programming language). Specifically, no quoting or unquoting is performed.
The result of a variable reference to a variable that does not exist or cannot be resolved is the variable reference itself.
A piece of text containing variable references is processed as many times as the result changes. This allows you e.g. to have references to the include realm resolved also in already included content. Consequently, you must make sure that contents looking like a variable reference, but which may not be resolved, must be properly quoted (e.g. by doubling the $ sign). To avoid potential infinite recursion, this repeated resolution process on some source string is terminated when even after a certain number of iterations, changes in the result still occur. The limiting number of iterations currently is set at 32 by default. It can be changed by setting the Java property de.infinityloop.application.maxvarrecursion
.
All variable names that start with an upper-case letter are reserved for upCast’s own use.
You should therefore name your own variables in such a way that they do not start with an upper-case letter, even when at that time, a likewise named upCast-defined variables does not yet exist. We might introduce it in a subsequent release and make your pipeline not work correctly any more.
The syntax to refer to a variable in a specific realm is similar to that of Ant, albeit with a twist:
${realm
:name
#modifier
}
Note the special #modifier
part: It is useful when wanting to modify the stored value of a variable before returning it in specific ways. This is most useful in file paths, e.g. to only retrieve the name of a file in an absolute path, the base name or just the path to some file.
As with Ant, variable resolution is not recursive, i.e. you cannot write something like ${module:${pipeline:paramname}}
to calculate the name of a module variable dynamically.
The components of a variable reference are:
Example 4.1. Modifier sample results
With SourceFile
having a value of "C:\Documents and Settings\upCast\The file.xml"
, the following variable references with modifiers will evaluate to:
${SourceFile#local}
→ C:\Documents and Settings\upCast\The file.xml
${SourceFile#url}
→ file:///C:/Documents%20and%20Settings/upCast/The%20file.xml
${SourceFile#localpath}
→ C:\Documents and Settings\upCast
${SourceFile#urlpath}
→ file:///C:/Documents%20and%20Settings/upCast
${SourceFile#localname}
→ The file.xml
${SourceFile#urlname}
→ The%20file.xml
${SourceFile#localextension}
→ xml
${SourceFile#urlextension}
→ xml
${SourceFile#localbasename}
→ The file
${SourceFile#urlbasename}
→ The%20file
${SourceFile#localbasenamepath}
→ C:\Documents and Settings\upCast\The file
${SourceFile#urlbasenamepath}
→ file:///C:/Documents%20and%20Settings/upCast/The%20file
Let’s have a look at the various realms in more detail:
This realm is not yet available and will be implemented in a later release of upCast.
This realm is read-only.
This realm includes upCast application-global values.
The following variables are currently defined:
variable name | description |
| path to the (OS/system-specific) support files folder |
| path to the resources folder bundled with the application distribution (when it was installed using one of the system-specific distribution packages) |
| the path to the external logfile as calculated by the application and/or set in the java system property |
By default, all file path values returned are in URL format. You can use all available modifiers on them, of course, to change format or extract parts from them.
Example 4.2.
To retrieve the location of the application’s support folder on the system it is running on, use:
${application:SupportFolder}
This realm is read-only.
This realm includes upCast pipeline-global environment values. Most of them are virtual in the sense of that they reflect some current state of the execution environment at the time of recalling them and are not actually stored.
The following variables are currently defined:
variable name | Java type | description |
| Integer | the application version (0x0Mmr format) |
String | the application version in "M.m.r" format | |
Integer | the build number | |
String | the timestamp string of the build in the format " | |
| List | list of Strings of features included in the current license; features not active are enclosed by parantheses ‘(‘ and ‘)’ |
List | list of Strings that only includes features in the current license that are valid at the time of the query | |
String | a string describing the current license | |
Integer | number of days until the license feature | |
String | application installation folder | |
List | list of Strings identifying the locations of currently active XML catalog files in the pipeline | |
String | version information of the included Xerces parser | |
| String | version information of the included Xalan XSLT processor |
| String | version information of the included Saxon 9.x XSLT 2 processor |
String | version information of the included Saxon 6.x XSLT 1 processor | |
| Integer | version of the active WordLink component; returns |
| Integer | version of Microsoft Word that WordLink is currently linking to; returns |
| String | absolute path to the application used for implementing the WordLink functionality; returns |
| Integer | version of the active MathLink component (implementing the link to MathType 5); returns |
| Integer | version of the MathType DLL used for implementing MathLink; returns |
| String | the text currently displayed in the progress bar’s sub-label |
| String | the text currently displayed in the progress bar’s label |
| Long | the ordinal number (1-based) of the currently executed module task in the pipeline |
| Long | the total number of tasks defined in the current pipeline |
Long | the maximum value for completion indication of the current task | |
| Long | the current value of completion for the currently running task; the task is completed when this value is equal to |
| String | the folder searched for application support files |
| String | the folder searched for license files |
String | the absolute path of the log file written to | |
Boolean |
This means that the pipeline must be a top-level pipeline (see | |
| Boolean |
This means that the pipeline is not one that is executed within an External Pipeline Processor as a sub-pipeline |
String | returns the contents of the pipeline info window as string This information may prove useful for debugging, as it contains the complete running environment information of this pipeline in human readable form | |
| String | returns the compatibility version of the current pipeline as string or the empty string when the info is not available This information is a copy of the respective parameter setting in the Pipeline Info > Settings tab |
String | returns the build of the current pipeline as string or the empty string when the info is not available This information is a copy of the respective parameter setting in the Pipeline Info > Settings tab | |
Integer | the build number of the latest available version of this application This information is retrieved from infinity-loop’s servers by fetching the URL http://versioncheck.upcast.de/upcast7.plist. When there is no newer version available, this returns 0. When the information could not be retrieved (e.g. due to a server error or if there is no active connection to the internet), |
By default, all file path values returned are in URL format. You can use all available modifiers on them, of course, to change format or extract parts from them.
Example 4.3.
To get information on the version of Xalan currently in use by upCast RT, write:
${environment:xslt-xalan-version}
which might return the value "Xalan Java 2.7.1
".
For accessing these environment values from UPL, access them using the environment namespace like ordinary UPL variables. Java types as listed in the table above are coerced to the respective UPL types.
With a namespace definition of
#namespace environment "http://www.infinity-loop.de/namespace/upcast-realm/environment";
the code
println( $environment:dir-licenses );
might print the following on the console:
/Users/demo/Library/Application Support/infinity-loop/upCast RT/Licenses
and the code
println( $environment:license-features );
might print the following to the console:
{"rtfimportGUI","rtfimportAPI","rtfexportGUI","rtfexportAPI","uplGUI","uplAPI"}
It is often useful to store values that several modules will need as pipeline variables. Examples are the source document to process, the destination folder, the folder where images will be stored or the folder where temporary files should be created if needed by the pipeline.
The pipeline realm contains variables that are available to all modules in a specific pipeline. Each pipeline has its own set of pipeline variables. Modules can only access pipeline variables of the pipeline they are a member of.
The set of pipeline variables is cleared before each execution of a pipeline with the exception of the following special, pre-defined, read-only variables:
base
PipelineBase
This realm includes all parameters of a single module in a pipeline. This realm can only be accessed from within that module, and only the parameters of the currently executed module at the time of reference resolution can be accessed.
Referencing module variables is generally not recommended since upCast has no defined order of variable resolution and will not determine a suitable one by itself. Referring to module variables can therefore lead to infinite loops or referring to unresolved references.
This realm is read-only, with the exception of the UPL execution context, where you can also set variables in that realm.
The javaproperty realm contains all currently defined Java system properties, either pre-defined ones by the Java Virtual Machine (like user.dir
or user.home
) or properties explicitly set on launch of the VM running the application.
This realm is read-only.
The include realm returns the contents of the file specified as the name of the variable. The syntax is as follows:
${include:/absolute/filepath/to/file.ext
} ${include:relative/path/to/file.ext
}
A relatively specified path is always considered to be relative to the value of ${pipeline:base}
, i.e. the base URL of the pipeline.
The include realm can include parameters like e.g. specifying the encoding of the file to be used for reading it. The variable reference syntax can therefore take the following, extended form:
${include( paramname: "value" [, paramname: "value" ]* ):filepath}
The following table lists the possible parameters that can be specified for an include reference:
parameter name | value |
encoding | the (Java-) name of the encoding to be used for reading the file When this parameter is not specified, the platform’s default encoding is used. |
source | lets you choose wherefrom the data to be included should originate from
|
a string value which is the fallback replacement value when the include cannot be performed due to an error, e.g. if the refernced file does not exist, is not readable or has an encoding error Normally, when a variable cannot be resolved in the INCLUDE realm, the variable reference is left verbatim. With a fallback value, it is possible to conditionally include a file if it is present into some other piece of code (by setting fallback to the empty string), or even insert some default code when a file is not present. |
Example 4.5.
1. The value of the variable reference
${include( encoding: "UTF-8" ):Resources/entity.map}
is the text contents of the file pipeline-basedir
/Resources/entity.map
, read with UTF-8 encoding.
2. The value of the variable reference
${include( source: "variable" ):pipeline:DestinationFolder}
is the text contents of the variable DestinationFolder in the pipeline realm.
3. Assuming the file pipeline-basedir
/doesnotexist.txt
does not exist or is not readable, the value of the variable reference
${include( fallback: "" ):doesnotexist.txt}
is the empty string ""
, the value of
${include:doesnotexist.txt}
is the string "${include:doesnotexist.txt}"
, and the value of
${include(fallback: "println('File does not exist!'"):doesnotexist.txt}
is the string "println('File does not exist.')"
.
Parameters and variables are internally stored using standard, appropriate Java or UPL object types. Some parameters can take several different types, which, however, can only be set using native Java code using the upCast API or UPL functions. Parameters that can accept more than one of the following basic types will have this mentioned explicitly and in detail in their respective description section.
The basic parameter types are:
corresponding Java type (class or interface) | corresponding UPL type | |
Bool |
| Bool |
Integer |
| Numeric |
Double |
| Numeric |
String |
| String |
List |
| List |
|
| — |
Some settings are global to the upCast application and (some of them optionally) affect all pipeline documents loaded.
These can be set in the upCast Preferences dialog, available under the application menu (Mac OS X) or the File menu (other platforms).
To make the settings active, click Apply or the window’s close button.
The parameters are grouped into tabs.
Create new document on launch when no others are open
When selected, a new default pipeline document will be created on upCast launch when no other windows (e.g. from e previous, saved session) are open.
Re-open documents that were open when last quitting
When checked, all documents that were open when upCast was last quit will be re-opened in their previous locations.
Remember the most recent ___ pipeline documents
Here, you can enter the number of recently opened documents that should be listed in the File > Open Recent menu. Decreasing the value will forget any document listings beyond that new number.
To clear the File > Open Recent menu, set the number to 0, close the application preferences window by clicking Apply, then re-open and enter the number of documents you want to be remembered. Setting the value to 0 temporarily will clear the entire internal list of documents, effectively clearing the menu.
Here, you can specify a log event filter expression. Only log events passing the filter expression are actually written to the external log file. Several often-used filter expressions are available from the associated popup menu, however you are free to define your own customized filter expressions. The filter expression syntax is described here.
Check for updates on launch
When checked, upCast will contact the infinity-loop version server to check whether a newer version of upCast is available for download. If there is, you will be notified in an info dialog.
Clicking this button will let you manually check for updates. This is particularly useful when Check for updates on launch is not checked.
When switching between Simple View and Edit View …
This parameter lets you set the behaviour with regard to window positioning and sizing when switching between Simple View and Edit View of pipeline windows. upCast can remember location and size of the window in each of the two modes and restore those settings when switching between them. The following behaviours are available:
this option tries to keep the current window size and position when switching between Simple View and Edit View if possible (This is the default and mirrors upCast's pre-7.5 behaviour.)
this option restores the last size of the window in each of the two modes when switching between them, keeping the current window position (upper left corner) fixed
this option restores the last size and position of the window in each of the two modes when switching between them
Window sizes and positions for each mode are saved to the pipeline file (*.ucdoc
and *.ucpar
) and therefore are available again when re-opening it.
Pipeline Template Paths
In this text field, you can specify paths where upCast will look for pipeline templates, one path per line. Use this if you store personal or company templates at a central place on your network and make those templates available automatically within upCast.
The default path form templates, which points to the templates copied to volume during installation, is
${application:BundledResources}/templates
You must include this path in this field if you want to have access to the application-included templates. On the other hand, if you want some users to not have access to the default templates but want them to be restricted to your specific, customized templates only, make sure that in those users’ installations, the default path is not included in the path list.
To add a path to the list, click Add Path… and navigate to the folder containing the pipeline template definition folders.
Empty lines or lines starting with // are considered comments and are discarded during parsing.
You can use variable references from the include, application and javaproperty realm, but you cannot use the pipeline realm since the setting is application-global.
upCast supports the use of catalog files. A catalog file is in its simplest idea a mapping definition between PUBLIC DTD identifiers and the location of a physical copy of that specific DTD (or more general, entity). The upCast application supports the catalog file format as defined in http://www.oasis-open.org/specs/tr9401.html as well as XML Catalogs.
To add a catalog file, choose Add Catalog… and select the catalog file to add from the file system. The new catalog will be available to all modules immediately after closing the preferences window.
To remove a catalog, just delete its entry line.
Catalogs are considered in the order displayed.
OASIS catalog files are read with platform encoding, XML catalog files with the encoding specified in their XML declaration.
By clicking Insert upCast defaults, code is added to pick up any upCast default catalog possibly delivered with the application. You should have that entry in place for best performance.
You can override the global Catalog setting individually for each pipeline.
Font configuration
Specify the source code for the stdfonts.config override that should be used for this pipeline.
Custom Encodings
To add a custom encoding file, choose Add Encoding… and select the custom encoding file (*.encoding)
to add from the file system. Each line in the field specifies a custom encoding location.
To add a folder where upCast should pick all contained custom encoding files, choose Add Encodings Folder… and select it.
To remove a custom encoding entry, delete the text line containing its location specification.
This panel is for importing an upCast license file and reviewing current licensing status.
Certain module types require specific license features. The features available in the currently active license are listed in the license features table at the bottom of this panel. Please refer to the individual module’s documentation to check which license feature it requires to be fully (or: at all) functional.
Import new license
Clicking this button brings up a file chooser where you can find and select the license file you got sent upon your license request or purchase from infinity-loop’s licensing department. You get the chance to store this license in upCast’s Licenses
special folder, so it will be available to you automatically at launch.
Pick from available licenses
Clicking this button shows you all licenses from upCast’s Licenses
folder and any licenses packaged into the application itself that can be used for this version of upCast. This allows e.g. to switch between evaluation and full licenses or licenses with different features.
Parameters will be described using the following typography:
Name | DeleteEmpties |
Java symbol |
|
Type | Boolean |
Value | false, true |
Name gives the internal java.lang.String name of the parameter, which is used for storing in preferences files, and can be used in the Java API as String. However, in Java, the use of the Java symbol is highly recommended instead.
Java symbol names the Java constant definition for the parameter’s Name. All constants are defined in the class de.infinityloop.common.Params.
Type specified the recommended Java type to use when programming against the API. In the GUI version, that is taken care of automatically. Also, when using alternative interfaces like an Ant task, which allows only passing arguments as character strings, upCast tries to perform an appropriate cast. So you should make sure that the data you provide in these cases will be cast-able to the specified type, as otherwise the conversion will fail or produce incorrect results at runtime.
Value describes possible value ranges, supported keywords or other specifics about that parameter’s range.
The pre-defined pipeline variables PipelineBase and base (deprecated) are automatically made available in the GUI version of upCast (read-only) and contain the path to the current pipeline document (*.ucdoc
), excluding the actual name, in URL format (including trailing slash ‘/’). It is essential to have the pipeline document saved to a file on disk so upCast can determine this property. If the path can not be determined, the current directory is returned instead (Java property user.dir
).
When using the upCast RT Java API (i.e., the UpcastEngine
class) directly, this value must be explicitly set before working with pipelines that contain any references to values dependent on ${pipeline:PipelineBase}
. Only use the setPipelineBaseURI()
API method (class UpcastEngine
) for setting the value for this pipeline variable.
You can use this for making the configuration independent from its actual location in the file system by specifying paths relative to the base variable, and storing an resources needed for the pipeline in subdirectories to this base URI.
For distributing a configuration, we recommend to put it at the root of a folder with required resources in sub-folders according to the following layout:
Name |
PipelineBase |
Java symbol |
|
Type |
String |
Value |
absolute path to folder in which the current configuration file is located (if loaded via the GUI; automatically set) or the folder of a configuration distribution root in URL format |
This variable holds the full URI (as a file:
-URL) of the pipeline document (*.ucdoc
) implementing the current pipeline.
Name |
PipelineURI |
Java symbol |
|
Type |
String |
Value |
absolute path to folder in which the current configuration file is located (if loaded via the GUI; automatically set) or the folder of a configuration distribution root in URL format |
This variable holds the path to the current parameter set document (*.ucpar
), excluding the actual name, in URL format (including trailing slash ‘/’). For regular pipeline documents, the contents of this variable is the same as that of PipelineBase.
Name |
ParamBase |
Java symbol |
|
Type |
String |
Value |
absolute path to folder in which the current parameter set file is located (if loaded via the GUI; automatically set) in URL format |
This variable holds the full URI (as a file:
-URL) of the current parameter set document (*.ucpar
). For regular pipeline documents, the contents of this variable is the same as that of PipelineURI.
This definition of the contents of the ParamURI pipeline variable can be used in UPL to determine whether the currently running application is run directly from a pipeline document (ucdoc)
or via a parameter set (ucpar)
with code like the following:
#namespace pipeline ""; ... if( ends-with( $pipeline:ParamURI, "ucpar" ) ) { /* we’re running from a parameter set document */ } else { /* we’re running from a regular pipeline document */ }
Name |
ParamURI |
Java symbol |
|
Type |
String |
Value |
absolute path to folder in which the current configuration file is located (if loaded via the GUI; automatically set) or the folder of a configuration distribution root in URL format |
This variable holds a UUID string identifying this particular running instance of a pipeline.
Identifying a certain pipeline object instance is necessary in some upCast XSLT extension functions which need to retrieve information from the pipeline object that is running the transformation. The value in this variable is used for these identification purposes and must be passed as a stylesheet parameter when needed there.
Name |
PipelineInstanceId |
Java symbol |
|
Type |
String |
Value |
Click the Pipeline Settings… button in the pipeline window to access the window for setting pipeline-level settings.
To make the settings active, click Close or the window’s close button.
Many of these settings allow you to override the settings made in the upCast preferences.
When you are using the upCast GUI as the prototype and testing environment for your pipeline development, but intend to later export it in form of Java source code or an Ant task, we recommend overriding the global settings by pipeline-specific settings to get consistency in your output for a specific pipeline document instead of being dependent on the current global application preferences at time of export.
Access the various settings by choosing the respective tab:
Here, you can set up a description of parameters you want your pipeline to be dependent on. The information provided here is used in three ways:
to create a simplified view and data entry UI objects for the user of a pipeline, where you want to hide the details of the implementation (i.e. the kind and order of modules used, calculations etc.),
to define the parameters a pipeline accepts and requires to be able to run from the commandline or via the Java API functions, including the ability to check those parameters for legal values, and
to provide documentation for the semantics of a parameter, which is shown in form of help tags in the UI, as text in the commandline, and formatted as HTML document when generating the pipeline documentation
This is a convenient feature to distribute complete, parameterized pipeline solutions to your customers in an easy-to-use, packaged way. All they need to do is open the pipeline, supply the requested parameters, and click the Run button. They are therefore completely shielded from the (possibly many) modules building up the pipeline and their complexity.
Interface element and parameter definitions
The description code you provide here serves two purposes:
It is the basis for determining the number and name of pipeline parameters.
It specifies the kind of form display element for each of these parameters.
Basically, you specify the name of the pipeline variable you wish to have set to the specified pipeline parameter’s value. This value is supplied as initial, pre-set pipeline variable to your pipeline definition.
The pipeline parameters are only set when the GUI is in Simple View mode. When in full editing mode, the pipeline is executed with a completely clean set of pipeline variables (except for the base variable) – unless you check the Set specified parameter defaults when running a pipeline in edit view option (see above). In the latter case, the default values for those parameters that have a default specified are set.
Before the defining code is interpreted, upCast resolves any contained variable references for the following realms and in that order:
include
javaproperty
application
You cannot (for obvious resons) access variables in the pipeline or module realm.
You can use the include variable reference to your advantage in projects where you have to create similar pipelines that essentially should have the same Simple View definitions. To keep those in-sync, you can use an external file holding the parameter definition code, then include it in all pipelines that should show the same UI and have the same parameters. You then only need to update that single external file, and the UI definitions are updated automatically in all pipelines that include it.
upCast offers several types of UI elements for parameter entry: a decorating label, a text field or box, a filechooser, a popup menu and a checkbox, each one with its own set of dedicated properties.
You must assign one of these entry types to each pipeline parameter you need. The syntax for describing the properties is based on a CSS rule set: The selector part takes the form of an element selector and supplies the name of the pipeline variable to set. The declaration block part specifies the specific display and behavioural properties for that UI element.
Here are the properties which you can set for each of the following available types (option values are case-sensitive!):
This UI element creates static text. You can use this for headings, parameter grouping or parameter descriptions for the pipeline GUI users.
We recommend creating text labels with an ID type of selector, since using an element selector would create a pipeline variable by that name (and reserve a likewise named pipeline variable).
Using an ID type of selector part prevents this from happening, and the label will just server for showing text in the UI without any further effects.
label | |
type |
|
text | the text to display in the label |
font-family | the name of the font to use; when not specified, the system’s default label font |
font-weight | normal | bold |
font-style | normal | italic |
font-size | size of the font; when not specified, the system’s default label font size |
color | the text color; must be a CSS 2.1 color value |
background-color | the background color; must be a CSS 2.1 color value |
initialize-when |
Defines parameter values setting and initialization behaviour when the pipeline is run as a sub-pipeline from within an External Pipeline Processor (see there for details); the default value is |
when | |
locked | when |
Example 6.1.
#myLabel { type: label; text: "Simple View Sample"; font-size: 20pt; font-weight: bold; color: olive; }
creates a label with 20pt font size, bold text and olive text color.
This UI element creates a text field for arbitrary text.
It will create a parameter and pipeline variable of type String.
text | |
type |
|
label | text to display as label at the very left |
persistent | when when The default value is |
default | the default value for the parameter, if newly created or persistent is |
description | the string specified here is displayed as tool-tip text in the UI when hovering over the input element and also used for descriptive messages when running in commandline mode |
required | (only used in upCast commandline mode) when when The default value is |
lines | the number of lines of the text field, the default is 1 |
postfix | the text to display after the text entry field; use this e.g. for displaying a value unit like "dpi" to let the user know the semantics of the number entered in the text field |
initialize-when |
Defines parameter values setting and initialization behaviour when the pipeline is run as a sub-pipeline from within an External Pipeline Processor (see there for details); the default value is |
hidden | when |
locked | when |
Example 6.2.
headerText { type: text; label: "Header Text:"; persistent: true; default: "My Publication"; lines: 2; }
creates a field to input header text used in the pipeline. The pipeline variable created will be named headerText, and values the user inputs will be stored across document openings. The input field will show two lines of text, and will be pre-occupated with the text "My Publication" on initial creation.
This UI element creates a text field for entering a file or folder path in local or URL format. It also displays a button to pick a file or folder using the system's file chooser UI.
It will create a parameter and pipeline variable of type String.
type |
|
label | text to display as label at the very left |
persistent | when when The default value is |
default | the default value for the parameter, if newly created or persistent is |
description | the string specified here is displayed as tool-tip text in the UI when hovering over the input element and also used for descriptive messages when running in commandline mode |
required | (only used in upCast commandline mode) when when The default value is |
mode |
|
format |
|
initialize-when |
Defines parameter values setting and initialization behaviour when the pipeline is run as a sub-pipeline from within an External Pipeline Processor (see there for details); the default value is |
hidden | when |
locked | when |
Example 6.3.
SourceFile { type: filechooser; label: "Source file:"; persistent: true; mode: open; format: url; }
creates a field with a button to call a file chooser. The pipeline variable created will be named SourceFile, and values the user inputs will be stored across document openings. The file chooser will allow the user to pick files only, and the result will be stored in URL format in the editable input field.
This UI element creates a text field for entering a list of file or folder paths (one per line) in local or URL format. It also displays a button to add a file or folder using the system's file chooser UI at the end of the current list.
It will create a parameter and pipeline variable of type List consisting of one line each of the input field as String value (in the displayed order).
filelist | |
type |
|
label | text to display as label at the very left |
persistent | when when The default value is |
default | the default value for the parameter, if newly created or persistent is |
description | the string specified here is displayed as tool-tip text in the UI when hovering over the input element and also used for descriptive messages when running in commandline mode |
required | (only used in upCast commandline mode) when when The default value is |
mode |
|
format |
|
lines | the number of entries (=lines) of the text field in the display, the default is 4 |
initialize-when |
Defines parameter values setting and initialization behaviour when the pipeline is run as a sub-pipeline from within an External Pipeline Processor (see there for details); the default value is |
hidden | when |
locked | when |
Example 6.4.
inputFiles { type: filelist; label: "Source files:"; persistent: true; mode: open; format: url; lines: 5; }
creates an entry field for a list of file specifications where each single line corresponds to one list item, i.e. here: a file path. An empty line create a list item consisting of the empty string. The pipeline variable created will be named inputFiles, and values the user inputs will be stored across document openings. The file chooser will allow the user to add files only, and the result will be stored in URL format in the editable input field.
This UI element creates a popup menu to pick a single value among a set of pre-defined ones.
It will create a parameter and pipeline variable of type String holding the internal value (see internal-values property for details) representation of the currently selected item.
popup | |
type |
|
label | text to display as label at the very left |
persistent | when when The default value is |
default | the default internal value for the parameter, if newly created or persistent is |
description | the string specified here is displayed as tool-tip text in the UI when hovering over the input element and also used for descriptive messages when running in commandline mode |
required | (only used in upCast commandline mode) when when The default value is |
values | a space- or comma-separated list of values to display in the popup and to pass into the pipeline variables |
internal-values | a space- or comma-separated list of internal values. The value set on the pipeline variable is the one from this list whose index matches the selected option from the values property’s list of displayed values. Use this to use descriptive values in the displayed popup, while still getting short enum-type values in your variable. It also allows for easy localization of displayed values without having to change internal processing. |
initialize-when |
Defines parameter values setting and initialization behaviour when the pipeline is run as a sub-pipeline from within an External Pipeline Processor (see there for details); the default value is |
hidden | when |
locked | when |
Example 6.5.
targetType { type: popup; label: "Target type:"; persistent: true; default: "db4"; values: "DocBook 4", "DocBook 5", "DITA"; internal-values: "db4", "db5", "dita"; }
creates a popup with three entries, "DocBook 4", "DocBook 5" and "DITA". The pipeline variable created will be named targetType, and its value will be one of the values "db4
", "db5
" or "dita
", and the value selection will be stored across document openings. The default value of the variable will be "db4
" upon field creation.
This UI element creates a labelled check box to pick enable or disable (turn on or turn off resp. set to true or to false) a specific boolean-valued option.
It will create a parameter and pipeline variable of type Bool holding the boolean representation of the current state of the checkbox (true
when checked, false
otherwise).
checkbox | |
type |
|
label | text to display as label at the very left |
persistent | when when The default value is |
default | the default value for the parameter, if newly created or persistent is |
description | the string specified here is displayed as tool-tip text in the UI when hovering over the input element and also used for descriptive messages when running in commandline mode |
required | (only used in upCast commandline mode) when when The default value is |
text | the checkbox’s label text next to the actual checkbox graphic |
initialize-when |
Defines parameter values setting and initialization behaviour when the pipeline is run as a sub-pipeline from within an External Pipeline Processor (see there for details); the default value is |
hidden | when |
locked | when |
Example 6.6.
includeStyle { type: checkbox; label: "Option:"; persistent: false; default: true; text: "Include style information"; }
creates a checkbox with text "Include style information". The pipeline variable created will be named includeStyle and will have the Boolean value true
when the box is checked, false
otherwise. The popup value selection will not be remembered across document openings. The default will be the option being checked (=on).
This UI element creates a text field for entering a list of arbitrary, single-line strings (one per line).
It will create a parameter and pipeline variable of type List consisting of one line each of the input field as String value (in the displayed order).
list | |
type |
|
label | text to display as label at the very left |
persistent | when when The default value is |
default | the default value for the parameter, if newly created or persistent is |
description | the string specified here is displayed as tool-tip text in the UI when hovering over the input element and also used for descriptive messages when running in commandline mode |
required | (only used in upCast commandline mode) when when The default value is |
the number of lines of the text field, the default is 4 | |
initialize-when |
Defines parameter values setting and initialization behaviour when the pipeline is run as a sub-pipeline from within an External Pipeline Processor (see there for details); the default value is |
hidden | when |
locked | when |
The order in which the parameters are defined determine the display order and the order of parameters in created Java code functions.
Example 6.8.
Concatenating all individual parameter definition examples from above into one, the following Simple View for the pipeline would be created:
Name |
ParameterDefinitions |
Java symbol |
|
Type |
String |
Value |
Reset persistent values
This clears any currently stored persistent values in the pipeline document.
You should clear the values when you make significant changes to the paraneter definition and prior to saving the pipeline configuration for distribution to your customers, so they do not see your last, private settings you made during development for parameters having persistence turned on.
Initialization
This parameter lets you programmatically set pipeline parameters as well as (dynamically) prevent running the pipeline at all. For this, you can write a custom UPL function initialize()
by clicking on the Edit initialization code… button.
The text on the Edit initialization code… button will be bold when a custom initialization function has been defined (and therefore the code field is not empty). This lets you quickly see if a pipeline defines a custom initialization function without having to open the code entry dialog.
If the button text is plain, the code field is empty and the pipeline will always be executed.
If you always want to run the pipeline unconditionally (and don't need the function for parameter overrides), make sure the code field is empty. This also allows you to see at a glance in the UI to see whether a custom function is defined, and you protect you against possible future signature changes and therefore code incompatibilities in the initialize()
function when you effectively don’t even use its features.
If the initialize()
function returns EXECUTE (which is the default), the pipeline is further executed.
If the initialize()
function returns SKIP, the pipeline’s modules are not executed.
If the initialize()
function returns TERMINATE, the pipeline’s action is not performed and additionally, further execution is aborted.
In the initialize()
function’s body, you can run arbitrary UPL code. This code is run just before actually performing the pipeline’s programmed functionality. This function hook’s main intent is to give you the possibility to programmatically and dynamically set pipeline parameter values based on e.g. pipeline variable values (which in turn may have been set through the Simple View or by an external parameter passed to the pipeline). This way, you can set a parameter that does not allow you to have variable references expanded, like popups or check boxes. Additionally, this function serves as a dynamically evaluated condition specifying whether to run the pipeline or not.
Name |
InitializationCode |
Java symbol |
|
Type |
String |
Value |
UPL source code |
Finalization
This parameter lets you specify the condition under which the pipeline signals an error to its parent, which is the application when it is a top-level pipeline, or the executing component, when it is run as a sub-pipeline (e.g. by the External Pipeline Processor).
In the case of being a top-level pipeline, signalling an execution error will result in an error dialog to be shown (if run in the GUI) or an exception being thrown (when run via the Java API).
You can specify the cases in which a pipeline execution failure should be signalled by using several pre-defined, often used conditions, or you can specify a custom condition in UPL:
pipeline execution is always reported as successful
signal a pipeline execution failure when during execution, a FATAL log message has been received
signal a pipeline execution failure when during execution, a FATAL or ERROR log message has been received. This is the default for new pipelines.
signal a pipeline execution failure when during execution, a FATAL, ERROR or WARN log message has been received
In all of the four pre-programmed finalization modes above, collected log messages from level WARN and up are forwarded to the parent (usually a pipeline object). See also the section on logging for more details.
this option lets you specify custom UPL function code which, by returning one of the two Id values TERMINATE or CONTINUE, signals the failure state of the pipeline
To edit the UPL code for the custom finalize()
function, click the Edit finalization code… button. By returning the Id TERMINATE, you indicate that the execution of the pipeline has failed, and by returning CONTINUE you indicate that the pipeline execution succeeded.
The custom function receives an Id parameter which is TERMINATE when one of its child modules requested explicit, premature pipeline termination, CONTINUE otherwise.
Example 6.9. Finalization function template
function finalize( $childFinalizationResult as Id ) as Id { variable $result as Id := $childFinalizationResult; // default: CONTINUE /* Return the Id TERMINATE when you want to terminate the pipeline, CONTINUE otherwise. */ return $result; }
Generating a custom error message
Additionally, in the custom finalization code field, you can optionally specify a second UPL function, message-text()
. When this function is defined and does not return the empty string, when finalize() returns TERMINATE, the string returned by this function will be shown to the user instead of the default message generated by upCast. This allows you to generate error messages that are tailored specifically to your application and its user base.
Example 6.10. Custom message text function template
function message-text() as String { variable $result as String := ""; /* Return a non-empty message string to display an error dialog resp. write the error text to the log. */ return $result; }
Name |
FinalizationMode |
Java symbol |
|
Type |
String |
Value |
continue | signal-fatal | signal-error | signal-warning | custom |
Name |
FinalizationCode |
Java symbol |
|
Type |
String |
Value |
UPL source code |
Log filter
Sets the logging threshold for messages that the module accepts from children and produces itself (see the logging architecture description for details).
Some default filter expressions are available from the associated popup menu, however you are free to define your own customized filter expressions. The filter expression syntax is described here.
If set to inherit, the logging filter settings are governed by the application's filter settings for the external logger as set in the application's preferences.
Name |
LogFilterSpec |
Java symbol |
|
Type |
String |
Value |
inherit | OFF | FATAL | ERROR | WARN | INFO | DEBUG | VERBOSE | DETAIL | TRACE | ALL | |
Edit-Lock password
Here, you can specify a password that prevents switching off the Simple View and hence editing the pipeline from withing the GUI. It does not encrypt the pipeline document itself!
To remove the lock, clear the password field. The password "__________
" (10 underscores) must not be used.
The password is stored in the pipeline document as base64-encoded MD5 hash.
Name |
EditLockPassword |
Java symbol |
|
Type |
String |
Value |
Pipeline UID
To implement file-location independent linking from Parameter Set documents to their implementation pipeline document, each pipeline document must have a unique ID. This need not be a standard UUID when you can guarantee that these IDs will only be used in a controlled environment, where we suggest using speaking IDs to make it easier for users to manually find the respective pipeline given a UID value, which may be necessary when a link gets broken to a mis-configuration of the ID resolver.
By default, every pipeline document that is opened that does not yet have a non-empty pipeline ID setting, upCast will automatically generate a UID and set it for that pipeline document.
Name |
PipelineUUID |
Java symbol |
|
Type |
String |
Value |
Required upCast build number
Enter the build number of the upCast application that this pipeline requires as a minimum to be able to run. When a user tries to run the pipeline with an application version that has a build number less than the one specified here, a dialog is shown allowing the user to abort the execution of the pipeline (the default), execute it nevertheless (at his own risk), or aborting the execution and check automatically for a newer version of the application at the vendor site.
When you leave the field empty, no minimum requirement check is performed.
When no UI is available (e.g. when running from the commandline or via the Java API), execution is aborted and a FATAL log message with details is generated.
Name |
RequiredBuildNumber |
Java symbol |
|
Type |
Integer |
Value |
Runnable (by itself)
This option tells upCast if the pipeline can run by itself (the default value, i.e. option checked) or not.
You can use this to identify pipelines that are only useful e.g. as sub-pipelines called from other pipelines, especially when they need some setup (like inherit a current document tree they perform some specialized operations on, but not building it themselves).
When this option is not checked, the pipeline cannot be run as a top-level pipeline from the GUI. This is accomplished by disabling the Pipeline > Run command and the Run button in the pipeline window.
When checked, the catalogs set in the nearest pipeline ancestor in the call chain or, if it is the top-level pipeline, of the upCast application preferences will be used. When unchecked, the Catalog files setting as specified will be used.
Name |
UseGlobalCatalogs |
Java symbol |
|
Type |
Bool |
Value |
Catalog files
To add a catalog file, choose Add Catalog… and select the catalog file to add from the file system. Each line in the field specifies a catalog location. The new catalog will be available to all modules immediately after closing the preferences window.
When the catalog resides in the file hierarchy below the pipeline base URI, it is made relative by default using the ${pipeline:base}
variable notation.
When you hold down the Alt key while clicking Add Catalog…, upCast tries to generate a pipeline base URI-relative path, even when the location is outside the directory subtree under the pipeline base URI.
To remove a catalog, delete the text line containing its location specification.
Catalogs are considered in the order displayed.
Name |
Catalogs |
Java symbol |
|
Type |
String |
Value |
one path to a catalog per line as string |
Inherit from parent
When checked, the font configuration set in the nearest pipeline ancestor in the call chain or, if it is the top-level pipeline, of the upCast application preferences will be used. When unchecked, the Font configuration setting as specified will be used.
Name |
UseGlobalFontConfig |
Java symbol |
|
Type |
Bool |
Value |
Font configuration
Specify the source code for the stdfonts.config override that should be used for this pipeline.
Name |
FontConfiguration |
Java symbol |
|
Type |
String |
Value |
font configuration code |
Inherit from parent
When checked, the custom encoding setting set in the nearest pipeline ancestor in the call chain or, if it is the top-level pipeline, of the upCast application preferences will be used. When unchecked, the Custom Enconigs setting as specified will be used.
Name |
UseGlobalEncodings |
Java symbol |
|
Type |
Bool |
Value |
Custom Encodings
To add a custom encoding file, choose Add Encoding… and select the custom encoding file (*.encoding)
to add from the file system. Each line in the field specifies a custom encoding location. When the custom encoding resides in the file hierarchy below the pipeline base URI, it is made relative by default using the ${pipeline:base}
variable notation.
To add a folder where upCast should pick all contained custom encoding files, choose Add Encodings Folder… and select it. When the folder resides in the file hierarchy below the pipeline base URI, it is made relative by default using the ${pipeline:base}
variable notation.
When you hold down the Alt key while clicking Add Encoding… or Add Encodings Folder…, upCast tries to generate a pipeline base URI-relative path, even when the location is outside the directory subtree under the pipeline base URI.
To remove a custom encoding entry, delete the text line containing its location specification.
Name |
CustomEncodings |
Java symbol |
|
Type |
String |
Value |
paths to custom encodings (either to an individual custom encoding file or to a folder containing *.encoding files), with one entry per line |
Inherit from parent
When checked, the license set in the nearest pipeline ancestor in the call chain or, if it is the top-level pipeline, of the upCast application preferences will be used. When unchecked, the License location setting as specified will be used.
Name |
UseGlobalLicense |
Java symbol |
|
Type |
Bool |
Value |
License location
To set the license to be used for running this pipeline, click Choose license file… and select the license file (*.uclicense)
to be used.
When the license file resides in the file hierarchy below the pipeline base URI, it is made relative by default using the ${pipeline:PipelineBase}
variable.
The license details of the active license are displayed in the fields below for your reference.
Name |
LicenseFile |
Java symbol |
|
Type |
String |
Value |
path to license file |
Specify the fully qualified class name where the Java source code export option should place the generated code. Any required, nesting package folders will be created automatically by the File > Export to Java Source… function.
Name |
ExportJavaClass |
Java symbol |
|
Type |
String |
Value |
Source root folder
Specify the absolute path to the source root folder, i.e. the root of the Java package hierarchy subdirectories. You can use the ${pipeline:base}
variable as the first component of the path to specify the source root relative to the pipeline base URI.
When this field is left empty, upCast will ask for the source root folder every time you call the File > Export to Java Source… function. When this field is non-empty, that value will be used silently when calling the Java source export function.
With the Choose… button, you can request a file chooser to pick the Java source root folder. When this is a subdirectory of the pipeline base URI, the path is automatically made relative to it.
When you press the Alt key while clicking Choose…, upCast tries to always make the path relative, even if it is not in the subtree under the pipeline base URI.
Name |
ExportJavaSourceRoot |
Java symbol |
|
Type |
String |
Value |
‘upcast.jar’ location (or Ant expression)
Here you specify the path or expression to insert into the upcast Ant task definition to the upcast.jar
file containing the actual Java code for the task. If you leave that field empty, "upcast.jar
" will be used in the created Ant file module when using File > Export as Ant Task….
The text you enter here is first processed by the usual upCast variable resolution mechanism. This has the advantage that you can use upCast variables for calculating the path, must, however, take care to quote the ‘$
’ character (dollar sign) when you want that verbatim, e.g. to reference Ant variables.
So you could use something like
$${basedir}/tasks/upcast.jar
to keep the generated Ant file portable by referring to upcast.jar relatively from the Ant build file’s base directory.
Name |
ExportAntJarLocation |
Java symbol |
|
Type |
String |
Value |
literal Ant value for task’s ‘basedir’ attribute
Here you specify the literal code to be used for the upcast task’s basedir attribute in the generated target. This is useful to calculate the pipeline base URI relative to some Ant property and thus make the generated build module position independent. upCast variables are resolved as usual before writing the resulting text to the build file.
Example 6.11.
To calculate the pipeline base URI to be used by the task relative to the position of the build file, you may want to use a setting like
$${basedir}/MyPipelineRoot/
Note how you must quote the ‘$
’ character (dollar sign) to avoid upCast trying to treat it as an upCast variable and expand it.
Name |
ExportAntBasedir |
Java symbol |
|
Type |
String |
Value |
literal Ant code for <source> selection
Here you specify the literal Ant source
XML code to be inserted into the generated target code for selecting the source file(s) to be used.
With the Add source… button, you can generate code for a single source file.
When holding down the Meta (Mac OS X: Command) key while clicking the Add source… button, you can generate code for all files in the selected folder. A commented-out line for filtering based on extension is automatically generated, which you can uncomment and fill in as desired.
For both cases, when additionally holding down the Alt key, the reference generated will be relative to the literal value specified in the literal Ant value for task’s ‘basedir’ attribute field. For this, a special local variable ${taskbase}
is used, which gets replaced by the resolved contents of the literal Ant value for task’s ‘basedir’ attribute parameter.
For the syntax used for source
specification, see the description of the upCast Ant task.
upCast variables are resolved as usual before writing the resulting text to the build file, including the resolution of the special ${taskbase}
variable as the last resolution step.
Name |
ExportAntSourceCode |
Java symbol |
|
Type |
String |
Value |
This is a free form text field for adding notes or documentation to this pipeline setup. You can use HTML tags which are copied verbatim into the generated documentation for the pipeline (via File > Generate Documentation…).
Name |
ModuleDocumentation |
Java symbol |
|
Type |
String |
Value |
HTML code (will be copied into generated HTML documentation) |
Each module type has its own, dedicated set of parameters to control its behavior. A few parameters are shared by all modules, both in name and semantics. These are listed explicitly below. However, all other parameter names are to be interpreted with the context of the module’s functionality in mind to infer their meaning.
Internally, parameters of modules are dynamically, weakly typed, though each parameter has a recommended or even required (by definition) type.
Parameters will be described using the following typography:
Name | DeleteEmpties |
Java symbol |
|
Type | Boolean |
Value | false, true |
Name gives the internal java.lang.String name of the parameter, which is used for storing in preferences files, and can be used in the Java API as String. However, in Java, the use of the Java symbol is highly recommended instead.
Java symbol names the Java constant definition for the parameter’s Name. All constants are defined in the class de.infinityloop.common.Params.
Type specified the recommended Java type to use when programming against the API. In the GUI version, that is taken care of automatically. Also, when using alternative interfaces like an Ant task, which allows only passing arguments as character strings, upCast tries to perform an appropriate cast. So you should make sure that the data you provide in these cases will be cast-able to the specified type, as otherwise the conversion will fail or produce incorrect results at runtime.
Value describes possible value ranges, supported keywords or other specifics about that parameter’s range.
The following parameters are available on all modules:
Active checkbox
When the "active" checkbox in the upper left corner of the module parameter pane is checked, the module is active in the pipeline.
During pipeline development, it is often useful to have several differently configured modules to switch between, or to have modules inserted in the pipeline that generate some sort of debug output. To quickly activate and deactivate a module without having to actually delete or insert it again into a pipeline, with this parameter, modules can be quickly temporarily disabled by unchecking it.
Deactivated modules are completely skipped during a pipeline run and impose only minimal overhead – actually, it’s just writing a line to the log file.
Name |
ModuleEnabled |
Java symbol |
|
Type |
Bool |
Value |
true | false |
Name
Here, you can assign a meaningful name to a module instance. By default, modules’ names are their type, like "XSLT Processor" or "RTF Importer". However, when you have e.g. several XSLT processors in your pipeline, it is desirable to use more meaningful names, like "strip namespaces XSLT" or "TEI conversion transformation".
Name |
InstanceNameUser |
Java symbol |
|
Type |
String |
Value |
an arbitrary string |
Export
When checked, this module is handled (exported) in a File > Export… function.
You can use this to set up a single upCast pipeline document in such a way that for export to Java code or an Ant task, only certain modules will be exported. This lets you use some module instances for debugging in the UI, which then won’t be part of an exported pipeline representation.
For the Export as XML… function, module
elements will have an additional attribute export
with value true
or false
, respectively. This allows you to decide in any custom post-processing of that pipeline export format whether you want to handle that module in a special way (like discarding it completely like the built-in export options Ant and Java source).
Name |
ModuleExported |
Java symbol |
|
Type |
Bool |
Value |
true | false |
This parameter lets you programmatically set module parameters as well as (dynamically) prevent running the module even when its active checkbox is checked. For this, you can write a custom UPL function initialize()
by clicking on the Edit initialization code… button.
The text on the Edit initialization code… buttonwill be bold when a custom initialization function has been defined (and therefore the code field is not empty). This lets you quickly see if a module defines a custom initialization function without having to open the code entry dialog.
If the button text is plain, the code field is empty and the module will always be executed.
If you always want to run the module unconditionally, make sure the code field is empty. This also allows you to see at a glance in the UI to see whether a custom function is defined, and you protect you against possible future signature changes and therefore code incompatibilities in the initialize()
function when you effectively don’t even use its features.
If the initialize()
function returns EXECUTE (which is the default), the module is further executed.
If the initialize()
function returns SKIP, the module’s action is not performed and the subsequent module in the pipeline (if there is one) is run.
If the initialize()
function returns TERMINATE, the module’s action is not performed and additionally, further pipeline execution is aborted.
In the initialize()
function’s body, you can run arbitrary UPL code. This code is run just before actually performing the module’s functionality. This function hook’s main intent is to give you the possibility to programmatically and dynamically set module parameters’ values based on e.g. pipeline variable values (which in turn may have been set through the Simple View or by an external parameter passed to the pipeline). This way, you can set a parameter that does not allow you to have variable references expanded, like popups or check boxes. Additionally, this function serves as a dynamically evaluated condition specifying whether to run the module or not (in contrast to the module’s static Active checkbox).
Example 7.1.
Assuming you are offering your users the choice between the HTML and CALS table model by way of a pipeline parameter tableType
(e.g. in the Simple View), the following code sets the corresponding module parameter TableModel dynamically in the XML Export module. This would not be otherwise possible via that module’s UI since for the selection, a popup is used which has no way to calculate its value based on pipeline variables.
The code assumes that the pipeline parameter tableType
can have one of two values: html
or cals
.
#namespace module "http://www.infinity-loop.de/namespace/upcast-realm/module"; #namespace pipeline "http://www.infinity-loop.de/namespace/upcast-realm/pipeline"; function initialize() as Id { $module:TableModel := $pipeline:tableType; return EXECUTE; /* run the module */ }
Name |
InitializationCode |
Java symbol |
|
Type |
String |
Value |
UPL source code |
Finalization
This parameter lets you specify the condition under which further pipeline execution should be cancelled after running this module.
This parameter will only be evaluated (and therefore have any effect) if the module action was actually performed, or in other words: if initialize()
did not prevent the execution of the module’s action by returning TERMINATE or SKIP.
Normally, pipeline execution continues with the following defined modules even if in the current one there was a warning or error. These messages are collected and then displayed in the final pipeline execution error dialog. However, sometimes this is not a desired behaviour. Specifically, when subsequent modules rely on the proper execution of their predecessors to produce usable or correct results or – even more importantly – to not cause harm to data integrity, it may be necessary to immediately stop further execution of the pipeline when some module produces an error.
You can specify the termination behaviour by using several pre-defined, often used conditions, or you even can specify a custom condition in UPL:
continue pipeline execution no matter what, i.e. even when an ERROR or FATAL error has occurred
terminate pipeline execution when during execution of this module, a FATAL error message has been generated
terminate pipeline execution when during execution of this module, a FATAL or ERROR error message has been generated. This is the default value for new module instances.
terminate pipeline execution when during execution of this module, a FATAL, ERROR or WARN message has been generated
this option lets you specify custom UPL function code which, by returning one of the two Id values TERMINATE or CONTINUE, can request pipeline termination or continuation after this module.
To edit the UPL code for the custom finalize()
function, click the Edit finalization code… button. By returning the Id TERMINATE, you can request pipeline termination, and by returning CONTINUE as result you can request pipeline continuation.
The custom function takes as Id parameter the termination status of its child component if there is any, CONTINUE otherwise.
Example 7.2. Finalization function template
function finalize( $childFinalizationResult as Id ) as Id { variable $result as Id := $childFinalizationResult; // default: CONTINUE /* Return the Id TERMINATE when you want to terminate the pipeline, CONTINUE otherwise. */ return $result; }
Name |
FinalizationMode |
Java symbol |
|
Type |
String |
Value |
continue | signal-fatal | signal-error | signal-warning | custom |
Name |
FinalizationCode |
Java symbol |
|
Type |
String |
Value |
UPL source code |
Sets the logging threshold for messages that the module accepts from children and produces itself (see the logging architecture description for details).
Some default filter expressions are available from the associated popup menu, however you are free to define your own customized filter expressions. The filter expression syntax is described here.
If set to inherit, the logging filter settings are governed by the module’s execution parent’s settings (that is usually the pipeline it is contained in).
Name |
LogFilterSpec |
Java symbol |
|
Type |
String |
Value |
inherit | OFF | FATAL | ERROR | WARN | INFO | DEBUG | VERBOSE | DETAIL | TRACE | ALL | |
Documentation
This is a free form text field for adding notes or documentation specific to this module instance. You can use HTML tags which are copied verbatim into the generated documentation for the pipeline (via File > Generate Documentation…).
Name |
ModuleDocumentation |
Java symbol |
|
Type |
String |
Value |
HTML code (will be copied into generated HTML documentation) |
This section describes the available modules in more detail, listing available parameters. Filter type identifiers are given in square brackets after the UI module name.
This module allows you to set some commonly used global variables easily for re-use in subsequent modules. It is therefore most useful as the first module in a pipeline.
You can set the global variables pipeline:SourceFile, pipeline:TemporaryItemsFolder, pipeline:DestinationFolder, pipeline:ImageDestinationFolder and pipeline:DebugFolder.
The effect of using this module in API mode is the same as using UpcastEngine.setPipelineVariable()
.
All parameters have a type of java.lang.String.
When a field of the pre-defined parameters is left empty, that parameter is not set at all. This allows having this type of module somewhere in the middle of a pipeline and have it only set resp. override certain parameters (either custom parameters or selected pre-defined ones). All parameters with empty values in the list of pre-defined entry fields keep their assigned parameters (or are not created).
This also means that if you want to assign the empty string to some parameter, you can only do so by specifiying it in the Custom pipeline variables field.
Custom pipeline variables
Here, you can specify additional global values for use in subsequent modules. The definitions herein are processed after the fixed global parameters described above are evaluated and set, so you can refer to them using the usual ${pipeline:…}
variable reference. A parameter definition must follow this syntax:
varname’:=’ ‘"’ value ‘"’;
Quotes within the variable value must themselves be quoted using the backslash character ‘\
’.
The text in this field and any contained variable references are resolved as follows:
Any references to the include realm are resolved.
The individual assignments are parsed according to the above syntax.
Only now, any further references to realms other than include are resolved separately for each value part of an assignment.
This algorithm covers the usual cases where you might want to include constant assignment code shared by several pipelines using an include variable reference, but also reference variable values without having to worry whether their actual value breaks the value assignment syntax described above.
Name |
PipelineVariables |
Java symbol |
|
Type |
String |
Value |
(string in same syntax as in corresponding UI field) |
This module requires an appropriate RTF Importer feature included in your license to be fully functional.
This importer module handles conversion from RTF to the internal, unified upCast format, the upCast Internal DTD. With WordLink enabled, the filter also can convert Word binary files (*.doc
).
The RTF importer outputs the RTF Optional Hyphen symbol (\-
) as codepoint U+E003
in the Unicode Private Use Area. This is to allow following pipeline steps to discriminate it from Soft Hyphen (U+00AD
) Unicode characters entered directly in the RTF as Unicode. This has been implemented because rendering behaviour of the two in following rendering engines is different from Word’s display so that it is important to be able to differentiate between those two.
However, the Unicode Translation Map in effect in the XML Exporter module maps U+E003
to U+00AD
by default. If you need or want to change the translation of RTF’s Optional Hyphen symbol to something other than the Soft Hyphen character in Unicode, you must change or override the default mapping of the source codepoint U+E003
in the XML Exporter module.
Parameters are grouped logically into tabs:
General
Source file
Specify the source file in RTF or, if WordLink is available, in Word binary format that should be imported.
Name |
SourceFile |
Java symbol |
|
Type |
Object |
Value |
absolute file path in URL or local file system convention |
Hoist common inline properties to parent
If enabled, any inline formatting CSS property that extends and has the same value over all children of a paragraph-level element will be hoisted to its parent object as a style
override. Effectively we’re making use of CSS inheritance and optimize the output by specifying that particular property only once on the parent instead of on each of its child elements.
Name |
HoistCommonInlines |
Java symbol |
|
Type |
Bool |
Value |
true | false |
Remove empty inlines
If enabled, any inline style specifications that do not contain any #PCDATA
or similar, visually rendered content, are discarded from the document.
The default for this parameter is off based on the assumption that you may want to keep e.g. formatting information for empty cells so that a user may later fill in text and has the correct, originally intended formatting information available at that document location.
Name |
RemoveEmptyInlines |
Java symbol |
|
Type |
Bool |
Value |
true | false |
Allow ‘class’ and ‘style’ attributes simultaneously on <inline>
elements
When on, this option allows that both a class
and style
attribute may be present on an element. Otherwise, the two are separated and an anonymous inline element is created for the style
attribute instead.
Option checked:
This is <uci:inline uci:class="slang" uci:style="color: blue;">True Blue</uci:inline>.
Option unchecked:
This is <uci:inline uci:class="slang"><uci:inline uci:style="color: blue;">True Blue</uci:inline></uci:inline>.
You might want to use this option to have named Word styles always separated out in a dedicated element so that additional override styles can be recognized quickly by the additional inline element.
Name |
CombineWithLogicalStyle |
Java symbol |
|
Type |
Bool |
Value |
true | false |
Markup revision tracking using <inserted> and <deleted>
When this is checked, document revisions are marked up in the result using the inserted
and deleted
elements.
If this is off, only the result of the revisions will be exported, i.e. inserted content remains in the document and deleted content is removed.
Name |
RevisionTracking |
Java symbol |
|
Type |
Bool |
Value |
true | false |
Use CSS for forced pagebreaks (where possible)
When checked, the importer tries to use CSS code for specifying forced pagebreaks wherever possible by using the pagebreak-before: always
property/value combination.
If this is off, a pagebreak
element will always be used.
Name |
UseCSSForPagebreaks |
Java symbol |
|
Type |
Bool |
Value |
true | false |
Apply list structuring heuristics
If checked, special list structure detection algorithms are performed to create the best logically structured XML output. If unchecked, Word’s internal list IDs are used to track where a list starts and ends and where a new one begins, which may (based on the editing history of a particular list) have virtually no resemblance to what you are actually seeing in the layout.
The default value is on.
Name |
ApplyListHeuristics |
Java symbol |
|
Type |
Bool |
Value |
true | false |
If checked, list/item element structures are not created. Instead, all elements that would constitute list item contents get the respective info attached as attributes:
'true' when that element is part of the first list item of the respective list, 'false' if it is in a subsequent item (not the first one)
'true' when this is the first element of a list item (the one that has the list marker before it), 'false' for any subsequent elements in a certain list item
(on the first element of a list item only) the numbering text (effective marker text) of that list item
Additionally, the following list-specific properties are copied onto each of the (virtual) list item's content elements:
list-style-type
-ilx-list-level
-ilx-list-group
-ilx-list-numbering-absolute
-ilx-marker-align
-ilx-marker-follow
-ilx-marker-offset
-ilx-marker-format
-ilx-marker-font-family
-ilx-marker-font-size
-ilx-marker-color
The default value of this parameter is off.
Name |
FlattenListStructures |
Java symbol |
|
Type |
Bool |
Value |
Flatten options
Here you can control some additional aspects of list flattening:
when checked, the actual list marker text for each item is inserted as first child node in any element that has the uci:list-hasmarker
attribute set to 'true'
, i.e. any element that is first in a list item. The default value is off.
Default font size
Some RTF documents do not specify a default font size for their text content, but rely on the default of the rendering application (like Microsoft Word). This parameter lets you set the default font size for such documents.
Microsoft Word applications up to and including Word 97 used a default value of 10pt, Word 2000 and later use a default of 12pt. When you set this parameter to * (i.e. automatic), upCast tries to guess from the RTF symbols it finds in the document whether it is a Word 2000 (or later) document and then will use 12pt as default font size, 10pt otherwise.
Name |
DefaultFontSize |
Java symbol |
|
Type |
String |
Value |
'*' | 1..999 |
Note handling options: Marker mismatch handling
This option lets you specify what should happen when the reference marker to a footnote, endnote or annotation in the body text of the document does not match the marker within the footnote, endnote or annotation definition itself:
The RTF Importer uses pre-defined Word character styles ("footnote reference", "endnote reference", "annotation reference") to find the marker information in a footnote, endnote or annotation definition, respectively. In normal usage patterns of the Word application, the setting of these styles is automatic and correct. However, when markers are edited manually in a footnote definition (and not paying attention), the marker-identifying style may be inadvertently applied also to some or all of the footnote contents. Since upCast tries to automatically remove the marker portion from the footnote definition (since that info is usually (re-) created at the final output/rendering stage), a wrong setting of the respective character styles may lead to actual footnote content getting lost. Such authoring mistakes are usually detected by checking if the reference marker text in the document body is the same as the marker in the definition that the RTF Importer detects based on the special character style settings. Issuing an error when that text is different alerts the user that there's probably something wrong with the document that needs checking and/or fixing in order to not lose any content.
Name |
NoteMarkerValidation |
Java symbol |
|
Type |
String |
Value |
none | warn | error |
Note handling options: Marker in note definition
This option lets you specify what should happen with any (repeated) marker text in the actual footnote definition:
The RTF Importer uses pre-defined Word character styles ("footnote reference", "endnote reference", "annotation reference") to find the marker information in a footnote, endnote or annotation definition, respectively. When this option is set, any leading contents in a footnote definition that has one of the mentioned styles is removed from the note definition content. This is the default setting.
If the note body (uci:content
) starts with the same text as is present in the note reference (uci:reference
), that part is removed from the beginning of the footnote body content. Differences in style and leading and trailing whitespace are not considered and removed as well. Automatic note numbering placeholders are considered as expected during this text prefix comparison.
the footnote, endnote or annotation content is kept unmodified (i.e. as present in the Word document) in the XML output, including any contained repetition of the note's reference marker
Name |
NoteMarkerHandling |
Java symbol |
|
Type |
String |
Value |
remove-style | remove-content | include |
Literal pass-through styles
If checked, you can specify a set of (Word-) styles, separately for the paragraph style and the character style category, by specifying their exact names which should be treated as literals. This means that all text in the document set using these styles will be written to the output without any interpretation by upCast. This lets you write e.g. XHTML or XML code directly within your document the way it should appear at that location in the output.
Name |
LiteralProcessing |
Java symbol |
|
Type |
Bool |
Value |
true | false |
Paragraph style names
When Literal pass-through styles is on, specify here the list of paragraph styles that should be treated as literal content indicators. Enclose the names of the styles in double-quotes, separate styles by a space character.
Name |
LiteralParStyle |
Java symbol |
|
Type |
String |
Value |
style name |
Character style names
When Literal pass-through styles is on, specify here the list of character styles that should be treated as literal content indicators. Enclose the names of the styles in double-quotes, separate styles by a space character.
Name |
LiteralCharStyle |
Java symbol |
|
Type |
String |
Value |
style name |
Images
Include images
When checked, images contained in the document are processed as configured by the image processing parameters. If unchecked, all images of the source document will be completely discarded from the document.
Name |
IncludeImages |
Java symbol |
|
Type |
Bool |
Value |
true | false |
Temporary images folder
This is the location where images in a read document will be temporarily stored while the pipeline is processed. It is e.g. the responsibility of an exporter to copy images intended to be permanently saved across a pipeline run to a different location.
A pipeline keeps track of temporary images created in the above location. After finishing a pipeline run, all these recorded temporary files are automatically deleted.
Name |
TemporaryItemsFolder |
Java symbol |
|
Type |
String |
Value |
path to temporary items folder |
Use inline copies instead of referenced original images (if available)
When this option is checked, for images that have been included in the RTF document using both methods, by reference and by embedding, the module will try to use the embedded substitute representation. This option essentially breaks the link to the original image file, if a substitute representation has been embedded in the RTF file, and instead links to the embedded representation of the original file.
When an image has only been linked and no substitute representation is available in the RTF, however, the original link to the image is preserved and used.
Name |
InlineReferencedImages |
Java symbol |
|
Type |
Bool |
Value |
true | false |
Incoming images default resolution
This parameter determines the image resolution in dpi (dots per inch) to use for embedded images that do not specify their resolution explicitly. This is true for all (originally) GIF images and some variants of JPEG and PNG images.
Without any dpi information, the RTF importer (and, as a matter of fact, even Word) cannot determine the absolute size of images, which is necessary to create a fully specified export file. This parameter is then used to establish a default dpi value and corresponds roughly to Word’s Web Options > Image resolution setting.
When setting this to the default ‘*
’ value, the RTF importer determines the absolute size of the image from the image properties in the RTF document (if available) and modifies the embedded image data by adding the resolution determined from the (absolute size/number of pixels)-pair to the externalized image. This ensures that subsequent processors can correctly determine absolute sizes and scale any images accordingly.
If you have control over the original document generation process and especially image creation, make sure that each image you add to a Word or RTF document contains explicit resolution information, as this avoids all sorts of platform incompatibilities.
This rule especially forbids importing GIF images as the GIF format does not include resolution information. However, also several Clip Art images in JPEG and PNG format do not contain this desirable information, with displayed image size in a document becoming dependent on platform, Word version or setting of the Web Options > Image resolution parameter – which is generally undesirable.
Outgoing images rendering resolution
This value affects the WMF to pixmap renderer built into the RTF Importer. This means that WMF (or EMF) images will be rendered into a pixmap with pixel dimensions for width and height that correspond to this value.
The default value is 96 dpi (used e.g. by Microsoft’s Internet Explorer™). You may want to change this when outputting for Netscape Navigator 4.7 on the Mac, which by default displays at 72 dpi and therefore would downscale images written using 96 dpi resolution.
Suppose you have a WMF image in your document that is 2 by 1 inches in size. With 96 dpi output resolution, this will yield a pixmap of size 192 by 96 pixels.
However, if you set the output resolution to only 72 dpi, the resulting pixmap will be 144 by 72 pixels in size.
Name |
ImageRenderingResolution |
Java symbol |
|
Type |
Integer |
Value |
20..360 |
Export embedded images of type…
While exporting embedded images, you have the option to convert them to a different format.
The RTF Importer includes a custom WMF to pixmap renderer fully programmed in Java. It is neither intended nor recommended for production quality image conversion! To perform high-quality image conversion, we strongly encourage you to consider specialized third-party products. Nevertheless, the built-in renderer is useful and intended for producing draft image renderings for viewing in a web browser or creating documents for editorial review and should perform well enough for most purposes except final publishing.
Embedded images in an RTF document can be of several image format types: WMF, EMF, JPEG, PNG and Macintosh PICT. The RTF importer lets you specify a handling method for each of these formats, so you can e.g. use already pixel based images like JPEG or PNG unchanged while rendering vector formats like WMF into a pixel-based representation.
The following handling methods are available (some of which are not applicable to all source formats):
Export the embedded image as binary data without any modification applied
Export the embedded image as binary data without any modification applied, and then run the specified external command on it for further processing. (See below for details.)
The image will be completely removed from the document
The image will be converted into JPEG format, using the built-in WMF to pixmap renderer if necessary. Clicking Options… lets you set the JPEG compression quality.
The image will be converted into PNG format, using the built-in WMF to pixmap renderer if necessary. Clicking Options… lets you set the PNG compression algorithm.
The image will be converted into Windows bitmap (BMP) format, using the built-in WMF to pixmap renderer if necessary.
The image will be converted into Macintosh PICT format, using the built-in WMF to pixmap renderer if necessary. Note that only the image map operator is supported. The RTF importer will not translate WMF vector operators into native PICT operators.
When using the option external cmd, two additional parameters can be set:
The field should receive the destination file extension of the image file as it is after the external conversion. For example, if you want to convert a WMF file to TIFF, the extension should be tif
or tiff
.
This is the external command to execute for converting the image source file to the desired target format. You must use placeholders for the source and destination file name using the upCast variable syntax. The variables to use are:
| the image source file in local file name convention |
| the image source file in URL format |
| the destination file name in local file name convention |
| the destination file name in URL format |
This works as follows: The file to be converted is available at the location in imgsrc#local. The RTF importer then constructs a target file name, using the source file name as basis, but setting the extension to the one specified. Since the RTF importer needs to know the final resulting filename for referring to the externally converted image in the internal document tree, but there is no way to return a string from a shell command easily (just an integer return code), it prescribes the target file name itself. This is what the variable imgdest#local is for. You must make sure that the final, processed image file is available at the location contained in that specific variable.
Example 8.1. Example:
To convert a WMF file to JPEG, use settings like:
WMF to [external cmd:]
File extension: [jpg]
Command: [fileconverter -fmt jpeg -outfile ${imgdest#local} ${imgsrc#local}
]
Name |
WMFDestFormat |
Java symbol |
|
Type |
String |
Value |
unchanged | dispose | ExternalCommand | JPEG | PNG | BMP | PICT |
Name |
EMFDestFormat |
Java symbol |
|
Type |
String |
Value |
unchanged | dispose | UseWMFSubstitute | ExternalCommand |
Name |
JPEGDestFormat |
Java symbol |
|
Type |
String |
Value |
unchanged | dispose | ExternalCommand | JPEG | PNG | BMP | PICT |
Name |
PNGDestFormat |
Java symbol |
|
Type |
String |
Value |
unchanged | dispose | ExternalCommand | JPEG | PNG | BMP | PICT |
Name |
PICTDestFormat |
Java symbol |
|
Type |
String |
Value |
unchanged | dispose | ExternalCommand | JPEG | PNG | BMP | PICT |
Name |
WMFDest.JPEG.Quality |
Java symbol |
|
Type |
Integer |
Value |
0..100 |
Name |
WMFDest.PNG.CompressionType |
Java symbol |
|
Type |
String |
Value |
default | fast | max | none |
Name |
JPEGDest.JPEG.Quality |
Java symbol |
|
Type |
Integer |
Value |
0..100 |
Name |
JPEGDest.PNG.CompressionType |
Java symbol |
|
Type |
String |
Value |
default | fast | max | none |
Name |
PNGDest.JPEG.Quality |
Java symbol |
|
Type |
Integer |
Value |
0..100 |
Name |
PNGDest.PNG.CompressionType |
Java symbol |
|
Type |
String |
Value |
default | fast | max | none |
Name |
PICTDest.JPEG.Quality |
Java symbol |
|
Type |
Integer |
Value |
0..100 |
Name |
PICTDest.PNG.CompressionType |
Java symbol |
|
Type |
String |
Value |
default | fast | max | none |
Objects
These parameters specify how embedded objects (OLE) should be handled. The RTF importer generates an uci:object
element for each embedded object it finds in the RTF. The child elements of this container object are alternative representations of the object’s data. This can can be an uci:image
(if available in the source document: represents the current display of that object at the time of saving the document), or an uci:ole
element (if available: it contains a base64 representation of the binary data of the OLE object, which makes it possible to reconstruct it to an editable instance using the RTF exporter).
Include image representation
When checked, an image representation alternative will be added to the object element (if available in the source document).
Embed binary OLE object data as base64-encoded text
When checked, an uci:ole
binary data representation alternative will be added to the object element. The uci:ole
element contains the base64-encoded binary data as character data.
Extract object from OLE object and serialize to file
When checked, an uci:extobject
element within the uci:object element is written with the following attributes:
the raw target specification of the externalized file
the URL (possibly relative) to the externalized file
the MIME type of the externalized object file
Currently, the following OLE CLSIDs are supported for externalization:
serialization to PDF; the mime type used is application/pdf
serialization to *.xls; the mime type used is application/vnd.msexcel
If the OLE object is of some type that is not explicitly supported, a warning is issued to the logging system and the object (unwrapped from the embedded OLE object in the Word document) is written to the file. Note that in most cases, you will not be able to use that file with the application that originally created the OLE, as the data structure of such objects is proprietary to that application and extracting the correct portion of the data from the OLE object requires knowledge of that particular application's OLE file format.
To see a yet unsupported OLE format (CSLID) supported for externalization, please contact us at support@infinity-loop.de. Please do include a small sample file that includes an instance of an OLE of that particular application, and exactly specify which application at which version you were using for creating that OLE object.
Include MathML representation
When MathLink is available, i.e. you have Design Science‘s MathType software (version 5.2) installed on your Windows system and are running upCast on that same machine, for MathType OLEs, you can also embed a MathML representation of your formula in the object
element as m:math
element.
Since MathLink is only available on the Windows platform, this option will only be enabled when a functioning MathLink actually is available to the application.
Name |
ObjectHandling |
Java symbol |
|
Type |
String |
Value |
image || embed || mathml || extract (separated by whitespace if more than one) |
WordLink
Set WordLink features.
Since WordLink is only available on the Windows platform, this tab will only be displayed when WordLink actually is available to the application.
When opening a pipeline definition file created on the Windows platform on some other platform, existing settings will be preserved on save, but will have no effect during execution on that non-Windows platform.
Mode
When Process .doc files only is selected, WordLink and all options specified will only be applied to Word binary (*.doc
) files.
When Process all files is selected, WordLink and all options specified will be applied to any input document, i.e. even files that are in RTF format already. This lets you automatically update fields or add pagestart
and linestart
elements.
Name |
WordLinkMode |
Java symbol |
|
Type |
String |
Value |
doc | all |
Run macro named „il_premacro"
When checked, WordLink will first run a Word macro named il_premacro
on the source document. This macro must either be defined in the respective document (when it is a Word binary .doc
file) or in the global document template file (*.dot
).
When this macro is not available, an error will be issued after conversion, though the further conversion process is not affected.
Update fields
When checked, WordLink will update any fields in the source document with current values: date, time, pages, …
Update from linked images
When including an image only by reference (i.e., using Word’s INCLUDEPICTURE field), the RTF importer is not able to determine the actual image size as that information is not part of RTF. By checking this option, the linked image is temporarily included into the document with the effect that image size and possibly applied scaling in the .doc
Word binary file can be evaluated by the importer.
This feature is not beneficial for RTF source files, as in these the necessary information is already lost (also for Word).
Mark up layout page breaks using <pagestart />
This inserts a <pagestart />
empty inline element at those places where in current layout flow, there would be a dynamic page break when rendering the document.
Mark up layout line breaks using <linestart />
This inserts a <linestart />
empty inline element at those places where in current layout flow, there would be a dynamic line break when rendering the document.
This is slow for documents bigger than about 100 pages. You may want to increase the Kill timeout value significantly. Also, some document structure constellations may yield wrong line break position results due to limitations in the Word application.
Name |
WordLinkCommand |
Java symbol |
|
Type |
String |
Value |
Pages || Update || Premacro || Lines || Includelinkedimages || Updatelinks (concatenate desired options without any whitespace inbetween) |
Kill timeout
When hitting a corrupt document, WordLink may have problems and/or hang the application. Therefore, you can set a kill timeout value after which the WordLink functions will be aborted. The default value is 300 seconds.
Killing WordLink may leave an invisible instance of Word running. Please check in case of a timeout running processes and kill any zombie Word processes manually using the Process Viewer (Ctrl-Alt-Del on Windows 2000/XP).
Name |
WordLinkKillTimeout |
Java symbol |
|
Type |
Integer |
Value |
timeout duration in milliseconds |
Copy temporary .rtf file to debug folder as "basename
-tmp.rtf"
This is mainly for debugging purposes. It copies the intermediate RTF file to the specified debug folder with a name of basename
-tmp.rtf after having applied all WordLink functions. This is the file that the RTF importer itself takes as source for its actual conversion process.
Name |
WordLinkCopyToOutput |
Java symbol |
|
Type |
Bool |
Value |
true | false |
This module requires an appropriate UPL feature included in your license to be fully functional.
This module lets you run a program written in the Upcast Processing Language (UPL).
The single context node for all UPL code in this module is the document root node (XPath: /
). Note that this is different from the document root element!
UPL code
This contains the UPL code you want to execute. The code must define a function main()
as follows:
function main() as Value { ... your code goes here ... }
The UPL Processor calls this function main()
once when it runs and executes the code defined therein (or in any dependent, user-defined functions). For a detailed description of UPL, see the separate documentation, Upcast Processing Language.
The returned result of the function is stored into the pipeline variable ModuleResult.
Name |
UPLCode |
Java symbol |
|
Type |
String |
Value |
UPL source code |
UPL parameters
This contains the assignments for variables to be passed to the UPL program. For a detailed description of how UPL receives parameter values, see the separate documentation, Upcast Processing Language.
A parameter definition must follow this syntax:
paramname ':=' '"' value '";'
Quotes within the parameter value must themselves be quoted using the backslash character ‘\
’.
The text in this field and any contained variable references are resolved as follows:
Any references to the include realm are resolved.
The individual assignments are parsed according to the above syntax.
Only now, any further references to realms other than include are resolved separately for each parameter assignment’s value part of a parameter assignment.
This algorithm covers the usual cases where you might want to include constant parameter assignment code shared by several UPL modules in a pipeline using an include variable reference, but also reference variable values without having to worry whether their actual value breaks the value assignment syntax described above.
Name |
UPLParameters |
Java symbol |
|
Type |
String |
Value |
(string in same format as in UI) |
This module requires an appropriate UPL feature included in your license to be fully functional.
This module differs from the UPL Processor in that it does not call a single function once, but you can define code to be run upon visiting each node of the current internal document in a depth-first traversal, depending on certain conditions you specify.
This contains the UPL code you want to execute. For a detailed description of the UPL, see the separate documentation, Upcast Processing Language.
Name |
UPLCode |
Java symbol |
|
Type |
String |
Value |
UPL source code |
This contains the assignments for variables to be passed to the UPL program. For a detailed description of how UPL receives parameter values, see the separate documentation, Upcast Processing Language.
A parameter definition must follow this syntax:
paramname ‘:=’ ‘"’ value ‘"’;
Quotes within the parameter value must themselves be quoted using the backslash character ‘\
’.
The text in this field and any contained variable references are resolved as follows:
Any references to the include realm are resolved.
The individual assignments are parsed according to the above syntax.
Only now, any further references to realms other than include are resolved separately for each parameter assignment’s value part of a parameter assignment.
This algorithm covers the usual cases where you might want to include constant parameter assignment code shared by several UPL modules in a pipeline using an include variable reference, but also reference variable values without having to worry whether their actual value breaks the value assignment syntax described above.
Name |
UPLParameters |
Java symbol |
|
Type |
String |
Value |
(string in same format as in UI) |
Grouper
When turned on, the grouping algorithm will be run on the internal tree. This will be before the finalize()
or finalize-error()
UPL method is called.
Name |
RunGrouper |
Java symbol |
|
Type |
Bool |
Value |
Grouping processing order
This parameter lets you set the order of the colors in which the grouping should be performed.
With all colors in alphabetical order, all colors that have been used for setting painters or tags are grouped in alphabetical order of the color name. The ordering is the same as the Java class java.lang.TreeSet
uses by default on the platform you are running upCast on.
With colors in specified order, then all remaining:, you can specify a list of colors you want to be grouped in a given order before all others in the text field below, with any remaining ones being grouped in alphabetical order (see all colors in alphabetical order) afterwards.
With only these colors in specified order, you can specify the colors and their order of grouping in the text field below. No other colors will be grouped, even if respective painters or tags have been placed on nodes in the internal document tree.
After a run of the grouper, all painters, tags and paintings of nodes are removed from the internal tree. This means that any further grouper instances running after a grouper has already run will have no effect unless new painters and tags have been placed on nodes of the internal document tree, usually using the UPL processor.
Color order is specified by listing the colors in sequence, separated by whitespace. If a color name includes whitespace (which is deprecated), the full color name must be enclosed in double quotes.
Name |
GroupingColorOrder |
Java symbol |
|
Type |
String |
Value |
alphabetic | only | first |
Name |
GroupingColors |
Java symbol |
|
Type |
String |
Value |
ordered list of color names, separated by whitespace; to use color names that themself contain whitespace, surround them by double-quotes |
Element Splitter
When turned on, any split actions attched to nodes using mark-split()
in the internal tree will be executed. This will be after running the grouper (if enabled), but before the finalize()
or finalize-error()
UPL method will be called.
Name |
RunSplitter |
Java symbol |
|
Type |
Bool |
Value |
The Sectioner module is used for creating a nested, deeper structure based specifically on elements that have a heading level set (via set-heading-level()
in UPL), and uci:part
elements that have the grouping property set (via set-grouping()
in UPL).
Sectioning works only on the direct children of the uci:body
element in the upCast Internal DTD.
If the algorithm finds a uci:part
element, it checks its grouping property. If this uci:part
is a grouping part, all uci:body
element children between this uci:part
element and the next uci:part
element that has the grouping property set will be surrounded by this uci:part
element.
Example 8.2. Example:
… <part is-grouping="true"/> <par>…</par> <par>…</par> <part is-grouping="false"/> <par>…</par> <part is-grouping="true"/> <par>…<par> …
will be transformed by a run of the Sectioner into
… <part is-grouping="true"> <par>…</par> <par>…</par> <part is-grouping="false"/> <par>…</par> </part> <part is-grouping="true"> <par>…<par> …
Note that namespace prefixes/definitions have been omitted in the above for better readability.
<part> is grouping (by default)
When checked, even though you may not have specified this explicitly on each uci:part
element (e.g. in UPL), all uci:part
elements are treated as if they had set the grouping property by default. This mimics the behavior of pre-6.0 versions of upCast.
Name |
PartIsGrouping |
Java symbol |
|
Type |
Bool |
Value |
true | false |
Any elements that have a uci:heading-level
attribute with a value greater than 0 are considered headings of the respective structure level. The Sectioner creates sections based on the heading level information on those elements by automatically creating a surrounding uci:section
element, taking care to match the section nesting to the element’s heading level. This means that if there is a jump in heading level, the Sectioner will automatically generate additional, grouping uci:section
elements.
When an element with the same heading level is found as the current section nesting, the current section is closed and a new one is opened at the same level.
When an element with a higher heading level than the current one is encountered, a new, nested section is created within the current section.
When an element with a lower heading level than the current one is encountered, the appropriate number of open, nested sections is closed (including the one with the same nesting level) and a new one is opened.
Here’s an example demonstrating all possible cases (assume all elements and attributes being in the uci namespace):
Example 8.3. Example: section nesting based on paragraph’s heading level
<par>…</par> <par heading-level="1">…</par> <par>…</par> <par heading-level="2">…</par> <par>…</par> <par heading-level="4">…</par> <par>…</par> <par heading-level="3">…</par> <par>…</par> <par heading-level="1">…</par> <par>…</par>
will result in the following structure generated:
<par>…</par> <section level="1"> <par heading-level="1">…</par> <par>…</par> <section level="2"> <par heading-level="2">…</par> <par>…</par> <section level="3"> <section level="4"> <par heading-level="4">…</par> <par>…</par> </section> </section> <section level="3"> <par heading-level="3">…</par> <par>…</par> </section> </section> </section> <section level="1"> <par heading-level="1">…</par> <par>…</par> </section>
Note that namespace prefixes/definitions have been omitted in the above for better readability.
The sectioning algorithm can be modified by two options:
Create <section> for empty headings
The default sectioning algorithm only creates a new section for the first of consecutive elements having a uci:heading-level
attribute of the same value (if it is not empty).
The idea behind this option is that the user may have created a heading in Word, then hit return (not changing the style) to create visual space, and only then started writing the actual content. You certainly would not want to have a section on its own for each of the visual space generating empty heading-styled paragraphs, but only for the first one, so section nesting generation is suppressed for the remaining heading-styled paragraphs.
If, however, you want to create section nesting corresponding to each heading-styled paragraph in a document, even if it’s empty, check this option.
Name |
GroupEmptyHeadings |
Java symbol |
|
Type |
Bool |
Value |
true | false |
Create <sectionintro> around leading section content
Sections created using the sectioning algorithm may have leading content before any subsections they may also have. Checking this option allows you to have this leading content up to the start of the first nested (sub-) section be grouped by an uci:sectionintro
element , e.g. for easier post-processing later with XSLT.
You can choose whether you want the uci:sectionintro
element be created in any case (always) or only when the respective uci:section
actually has sub-sections (when sub-sections exist).
In this example, assume all elements and attributes being in the uci namespace:
Example 8.4. Grouping the section introduction
<par heading-level="1">…</par> <par>…</par> <table>…</table> <par>…</par> <par heading-level="2">…</par> <par>…</par>
will be transformed to the following when Create <sectionintro> around leading section content is checked with the always option:
<section level="1"> <sectionintro> <par heading-level="1">…</par> <par>…</par> <table>…</table> <par>…</par> </sectionintro> <section level="2"> <sectionintro> <par heading-level="2">…</par> <par>…</par> </sectionintro> </section> </section>
or it will be transformed to the following when Create <sectionintro> around leading section content is checked with the when sub-sections exist option:
<section level="1"> <sectionintro> <par heading-level="1">…</par> <par>…</par> <table>…</table> <par>…</par> </sectionintro> <section level="2"> <par heading-level="2">…</par> <par>…</par> </section> </section>
Note that namespace prefixes/definitions have been omitted in the above for better readability.
Name |
GroupSectionIntro |
Java symbol |
|
Type |
String |
Value |
never | always | child |
This module is deprecated and must no longer be used in new development of processing pipelines. It will be removed completely in a future version of upCast. Update any of your existing pipeline definitions as soon as possible by transitioning to the use of the functionally equivalent Grouper option of the UPL Tree Processor module.
The Grouper module actually performs a grouping that has been earlier specified during a run of an UPL Tree Processor.
Grouping processing order
This parameter lets you set the order of the colors in which the grouping should be performed.
With all colors in alphabetical order, all colors that have been used for setting painters or tags are grouped in alphabetical order of the color name. The ordering is the same as the Java class java.lang.TreeSet
uses.
With colors in specified order, then all remaining:, you can specify a list of colors you want to be grouped in a given order before all others in the text field below, with any remaining ones being grouped in alphabetical order (see all colors in alphabetical order) afterwards.
With only these colors in specified order, you can specify the colors and their order of grouping in the text field below. No other colors will be grouped, even if respective painters or tags have been placed on nodes in the internal document tree.
After a run of the grouper, all painters, tags and paintings of nodes are removed from the internal tree. This means that any further grouper instances running after a grouper has already run will have no effect unless new painters and tags have been placed on nodes of the internal document tree, usually using the UPL processor.
Color order is specified by listing the colors in sequence, separated by whitespace. If a color name includes whitespace (which is deprecated), it must be enclosed in double quotes.
Name |
GroupingColorOrder |
Java symbol |
|
Type |
String |
Value |
alphabetic | only | first |
Name |
GroupingColors |
Java symbol |
|
Type |
String |
Value |
ordered list of color names, separated by whitespace; to use color names that themself contain whitespace, surround them by double-quotes |
This module imports any XML document into the internal tree variable, replacing any existing document. This is useful when you want to apply some of the specific UPL functions on it and need not rely on styling info (which is currently not imported/recognized and cannot be created within upCast).
Special element: uci:text
There is one element, uci:text, that is handled sepcial on import: It is discarded, and its first Text node child receives the uci:text element's uci:node-id value as the value that will be returned from generate-id() called on it (unless a different node in the partially constructed document tree already carries that id value).
This parameter lets you choose the source XML file to import.
Name |
SourceFile |
Java symbol |
|
Type |
Object |
Value |
absolute file path in URL or local file system convention |
If the document to be imported has a DOCTYPE declaration associating a DTD with the document, the parser can identify ignorable whitespace. When this parameter is checked, ignorable whitespace will be discarded from the internal tree.
Name |
DiscardIgnorableWhitespace |
Java symbol |
|
Type |
Bool |
Value |
This parameter lets you choose a fallback for the external DTD subset for the imported document. This is useful if you want to import an XML document that does not have a DOCTYPE declaration, but you have an XML DTD that you know the imported document must satisfy. Specifying the external DTD subset here (DTD file) allows you to supply that info to the parser, which in turn may use that info to determine which whitespace is ignorable in the imported document.
Essentially, specifying a file here has the same effect as if the imported XML document had a
<!DOCTYPEroot
SYSTEM "specified-file
">
declaration, unless it explicitly specifies a DOCTYPE declaration of its own.
Name |
ExternalSubsetLocation |
Java symbol |
|
Type |
String |
Value |
Attach locator info
When checked, each element node of the imported document will get attached the following set of attributes:
the line where the start tag of this element starts in the imported XML document file (position of the opening '<')
the column where the start tag of this element starts in the imported XML document file (position of the opening '<')
the line where the start tag of this element ends in the imported XML document file (position of the first character after the closing '>')
the line where the start tag of this element ends in the imported XML document file (position of the first character after the closing '>')
the line where the end tag of this element starts in the imported XML document file (position of the opening '<')
the column where the end tag of this element starts in the imported XML document file (position of the opening '<')
the line where the end tag of this element ends in the imported XML document file (position of the first character after the closing '>')
the line where the end tag of this element ends in the imported XML document file (position of the first character after the closing '>')
All values are 1-based.
Name |
AttachLocatorInfo |
Java symbol |
|
Type |
Bool |
Value |
This module serves for serializing the internal tree to XML. It offers a choice for the table model to write (internal, HTML or CALS), debugging and pretty-printing options. It also offers choices for handling images in the document (separate for referenced/linked images and embedded images) and you can use a Unicode Translation Map.
General
Destination File
Choose the full filename into which the result should be written. You can use upCast’s variables for building the path.
Name |
DestinationFile |
Java symbol |
|
Type |
String |
Value |
absolute path to desired result file |
Output resolution
Specify the output resolution in dpi. This value is used for calculating device pixel values, e.g. in HTML tables’ cell widths or images’ sizes.
Name |
OutputResolution |
Java symbol |
|
Type |
Double |
Value |
1..9999 |
Output file encoding
Lets you specify the encoding in which the XML file will be written. If your further tool chain allows it, we strongly recommend to use the default, UTF-8.
Name |
OutputEncoding |
Java symbol |
|
Type |
String |
Value |
Java encoding name |
Table model
This parameter lets you choose which table model to use for tables. You can either choose the native (upCast) table model, which is a very simple table > row > cell model, the HTML 4 table model, or the OASIS-EM (CALS) (OASIS XML Exchange Table Model, a subset of CALS) table model.
The HTML 4 table model uses the HTML namespace http://www.w3.org/HTML/1998/html4, the CALS table model uses the special, proprietary namespace http://www.infinity-loop.de/namespace/2006/upcast-cals.
Name |
TableModel |
Java symbol |
|
Type |
String |
Value |
HTML | CALS | native |
Style information
Lets you specify how general CSS styles for known elements and named styles for paragraphs and inline elements should be exported. Options are:
No style info is exported at all. This does not effect local styles on elements, which will be written in any case according to the "Explode CSS style info" setting.
The style info is written as CSS code in the special element uci:style
(in the upCast internal namespace) in the document’s uci:head
element.
Writes a stylesheet processing instruction to point to a CSS file named basename.css
in the same folder as the resulting XML file. This file can e.g. be created using the CSS Exporter module.
this lets you specify a custom stylesheet processing instruction to e.g. link to a general CSS file you wish to use in all of the converted documents.
Name |
StylesheetMode |
Java symbol |
|
Type |
String |
Value |
none | internal | external | custom |
Name |
CustomStylesheetPI |
Java symbol |
|
Type |
String |
Value |
custom stylesheet PI string |
Include generator info as comment
When checked, adds info about when and by which version of upCast the XML file was produced to that file as an XML comment. This may be useful both for infinity-loop support during trouble shooting and for you, when you need to relate some produced XML files to a certain version (in time) of your pipelines.
Name |
IncludeGeneratorInfo |
Java symbol |
|
Type |
Bool |
Value |
true | false |
Wrap individual Text nodes with element <uci:text>
When checked, each single Text node in the internal DOM tree will be surrounded by an <uci:text>
element on serialization. That element is also the carrier for the uci:node-id
attribute for the original Text node's id as obtained from the generate-id()
XPath function when called on that node.
Name |
MarkTextNodes |
Java symbol |
|
Type |
Bool |
Value |
true | false |
Serialize node id as attribute @uci:node-id
When checked, each serialized element automatically receives an additional attribute uci:node-id, holding the value obtained by calling generate-id()
on that element.
Name |
SerializeNodeId |
Java symbol |
|
Type |
Bool |
Value |
Pretty-print output
Turns on pretty-printing the output for elements whose whitespace handling mode is known explicitly to the serializer.
Name |
PrettyPrint |
Java symbol |
|
Type |
Bool |
Value |
true, false |
Images
During import, e.g. using the RTF importer, all references to images are made absolute and stored this way in the internal tree as follows:
Embedded images are written to disk into a temporary location and possibly a format conversion is applied. The internal tree at this point holds the absolute path to these temporary image files.
Linked (or referenced) images are stored with their absolute path to the original image; no matching files for linked images are created in the temporary image files location.
At export time, you can decide how the image location information (and possibly the actual image files) should be handled. The handling mode can be set individually for images that were embedded in the original document and (external) images that were only linked to.
Embedded Images
This parameter governs the handling of images that originally had been embedded in the source document.
The uci:image
element is completely dropped from the XML output.
This option copies the temporary image file to the Image Destination Folder. If a file of the desired name already exists at that location, a unique file name is generated by appending -1, -2 etc. to the basename until a name is found that is not already used. A relative reference to this copy is then used in the uci:image
element.
This option copies the temporary image file to the Image Destination Folder. If a file of the desired name already exists at that location, it is overwritten without prompting. A relative reference to this copy is then used in the uci:image
element.
This option writes the absolute path to the temporary file as it is currently set in the internal tree unchanged. This is useful for checking how the internal tree looks like at a certain point in a module chain.
The temporary image files will be deleted automatically after a pipeline execution for a certain document. This means that when using the internal tree format (don’t copy) option, the referenced image in the generated XML will have been deleted!
Name |
EmbeddedImagesHandling |
Java symbol |
|
Type |
String |
Value |
discard | copy | copyreplace | internal |
Referenced images
This parameter governs the handling of linked (referenced) images in the original source file.
The uci:image
element is completely dropped from the XML output.
This option copies the original, linked-to image file to the Image Destination Folder. If a file of the desired name already exists at that location, a unique file name is generated by appending -1, -2 etc. to the basename until a name is found that is not already used. A relative reference to this copy is then used in the uci:image
element.
If the original file is not accessible from the machine that executes the pipeline (be it due to network failure, the file not existing or some other problem), the option update link for Destination File location is used as fallback instead.
This option copies the original, linked-to image file to the Image Destination Folder. If a file of the desired name already exists at that location, it is overwritten without prompting. A relative reference to this copy is then used in the uci:image
element.
If the original file is not accessible from the machine that executes the pipeline be it due to network failure, the file not existing or some other problem), the option update link for Destination File location is used as fallback instead.
This option writes the reference the same way it was found in the original source document. This therefore may be an absolute or relative path.
Note that if the original location specification in the source RTF file was relative, but the XML file is not saved into the same folder as where the source document is located, chances are that the link is broken.
This option updates the reference to the original image in a way that it still points to that very image, even when the destination of the XML file is in a different folder (when the original reference was relative).
Name |
LinkedImagesHandling |
Java symbol |
|
Type |
String |
Value |
discard, copy, copyreplace, keep, update |
Image Destination Folder
You can specify a separate folder dedicated for images. By default, this is set to ${module:DestinationFile#urlpath}, which evaluates to the same folder where the XML file is saved. However, if you want to put images into a separate folder, you can do this here. This is the folder where any of the above options that physically copy the image file will place the file. Any relative references to the image from within the XML file will be adjusted accordingly.
Name |
ImageDestinationFolder |
Java symbol |
|
Type |
String |
Value |
absolute path to folder |
The nodes in the internal tree may have a very rich set of attributes attached, many of which have only been useful while processing the tree within upCast, e.g. with UPL. Serializing all those attributes may create huge files, where only a fraction of the info contained will be used down the further processing chain of the document. To reduce unnecessary memory consumption and processing time, the XML Exporter offers a way to set up a filter on the attributes serialized for each internal tree node. This is achieved by using a specially formed UPL program in conjunction with the dedicated filtering function filter-attrs()
.
This filter can be effectively used to reduce the set of CSS properties exploded into attributes to a minimal set that you are actually interested in for further processing, e.g. in an XSLT step.
Attribute Filter
This field holds the UPL program to perform the filtering.
As in the UPL Tree-Processor, you can define several UPL rules. The selector part determines for which kind of node (and possibly more complex conditions) the attribute filter applies. This lets you filter attributes differently on different elements.
The action part is applied, when the selector matches. Although theoretically, you can use the complete range of UPL functionality on such a node, many changes to the node will not be picked up by the serialiazer (except for changes in the node’s attributes), so we recommend against using this UPL program for other things than filtering attributes.
It is important to understand how the context node supplied to the UPL program looks like:
The context node supplied to the UPL program is a temporarily, newly created, artificial, single node. It lives by itself and neither has a parent, nor siblings, nor children. It is neither the node in the context of its later serialization nor the actual node of the internal tree to be serialized, but merely just a lookalike of the former. This means that among other things, you cannot query its context nodes with XPath using eval-xpath()
.
The context node does not hold synthesized style info, nor does it hold attached user values.
The filtering UPL code is not called for nodes of other DOM node types than Element
.
Clicking the Insert defaults button inserts the current upCast default filter setup for new XML Exporter instances before any existing code in the Attribute Filter text field.
Name |
SerializationFilter |
Java symbol |
|
Type |
String |
Value |
Maps
Unicode translation map
This field lets you enter a Unicode Translation Map. You can enter any mappings directly or include an externally created Unicode Translation Map file using the include realm: ${include(encoding="…"):file}
.
Name |
UnicodeTranslationMap |
Java symbol |
|
Type |
String |
Value |
Unicode translation map code |
CSS property unit map
Here, you can specify a mapping table that associates any CSS <length> property with a pair of {unit, precision}. When the module needs to write length or size information in form of CSS properties, it consults this list to determine which length unit to use at which precision. For a description of the format, see CSS property unit table.
You can enter any mappings directly or include an externally created CSS property unit table file using the include realm: ${include(encoding="…"):file}
.
Name |
CSSPropertyUnitMap |
Java symbol |
|
Type |
String |
Value |
CSS property unit map code |
This module serves for executing external system commands by way of the standard command-line interpreter available on the respective execution platform.
System command
The command to be executed by the underlying system’s command-line interpreter. You can use upCast variables for building the string.
For platform-independent, common file operations, upCast offers some internal "pseudo" commands:
upcast:delete-file filename
+
Deletes all listed files.
upcast:copy-file source dest
Copies the file source
to the new file dest
.
upcast:move-file from to
Moves the file from
to its new destination to
. This is equivalent to the sequence of commands upcast:copy from to
followed by upcast:delete-file from
.
upcast:delete-recursively folder-or-file
+
Recursively deletes all listed folders and/or files.
This command is potentially dangerous as it can lead to deleting a huge number of files when used carelessly! Please consider using upcast:delete-recursively-restricted
instead.
upcast:delete-recursively-restricteddeletionboundary
folder-or-file
*
Recursively deletes all listed folders and/or files that are equal or reside below the specified deletionboundary
folder in the file system hierarchy.
This method is fail-fast, i.e. when a specified folder to be deleted is not hierarchically under the deletion boundary, any further actions on it are skipped. This should prevent the case where when you specify a folder where deletionboundary
is a descendant of that folder, the complete contents of deletionboundary
is deleted. Or, in other words: The specified root path for a recursive deletion operation must already satisfy the deletion boundary restriction to be considered any further.
Example 8.5.
upcast:delete-recursively-restricted "/user/iloop/temp/" "/user/iloop/temp/test.txt"
deletes the file /user/iloop/temp/test.txt
because it is a descendant of the deletion boundary folder /user/iloop/temp/
.
upcast:delete-recursively-restricted "/user/iloop/temp/" "/user/iloop/"
deletes nothing because the folder /user/iloop/
is not a descendant of the deletion boundary folder /user/iloop/temp/
.
Name |
Commandline |
Java symbol |
|
Type |
String |
Value |
commandline to execute, either as String or (in UPL or Java API) as List |
This parameter lets you supplement, override or completely replace the environment that the child process started by the module inherits from its parent (i.e., upCast).
For this, we use the following syntax:
paramname ‘:=’ ‘"’ value ‘"’;
Each variable definition takes one line of text.
Note that you must use CSS-style escapes (or numerical character entities of the form &#...;
) to generate Unicode characters for specifying font names using characters outside the ASCII range.
All lines starting with //
denote a comment line and are ignored, as do empty lines.
You can also specify a mode in which specified entries should be handled. Use either one of the following three mode options at the top of the environment specification on a line of its own:
@mode replace @mode override @mode supplement
@mode
This option controls the behavior of the environment variable entries:
The complete environment is cleared, then the new entries are added
Specified variables are added to the environment, replacing already existing ones
Specified variables are added to the environment unless a variable of that name already exists, in which case the already existing one is kept unchanged and the newly specified one is discarded
The default mode is override.
Example 8.6.
Writing
@mode replace
at the top of the environment variables definition code snippet will clear the current environment and only following definitions will be added.
Example 8.7. Environment specification examples
With this inherited environment:
PATH=/usr/bin USER=iloop
, the following specifications for Environment will result in the shown final environment variable set:
@mode override USER:="johndoe" SAMPLEVAR:="test"
results in
PATH=/usr/bin USER=johndoe SAMPLEVAR=test
Example 2
@mode replace USER:="johndoe" SAMPLEVAR:="test"
results in
USER=johndoe SAMPLEVAR=test
Example 3
@mode supplement USER:="johndoe" SAMPLEVAR:="test"
results in
PATH=/usr/bin USER=iloop SAMPLEVAR=test
Note how in supplement mode, existing variables of the same name (here: USER
) will keep their values unchanged.
Name |
CommandlineEnvvars |
Java symbol |
|
Type |
String |
Value |
["@mode replace" | "@mode supplement" | "@mode override"]
|
Wait for completion
When checked, the command is executed synchronously, i.e. upCast waits until the external command has completed before continuing execution.
Checking for errors occurring during external command execution can only be performed when this option is on. upCast considers any return value other than 0 (zero) an error.
Name |
WaitForCompletion |
Java symbol |
|
Type |
Bool |
Value |
true | false |
Timeout
Lets you specify the timeout for the external command in seconds. When the command does not exit and return within the specified time, it is forcibly killed by the module and an error message (WatchedCommandKilledWithTimeout
) is issued.
The timout value is only active when Wait for completion is checked.
A timeout value of "0
" waits indefinitely for the termination of the command.
Name |
WaitForCompletionTimeout |
Java symbol |
|
Type |
Integer |
Value |
seconds (positive integer value greater 0; when 0: wait indefinitely) |
This parameter lets you redirect the data the command writes to stdout
either to a file or into a pipeline variable.
By default, when the field is left empty, the command's output to stdout
will be written to upCast's log with a level of INFO.
When you specify an absolute file path, the output will be written to that file. Any existing file contents is cleared at each module's execution.
When you use the special syntax upcast:
varname
, the data sent to stdout by the command is written to the pipeline variable varname as String. You then can retrieve it via
${pipeline:varname
}
in the GUI fields of upCast, or via
$pipeline:varname
from UPL code.
Writing to a pipeline variable requires that Wait for completion is turned on.
Name |
RedirectStdout |
Java symbol |
|
Type |
String |
Value |
'' | |
Redirect 'stderr' to…
This parameter lets you redirect the data the command writes to stderr
either to a file or into a pipeline variable.
By default, when the field is left empty, the command's output to stderr
will be written to upCast's log with a level of ERROR.
When you specify an absolute file path, the output will be written to that file. Any existing file contents is cleared at each module's execution.
When you use the special syntax upcast:
varname
, the data sent to stderr
by the command is written to the pipeline variable varname as String. You then can retrieve it via
${pipeline:varname
}
in the GUI fields of upCast, or via
$pipeline:varname
from UPL code.
Writing to a pipeline variable requires that Wait for completion is turned on.
Name |
RedirectStderr |
Java symbol |
|
Type |
String |
Value |
'' | |
Example 8.8.
To create a new directory images
in the folder specified by the global variable DestinationFolder on a Unix system, you would use the following command:
mkdir "${pipeline:DestinationFolder#localpath}/images"
Note the quotes around the parameter to accommodate for path names that contain e.g. space characters.
This module lets you apply an XSLT transformation to some external file (which might be the result of an earlier exporter module). You can choose between the Xalan XSLT processor from the Apache Software Foundation (ASF; http://xml.apache.org/), Saxon 6.5.5 by Michael Kay, or Saxon-B (version 9) from Saxonica (http://www.saxonica.com).
Source File
Specify the file the transformation should be applied to, most probably an XML file. You can use all upCast variables for dynamically creating the full path to the file.
Name |
SourceFile |
Java symbol |
|
Type |
Object |
Value |
absolute file path in URL or local file system convention |
XSL Transformation File(s)
Specify the XSLT transformation ("XSLT file") to apply.
You can specify several transformations (use one line each to specify the full path to an XSLT file) or, in other words: the paths must be separated by a newline character. These will be chained, i.e. the original source file will be processed using the first XSLT file specified, the result will be processed by the second and so on. Note, however, that all transformations share the same XSLT parameters.
Empty lines are ignored.
Lines starting with a hash mark ('#
') or two forward slashes ('//
') are considered comments and ignored.
Use this for documentation purposes or to quickly disable one of the stylesheets in a processing chain by prefixing its line with a #
.
Name |
Stylesheet |
Java symbol |
|
Type |
String |
Value |
path to stylesheet |
XSLT parameters
Lets you specify parameters to be passed to the transformation. A parameter definition must follow this syntax:
paramname ‘:=’ ‘"’ value ‘"’;
Quotes within the parameter value must themselves be quoted using the backslash character ‘\
’.
You may use upCast’s variable system for constructing parameter values.
The text in this field and any contained variable references are resolved as follows:
Any references to the include realm are resolved.
The individual assignments are parsed according to the above syntax.
Only now, any further references to realms other than include are resolved separately for each parameter assignment’s value part of a parameter assignment.
This algorithm covers the usual cases where you might want to include constant parameter assignment code shared by several XSLT Processor modules in a pipeline using an include variable reference, but also reference variable values without having to worry whether their actual value breaks the value assignment syntax described above.
There is a specially named parameter: il.stylesheet.intermediates.folder
When this is specified and set to a writable folder on disk, the intermediate results after each step of an XSLT processing chain is serialized in a separate file in that folder, with the number in the file name indicating after which step that respective file was serialized.
Name |
StylesheetParameters |
Java symbol |
|
Type |
String |
Value |
(string in same format as in UI) |
Result file
Specify where the transformation result should be written. You can use all upCast variables for dynamically creating the full path to the file.
Name |
DestinationFile |
Java symbol |
|
Type |
String |
Value |
absolute path to desired result file |
XSLT processor
Lets you choose between Xalan and Saxon 6.x or Saxon 9.x as the XSLT processor to use (if available).
Name |
XSLTProcessor |
Java symbol |
|
Type |
String |
Value |
xalan | saxon6 | saxon |
This module lets you apply a Unicode Translation Map to an already existing XML document. Additionally, by way of the Output encoding parameter, you can quickly change the character encoding used in an XML file.
Though the implementation tries to preserve the formatting of the original document while doing its thing, there is no guarantee that the result is syntactically equivalent to the input, though structurally, it of course is.
The Unicode Translation Map rules are only applied to the XML document’s text and attribute nodes. Comments and PIs are left unchanged.
Source File
Specify the file the transformation should be applied to, which must be an XML file. You can use all upCast variables for dynamically creating the full path to the file.
Name |
SourceFile |
Java symbol |
|
Type |
Object |
Value |
absolute file path in URL or local file system convention |
Unicode Translation Map
This field lets you enter a Unicode Translation Map. You can enter any mappings directly or include an externally created Unicode Translation Map file using the ${include(encoding:="…"):file}
variable reference, which is automatically replaced by the contents of the specified file after reading it using the specified encoding.
When you leave this field completely empty, no Unicode translation is performed. You can use this if the only thing you want to do is changing the character encoding the XML file is in by specifying the desired Output encoding.
Name |
UnicodeTranslationMap |
Java symbol |
|
Type |
String |
Value |
Unicode translation map code |
Destination file
Specify where the translation result should be written to. You can use all upCast variables for dynamically creating the full path to the file.
Name |
DestinationFile |
Java symbol |
|
Type |
String |
Value |
absolute path to desired result file |
XML version attribute
Specify the value of the version attribute on the XML declaration at the beginning of the result XML file.
If you leave this empty, no XML declaration will be written. The default value is "1.0".
Note that this is a textual parameter only; specifying e.g. "1.1" does not modify the file written such that it is a valid XML 1.1 file.
Name |
XMLVersion |
Java symbol |
|
Type |
String |
Value |
value to be written in the 'version' attribute of the XML declaration; when empty, XML declaration is suppressed |
Output encoding
Lets you specify a name of a supported output file encoding, e.g. UTF-8
or iso-8859-1
. This encoding is also specified in the encoding attribute on the XML declaration (if written, see XML Version parameter above).
Name |
OutputEncoding |
Java symbol |
|
Type |
String |
Value |
Java encoding name |
DOCTYPE declaration
This lets you add, override or remove an existing doctype declaration in the incoming document.
When this field is a single asterisk ("*
"), the doctype declaration in the source document (if present) is passed through as-is.
When this field is empty (""), any doctype declaration present in the source document is stripped from the output.
When this field contains any other data, that data is written verbatim to the output, replacing any possibly existing doctype declaration in the input document.
Name |
DOCTYPEDeclaration |
Java symbol |
|
Type |
String |
Value |
literal value of full DOCTYPE declaration as String; when empty, DOCTYPE declaration is removed, when ' |
This module serves for validating arbitrary XML documents. The module supports validation against an XML DTD, XML Schema and Relax NG.
Specify the XML file that should be validated. You can use all upCast variables for dynamically creating the full path to the file.
Name |
SourceFile |
Java symbol |
|
Type |
Object |
Value |
absolute file path in URL or local file system convention |
Redirect report to
Specify a destination file where the validation report will be written to.
When you specify an absolute file path, the output will be written to that file. Any existing file contents is cleared at each module's execution.
When you use the special syntax upcast:
varname
, the report data is written to the pipeline variable varname as String. You then can retrieve it via
${pipeline:varname
}
in the GUI fields of upCast, or via
$pipeline:varname
from UPL code.
The validation report is an XML file in UTF-8 encoding with root element <validation-report>
. It has child messages of the following form:
<msg system-id="validated-file-url
" line="line-number
" col="column-number
">...validation message...
</msg>
The validation message
is constructed from the respective schema type's validation handler's error message (SAXParseException
).
Name |
ReportDestination |
Java symbol |
|
Type |
String |
Value |
'' | |
Schema type
Specify the type of Schema you want to validate the file against:
validate against an XML DTD; the document to be validated must have a valid DOCTYPE
declaration
validate against an XML Schema; the document must have the respective schema file location attributes
validate against a Relax NG schema; you must specify the location and type of the Relax NG schema file using the specific parameters shown when this type is selected (see below)
Name |
SchemaType |
Java symbol |
|
Type |
String |
Value |
dtd | xmlschema | relaxng |
External DTD subset fallback
(for XML DTD schema type only)
This parameter lets you choose a fallback for the external DTD subset for the document to be validated. This is useful if you want to validate an XML document that does not have a DOCTYPE declaration, but you have an XML DTD that you know the imported document must satisfy. Specifying the external DTD subset here (DTD file) allows you to supply that info to the parser, which in turn uses that info for validation.
Essentially, specifying a file here has the same effect as if the source XML document had a
<!DOCTYPEroot
SYSTEM "specified-file
">
declaration, unless it explicitly specifies a DOCTYPE declaration of its own (in which case the latter takes precedence).
Name |
ExternalSubsetLocation |
Java symbol |
|
Type |
String |
Value |
Relax NG Schema file
(for Relax NG schema type only)
Specify the location of the Relax NG schema file to validate the Source File against.
Name |
RelaxSchemaLocation |
Java symbol |
|
Type |
String |
Value |
absolute file path |
Relax NG Syntax
(for Relax NG schema type only)
Specify the syntax the Relax NG schema file is written in, either XML syntax or compact syntax.
Name |
RelaxSyntax |
Java symbol |
|
Type |
String |
Value |
xml | compact |
This module writes an external Cascading Style Sheets, level 2 (CSS2) file comprising all styles (paragraph styles and character styles) used in the current internal document, matching their visual appearance as closely as reasonably possible. The output also includes information on the page setup like paper size and margins.
The CSS2 file written may for example be referenced by a file created by the XML Exporter module.
Selector syntax
Lets you choose which CSS selector syntax should be used:
Writes selectors using the ‘class’ attribute shorthand: .classname { ... }
Writes selectors according to CSS2 selector syntax rules: *[class=classname] = { ... }
Writes both ways of expressing the selector so that tools understanding either can pick the one that they understand. First, the shorthand is written, followed by full CSS2 selector.
Name |
SelectorSyntax |
Java symbol |
|
Type |
String |
Value |
css1 | css2 | all |
upCast DTD elements namespace prefix
Specify the namespace prefix for the upCast DTD elements that the final XML file is using which includes the generated CSS file by this module.
The default is the empty string, i.e. no namespace prefix used.
Setting this parameter is necessary until widespread support for the CSS Namespaces Module is available. Until then, element names are bound by their qualified name, including namespace prefix plus separating colon (if existant). To generate the qualified element name, the module must be told the namespace prefixes it should use.
Name |
UpcastDTDNamespacePrefix |
Java symbol |
|
Type |
String |
Value |
prefix for elements in upCast DTD |
HTML4 DTD elements namespace prefix
Specify the namespace prefix for the HTML4 elements that the final XML file is using which includes the generated CSS file by this module. HTML elements are e.g. used for tables (if you opted for the HTML table model).
The default is html
.
Name |
HTML4DTDNamespacePrefix |
Java symbol |
|
Type |
String |
Value |
the desired namespace prefix |
Output file
Specify where the CSS file should be written. You can use all upCast variables for dynamically creating the full path to the file.
Name |
DestinationFile |
Java symbol |
|
Type |
String |
Value |
absolute path to desired result file |
Output encoding
Lets you specify a name of a supported output file encoding, e.g. UTF-8
or iso-8859-1
. This encoding is also specified in the @charset
rule at the very beginning of the CSS file.
Name |
OutputEncoding |
Java symbol |
|
Type |
String |
Value |
Java encoding name |
This module requires an appropriate RTF Exporter feature included in your license to be fully functional.
The RTF Exporter was formerly a separate product called "downCast". This module is a much improved version of downCast 1.x, especially in respect to performance (up to 300% faster) .
This module converts XML documents to Word or, more precisely, RTF documents. For specifying the layout, the module relies on a subset of Cascading Style Sheets, level 2 (CSS2) properties, amended by several proprietary properties where needed. Input XML documents must either be valid against the upCast DTD (note that this is different from the upCast internal DTD!), or they can be any arbitrary XML language for which a transformation into the upCast DTD can (and needs to) be created.
For more details on supported CSS and custom properties and their semantics, see the separate RTF Exporter documentation.
Source File
Specify the XML file that should be converted to RTF. You can use all upCast variables for dynamically creating the full path to the file. This must be an XML file conforming to the upCast DTD or – in experimental status – XSL-FO.
Name |
SourceFile |
Java symbol |
|
Type |
Object |
Value |
absolute file path in URL or local file system convention |
Destination file
Specify where the RTF result should be written to. You can use all upCast variables for dynamically creating the full path to the file.
When running on Windows and having WordLink installed and functional, by specifying the destination file extension as .doc
, you can have the module automatically convert the generated RTF file into a Word binary file.
Name |
DestinationFile |
Java symbol |
|
Type |
String |
Value |
absolute path to desired result file |
Source format
Specify the format the source file is in, either upCast DTD or XSL-FO.
Name |
SourceFormat |
Java symbol |
|
Type |
String |
Value |
upcast | xslfo |
Output resolution
When the RTF exporter must include images that do not specify their resolution explicitly in the file, the application uses the value that you specify here to calculate image size and resulting scaling factor to apply in the RTF output.
The default value is 96 dpi.
Name |
OutputResolution |
Java symbol |
|
Type |
Double |
Value |
1..9999 |
Specifies what the RTF Exporter should do when it encounters images in a format that it cannot handle or that are not supported in RTF, or when an image file to embed into a document is missing.
the image is completely removed from the result
error text indicating the file name of the missing image is embedded into the output document, prominently visible to the user
error text indicating the full, absolute path and file name of the missing image is embedded into the output document, prominently visible to the user
error text indicating the full, absolute path and file name of the missing or unsupported image, including further error details, is embedded into the output document, prominently visible to the user
a generic replacement image is embedded into the final result document, respecting and scaled to the originally requested image size so it does not break the layout of the document
Name |
ImageErrorHandling |
Java symbol |
|
Type |
String |
Value |
discard | filename | filepath | details | image |
issue runtime error
When checked, missing or unsupported images will cause a runtime error. When deselected, a warning only will be generated. The message id will be the same for both cases, however.
Name |
ImageErrorSignalling |
Java symbol |
|
Type |
String |
Value |
error | warning |
User stylesheet
Here, you can specify a CSS stylesheet to use for the conversion instead of the stylesheet (possibly) specified in the XML source. You can use all upCast variables for dynamically creating the full path to that file.
Name |
UserStylesheet |
Java symbol |
|
Type |
String |
Value |
path to user stylesheet |
Whitespace handler class
For experts only!
The RTF Exporter makes use of special code to handle whitespace characters in the input stream. This field lets you set a custom whitespace handler if this is required. A whitespace handler must be a Java class that implements the WhitespaceHandler
interface. If you think you need to implement your own whitespace handler, please contact us directly at <support@infinity-loop.de>
in advance.
The default value is ‘*
’ (asterisk) which lets the implementation decide on the most appropriate whitespace handler for the input document and should not be changed for normal use.
The module provides three Whitespace Handlers for different situations. You request their explicit use by specifying their full, qualified class name in the Whitespace Handler class input field.
Except for the NoopWhiteSpaceHandler
, all are more or less experimental and we do not guarantee their correctness or usefulness.
This is the default handler for input documents valid according to the upCast DTD. All whitespace is significant in mixed-content elements. It is automatically used when you specify ‘*’ and the Source format is upCast DTD.
This is a white space minimizing handler, minimizing whitespace in mixed-content elements. It is automatically used when you specify ‘*’ and the Source format is XSL-FO. It tries to mimic XSL-FO required behavior when minimizing whitespace before and around inline elements. Whitespace is collapsed to the left.
This handler behaves exactly the same as the XSLFOWhiteSpaceHandler
, except that it respects the setting of the white-space CSS3 shorthand property, resp. its all-space-treatment
component when resolved to its constituent properties. When this has the value preserve
, whitespace is preserved in that element, unless overridden in a child element. When this is collapse
(the default), the handler behaves as described above. Note that you should explicitly specify the desired behavior on the immediate parent element of (possibly) mixed content.
ID rendering mode
For elements having an id
attribute of type ID
, you can specify if and how this information should be translated into RTF bookmarks.
The ID information is not used and no bookmarks are created in the resulting RTF based on an id
attribute.
A bookmark with the id
’s value is created just before the start of the element’s contents.
A bookmark with the id
’s value is created immediately after the full contents of the element has been written to the RTF file.
A bookmark with the id
’s value is created that starts just before the start of the element’s contents and ends just after the full contents of the element has been written to RTF, i.e. the bookmark spans the contents of the element.
Name |
IDRenderMode |
Java symbol |
|
Type |
String |
Value |
surround | ignore | before | after |
Style name output format
Determines how style names should be written to the RTF stylesheet destination. When Unicode, we generally use Unicode characters to express possible umlauts; if normal (use document encoding), the document encoding is used wherever possible.
Name |
StyleNameFormat |
Java symbol |
|
Type |
String |
Value |
unicode | normal |
Table ‘frame’ attribute overrides cell border definitions
When checked, the frame
attribute on table elements overrides any settings of cell borders that border on the outmost surrounding table border.
When not checked, a cell’s border CSS definition takes highest precedence in rendering.
Name |
FrameOverridesCells |
Java symbol |
|
Type |
String |
Value |
true | false |
This module lets you execute another, external pipeline document as a sub-pipeline within the current pipeline execution.
It is not possible to provide the external pipeline in form of a Java Stream object, it must be an external file residing in the file system.
The result value of this module is the result value of the executed sub-pipeline.
Source File
The path to the external pipeline document (.ucdoc
) to include in the current pipeline.
Name |
SourceFile |
Java symbol |
|
Type |
Object |
Value |
absolute file path in URL or local file system convention |
Pipeline variables
This lets you choose how pipeline variables for the included pipeline should be created:
the included pipeline gets its own, initially empty set of pipeline parameters. Think of this as when running that pipeline as a completely independent pipeline
this creates a copy of the current pipeline variables and passes it on to the included pipeline. This lets you pass all the current pipeline variables to the included pipeline. When the included pipeline modifies any variables, this only affects itself, but not the calling pipeline. This way, it is possible to provide values (like "parameters") to the included pipeline. When execution of the included pipeline finishes, the pipeline variables of the calling pipeline will be in exactly the same state as before running the included pipeline. Effectively, the included pipeline can not have any side-effects on the callers set of variables.
in this mode, the included pipeline uses the same instance of pipeline variables as the caller. This means that the included pipeline receives and can modify the pipeline variables of the including pipeline. This way, it is possible to provide values (like "parameters") to the included pipeline, and have the included pipeline "return values" by setting them in the pipeline variables.
The only exception to this rule is the pipeline:base variable, which is not inherited but set according to the included pipeline’s location on disk so that relative references therein are resolved properly. After the sub-pipeline’s execution, the original value is restored for the pipeline:base variable before continuing in the calling pipeline.
To have more control per parameter how it behaves in sub-pipeline execution environments, there is a specific property for specifying the setting behaviour. For each parameter, you can specify the initialize-when
property, with values never
, always
, unset
or an arbitrary <string-value>
. The default value is unset
. Here’s an outline of what happens with respect to pipeline parameters during a sub-pipeline call in all of the three cases above:
The pipeline variables pool of the sub-pipeline to be called is initialized or created according to the above parameter.
The following pipeline variables are set to their appropriate values depending on the storage location of the sub-pipeline: base, PipelineBase, ParamBase, PipelineURI, ParamURI, PipelineInstanceId.
Any sub-pipeline parameters specified in the Parameters field are written to the sub-pipeline's variable realm.
Finally, for each parameter defined in the sub-pipeline:
If the parameter’s initialize-when
value is unset
(or the property is not defined) and the pipeline variable pool does not already contain a variable by that name:
If the parameter is a persistent
parameter, a new variable is created in the pipeline variables with that parameter’s current value as stored in the sub-pipeline document as its value.
Otherwise, if the parameter is not a persistent
parameter, but has a default
value defined, a new pipeline variable is created in the pipeline variables with that parameter’s default value as its value.
Otherwise, if it’s neither a persistent
parameter nor has it a default
value, that pipeline variable is left undefined (and will probably cause execution errors later if the pipeline tries to access that value). This condition is considered a programming error.
If the parameter’s initialize-when
value is always
:
If the parameter is a persistent
parameter, a new variable is created (or a possibly existing variable overwritten) in the pipeline variables with that parameter’s current value as stored in the sub-pipeline document as its value.
Otherwise, if the parameter is not a persistent
parameter, but has a default
value defined, a new pipeline variable is created (or a possibly existing variable overwritten) in the pipeline variables with that parameter’s default value as its value.
Otherwise, if it’s neither a persistent
parameter nor has it a default
value, that pipeline variable is left undefined (and will probably cause execution errors later if the pipeline tries to access that value). This condition is considered a programming error.
If the parameter’s initialize-when
value is never
, no further actions are taken.
If the parameter’s initialize-when
value is a string value and either the pipeline variables do not already contain a variable by that name or any existing variable by that name has the same string value as the string specified for the initialize-when
value:
If the parameter is a persistent
parameter, a new variable is created (or a possibly existing variable overwritten) in the pipeline variables with that parameter’s current value as stored in the sub-pipeline document as its value.
Otherwise, if the parameter is not a persistent
parameter, but has a default
value defined, a new pipeline variable is created (or a possibly existing variable overwritten) in the pipeline variables with that parameter’s default value as its value.
Otherwise, if it’s neither a persistent
parameter nor has it a default
value, that pipeline variable is left undefined (and will probably cause execution errors later if the pipeline tries to access that value). This condition is considered a programming error.
Example 8.9.
By creating a parameter definition
copyright { type: text; label: "Copyright notice:"; }
in the main pipeline, which calls a sub-pipeline with the definition
copyright { type: text; label: "Copyright notice:"; default: "(c) 2008 My Company"; initialize-when: ""; }
the sub-pipeline will have set the copyright pipeline variable to the "(c) 2008 My Company
" default
value if the user did not provide a value in the text field for Copyright notice in the main pipeline. This lets you implement some sort of default or fallback value mechanism for values that are used in a sub-pipeline if the value has not been set (or, to be exact: has been set to the empty string ""
) in the calling pipeline.
Name |
PipelineRealmMode |
Java symbol |
|
Type |
String |
Value |
separate | copy | share |
Only run modules with ‘exported’ status
When checked, only modules that have the status exported set will be executed.
You can use this feature like this:
Develop your sub-pipeline on its own. For testing and debugging purposes, you will probably want to provide initial values (using an instance of the Pipeline Variables module) and debugging output within the pipeline using additional instances of the XML (Raw) Exporter modules. Now simply remove the exported status on these modules in the sub-pipeline and check the above option in the importing pipeline.
Effectively, this ensures that all debugging and setup code is only run when you run the sub-pipeline on its own (e.g. during development and isolated debugging), but does not run when the pipeline is included in any other pipelines. No further module activation/deactivation orgies to think of, all done automatically once set up as described – pretty neat, isn’t it?
Name |
OnlyRunExportedModules |
Java symbol |
|
Type |
Bool |
Value |
true | false |
Sub-pipeline Parameters
Parameters
Lets you specify parameters to be passed to the called sub-pipeline. This is especially useful when calling the sub-pipeline in Use independent variables in sub-pipeline or Copy variables to sub-pipeline mode. The parameters defined here are explicitly set in the pipeline realm of the sub-pipeline’s variables to the values specified here. This happens before any modules of the sub-pipeline run. Using this mechanism, it is possible to pass certain variable values to the sub-pipeline without having to share the pipeline variable pool with the calling pipeline. Note, however, that resulting variable’s values can not be passed from a sub-pipeline back to the calling pipeline.
A parameter definition must follow this syntax:
paramname ‘:=’ ‘"’ value ‘"’;
Quotes within the parameter value must themselves be quoted using the backslash character ‘\
’.
You may use upCast’s variable system for constructing parameter values.
The text in this field and any contained variable references are resolved as follows:
Any references to the include realm are resolved.
The individual assignments are parsed according to the above syntax.
Only now, any further references to realms other than include are resolved separately for each parameter assignment’s value part of a parameter assignment.
This algorithm covers the usual cases where you might want to include constant parameter assignment code shared by several External Pipeline Processor modules in a pipeline using an include variable reference, but also reference variable values without having to worry whether their actual value breaks the value assignment syntax described above.
Name |
PipelineVariables |
Java symbol |
|
Type |
String |
Value |
(string in same syntax as in corresponding UI field) |
When working with pipelines, especially ones that are parameterized, it is often convenient to have different sets of parameter settings at hand to run the pipeline. For example, when you are converting documents in the DocBook DTD to your own, you may want to set different header info depending on whether it is a technical article or a medical article. The conversion itself, however, is the same for both. In this case, you’d set up the actual conversion pipeline only once (with the benefit that both document types automatically see any improvements in that pipeline automatically), but have different sets of parameters for the text in the document header area. So what you would do is to create two parameter sets for that single pipeline document and store them in a document separate from the actual implementation logic. Depending on the document you need to convert, you’d just load the respective parameter set document and start the conversion, with all the parameters for that particular document type already set up correctly, with you only having to specify the input file. Well – this is what parameter set documents are for! They separate actual parameter value storage from the pipeline implementation (where they are normally stored as part of the pipeline document).
Parameter sets contain essentially only two types of data:
the current value of all Simple View parameters that have the persistent
flag set to true
the Pipeline UID they are based on resp. referring to
That’s all. Parameter sets, in particular, do not contain any application or pipeline logic.
A parameter set is always derived from a single, specific pipeline document. The link to its implementing pipeline is established by way of the pipeline’s UID.
When loading a parameter set, what actually happens is that the pipeline it is based on is loaded automatically, and then the parameter values stored in the parameter set document are automatically set on that pipeline. To the user, it looks like she has just opened a pipeline document in Simple View mode, then set the parameter values as stored in the parameter set. The only difference is that the user cannot actually edit the pipeline implementation, or in other words: he cannot switch to edit mode. The second visual difference is that parameter sets open in a window with blue background, whereas real pipelines display in the default system color background.
When parameter values in a parameter set are edited, they can be saved back to the parameter set using the usual commands in the File menu: Save (to save in the same file, overwriting the old values) or Save As… to save the parameter set in a new file. You can also copy parameter set files on the operating system level and/or rename them.
Now, how do they find their pipeline implementation file when opened? This is done (re-)purposing the well-known and established Catalog system. The only difference is that you do not resolve the PUBLIC identifier of some DTD or entity to an absolute file path, but the pipeline UID, which you can think of the PUBLIC identifier of the pipeline document. This mechanism allows you to configure your system easily so that these Pipeline UIDs can be resolved to the actual, single implementation file from literally anywhere on your network: Just set up a single catalog for all your pipeline implementations and have your users add that file to upCast’s catalog system in the upCast Preferences, Catalog tab.
Example 9.1. Pipeline UID catalog example
A catalog might contain the following entries:
PUBLIC "d3546614-fb0e-4739-bfea-1f74280d9761" "file:///upcast/pipelines/docbook.ucdoc" PUBLIC "ACME-XHTML-conversion-pipelineV1.1" "file:///upcast/pipelines/acme2html.ucdoc"
When you add this file to upCast’s catalog system, you can open a parameter set from anywhere on your local disk or even the LAN and have it automatically load and run the pipeline document it depends on.
In the first line, the Pipeline UID has been auto-generated by upCast and is using a standard UUID.
In the second line, a speaking UID has been chosen by the pipeline author, who of course must be sure and ensure that this ID will never be used in any of the pipelines a potential user will want to run using a parameter set file.
The first parameter set for a certain pipeline must be created by opening it in upCast, then doing a File > Save to Parameter Set… . You will be prompted for a file name to save the parameter set to. Parameter set files always have the extension ucpar (short for upCast parameter set). The pipeline document will be closed and the new parameter set file will be opened in its place.
From there on, you can create additional instances either by repeating the above, or simply by saving copies of an open parameter set.
Note that only the values of parameters are saved that have their persistent property set to true in the implementing pipeline document. The decision on this property is up to the pipeline author. You will see all parameters defined in original pipeline when opening a parameter set, those values will either be empty or filled with the default values the pipeline author has specified for those parameters.
Even when loading a parameter set, be aware that the pipeline variablereference to ${pipeline:base}
will resolve to the folder where the implementing pipeline document is located, not where the parameter set document lives.
If you want to specify e.g. file path parameters relatively to the location of the parameter set, you can use the new variable ${pipeline:ParamBase} that is automatically created, and which holds the absolute path to the folder within which the respective parameter set resides on disk..
Even for pipeline documents, ${pipeline:ParamBase}
is always defined. In that case, it has the same value as ${pipeline:base}
.
In this case, only the parameters that still have their counterpart will be loaded from the parameter set, and for the remaining parameters it will be automatically updated to the new parameter configuration. This is done on a best-effort basis. Incompatible parameter’s values will be discarded.
When the changes are not affecting the configuration of parameters, the pipeline implementation will be re-loaded automatically once you click the Run button. This will only work reliably when your file system delivers correct last modified date information for files.
When changes are also affecting the configuration (number, type, text, defaults etc.) of pipeline parameters, the parameter set will detect this when re-loading the pipeline implementation due to the change and instruct you to close, then re-open the parameter set to have it pick up the changes.
Assuming you updated the respective catalog entry, the parameter set will no longer be able to resolve its id to the required pipeline implementation and therefore cannot be used any longer.
Also, when the pipeline document a catalog UID lookup resolves to does not actually match the requested UID, an error dialog will be shown and the parameter set cannot be used.
In this case, the system will try to load the pipeline implementation from the system path additionally stored in the parameter set. This path holds the absolute path to the pipeline document at the time the File > Save to Parameter Set… command was run. When this file still exists and its a pipeline document that has the requested Pipeline UID, then that pipeline implementation is loaded. Otherwise, an error is issued and the parameter set cannot be opened.
Basically, the action of making consecutive sibling nodes based on certain conditions children of a newly created surrounding element is called grouping. These conditions are exposed to you by way of the unique painter concept.
To understand the painter concept, you first of all need to be fully aware of the following, most important fact: Grouping is always performed on a flat, linear list of nodes. Huh? I thought we’re working on a document tree? Though this is of course true, grouping only occurs among sibling nodes, i.e. all direct children nodes of an individual element. Any element’s direct children can be expressed by an ordered, flat list. Of course, we recursively group on a child’s list of children, but this is a completely independent grouping operation. So again, a single, independent grouping operation is always performed on a flat, ordered list of nodes.
Now, for the following let’s think of nodes being white bricks placed in an ordered row on the floor. These bricks can be painted with one (or even several – think: spotty!) colors. The color indicates the element by which the bricks should be grouped.
The grouper does one very simple thing: It wraps all adjacent, likewise colored nodes in a parent element (think of this being some kind of bag) that has the same name as the color of the nodes it wraps.
So the essential part to be done beforehand is to color the nodes in the desired way. This is a two-step process: First, you need to check the role of each node as far as grouping is concerned and assign it that role by placing a painter on it that knows how to go about painting for this specific role. Second, the painting is actually performed.
In this first step, consider yourself a paint-shop owner, making a work-plan for your painter employees. Equipped with a packet of self-adhesive post-it notes and a pencil, you start figuring out the work to be done at the first node in the list of sibling nodes. For now, you are just interested in determining which nodes should be collected into groups of the color green. You examine the node you are on. For example, you may look at some of its attributes or layout properties, or perform a more complex examination which may include evaluating a boolean XPath expression. After some pondering, you will come to a certain conclusion as to the role of the node you are currently standing on. This can be one of the following:
You know that this node will always start a group of the color you are currently considering (i.e. green). Therefore, you write "start green" on one of your post-its and tack that to the node.
You know that this node will always end (and therefore be the last one in) a group of the color you are currently considering (i.e. green). Therefore, you write "end green" on one of your post-its and tack that to the node.
Now it is time to think of which of your painter employees is best suited for the painting job. For this you have to evaluate the constellations that may happen in your document regarding the nodes that should be grouped.
For example, you may know that if you don’t find a node starting a group and a node ending the group, the grouping should not occur. In other words, the known start and end nodes (i.e. nodes that fulfill the requirements for being tagged as such) are required for a grouping to happen.
Other situations could be as follows: group from a start node to the next start node, group from an end node to the next end node, group adjacent likewise colored nodes, etc. For each of these situations, you have dedicated painters. To have them do their work in the next step, you place them on nodes.
Suppose in our example, we require a start and end node for a grouping to happen, and we have just tagged the current node as a start node. We therefore choose a start-end painter and place it on the current node.
When we have done both, tagged the node (if possible) and placed a painter (if we could determine a suitable), we move on to the next node in the ordered sequence and start over.
Finally, we’ll reach the last node in the sibling node sequence and will have tagged some nodes and/or placed painters on some of the nodes. Now, all preparation work is done and we can tell the painters to do their work, i.e. start painting.
Now, consider yourself a painter, with a bucket of color of a certain kind (the color-"name" corresponds to the element name that should be the grouping element later). In the previous step, you have been placed on some node in the sequence.
Depending on your kind, you try to paint from your location.
In our example, you are a start-end painter. This means from the place you are at, you look in direction of the start of the sequence and look for the nearest node that has been tagged with a "start green" label. (This may be the node you are standing on.) If you find such a node, you remember it. If you do not find such a node, you cannot fulfill your task (which is "Paint from start node to end node") and give up, not painting anything.
Next, you look into the direction of the end of the sequence and look for the nearest node tagged with an "end green" label. (This may, again, be the node you are standing on.) If you find that as well, you can fulfill your painting job and start painting all nodes from the start node you found to the end node you found (including both). Then, you are finished.
The above is repeated for all painters that have been placed on nodes in the current node sequence. After this has been finished, the complete sequence got painted in a way that the actual grouping can take place, based on the paint color information on each node and the start and end tagging.
For each color, a node can have either no tag, or it can be tagged as a start node, tagged as an end node, or tagged as both, start and end node for that respective color.
These tags can currently be set using the UPL functions mark-start()
and mark-end()
.
The example in the introduction to the painter concept already mentioned the start-end painter type. Painters can be placed on a node using the UPL function set-painter()
.
Note that you can place an ordered list of painters for a single color on a node. The idea is to have fallback painters when the first one fails to paint because its requirements cannot be fulfilled (like e.g. for a start-end painter, when there’s either no start tag or end tag). In such a case, painting using the second-specified painter is tried. If that cannot paint as well due to unsatisfied requirements, the next painter is tried and so on until either a painter is able to paint, or the end of the list is reached, in which case no painting occurs.
In the examples below for each painter, we use the following symbols:
Follows a description of all available painter types:
This painter will paint from the nearest start-tagged node of the node sequence (in direction to the start) to the nearest end-tagged node (in direction to the end), observing its own node.
There may be no end-tagged node between the painter and the nearest start-tagged node, nor a start-tagged node between the painter and the nearest end-tagged node. In both of these cases, the painter will fail. The "-" in the name symbolizes a direct, uninterrupted (by other tags) sequence of nodes between start and end.
This is the same as start-end, but it allows differently tagged nodes between the nearest start- and end-tagged nodes (see last two examples). The "*" in the name symbolizes a wildcard sequence of tagged nodes between start and end.
The painter will fail if either there’s no start-tagged node earlier in the node list or no end-tagged node later in the node list.
This painter will paint from the last start-tagged node up to the one it was placed on.
There may be no end-tagged node between the painter and the nearest start-tagged node. If this is the case, the painter will fail. The "-" in the name symbolizes a direct, uninterrupted (by other tags) sequence of nodes between start and painter position.
The painter will also fail if there is no start-tagged node earlier in the node list.
This is the same as start-here, but it allows end-tagged nodes between it and the nearest preceding start-tagged node (see last two examples). The "*" in the name symbolizes a wildcard sequence of end-tagged nodes between start and painter node.
The painter will fail if there is no start-tagged node earlier in the node list.
This painter will paint from the node it is placed on up to the next end-tagged node.
There may be no start-tagged node between the painter and the next end-tagged node. If this is the case, the painter will fail. The "-" in the name symbolizes a direct, uninterrupted (by other tags) sequence of nodes between start and painter position.
The painter will fail if there is no end-tagged node later in the node list.
This is the same as here-end, but it allows start-tagged nodes between it and the nearest following end-tagged node (see last example). The "*" in the name symbolizes a wildcard sequence of start-tagged nodes between painter node and next end-tagged node.
The painter will fail if there is no end-tagged node later in the node list.
This painter paints from the nearest preceding start-tagged node to the next start-tagged node (not including the latter).
It fails if there is an end-tagged node in-between. It also fails if there is no start-tagged node.
This is the same as the start-start painter, however end-tagged nodes between those marked with a start tag are allowed (see examples 7 and 8).
This painter paints from the nearest preceding end-tagged node to the next end-tagged node (not including the former).
It fails if there is a start-tagged node in-between. It also fails if there is no following end-tagged node.
This is the same as the end-end painter, however start-tagged nodes between those marked with an end tag are allowed (see examples 2 and 4).
Grouping is performed on the whole document tree in a bottom-up document order. It is performed individually for each element’s children. It is also performed in a defined color order that you can specify, i.e. colors are always processed in a defined order.
Grouping does take into account node start and end tags. This is necessary in order to support directly adjacent groups. If grouping was only based on contiguous coloring, adjacent groups would not be possible since the grouper would not know where to split contiguously colored nodes into groups. In this, tags live up to their original roles, that is start tags always start a new group on that respective node and end tags end the currently open group after that node.
The following sample graphics shows – for a single color – how grouping takes place in a specific painting/tagging situation:
Group 1 is delimited by the start tag of node #2.
Group 2 is delimited by the end tag of node #2.
Group 3 is delimited by the end tag of node #4.
Group 4 is delimited by the end tag of node #7.
Group 5 is delimited by the non-painted node #9.
Group 6 is delimited by the end of the node sequence.
When placing tags on nodes it is therefore important to always bear in mind that these tags will also govern the final grouping in situations where painted nodes are adjacent.
Follow some examples you may encounter in one or another form in your own grouping requirements:
Suppose you want to group adjacent paragraphs that are of class "Note", because you want to group them using a note
element.
The UPL code you should run before the grouper in the UPL processor should look like:
[element(uci:par) and @uci:class="Note"] { set-painter( note, {"this"} ); }
This will set a painter of color "note" and type this on all uci:par
elements that are of class "Note
". During painting, those nodes will be painted with the specified color, and during grouping all contiguously adjacent, likewise colored node groups will be grouped by an <uci:block uci:type="note">…</uci:block>
element.
Suppose you want to group nodes where you know exactly which conditions must be met by a node to start a group, but you don’t know the end. What you additionally do know is which kind of nodes are certainly part of the group (if they exist).
Let’s say we have the following XML fragment of sibling nodes:
➊ <p>Some text.</p> ➋ <p class="example-title">Example</p> ➌ <p class="example-text">Fruits are:</p> ➍ <list> <item>apples</item> <item>bananas</item> </list> ➎ <p class="example-text">All these can be bought at Miller’s.</p> ➏ <p>As you have seen,…</p>
In this example, you know that paragraphs of class "example-text" always are part of an example, and that an example is always started by a paragraph of class "example-title". You do not know more, i.e. there may be arbitrary elements in-between like the list
element in the example.
A suitable UPL code to group the elements #2 to #5 could be:
[element(p) and @class="example-title"] { mark-start( example ); set-painter( example, {"this"} ); /* optional, see below */ } [element(p) and @class="example-text"] { set-painter( example, {"start-here"} ); }
What does this do?
First, a start tag with color "example" is set on node #2, along with a painter that only colors itself. This is necessary when an example is allowed to only consist of an "example-title"-paragraph. If you require an example to at least have one "example-text"-paragraph to be a valid example, don’t use the line of code marked optional in the above.
Then, a painter of color "example" is placed on node #3 that paints from the nearest preceding start tagged node of color "example" up to itself. On the list element (#4), no painter or tag is set. On node #5, we again set a painter of color "example" that paints from the nearest preceding start tagged node of color "example" up to itself.
This happens during the run of the UPL program in the UPL Tree-Processor module.
Now, it’s the grouper’s turn, and it is about to perform the grouping for the color "example". As we have seen above, the first thing it does is apply the painting through the painters. The painters execute in document order, one after the other, so you get the following sequence of painting and – finally – grouping:
First, painter P1 does its node painting. It is a this painter and therefore only paints the node it was placed on. Follows painter P2 of type start-here. Then finally, painter P3 starts painting. It is also of start-here type, and therefore paints from the nearest preceding start-tag up to the node it was placed on. Finally, the grouping G is created and nodes #2 to #5 are wrapped by a <uci:block uci:type="example">…</uci:block>
element.
Note how the list
node #4 is painted by painter P3 even though it has neither been tagged nor has a painter been placed on it. Instead of the list
node, any number of nodes not known in advance could have been present between node #3 and #5, and they would have been automatically grouped into an "example". This is a very important fact to both keep in mind and utilize to your advantage, for example in documents that have no strict, dependable structure but where you must work with only few known node constellations.
But what if…? Sure you have asked yourself, "But what if some badly authored document contains an ‘example-text’-paragraph without a preceding ‘example-title’-paragraph?" Here, the precise definition of the painter types comes into play.
Let’s assume node #2 is removed from the above example sequence. In this case, painter P2 would be the first painter to be executed. It is of type start-here, which fails if no suitable start-tagged node is found – which is the case here: there is no start-tagged node at or earlier in the node sequence. P2 fails, and a painter failing means it does not paint anything. The same is true for painter P3, with the effect that no node gets painted at all if node #2 (i.e., a start-tagged node) does not exist. Consequently, no grouping will occur.
Maybe that is not what you want. Maybe you want semantics like, "If a start-tagged node exists, then use that. If, however, it doesn’t, then at least make the individual ‘example-text’-paragraphs groups." This is where the painter fallback types come in handy. For the above, you’d need to change the UPL code as follows:
[element(p) and @class="example-title"] {
mark-start( example );
set-painter( example, {"this"} ); /* optional, see below */
}
[element(p) and @class="example-text"] {
set-painter( example, {"start-here", "this"} );
}
Note the added painter type this in the second rule. This has the effect that when the first painter type (start-here) fails, the next – this – is tried, which – as already described – only paints the node the painter was placed on. So if node #2 was missing in our example, with the new UPL code we’d make sure that at least the paragraphs of class "example-text" would get painted, and therefore grouped, either on their own as in our example or, if adjacent, as a whole.
More real-world examples will be posted as supplemental material on our website in form of tutorials and how-tos in the following weeks and months.
The use of XML namespaces is a core concept of upCast. Namespaces are essential to the processing pipeline, since they allow the clash-free co-existence of user-defined attributes and elements with upCast’s automatically generated elements and attributes. Clear separation of element and attribute domains allows targeted, semantically clear selection and filtering of the rich information present in the internal tree at serialization time.
namespace name | recommended prefix |
http://www.infinity-loop.de/namespace/2006/upcast-internal | uci |
All elements and attributes of the upCast Internal DTD are members of the http://www.infinity-loop.de/namespace/2006/upcast-internal namespace. The suggested namespace prefix is uci.
Besides the goal of avoiding name clashes, attributes are members of the upcast-internal namespace so that they can be put on any element in the internal tree, even if it is a non-upcast-internal element, and still be recognized easily as such.
namespace name | recommended prefix |
http://www.infinity-loop.de/namespace/2006/upcast-css | css |
The upcast-css namespace with the name http://www.infinity-loop.de/namespace/2006/upcast-css has a recommended prefix of css. It is currently only used for attributes and does not contain any elements. It is essentially a virtual namespace, which means that elements of the internal tree pretend to have attributes in the upcast-css namespace, when in fact these attributes are not actually materialized on the elements for efficiency reasons, but are generated on the fly when queried.
The css namespace contains the current value of all properties at the context node that have been defined by either applying a class
to an element or a manual style override. It is assumed that all properties are inherited, and that manual overrides take precedence over class application when occurring on the same node.
The upcast-css namespace contains CSS styling properties mapped to an attribute representation. Since not all CSS property names used in upCast are also valid XML attribute local names, the following translation applies:
virtualized attribute name | |
-ilx- | css:ilx- |
| css: |
The only time the virtual attributes in the upcast-css namespace are actually materialized is when an exporter wants to serialize the internal tree verbatim, e.g. for reference reasons.
To export materialized attributes in the upcast-css namespace using the supplied XML Exporter, turn on its Explode CSS style info into real attributes option.
namespace name | recommended prefix |
http://www.infinity-loop.de/namespace/2006/upcast-cssoverride | csso |
The upcast-cssoverride namespace with the name http://www.infinity-loop.de/namespace/2006/upcast-cssoverride has a recommended prefix of csso. It is currently only used for attributes and does not contain any elements. It is essentially a virtual namespace, which means that elements of the internal tree pretend to have attributes in the upcast-cssoverride namespace, when in fact these attributes are not actually materialized on the elements for efficiency reasons, but are generated on the fly when queried.
The upcast-cssoverride namespace contains CSS styling properties mapped to an attribute representation. It contains only properties that have been brought into the tree by applying a manual, explicit, anonymous style property override at a certain node, usually by way of a style
attribute with local style property settings. The properties available in the csso namespace on a specific context node consist of the union of all such properties having been applied either on the node itself or one of its ancestors in the described fashion, in order from document root to context node, unless they are identical in name and value with a property in the fully calculated cssc namespace on that node, in which case they are not added. (It is assumed that cssc properties are always inherited.)
Since not all CSS property names used in upCast are also valid XML attribute local names, the following translation applies:
property name | virtualized attribute name |
-ilx- | csso:ilx- |
| csso: |
The only time the virtual attributes in the upcast-cssoverride namespace are actually materialized is when an exporter wants to serialize the internal tree verbatim, e.g. for reference reasons.
To export materialized attributes in the upcast-cssoverride namespace using the supplied XML Exporter, turn on its Explode CSS style info into real attributes option.
namespace name | recommended prefix |
http://www.infinity-loop.de/namespace/2006/upcast-cssclass | cssc |
The upcast-cssclass namespace with the name http://www.infinity-loop.de/namespace/2006/upcast-cssclass has a recommended prefix of cssc. It is currently only used for attributes and does not contain any elements. It is essentially a virtual namespace, which means that elements of the internal tree pretend to have attributes in the upcast-cssclass namespace, when in fact these attributes are not actually materialized on the elements for efficiency reasons, but are generated on the fly when queried.
The upcast-cssclass namespace contains CSS styling properties mapped to an attribute representation. The cssc namespace contains only properties that have been brought into the tree by applying a named style class from an external stylesheet onto a node, usually by way of a style reference using the class
attribute. The properties available in the cssc namespace on a specific context node consist of the union of all such properties having been applied either on the node itself or one of its ancestors in the described fashion, in order from document root to context node. It is assumed that cssc properties are always inherited.
Since not all CSS property names used in upCast are also valid XML attribute local names, the following translation applies:
property name | virtualized attribute name |
-ilx- | cssc:ilx- |
| cssc: |
The only time the virtual attributes in the upcast-cssoverride namespace are actually materialized is when an exporter wants to serialize the internal tree verbatim, e.g. for reference reasons.
namespace name | recommended prefix |
http://www.infinity-loop.de/namespace/2006/upcast-cals | cals |
The upcast-cals namespace with the name http://www.infinity-loop.de/namespace/2006/upcast-cals has a recommended prefix of cals. It is used to differentiate attributes on tables from similarly or likewise named attributes and elements in other table models like HTML or the internal table model. You can therefore already decide at the top-level cals:table
element that you are dealing with a CALS table without having to infer this from the further descendant element structure.
namespace name | recommended prefix |
http://www.w3.org/HTML/1998/html4 | html |
The html namespace with the name http://www.w3.org/HTML/1998/html4 has a recommended prefix of html. It is used to differentiate attributes on tables from similarly or likewise named attributes and elements in other table models like CALS or the internal table model. You can therefore already decide at the top-level html:table
element that you are dealing with a HTML table without having to infer this from the further descendant element structure.
namespace name | recommended prefix |
http://www.w3.org/1999/xlink | xlink |
The XLink namespace with the name http://www.w3.org/1999/xlink has a recommended prefix of xlink. It is used to identify linking attributes on elements.
namespace name | recommended prefix |
http://www.w3.org/XML/1998/namespace | xml |
The XML namespace with the name http://www.w3.org/XML/1998/namespace has a recommended prefix of xml.
In UPL, you can refer to variables and values in a specific realm using that realm’s namespace. For each realm, there is a corresponding namespace.
For details on UPL variable references, confer the UPL specification.
For details on upCast variable realms, see here.
namespace name | recommended prefix |
http://www.infinity-loop.de/namespace/upl/utility-functions | util |
upCast comes with a library of several UPL utility function definitions. To have those separate from your own function definitions, even if they may share the same local function name, all these functions are located in a specific namespace: http://www.infinity-loop.de/namespace/upl/utility-functions .
The library of UPL utility functions is currently not documented as it is still work in progress and therefore not stable enough to be used reliably in your own projects.
Some settings need to be made early in the startup process of upCast. In fact so early, that they can not be read with application-internal means, but need already be set and available when upCast starts running. To set those values in cases where their default is not desirable, you can pass them via Java system properties to the JVM running the upCast application.
The following parameters are available, with their defaults, which are sometimes calculated dynamically based on the system/OS the application is running on, given as well:
de.infinityloop.exe.location (Windows only)
Default: ${application:BundledResources}/EXEs
Specifies the folder where upCast will look for supporting .exe
files (like il-gw.exe
, used for WordLink).
de.infinityloop.application.location
Default: (application installation root)
The folder where the application’s installation root lies.
de.infinityloop.application.preferencesdir
Default: (system dependent)
The folder where upCast will write its preferences file to.
de.infinityloop.application.logfile
Default: (system dependent)
The file where upCast will write its external logfile. Whether this value is actually used is dependent on the log subsystem chosen.
de.infinityloop.application.logsize
Default: 8388608 (bytes, i.e. 8 MB)
The maximum size for the application log file. When this size is reached, the log file is automatically cleared and filled with new log entries.
de.infinityloop.loglevel
Default: 3
Set to a number greater or equal 0 identifying the threshold below which messages will be output to the log subsystem. The currently used range is 0..7, with 7 being the highest level debugging, i.e. "output always". To have verbose logging, set this to a high value. To get reduced logging info, reduce the number.
The default value 3 corresponds to INFO
type messages (and above).
When running a pipeline using the de.infinityloop.upcast.RunPipeline
class, use the -debug
logfilterexp
option described there instead of the de.infinityloop.loglevel
property.
Default: (empty)
This property can be used for specifying the log event filter used at the interface to the external logging system (usually a file or the console). Additionally, it is used in some selected places within upCast’s code base to prevent the time-consuming creation of complex log events already at their originating place.
The filter expression syntax is the same as described here.
Example 12.1.
-Dde.infinityloop.logfilterspec=+ERROR,+FATAL,+INFO
only passes messages of type ERROR, FATAL and INFO to the external logging subsystem, but not WARN messages.
At this time, the only supported message constant preventing log message generation already at its origin is CurrentRTFToken
.
upCast offers a convenient helper class for running pipeline documents from the commandline. It also allows you to pass parameter values to the pipeline if they have been defined in the Pipeline Settings > Pipeline Parameters tab.
The commandline pipeline document interpreter class reads the specified pipeline document and looks for all defined pipeline parameters in it:
If a parameter has the property required set to true
, a value for it must be specified on the commandline. If no value is specified, the execution is stopped and an appropriate error message is output to the console.
If a parameter does not have the required property set to true
, and if no value for it is specified in the commandline call, and if it has its default property specified, that specified value is set.
If a parameter does not have the required property set to true
, and if its default property has not been specified, and if no value for it is specified in the commandline call, then this parameter will not be set at all. Trying to retrieve the value for such a parameter during pipeline execution will result in an error to the effect that the requested parameter resp. pipeline variable is undefined.
If a parameter is specified on the commandline that is not defined as a parameter in the pipeline, an error is issued to the console and execution is halted.
After these checks, the parameter values that are defined will be set as variables in the pipeline realm (similar to how is the case when running the pipeline in Simple View mode), and then modules will be executed in the order as defined in the pipeline document.
A pipeline document to be run by the commandline interface must be self-contained, i.e. it must explicitly specify
its license file
catalogs to be used
font configuration definitions or overrides
any custom encodings to be used
You should make sure that pipeline documents intended to be run via the commandline do not have their Use application settings checkbox checked on their Pipeline Settings > Catalogs, Pipeline Settings > Font configuration, Pipeline Settings > Encodings and Pipeline Settings > License tab.
Note that by default, upCast’s built-in templates have this checkbox checked!
For parameters of type popup
, the internal value (from the internal-values property list) must be passed in as the parameter value, not the displayed value.
java –classpath upcast.jar de.infinityloop.upcast.RunPipeline parameters...
with parameters
being:
absolute path to the pipeline document to be run
standard options
Standard options are as follows:
set the pipeline parameter name
to the value value
turn on debug output for the conversion with specified level of verbosity with N being a number between 0 (least verbose) and 7 (annoyingly verbose). Alternatively, you can specify a filter expression (as string) that follows the log filter expression syntax as defined here.
display upCast version information
show help on the defined parameters for the specified pipeline document
one or more (XML-) catalog files to set up as global upCast catalogs before further processing. Setting this option is essential when using parameter sets (*.ucpar
) that rely on resolving their PUBLIC identifier to find the corresponding pipeline implementation file.
override the license specified in the pipeline with this specified one (absolute URL path)
To view a current synopsis of the implementation of the commandline interface, issue
java -cp upcast.jar de.infinityloop.upcast.RunPipeline
The application jar (upcast.jar
) contains an embedded evaluation license.
Since the commandline has no knowledge about the system-dependent preferences file storage location, you need to always specify the license manually when calling the CLI (or Java API) unless you have specified the license to use explicitly in the pipeline settings (which is recommended for any pipeline intended to be run via CLI or Java API).
The path to use for the embedded evalulation license is as follows:
jar:file:!/de/infinityloop/upcast/resources/licenses/upcast-eval.uclicense
When using the CLI, add the parameter
-license "jar:file:!/de/infinityloop/upcast/resources/licenses/upcast-eval.uclicense"
to your call.
The RunPipeline
class calls Java's System.exit()
function after running the pipeline. It returns a numeric exit code. This exit code can either be one of the upCast-reserved exit codes, or be a exit code the pipeline itself sets.
Exit codes in the range from 0 to 99 are reserved for upCast's own use.
Custom exit codes should be in the range from 100 to 250.
The following exit codes are currently used by upCast:
SUCCESS
: the pipeline was successfully processed
GENERALERROR
: some general error occurred during pipeline processing; for details, see the log file entries
PIPELINENOTFOUND
: the pipeline document to be run was either not found or cannot be read; check the path to the file ans whether it is readable by the user that runs upCast
ARGUMENTERROR
: the number or types of arguments passed does not match the pipeline document's parameter definitions
To return a custom exit code, you must make sure that the pipeline variable ModuleResult of the top-level pipeline is set to a corresponding value. That value must be castable to an integer in the range from 0 to 255.
By default, ModuleResult contains the value last set by a module in execution order. You can override that value – if required – in the top-level pipeline's custom finalization function finalize()
by writing the value to the $pipeline:ModuleResult
variable explicitly.
It is strongly recommended to use the detailed Java API (described in this chapter) only when your requirements do not let you use the static de.infinityloop.upcast.UpcastEngine.runPipeline() method. This is the case e.g. when you must dynamically select and parameterize the modules and their sequence of execution, or must react specifically in Java code on error conditions after each single module execution.
If you do not absolutely need these fine-grained control capabilities, which is usually the case when you can set up and run the pipeline you need using just the upCast GUI, please do not use the low-level API described in the following. Just use the de.infinityloop.upcast.UpcastEngine.runPipeline() method with the pipeline or parameter set file you developed in the GUI, instead. This makes changes to the pipeline possible without the need for re-compilation and therefore maintenance so much easier…!
Complete sample code (just a few lines of Java) ready for copy and paste into your Java project for each individual pipeline using de.infinityloop.upcast.UpcastEngine.runPipeline() can be obtained from the pipeline documentation in HTML format you get from File > Generate Documentation….
Accessing upCast functionality is carried out via one instance of a broker object: UpcastEngine
. You should create one instance of that object at startup and use it for many subsequent conversions, since creation of this object is rather expensive. There are no problems in reusing that object for subsequent conversions (in contrast e.g. to many XML parser implementations, for example) – to the contrary, it is highly recommended from a performance point of view.
You may create several instances of the UpcastEngine
object in order to run multiple conversion threads at the same time in your single application. Please note that the maximum number of parallel threads may be restricted by your license.
We assume that you are familiar with Java programming and its concepts like objects, interfaces and implementations. You should also be fluent with the Java object notion and with Java Streams.
The javadoc API reference can be found here.
The general programming steps are as follows:
Instantiate a de.infinityloop.upcast.UpcastEngine
object. You can think of this object as the interface to your pipeline.
Set the pipeline base URI using the setPipelineBaseURI()
method.
Register that instance with an appropriate license file using its setLicense()
method.
Set global pipeline parameters like catalogs to use, overrides to the standard font configuration and custom encodings to use via the appropriate instance methods.
Call the initializeConversion()
method.
Set pipeline variables using the setPipelineVariable()
method.
Choose a module class via the method setModuleType()
, which then internally gets instantiated and becomes the current module.
Set module parameters using (possibly repeated calls to) the setModuleParameter()
method.
Start the module execution by calling runModule()
.
(optional) Repeat from step 7 for subsequent modules in the desired pipeline.
Call the cleanupConversion()
method.
(optional) Repeat from step 5 for converting another document.
Expressed in actual Java code, this might look something like this:
String moduleID = null; UpcastEngine ucInst = new UpcastEngine( "instance one" ); ucInst.setPipelineBaseURI( "file:///path/to/basefolder/" ); ucInst.setLicense( "file:///path/to/upcast.uclicense" ); ucInst.setPipelineVariable( "DestinationFolder", "/test/out/" ); ucInst.setPipelineVariable( "ImageDestinationFolder", "/test/out/" ); ucInst.initializeConversion(); moduleID = ucInst.setModuleType( UpcastEngine.kRTFImporterType ); ucInst.setModuleParameter( moduleID, "OrigNumbering", Boolean.TRUE ); ucInst.setModuleParameter( moduleID, "SourceFile", "/test/in/in.rtf" ); ucInst.runModule( moduleID ); moduleID = ucInst.setModuleType( UpcastEngine.kXMLExporterType ); ucInst.setModuleParameter( moduleID, "DeleteEmpties", Boolean.FALSE ); ucInst.setModuleParameter( moduleID, "DestinationFile", "${pipeline:DestinationFolder}/out.xml;" ); ucInst.runModule( moduleID ); ucInst.cleanupConversion();
To quickly construct a slightly more sophisticated Java source code template for a pipeline you have already built using the GUI, use the File > Export to Java source command. You can then modify this generated code to your liking, preferably by subclassing it and overriding methods where needed.
You gain access to all functionality of upCast by means of objects of a single class: UpcastEngine
. An instance of this object is what you will use in your application in order to access the full range of upCast API functionality.
Before you can do anything with upCast, you need to instantiate a UpcastEngineobject
:
UpcastEngine ucInst = new UpcastEngine( "instance one" );
The UpcastEngine class is to be found in the de.infinityloop.upcast
package.
You should keep this object stored in a variable which you can access from all places inside your program where you need to access upCast functionality.
You should strive to have only one instance of the UpcastEngine
object per physical CPU at any time for performance reasons. Also make sure you only instantiate this object once during the life of your application process, as instantiating and disposing of this object is a relatively costly operation.
In the GUI version of upCast, this proeprty is set automatically for you, as there is a pipeline document that determines this value. In the API, however, there is no such document, so you must tell the upCast pipeline processor the value of this property. It serves as basis for resolving any ${pipeline:base}
references you might have in module parameter values or pipeline setting values.
ucInst.setPipelineBaseURI( "file:///path/to/basefolder/" );
This should be called immediately after creating the UpcastEngine
object instance.
To use upCast in API mode, a license file is required that includes either or both of the rtfimporter-api and rtfexporter-api features. If in doubt, contact us at licensing@infinity-loop.de.
License features encoded in a *.uclicense
upCast license file can be reviewed by opening the license file in upCast (using File > Open... or by double-clicking the license file (Windows and Mac OS X only)).
To set the license, use:
ucInst.setLicense( "file:///path/to/upcast.uclicense" );
The application jar (upcast.jar
) contains an embedded evaluation license.
When running via the Java API, upCast has no knowledge about the system-dependent preferences file storage location. Therefore, you need to always specify the license explicitly when using the Java API.
The path to use for the embedded evalulation license is
jar:file:!/de/infinityloop/upcast/resources/licenses/upcast-eval.uclicense
When using the Java API, you'd then use this line of code to set the evaluation license:
ucInst.setLicense("jar:file:!/de/infinityloop/upcast/resources/licenses/upcast-eval.uclicense");
You can set pipeline properties directly on the UpcastEngine
instance object. This includes amending or overriding the font configuration (setCustomFontConfiguration()
), adding catalog files to be used by XML processing (addCatalog()
, discardCatalogs()
), and adding custom encodings to the set of built-in ones (addCustomEncoding()
). These settings remain valid as long as the UpcastEngine
instance lives or until you explicitly clear or set them to different values.
(Do not confuse these settings with the setting of pipeline variables; see below.)
Whereas in the GUI, you build a static pipeline by choosing a specific sequence of modules, the API handles a pipeline differently. In fact, there is no concept of a pre-built pipeline setup to be run; instead, you run modules one at a time. This has the great advantage that you can dynamically and programmatically build the actual pipeline for each single conversion, e.g. based on results of a preceding module execution on that input source.
ucInst.initializeConversion(); /* ... your pipeline code goes here ... */ ucInst.cleanupConversion();
Since upCast has to do some housekeeping for each conceptual pipeline run (independent of the actual number and sequence of modules run within), you need to tell it when you conceptually start a pipeline for a specific input file, and when you are done with it, i.e. when you have run the last module for this specific input file. This is done by the initializeConversion()
and cleanupConversion()
methods.
For example, initializeConversion()
cleans the pipeline variable realm so that subsequent pipeline runs do not see values set by a previous run. And cleanupConversion()
makes sure any temporary files created by some module get properly deleted when they are no longer needed.
It is very important that you obey this pipeline bracketing rule at all times, as strange, non-deterministic behaviour may occur otherwise.
As in the GUI (by way of the Pipeline Variables module), you can set variables in the pipeline realm to be used by modules run subsequently. The method to use is setPipelineVariable()
, e.g.:
ucInst.setPipelineVariable( "DestinationFolder", "/test/out/" );
The pipeline variable realm is cleared by a call to initializeConversion()
. You therefore must explicitly (re-)set them at the beginning of a new conversion pipeline execution for a document.
Each module to be run has to be set up individually. This is done in three general steps:
Choose and set the module class to use.
Set module parameters.
Run the module.
First, you choose from one of the available module classes and set that using the setModuleType()
method:
moduleID = ucInst.setModuleType( UpcastEngine.kRTFImporterType );
This will create a new instance of this module type and set it as the current module.
The constants to be used (in the UpcastEngine
class) for the available module types are:
Module Type | Java constant name |
Pipeline Variables |
|
RTF Importer ("upCast") |
|
UPL Processor |
|
UPL Tree Processor |
|
Sectioner | |
XML Exporter |
|
Commandline Processor |
|
XSLT Processor |
|
Unicode Translation Processor |
|
XML Validator |
|
CSS Exporter |
|
RTF Exporter ("downCast") |
|
XML Importer |
|
External Pipeline Processor |
|
Module parameters will be set to defaults. The call will return a handle (a String) to that module which you must pass to subsequent setModuleParameter()
calls:
ucInst.setModuleParameter( moduleID, "SourceFile", "/test/in/in.rtf" );
The default parameter setting of modules is not documented. Though usually reasonable, these may change from release to release. We therefore highly recommend to set all parameters of a module explicitly to the desired values in order to not have your code break at an upCast update.
Like in the GUI version of upCast, you can use variable references in the parameter values which will be resolved by upCast automatically.
Example 14.1.
To specify the source file relatively to the pipeline base directory (to whatever value it is currently set), use a line like
ucInst.setModuleParameter( moduleID, "SourceFile", "${pipeline:base}/source/in.rtf" );
Parameter names for each module are given in the description of the individual modules earlier in this manual. The parameter value has to be passed as a Java Object
. The required object class depends on the specific parameter and is documented for each available parameter.
If you set a parameter more than once, the last value set will be used.
To set several parameters, you need to repeatedly call the method setModuleParameter()
.
If you try to set a parameter that is not supported by the current module, the parameter simply will have no effect, but no error is reported. To track which parameters you set in your application, you should turn on debug logging.
If you use a different Java Object
(sub-)class for the parameter value than specified in the reference section, the behaviour is undefined. Some types may be compatible, but in general you will get a Java exception at some point later in the execution of upCast or the operation will not work the way you intended.
Finally, you’ll want to kick off the module’s execution. This is done by the runModule()
method:
ucInst.runModule( moduleID );
After this, you could either setup the next module exactly as described in this section so far. You could even base the selection of the next module on the value of some pipeline variable which the module might have set to some specific value or some other condition. You can query the values of variables in the pipeline
realm using the getPipelineVariable()
method.
To access WordLink functionality also from upCast running via the API, you need to tell it where the WordLink component il-gw.exe
is to be found before you instantiate an UpcastEngine
object. This is done by setting the system property de.infinityloop.exe.location
to the folder where il-gw.exe
resides:
System.setProperty( "de.infinityloop.exe.location", "/path/to/il-gw-folder/" );
On a typical Windows installation, this is C:\Program Files\infinity-loop\upCast\Resources\EXEs
, but you are free to move the application file il-gw.exe
anwhere in your filesystem where it is convenient for your deployed application.
Using WordLink in a critical server-based unattended environment is not supported and therefore not recommended. WordLink uses an installed copy of Word in component mode. Such use is explicitly warned against by Microsoft for server or server-like applications for technical reasons (letting alone any remaining licensing issues).
WordLink must access and launch Word to do what it needs to do. However, when running in a server environment, rights of running processes are usually tightly restricted. For example, Word might not be allowed to be accessed by the server process as COM object.
To make WordLink work in such restricted environments, you need to explicitly grant the user running the server access to the Word COM object. You can check and do this as follows:
On the Windows commandline, start dcomcnfg.exe
.
Choose the component "Microsoft Word Document" (or similar, depending on localization) and click Properties... .
Under Security > Use custom launch permissions, add the account that runs the server using Edit... – Add... . (On one of our machines, this e.g. was "ASPNET (ASP.NET Machine Account)").
After this modification, WordLink should also work in the restricted environment.
During a single call to an API method, several problems may occur, some of them quite significant, some of them less significant. In every case, the method will throw a single UpcastException
. An UpcastException
is a special descendant of a java.lang.Exception
that encapsulates a list of errors and/or warnings that occurred during the last call to an API method.
You can query an UpcastException
for its single constituents, which are objects of type LogEvent
. A LogEvent
encompasses:
a numerical message code
a message class, one of: FATAL
, ERROR
, WARN
, INFO
, DEBUG
, VERBOSE
, DETAIL
a human readable message as String
a (possibly null) array of parameters that were used in constructing the message
The recommended coding style for error handling is to wrap each call to an API method in its own try{}/catch{}
-block and catching UpcastExceptions
explicitly. This is useful if e.g. the runModule()
call throws an exception, but the severity is not high and you decide to continue processing because it only contains a warning that you do not care about and that does not affect the document integrity. By wrapping each call separately, you get the maximum out of any sequence of API calls by just skipping the portions that did not work.
A typical API call including error handling would look something like:
try { ucInst.runModule( moduleID ); } catch( UpcastException e ) { if( e.extractSignificantEntries( new int[] { LogEvent.FATAL, LogEvent.ERROR }, null, null ).size() > 0 ) { // we only react on FATAL or ERROR types, but not WARNings ... do some error handling ... } }
Using the extractSignificantEntries()
method you can specify in very high detail in what messages you are interested in. For more information on how to use this method, see the javadoc API reference.
The message codes are all constants of a special class, Msg
. See the javadoc API reference for a description of the currently defined message codes and the number and semantics of parameters available for a specific message.
upCast’s distribution jar includes an Ant task that lets you run upCast pipelines from files (*.ucdoc
) from within Ant. This has the advantage that usually, you do not have to create the Ant task code anew whenever you make minor to moderate changes to your pipeline. To use it, you have to first define the task for use by Ant, then create the correct sub-structure of the upcast task.
To quickly construct an Ant build file code template for the upcast-runner task, use the File > Export to Ant > using 'upcast-runner' Task command. You can then modify this generated code to your liking or include it into an existing build file.
To define the task, use the following code:
<taskdef
classname="de.infinityloop.upcast.ant.UpcastRunnerAntTask"
name="upcast-runner"
classpath="upcast.jar
"
/>
For upcast.jar
, you must specify the path to the distributed upcast.jar
file. E.g., if you have a specific tasks
folder next to your build file, you should copy the upcast.jar
file there and specify ${basedir}/tasks/upcast.jar
.
<upcast-runner file="/path/to/pipeline.ucdoc
" logfilter="DEBUG"
sourceparam="SourceFile" > <source dir="..."> <include name="pattern
" /> … </source> <catalogs> <catalog file="..." /> </catalogs> <param name="..." value="..." /> … </upcast>
On the upcast-runner
task itself, some global parameters need to be set, above all file, which is the absolute path to the pipeline to run by this task.
The upcast-runner
task can contain the following elements as nested elements: source
(to set the source files; see below for the exact semantics), catalogs
(to specify global catalog files to use; needed to resolve the PUBLIC ID of the pipeline in case the task is used to run a parameter set (*.ucpar
)), and one or more param
elements setting the pipeline's public parameters.
We’ll discuss each of these elements in more detail in the following.
Attribute | Description | Required |
| The absolute path to the pipeline or parameter set file to run. | yes |
The filter specification for emitted logging events. The filter specification syntax is described here. The default is | no | |
| The absolute path of a file to write logging output to. When not specified, upCast's default log file is used at {Log File}. | no |
| contains the name of the pipeline variable that should be set to an item as selected by the | no |
The source
element is a very special element. Normally, an upCast pipeline does not inherently support batching, because a pipeline is described only in terms of a specified, single source file (though this might be different for each module). To make this single source file easily available, the Ant task makes use of the upCast parameter system and allows you to pre-set the pipeline:SourceFile variable (or a differently named variable if the sourceparam attribute on the upcast
task element is used).
The source
element is the equivalent to setting this variable. This works as follows:
For each file selected by any nested source
element in an upcast
task, the pipeline as described by the upcast-runner
element is executed.
In this, the current file of an iteration is set as the pipeline variable SourceFile (the default, or the different variable name optionally specified by the sourceparam attribute on the upcast-runner
task element) and is then available as ${pipeline:SourceFile}
(resp. ${pipeline:
variablename
}
) for all modules in the pipeline.
Don’t forget to quote upCast variable references in an Ant build-file so that Ant does not try to resolve these variable references (which are otherwise very similar in syntax). See the param element description below for details and an example.
The source
element is of the Ant core type fileset
as described in the Ant manual, where you will find all attributes and allowed nested elements described in detail.
This optional element designates the upCast license file to use. If you specify a license explicitly using this element, it overrides any license setting made directly in the pipeline (or parameter set) document to be run.
This element is grouping all catalog
elements.
This element designates an OASIS or XML catalog file.
The catalog
element has the following attribute:
Attribute | Description | Required |
| The full, absolute path to the catalog file. You cannot use relative addressing based on the pipeline base directory on this parameter, as it is evaluated even before the pipeline document is read and the context is established. | yes |
Used to set a pipeline parameter. The parameter names available for use depends on the pipeline definition. To learn the parameters, we recommend to run File > Generate Documentation… on the pipeline this task should execute, which generates a documentation of the pipeline, including the parameters and value ranges it supports.
upCast’s distribution jar includes an Ant task that lets you run upCast pipelines from within Ant. To use it, you have to first define the task for use by Ant, then create the correct sub-structure of the upcast task.
To quickly construct an Ant build file code template for a pipeline you have already built using the GUI, use the File > Export to Ant > as self-contained Task… command. You can then modify this generated code to your liking or include it into an existing build file.
We recommend that instead of this upcast
task, you should use the upcast-runner task whenever possible for your convenience.
To define the task, use the following code:
<taskdef
classname="de.infinityloop.upcast.ant.UpcastAntTask"
name="upcast"
classpath="upcast.jar
"
/>
For upcast.jar
, you must specify the path to the distributed upcast.jar file. E.g., if you have a specific tasks
folder next to your build file, you should copy the upcast.jar
file there and specify ${basedir}/tasks/upcast.jar
.
<upcast basedir="/pipeline-base/
" name="instancename
" sourceparam="SourceFile" > <source dir="..."> <include name="pattern
" /> … </source> <settings> <licensefile file="..." /> <logging file="..." filter="..." /> <catalogs> <catalog file="..." /> … </catalogs> <encodings> <encoding file="..." /> … </encodings> <parameters> <parameter name="..." value="..." /> … </parameters> </settings> <pipeline> <module type="..." name="..."> <param name="..." value="..."/> … </module> … </pipeline> </upcast>
On the upcast
task itself, some global parameters need to be set, above all basedir (which directly translates to upCast’s ${pipeline:base}
variable and is accessible as such in parameter values).
The upcast
task can contain the following elements as nested elements: source
(to set the source files; see below for the exact semantics), settings
(for some essential pipeline settings), and pipeline
(describing the pipeline to run).
A pipeline
consists of an ordered sequence of module
elements (corresponding to the standard upCast modules), each of which can have any number of param
elements (to set module parameters).
We’ll discuss each of these elements in more detail in the following.
This is the root element of the upCast Ant task. It encapsulates a complete upCast pipeline with associated parameters.
Description | Required | |
| This is the equivalent to the setPipelineBaseURI("..."); You can access this base directory in parameter values that support variable extension using | yes |
| Sets a descriptive name for this running task instance that is used e.g. in log output. | no |
| contains the name of the pipeline variable that should be set to an item as selected by the source element in turn (for batches). The default variable name used when this attribute is not specified is SourceFile. | no |
The source
element is a very special element. Normally, an upCast pipeline does not inherently support batching, because a pipeline is described only in terms of a specified, single source file (though this might be different for each module). To make this single source file easily available, the Ant task makes use of the upCast parameter system and allows you to pre-set the pipeline:SourceFile variable (or a differently named variable if the sourceparam attribute on the upcast
task element is used).
The source
element is the equivalent to setting this variable. This works as follows:
For each file selected by any nested source
element in an upcast
task, the pipeline as described by the pipeline
element is executed.
In this, the current file of an iteration is set as the pipeline variable SourceFile (the default, or the different variable name optionally specified by the sourceparam attribute on the upcast
task element) and is then available as ${pipeline:SourceFile}
(resp. ${pipeline:
variablename
}
) for all modules in the pipeline.
Don’t forget to quote upCast variable references in an Ant build-file so that Ant does not try to resolve these variable references (which are otherwise very similar in syntax). See the param element description below for details and an example.
The source
element is of the Ant core type fileset
as described in the Ant manual, where you will find all attributes and allowed nested elements described in detail.
This element is grouping all settings that pertain to the whole pipeline execution environment (not just individual volumes. It has no attributes.
This element designates the upCast license file to use.
This element specifies the log file destination (if you are using upCast’s logging setup unchanged) and the logging filter specification.
The logging
element has the following attribute:
Attribute | Description | Required |
| The full, absolute path to the log file. | no |
| The filter specification for emitted logging events. The filter specification syntax is described here. The default is | no |
This element is grouping all catalog
elements.
This element is a wrapper for one or more encoding
elements.
The encoding
element is used to tell the upcast
task about any custom encoding files you are using. The encoding
element is of the Ant core type fileset
as described in the Ant manual, where you will find all attributes and allowed nested elements described in detail.
This element lets you specify a font configuration override. The override source code is specified as this element’s child text content.
This element has no attributes.
This element is a wrapper for one or more param
elements. The param element in this context (as descendant of the settings
element) specifies any parameters set and defined via the Simple View. The set of pipeline variables is pre-populated with these values before the pipeline element and its module children are evaluated.
Any implicit variable setting by a source
element contained within the upcast
task is performed after having set all of the param
elements contained within this parameters
element.
The pipeline
element specifies and groups the specification of the ordered execution of modules
.
A module
element specifies the type and (via its param
nested elements) the configuration of an upCast module to be run.
Attribute | Description | Required |
| The module’s class ID. One of:
| yes |
| Can be used to set a descriptive name for the module resp. its intended usage. This is equivalent to the module’s Name parameter. | no |
Used to set a module parameter. For a description of the parameters available for a specific module, see its description earlier in this document.
Attribute | Description | Required |
| The parameter’s name. | yes |
| The parameter’s value. You can use the usual upCast variable references in the value. NoteTo differentiate an Ant variable reference (which is expanded by the Ant runtime before being passed as the actual value to the code backing the <param name="P" value="${SourceFile}"/> will set the value of the parameter P to the contents of the Ant property <param name="P" value="$${SourceFile}"/> will set the value of the parameter P to the string value " | yes |
upCast uses an event-based logging system, which also serves a second purpose, namely transporting warning and error states through the system. There are several hooks where you can intercept and react to those states (= log events) in a programmatic way.
The basic concepts in this architecture are Log Event Sources, which generate a new log event, Log Event Processors, which evaluate, process, (possibly) forward and programmatically change or filter generated log events and Log Event Filters, which discard log events you are not interested in and therefore serve to conserve storage space and increase performance.
Each component in upCast, which can be a module or pipeline, has its own Log Event Source and Log Event Processor.
The Log Event Source sends a freshly created Log Event down to two distinct paths: the global Log Event path and the component's Log Event path.
The global Log Event path cannot be intercepted or filtered within a component, meaning that all events ever created are sent down this path and will show up in the Central Logging Hub. There, the Log Events of all running component instances and the application are merged into a single stream of Log Events that any number of Log Writers can attach to. Each Log Writer defines its own, independent Log Event Filter (if desired).
Two Log Writers are pre-defined: The Live Log Window and upCast's global log file.
The global application log file is created by a pluggable logging system. For the abstraction layer, upCast uses Log Bridge by Graham Lea. The default logging system implementation under this bridge in upCast is a slightly modified binding to java.util.logging
, the Java 1.4 logging system implementation. The log filter used for the log file is controlled by the setting of the Log filter parameter in the upCast Preferences window. When running in API mode where the UI preferences are not available, you can pass the Log Event Filter setting using the Java system property de.infinityloop.logfilterspec
. Additionally, you can set the log file location using the Java system property de.infinityloop.application.logfile
.
The Log Events available in live log mode of the logging window are all the events generated in the system, which are filtered for display by the respective setting in that window.
In addition to those pre-defined Log Writers, you can programmatically create (via UPL's create-log-writer()
function) any number of custom Log Writers with their own Log Event filter defined that – in contrast to the pre-defined Log Writers – also includes a context. The context is the component instance that was executing when (and wherefrom) the Log Writer was programmatically created. This allows you to filter Log Events not only by level and message code, but also by module or pipeline instance they originated from. This makes those custom Log Writers ideal for creating specialized log files that only cover e.g. a single conversion session or even just a detailed log of a single module's execution for debugging purposes.
The Log Event Processor is situated in the component's Log Event path and therefore allows you to query each component for the events and error states it has generated during its execution. This path is the one you will want to use to react to certain generated Log Events of interest as they often also carry e.g. error state information.
The Log Event Processor of each component only sees and therefore can process Log Events that pass its Log Event Filter (with few exceptions, see below). It collects those Log Events, which either were created by its own Log Event Source or have been actively forwarded from its child components (e.g. in case of a pipeline component, this would be its module children).
A Log Event Processor can raise the Terminate Pipeline Execution Signal. This is a request to immediately terminate further processing of the currently running pipeline instance. This specifically means that any later components in that pipeline will not be run. Furthermore, that pipeline's Log Event Processor is informed that a child component has raised the Terminate Pipeline Execution Signal (it is passed that condition in a parameter to its custom finalization UPL function). It is then free to reset (= ignore) it or forward it to its parent (if any) by returning it as the result value of the function.
By default, new modules and pipelines have their Log Event filter set to "inherit", i.e. they use the same setting as their parent. This effectively means that the Log Event Filter settings in pipelines and modules is the same as (and linked to) the Log Event Filter setting in upCast’s Preferences window. If you want more (or less) logging information to be available in a module’s or pipeline’s Log Event Processor, you need to override the default setting.
When an error occurs during processing of a component's initialization code (which should never happen in a correctly configured or programmed configuration), it will raise the Terminate Pipeline Execution Signal and all of its collected ERROR or FATAL messages are automatically forwarded to the component's parent.
The syntax is an arbitrary sequence of the following, whitespace- or comma-separated tokens, which are executed in the specified sequence:
enable the respective message type
disable the respective message type
enable all messages of the specified level and higher (i.e., WARN is equivalent to +FATAL,+ERROR,+WARN)
enable that message using its symbolic name
enable all messages between msgconstantid1 and msgconstantid2
disable that message using its symbolic name
disable all messages between msgconstantid1 and msgconstantid2
is a special token that uses the filter settings of the parent object of the one it is specified on or, if that does not exist, ALL.
A log message is filtered by applying all tokens of a log filter expression in the sequence they are written, from left to right.
The message symbolic names can be looked up in the documentation of the defining Java class, de.infinityloop.msg.Msg.
Example 17.1. Log filter spec examples
INFO
lets pass all messages that have a level of INFO or higher (i.e. WARN, ERROR, FATAL).
DEBUG -INFO
lets pass all messages that have a level of DEBUG or higher, but that are not of level INFO
+ERROR +INFO
lets pass all messages that have either the level ERROR or the level INFO
ERROR +ColumnNumbersNotContiguous
lets pass all messages that have a level of ERROR or higher, including the warning for a non-contiguous numbering of colspec’s colnum attribute.
INFO --149
lets pass all messages that have a level of INFO or higher, except for the ColumnNumbersNotContiguous
warning (whose numerical value is -149)
This is an example for demonstration purposes only on how to exclude a negative number. You should always use symbolic names for error messages (since the number assignment may change between releases), except for custom messages you use in your pipelines for which there are no symbolic constants defined. For those, you must always use positive numbers for custom messages.
DEBUG WARN
is the same as WARN, since the tokens are executed in the order of writing and therefore WARN overrides any settings with regard to levels that DEBUG may have set earlier.
ERROR +1..100 -WARN
lets pass all messages that have a level of ERROR or higher plus all custom log messages with an id between 1 and 100, but excluding from those all whose level is WARN.
Pipeline Templates in general are no different from regular pipeline definitions. However, to be handled correctly within upCast, you must obey the following important points:
the main pipeline configuration file (*.ucdoc
) must be named template.ucdoc
(exactly like this!) and must reside in the top-level folder of the template
all resources that the pipeline template uses (XSLT files, UPL files, other resources) must be stored below its top-level folder
the name of the top-level folder will be used in upCast’s UI to refer to the template, so it should be not too long and descriptive
To make creating a parameter set from a template work, the template.ucdoc must have defined its UID pipeline property setting, and ideally upCast’s configuration (upCast preferences, Catalogs tab) should include a reference to a catalog file where that UID is mapped to the physical location of the pipeline template
To have upCast find, recognize and display a template in its UI, you need to make sure that its top-level folder is placed in one of the folders upCast is looking for pipeline templates. The places upCast looks for pipeline templates can be specified in upCast’s preferences on the Application Settings tab, Pipeline Template Paths parameter.
Example 18.1. Pipeline Template Example: Pipeline template default file layout
The default pipeline template file layout and file naming is as follows:
These files are located within the descriptively named template folder (use a name of your choice), which itself is placed into one of the locations for Pipeline Template Paths.
upCast makes use of certain locations in the running machine’s file system to store support and session information. These locations are different depending on the underlying native operating system that runs the Java Virtual Machine. This manual denotes these standard places by placing the name of the location in curly braces, e.g. {Application Support Folder}. To learn the actual location on the machine upCast is running on, open the View > System Information… window, where the name as enclosed in the curly braces is listed with its actual corresponding file system location.
The following standard folders and locations are currently defined:
The root folder for application support data.
This folder contains installed licenses for the software. License files must end in .uclicense
to be recognized.
This is the path in the distribution jar where you may store a file named upcast.uclicense
. If this file exists at the specified location, it is used. It is then included in the license picker window.
The path to the application’s log file.
The location of the preferences7.plist
file where current application configuration parameters are persistently stored when the application is quit and where these settings are restored from upon next launch.
The root folder for application documentation data and related files.
The file that is opened when the user chooses View > Built-in Help... . By replacing the default file with a custom one at this location, you can provide custom help files or documentation for specific installations.
The path to the system-specific temporary items folder.
upCast has a built-in mechanism for converting any Unicode character to any other Unicode character, string or even entity notation on export. This is done by means of a Unicode translation map. You can specify a Unicode translation map in various exporter modules as the final stage a character needs to pass through before actually getting written to the output file or stream.
The syntax is simple: one conversion entry per line, and all lines starting with #
or //
are treated as comments.
A conversion entry has the form (notation similar to BNF):
conversion ::= unicodeNumber ‘=’ replacement
with:
unicodeNumber ::= hexNumber | decimalNumber replacement ::= string | hexNumber | decimalNumber hexNumber ::= (‘0x’ | ‘0X’ | ‘$’)[0-9A-Fa-f]+ decimalNumber ::= [0-9]+ string ::= ‘"’ (asciiChar)* ‘"’ asciiChar ::= a one-byte character in the range from 32 to 127, excluding ‘"’
Note that there's no whitespace allowed around the '=' character.
Example 20.1.
Follows a rather silly example, with the effect added in comments:
// First, we simply convert all spaces to a dot: 32="." // Then, we convert all capital letter A’s to a // full, empty tag: <letter_a /> 65="<letter_a />" // And then, we discard all small // letters ‘u’ completely: 0x75=""
For your convenience, there are the following options (all indicated by a leading '@' character) you can write on a line:
@charref fromCodepoint toCodepoint fillerKey [formatstring
]
This specifies how a certain range of code points should be preset. This saves you typing work if you need some range of characters not be specified in UTF-8 encoding, but e.g. as character references.
You can specify this option anywhere in a Unicode translation map, it takes effect (meaning: gets expanded and processed) at that specific location. You may use this to initialize a certain code range and then overwrite selected code points by specifying additional, normal translation rules as described above later on, which will then override the initialization performed by this option.
fromCodepoint
A decimal integer value specifying the start code point of the code range.
toCodepoint
A decimal integer value specifying the end code point of the range.
fillerKey
A string constant identifying the algorithm to use for filling the specified code point range.
The code point range is filled with character references in decimal notation, e.g. Ӓ
.
The code point range is filled with character references in decimal notation, e.g. Ӓ
.
The code point range is filled with the named character entity references as defined in http://www.w3.org/TR/xml-entity-names/. Code points in the specified range for which there are no named character entity references defined are left as-is. This allows you to either output them as UTF-8 (do nothing), or in a specific character reference notation by preceding the @charref named
option with e.g. the @charref dec
option.
The formatstring parameter string defines a configurable pattern as replacement. The format string is a standard Java MessageFormatter
format string, with the following placeholders available:
the Unicode codepoint of the current character in decimal number notation
the Unicode codepoint of the current character in hex number notation
the current character itself
Example 20.2.
@charref 128 32767 dec
This line fills the Unicode translation map for all code points from 128 to 32767 (incl.) with decimal numerical character references.
@charref 128 256 hex @charref 128 256 named
These two lines effectively fill the Unicode translation map for all code points from 128 to 256 (incl.) with named character entity references and set those code points for which there are no names defined to hexadecimal character references.
@charref $E000 $F8FF pattern "<illegal-char codepoint="{2}" />"
This will preset the PUA area with an illegal-char
element that has as its codepoint
attribute set to the hex value of the codepoint it represents/replaces.
@fill fromCodepoint toCodepoint value
This specifies how a certain range of code points should be preset. This saves you typing work if you need some range of characters be all set to a single value to be output.
You can specify this option anywhere in a Unicode translation map, it takes effect (meaning: gets expanded and processed) at that specific location. You may use this to initialize a certain code range and then overwrite selected code points selectively by specifying additional, normal translation rules as described above later on, which will then override the initialization performed by this option.
The difference between @fill
and @charref
is that here, the complete range is set to the same specified replacement value.
fromCodepoint
A decimal integer value specifying the start code point of the code range.
toCodepoint
A decimal integer value specifying the end code point of the range.
value
The value to set each of the code points in the specified range to. This can be any value that is allowed on the right side of a normal conversion entry, so it can be either a single Unicode character specification or a fixed string value.
Example 20.3.
@fill 0 7 "[ILLEGAL_XML_CHAR]" @fill 11 12 "[ILLEGAL_XML_CHAR]" @fill 55296 57343 "[ILLEGAL_XML_CHAR]" @fill 65534 65535 "[ILLEGAL_XML_CHAR]"
These lines preset the Unicode translation map such that any occurrence of a character that is not allowed in XML 1.0 is output as the text data "[ILLEGAL_XML_CHAR]
".
@invalid-xmlchar formatstring
This specifies how invalid XML 1.0 characters should be mapped when they occur in PCDATA content. formatstring
is (after expansion) the replacement string for any invalid XML 1.0 characters.
The format string is a standard Java MessageFormatter
format string, with the following placeholders available:
Additionally, each time a character replacement takes place, a log message of type InvalidXMLCharacter
(id = -218
, level = WARN
) is generated.
This option's definition will also be used as fallback for data in attributes when the more specific @invalid-xmlchar-attr
option is not defined.
Example 20.4.
To use, amend e.g. the XML Exporter's Unicode Translation Map by the following lines:
@invalid-xmlchar "##INVALIDCHAR={0} U+{1}##"
This will generate the following output for offending characters 0x1e and 0x1c in PCDATA content:
...preceding text
##INVALIDCHAR=30 U+1e##following text...
@invalid-xmlchar-attr formatstring
This specifies how invalid XML 1.0 characters should be mapped when they occur in attribute content. formatstring
is (after expansion) the replacement string for any invalid XML 1.0 characters in an element's attribute.
The format string is a standard Java MessageFormatter
format string, with the following placeholders available:
the Unicode codepoint of the offending character in decimal number notation
the Unicode codepoint of the offending character in hex number notation
the offending character itself
Additionally, each time character replacement takes place, a log message of type InvalidXMLCharacter
(id = -218
, level = WARN
) is generated.
Example 20.5.
To use, amend e.g. the XML Exporter's Unicode Translation Map by the following lines:
@invalid-xmlchar-attr "##INVALIDCHARATTR={0} U+{1}##"
This will generate the following output for offending characters 0x1e in an element elem
's attribute attr
:
<elem attr="...preceding text
##INVALIDCHARATTR=30 U+1e##following text...">...</elem>
This table associates arbitrary CSS <length>
properties with a pair of unit and precision information. This is useful when the created style information in either the CSS style sheet or the style overrides in the XML output should be human readable, in which case you would provide a table with a unit of measurement that people are most familiar with (inches or centimeters, e.g.), and a reasonable precision like 2 decimal digits.
The default table uses cm
as default unit, with a precision of 1 or 2 decimal digits, and pt
for special properties like font-size
.
The syntax is simple: one unit association entry per line, and all lines starting with //
are treated as comments.
An association entry has the form (notation similar to BNF):
association ::= propertyName ‘:’ ( (unit ‘,’ precision) | ‘#same’ )
with:
propertyName ::= CSS-property-name-identifier unit ::= ‘m’ | ‘cm’ | ‘mm’ | ‘pt’ | ‘in’ | ‘pc’ | ‘px’ | ‘emu’ | ‘tw’ | ‘hp’ precision ::= [0-9]+
tw
is a twip ("twentieth of a point") and the basic length unit used in RTF; 1tw = 0.05pt
emu
is a unit used in RTF shape objects; 1cm=360,000emu
hp
is a half-point and the unit used in RTF for specifying font sizes; 1hp = 0.5pt
The keyword #same
requests that the unit should not be changed.
The use of the #same
value is important for properties like line-height
which can be either relative, a number or even a keyword, where converting to an absolute length would be impossible. Failing to specify #same
on these properties may result in a conversion error.
Example 21.1.
Here’s an example of a CSS property unit table similar to the one used as the default table in upCast:
@option-default-length-unit:mm @option-default-length-precision:2 font-size:pt,1 border-top-width:pt,1 border-right-width:pt,1 border-bottom-width:pt,1 border-left-width:pt,1 -ilx-border-vertical-inside-width:pt,1 -ilx-border-horizontal-inside-width:pt,1 text-indent:mm,1 width:mm,1 height:mm,1 margin-left:mm,1 margin-right:mm,1 margin-top:mm,1 margin-bottom:mm,1 padding-left:mm,1 padding-right:mm,1 padding-top:mm,1 padding-bottom:mm,1 line-height:#same border-spacing:pt,2 letter-spacing:pt,2 -ilx-list-marker-offset:tw,0 -ilx-header-offset:mm,1 size:mm,1 -ilx-column-width:mm,1 -ilx-column-gap:mm,1 -ilx-footer-offset:mm,1 size:mm,1
There are two special options to specify default behavior
These options must be specified before any unit association for a specific CSS property.
@option-default-length-unit
This specifies the default unit to use for all <length>
units not specified explicitly in the unit table.
RTF files need to specify which encoding a font to be used is using and what properties it has. This is used by a rendering application to determine the best matching font on a platform where the exact specified font is not available. Additionally, the encoding a font is in is used by the rendering application to correctly interpret the characters found in the RTF file.
However, this mechanism does not support custom fonts with a special mapping of their constituting characters to a Unicode code point. This is what the Font Configuration setup is for. upCast comes with a default font configuration embedded in the application. You may extend and/or override it by providing a custom font configuration override or extension. This can be done either at the application or pipeline level. Here, you can specify standard font properties based on the font name, especially any custom encoding resp. codepage this font uses.
The default Font Configuration can be found at the following location in the package hierarchy in the upcast.jar
jar file:
de/infinityloop/resources/config/stdfonts.config
Follows an informal description of the font configuration format and the necessary properties, followed by the descriptioon of the search algorithm employed by upCast to find the properties for a given font.
The following special properties are used in the stdfonts.config
file:
Determines the general RTF font family a font belongs to based on its design. An RTF rendering application will use this information to find a font with similar appearance when an exact match cannot be found.
Supported values: roman
, swiss
, symbol
, modern
, script
, decor
, tech
, bidi
This indicates the Windows codepage the font uses for its encoding.
Supported values: codepageAsInteger
, -1
, 10000
, -1000
, -1001
, -1002
, -1004, -1005
The special values have the following meaning:
Uses the font's encoding, specified in its font table entry in the document being processed. This is the best choice for normal fonts.
Uses the document's default encoding. This should only be used by experts who know what and why they need to do this in very rare situations when processing legacy documents!
Identifies the Mac Roman encoding.
Identifies the Private Use Area mapper.
Identifies the standard encoding of the Symbol font.
Identifies the encoding of the Wingdings font.
Identifies the encoding of the Zapf Dingbats font.
Identifies Unicode fonts like "Arial Unicode MS
" (i.e. this is essentially an identity mapping)
Specifies the Unicode codepoint offset for this particular font. On platforms like Macintosh and Windows, fonts that have no Unicode mapping defined like "Webdings" or "Hoefler Text Ornaments", will be mapped 1:1 into the PUA (Unicode Private Use Area). Normally, this is the area of U+F000…U+F0FF, but by using the U-xxxx notation below, you can set the offset anywhere you require.
Supported values: normal
| private
| U-xxxx
with private
being equivalent to U-F000, which is the Unicode codepoint offset (should be in the Private Use Area (PUA)) where this mapping starts, and normal
being equivalent to U-0000, which is also the default if the property is not specified.
This property is only relevant for the RTF Exporter ("downCast") module.
When the RTF exporter encounters a Unicode character to render to RTF, it first looks whether this character is part of the encoding of the current font. If it is, it is written according to RTF specifications. However, when this Unicode character is not part of the encoding of the current font, the module tries to look up a font in the names specified using the @font-search-list
option in order. The first one it finds will be used to write the character to RTF. However, for a subsequent RTF reader to pick this up correctly, the module must write a switch of font for this specific character. This property specifies which method the RTF exporter should use for this, if possible:
The RTF exporter will write a simple RTF font switch {\fx c}
.
The RTF exporter will write the character using a SYMBOL
field. This is only possible for single-byte-fonts.
The RTF exporter decides how to best write the character.
This property is only relevant for the RTF Exporter ("downCast") module.
When the RTF exporter needs to write a character, it can do it in two ways: either just the character for the current font’s encoding, or additionally as the original Unicode codepoint. By specifying one of the following values for a font, you can tell the implementation which method it should use (if possible):
The RTF Exporter will always write the character in the current encoding and its Unicode equivalent
The RTF exporter will not write the Unicode equivalent
The RTF exporter decides how to best write the character.
This property is only relevant for the RTF Exporter ("downCast") module.
The following general options are available:
This option lets you specify a comma-separated list of font names in which the RTF exporter will search for an incoming Unicode character to be output to RTF if it is not part of the current encoding. This lets you specify precedences, e.g. you may want to list the actually installed fonts on your particular system first.
If the RTF Exporter does not find a match in the listed fonts, it will use Unicode notation with an underscore ‘_
’ as replacement character.
Example 22.1.
@font-search-list "Arial Unicode MS"
will fall back to the Arial Unicode MS font for characters for characters that are not part of the currently active font's encoding table.
This option controls the behavior of entries in a user-defined stdfonts.config
file.
The default value is override. The default stdfonts.config
has mode override specified and is always read first.
Any existing entries when reading this option are cleared, new entries are added in sequence
New entries are prepended (as a whole block, in sequence) to any existing ones, effectively overriding already existing font definitions for the same font since they are found first on searching
New entries are appended to the list of existing ones, i.e. only those for which there isn’t already a definition in the standard table take any effect
Example 22.2.
Writing
@mode replace
at the top of a font configuration file or code snippet will completely discard (=replace) any existing font mappings with the ones that follow after this option.
The file structure is line-based. Each line identifies a set of font names with a set of properties:
mappings ::= fontlist ‘=’ propertyset fontlist ::= font ( ‘, ‘ font)* font ::= fontname | ‘"’ fontname ‘"’ propertyset ::= ‘rtf-font-family: ‘ ffval ‘; codepage: ‘ [0-9]+ ‘;’ ‘ unicode-offset: ‘ uoval ‘; renderhint-fontswitch: ‘ rhfs ‘; renderhint-unicode: ‘ rhuc ‘;’ ffval ::= ‘roman’ | ‘swiss’ | ‘symbol’ | ‘modern’ | ‘script’ | ‘decor’ | ‘tech’ | ‘bidi’ uoval ::= ‘normal’ | ‘private’ | ‘U-‘ [0-9A-F]{4} rhfs ::= ‘font’ | ‘field’ | ‘auto’ rhuc ::= ‘always’ | ‘never’ | ‘auto’ fontname ::= name of font
Note that you must use CSS-style escapes (or numerical character entities of the form &#...;
) to generate Unicode characters for specifying font names using characters outside the ASCII range.
All lines starting with //
denote a comment line and are ignored, as do empty lines.
To avoid having to explicitly define every font in the stdfonts.config
file which might ever occur in a style sheet, the application implements a multi-stage search algorithm for a matching property definition entry as follows:
First, the default font configuration is read (it has a @mode
option value of override).
Then any custom addition/override in the application or pipeline settings and its entries are handled as a whole block according to the specified value for the @mode
option. If no mode option is specified, a default of override is assumed. Within this final, concatenated, big font configuration, the following search algorithm is employed:
A search for the exact name (considering case) is performed. The first matching entry is is used if it exists.
A search for the exact name, but ignoring case, is performed. The first matching entry is is used if it exists.
A search for a font name is performed that matches the start of the actual name. So if the characteristics for "Univers Bold" are requested, and there is an entry "Univers" in the font configuration, then its properties are used. Case is ignored.
A search for a font name is performed that is contained in the actual name. So if the characteristics for "L Univers 44" are requested, and there is an entry "Univers" in the font configuration, then its properties are used because the string "Univers" is contained in the actual font name. Case is ignored.
Here's the default font configuration as used by upCast as default:
// Default reading mode @mode override // The default search order: none! Use Unicode instead. @font-search-list // Default serif font properties: "Times New Roman", "L Centennial", Times, serif, Palatino, Georgia = rtf-font-family: roman; codepage: -1; "Times New Roman Greek" = rtf-font-family: roman; codepage: 1253; "Times New Roman CE" = rtf-font-family: roman; codepage: 1250; "Times New Roman Cyr" = rtf-font-family: roman; codepage: 1251; "Times New Roman Tur" = rtf-font-family: roman; codepage: 1254; "Times New Roman (Hebrew)" = rtf-font-family: roman; codepage: 1255; "Times New Roman (Arabic)" = rtf-font-family: roman; codepage: 1256; "Times New Roman Baltic" = rtf-font-family: roman; codepage: 1257; // Default sans serif font properties: Arial, System, Univers, Verdana, Helvetica, Tahoma, Optima, Futura, "Trebuchet MS", Lucida, sans-serif = rtf-font-family: swiss; codepage: -1; "Arial CE" = rtf-font-family: swiss; codepage: 1250; "Arial Greek" = rtf-font-family: swiss; codepage: 1253; "Arial Cyr" = rtf-font-family: swiss; codepage: 1251; "Arial Tur" = rtf-font-family: swiss; codepage: 1254; "Arial (Hebrew)" = rtf-font-family: swiss; codepage: 1255; "Arial (Arabic)" = rtf-font-family: swiss; codepage: 1256; "Arial Baltic" = rtf-font-family: swiss; codepage: 1257; "Arial Unicode MS" = rtf-font-family: swiss; codepage: -1005; // Default monospaced font properties: Courier, ProFont, Monaco, "Courier New", Pica, monospace = rtf-font-family: modern; codepage: -1; // Some other fonts "SimSun Western" = rtf-font-family: roman; codepage: 1252; // -------------------------------------- // Two-byte fonts we know about // -------------------------------------- // fcharset134 fonts: SimSun = codepage: 936; renderhint-unicode: auto; renderhint-fontswitch: font; PMingLiU, SimHei, "MS Mincho" = codepage: 936; renderhint-unicode: auto; renderhint-fontswitch: font; \00534e\006587\005f69\004e91, \005b8b\004f53, \009ed1\004f53, \00534e\006587\0096b6\004e66, \0065B9\006B63\008212\004F53 = codepage: 936; renderhint-unicode: auto; renderhint-fontswitch: font; // fcharset128 fonts: "MS PGothic" = rtf-font-family: roman; codepage: 932; renderhint-unicode: auto; renderhint-fontswitch: font; // fcharset129 fonts: "Gulim" = rtf-font-family: roman; codepage: 949; renderhint-unicode: auto; renderhint-fontswitch: font; // Symbol fonts Webdings = rtf-font-family: symbol; codepage: -1000; unicode-offset: private; renderhint-fontswitch: field; ZapfDingbats, "Zapf Dingbats", "ITC Zapf Dingbats" = rtf-font-family: symbol; codepage: -1004; unicode-offset: private; renderhint-fontswitch: field; Symbol = rtf-font-family: symbol; codepage: -1001; unicode-offset: private; renderhint-fontswitch: field; Wingdings = rtf-font-family: symbol; codepage: -1002; unicode-offset: private; renderhint-fontswitch: field; // Proprietary Fonts with no Unicode mapping of *any* contained character; add your own to the list if required: Tufa, "Hoefler Text Ornaments", Traffic, StarBats, "Score Font 4.0", Marl, Frets, FretBoard, "Apple Symbols", Anastasia, Aloisen = rtf-font-family: symbol; codepage: -1000; unicode-offset: private; renderhint-fontswitch: field;
upCast comes complete with virtually all default encodings you can use in RTF resp. Word, including many two-byte encodings. This means that normally, you do not need to provide a custom encoding.
The default encodings are hard-coded with optimizations done for each specific encoding to provide efficient access, since the mapping functions are called for each character that passes through upCast. These default encodings are therefore not directly user-accessible. However, there are sometimes occasions where you’ll need to use a custom encoding, especially when you are using custom fonts.
upCast provides a custom encoding loader and handler which lets you specify your own mappings from character code point in the font to Unicode by means of a simple text file. Both, one-byte and two-byte encodings can be specified in this way.
To create a custom encoding, you need to create an ASCII text file with the extension .encoding
which specifies both the mapping of the individual code points to Unicode and also states which code page it implements. You can give it a name as well for easily spotting it in the UI portions of the application.
By specifying a codepage in a custom encoding that has a default equivalent, you may override any of the factory-supplied encodings.
Since the mapping is built on the fly, specific optimizations cannot be performed and the use of custom encodings may slow-down processing slightly.
A custom encoding per se is not tied to anything but the codepage it implements. To tie a codepage to a specific font, you need to extend or override the font configuration. There, you simply list the font’s name and associate it with the codepage a custom encoding implements using the keywords as described.
It is recommended to use codepage values greater than 40000 for custom encodings, as upCast will not use these codepages internally. Which you use for custom encodings is up to you. upCast reserves the range from 32000 to 35000 for internal use, so you should not use these. Also note that when you override a default encoding, every font that is specified to use that encoding will use the custom one.
File names can be arbitrary, must however have an extension of .encoding
.
The file structure is simple: one mapping entry per line, and all lines starting with #
, //
or ;
are treated as comments. To create a two-byte encoding, separate the two bytes by a comma.
A mapping entry has the form (notation similar to BNF):
mapping ::= <srcbyte> [’,’ <srcbyte>] ‘=’ <unicodechar>
with:
srcbyte ::= hexNumber | decimalNumber unicodechar ::= hexNumber | decimalNumber hexNumber ::= (‘0x’ | ‘0X’ | ‘$’)[0-9A-Fa-f]+ decimalNumber ::= [0-9]+
Follows a rather silly example, which maps what in codepage 1252 fonts is a space to the at-sign:
@codepage 42001 @encodingname Silly Encoding $20=$40
Two special options are supported:
@codepage decimalNumber
This specifies the codepage this encoding represents.
You can specify either an existing encoding to override its definition, or create custom codepages for specific fonts, in which case you should choose a codepage number higher than 40000.
@encodingname asciistring
This is a descriptive name for the encoding so you can easily spot it in upCast’s UI.
upCast has some useful small hooks for troubleshooting basic problems in complex installations.
To get version and build info from an upcast.jar
at hand, run the following on the commandline:
java -cp upcast.jar de.infinityloop.upcast.AppVersion
To retrieve extended running environment info present when upCast would launch, run the following on the commandline:
java -jar upcast.jar -version
To launch upCast with extended log info turned on even before the respective setting from its preferences file (GUI mode) or when running in a server environment, define the following custom Java property:
-Dde.infinityloop.loglevel=7
You can even redirect the log file location. Please see here for all supported custom properties.
upCast and accompanying support material, code and the upCast DTD are Copyright © 1999-2015 by infinity-loop GmbH, Munich, Germany.
The application includes a slightly modified version of steadystate’s CSS2 parser. The complete modified source code can be downloaded in accordance with the requirements of its Lesser GPL from our website at: http://www.infinity-loop.de/download/legal/CSS2Parser.tgz.
This product includes software developed by the Apache Software Foundation (http://www.apache.org/). The full copyright notice is available here.
This application includes Apache Commons, which is covered under the Apache License Version 2.0.
This application includes swing-layout, an extensions to Swing to create professional cross platform layout. It is covered by the LGPL license.
The application uses work done by the W3C, which is Copyright © 2002 World Wide Web Consortium, (Massachusetts Institute of Technology, Institut National de Recherche en Informatique et en Automatique, Keio University). All Rights Reserved. http://www.w3.org/Consortium/Legal/ -- You can view the full Copyright Notice here.
XML- and OASIS Catalog support code is Copyright © 2001 Sun Microsystems, Inc. All Rights Reserved. -- You can view the full Copyright Notice here.
This product includes "The SAXON XSLT Processor from Michael Kay", http://saxon.sourceforge.net/, in compliance with its conditions of use and its license found here.
This product includes "The Saxon XSLT and XQuery Processor from Saxonica Limited", http://www.saxonica.com/, in compliance with its license as described here.
This product includes MRJAdapter.jar 1.0.9 in unmodified form by Steve Roy which is distributed using an Artistic License. In complicance with this license, here’s the link to the package’s site for downloading the full distribution: http://homepage.mac.com/sroy/mrjadapter/.
Copyright 2003-2006 The Werken Company. All Rights Reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
Neither the name of the Jaxen Project nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Copyright (c) 2001-2003 Thai Open Source Software Center Ltd
All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
Neither the name of the Thai Open Source Software Center Ltd nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
This product includes the Redstone XML-RPC Library v. 1.1.1 in compliance with the LGPLv2.
The source code can be obtained from here.
This application includes the ant.jar library to allow for being integrated into Ant builds. The library is covered under the Apache License Version 2.0.
Log-bridge is a subproject of Javatools. The code is covered under the the Apache License Version 2.0.
JTimepiece is the advanced library for working with dates and times in Java. The code is covered under the the Apache License Version 2.0.
XMLUnit enables JUnit-style assertions to be made about the content and structure of XML. It is covered under the BSD License:
Copyright (c) 2001-2014, Jeff Martin, Tim Bacon
All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
Neither the name of the xmlunit.sourceforge.net nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
RSyntaxTextArea is covered under a modified BSD license:
Copyright (c) 2012, Robert Futrell All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
Neither the name of the author nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL <COPYRIGHT HOLDER> BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Apache POI, Copyright 2009 The Apache Software Foundation. This software component is covered under the Apache License Version 2.0.
Simple, an embeddable Java based HTTP engine, is covered under the Apache License Version 2.0.
The Flying Saucer XML/XHTML renderer library is covered by the GNU Lesser General Public License.
This application uses BrowserLauncher2, which is covered by the LGPLv2 license.
Thanks go out to the members of the java-dev Mailing List without whom we would not even be that far as far as the Mac OS X user experience is concerned.
Also we’d like to thank all users and beta-testers for their very helpful feedback and problem reports we receive. This is what helps us making upCast the best conversion tool in its field.