DataPrep Elements
DataPrep elements in the Process Designer tool are used for executing data preparation operations upon process inputs. These elements consist of Workspace and DataPrep Export.
Note: A Hub element can be added between a Workspace element and a DataPrep Export element to process multiple input files individually or by group on a workspace.
The table below specifies functionality of the DataPrep group elements.
Element Name |
Description |
Properties and Comments |
Workspace |
Performs operations upon process inputs. |
Actions to select the workspace path:
|
DataPrep Export |
Exports data from data preparation workspace. |
The user can select several load plans for the export by ticking their corresponding checkboxes. To export all load plans, select the All Load Plans check box.
Note: You may also use the naming macros to name the exported files and destination folder. To do this, click the Insert Macro icon () and select the naming macros from the drop-down list. NOTE: The Input File Name macro is available to specify one or more input file names. If there are several inputs in a Load Plan (Join/Union), the names of the export files will contain a combination of these input files. When the Input files are processed by group, the Input File Name macro will only hold the first file name. |
Advanced options for Workspace element
Additional options for excel worksheet design
-
When dealing with multiple inputs, the Parse inputs for sheet names option becomes available. Select the check box if you want to parse the inputs for specific sheets.
-
When selected, changes can be applied to the following:
-
All worksheets
-
First sheet
-
Last sheet
-
Selected Sheet by Name
-
When not selected, changes can be applied to the following:
-
Selected Sheet by Name
-
Selected Sheet by Number
-
When dealing with only one input, modifications can be applied to the following:
-
All worksheets
-
First sheet
-
Last sheet
-
Selected Sheet by Name
Additional options for SQL Server, OLE DB, and Oracle source types
-
In the column list in the SQL Server/OLE DB/Oracle dialog box, select the column to use.
Select from the following options:
-
Aggregate: Select the check box if the column data should be aggregated (i.e., grouped together) and then select the aggregation method to apply from the drop-down menu. The following aggregation methods are available:
-
Sum: Applies to numeric data; returns the total value of all values in the column
-
Count: Applies to numeric data and dates; returns the number of entries in the column
-
Min: Applies to numeric data and dates; returns the lowest value in the column
-
Max: Applies to numeric data and dates; returns the highest value in the column
-
Group by: Applies to numeric data, dates, and text; returns grouped values and is often used with functions such as SUM and COUNT.
-
Apply Filter: Select the check box if the column data should be filtered and then select the filter method to apply from the drop-down menu. Enter the filter value to use in the corresponding Filter Value box.
The following filter methods are available:
-
Equalto: Applies to numeric data, dates, and text; returns all values that contain the specified filter value
-
NotEqualTo: Applies to numeric data, dates, and text; returns all values except those that contain the specified filter value
-
GreaterThan: Applies to numeric data and dates; returns all values that are greater than the specified filter value
-
GreaterThanEqual: Applies to numeric data and dates; returns all values that are greater than or equal to the specified value
-
LessThan: Applies to numeric data and dates; returns all values that are less than the specified filter value
-
LessThanEqual: Applies to numeric data and dates; returns all values that are less than or equal to the specified filter value
-
Contains: Applies to text; returns all values that contain the specified filter value
-
StartsWith: Applies to text; returns all values that start with the specified filter value
-
EndsWith: Applies to text; returns all values that end with the specified filter value
Advanced options for DataPrep Export element
Additional options for altair knowledge studio file types
-
In the When output file exists drop-down menu, select Overwrite to substitute files with the same file names.
Additional options for CSV file typeS
-
In the When output file exists drop-down menu, select Overwrite to substitute files with the same file names.
-
Delimiter sets the separator symbol between columns. The delimiter characters available are comma, semicolon, tab, pipe, and space. Choose Other to specify a different character.
-
Qualifier sets the wrapper symbol type.
-
Choose the Encoding Type to apply to the CSV file. The following options are available:
-
Code Page
-
UTF-8
-
UTF-16LE
-
UTF-16BE
-
Set the Code Page to apply to the CSV file. The following options are available:
|
|
|
-
Select the Include Column Header check box to include headers in your file.
-
Click Summary Export to specify the summary that you want to export.
In the Summary dialog box that displays,
-
select the summary to export from the Summary drop-down menu,
-
select the active measure to apply to the summary from the Active Measure drop-down menu,
-
select the drill level to apply to the summary from the Drill Level drop-down menu.
NOTES
Drilling up and drilling down is to collapse and expand a summary data set, respectively.
A lower drill level (e.g., Level 1) results in fewer columns in the exported table.
Click Ok.
ADditional options for fixed text file types
-
In the When output file exists drop-down menu, select Overwrite to substitute files with the same file names.
-
Click Column sizes to adjust the sizes of the columns in your file.
additional options for microsoft excel file types
-
In the When output file exists drop-down menu, select Overwrite to substitute files with the same file names.
-
Enter your Table Name.
NOTE: Click the Existing Tables button () to display the names of the tables in the export file when exporting to a file that already exists. Select the desired table name.
-
In the When table exists drop-down menu,
-
select Overwrite to substitute the data table,
-
select Append to append to new rows,
-
select Update to update existing rows,
-
select Update and append to update existing rows and append new ones.
-
When Overwrite, Update, Update and append, or Skip is selected, select the Suppress column header row check box if you prefer to hide the column headers in the exported Exel file.
-
Click Summary Export to specify the summary that you want to export.
NOTE: If Summary Export is selected, mapping and the Do not match column names checkbox are not available.
In the Summary dialog box that displays,
-
select the summary to export from the Summary drop-down menu,
-
select the active measure to apply to the summary from the Active Measure drop-down menu,
-
select the drill level to apply to the summary from the Drill Level drop-down menu.
NOTES
Drilling up and drilling down is to collapse and expand a summary data set, respectively.
A lower drill level (e.g., Level 1) results in fewer columns in the exported table.
Click Ok.
-
When Append is selected, modify the following:
-
Suppress column header row: Column headers in the exported Excel file are hidden.
-
Do not match column names: Columns from potentially different tables are left aligned regardless of column names or data types.
Additional options for Microsoft Access file types
-
In the When output file exists drop-down menu, select Overwrite to substitute files with the same file names.
-
Enter your Table Name.
NOTE: Click the Existing Tables button () to display the names of the tables in the export file when exporting to a file that already exists. Select the desired table name.
-
In the When table exists drop-down menu,
-
select Overwrite to substitute the data table,
-
select Update to update existing rows,
-
select Update and append to update existing rows and append new ones.
-
When Update or Update and append is selected, modify the following:
-
Column Mapping: Maps the export column to the destination column.
-
When update match not found: Select the Export to delimited text file check box to export the unmatched record(s) to a delimited text file when updating existing rows in a table if a matching row is not found in the destination table.
-
When Overwrite, Append, or Skip is selected, modify the following:
-
Export all: Exports all the columns in the export table.
-
Column Mapping: Maps the export column to the destination column.
Additional options for ALTAIR PANOPTICON designer file types
-
In the When output file exists drop-down menu, select Overwrite to substitute files with the same file names.
-
Enter your Table Name.
NOTE: Click the Existing Tables button () to display the names of the tables in the export file when exporting to a file that already exists. Select the desired table name.
-
In the When table exists drop-down menu,
-
select Overwrite to substitute the data table,
-
select Update to update existing rows,
-
select Update and append to update existing rows and append new ones.
Additional options for Qlik FILE TYPES
-
In the When output file exists drop-down menu, select Overwrite to substitute files with the same file names.
Additional options for Tableau file types
-
In the When output file exists drop-down menu, select Overwrite to substitute files with the same file names.
Note: Export to Tableau is performed in .tde format.
-
To publish the Tableau export file to Tableau Server, click the Publish to Tableau Server check box under the Publish to Tableau Server options. This displays the following Tableau Server settings.
-
Data Source Name: The name of the Data Source on the Tableau Server.
-
Site: The address of the data source.
-
Path to TABCMD: The route to the executable file of the TableauServerTabcmd application.
-
Server Address: The address of the Tableau Server.
-
Username: The login name for the Tableau Server.
-
Use Password File: Allows the use of .txt file with the password.
-
Password: The password to the Tableau Server (if the option Use Password File is enabled, it allows to specify the path to the Password File).
-
Proxy Address: The address of the proxy server (in HOST:PORT format).
-
Publish Type: Allows to specify an action for the Data Source:
-
Overwrite Data Source: Overwrites the data in the Data Source with the data from the export table (date modified is changed);
-
Replace Data: Replaces the data in the Data Source with the data from the export table (date modified is not changed);
-
Append Data: Adds data from the export table to the end of the Data Source table.
Additional options for OleDb and ODBC file types
-
Type in the Connection String or select it via the Data Link Wizard with all the required data.
-
Select the Include password check box to input a password.
-
Select the Bulk Behavior check box to enhance export performance.
NOTE: This option is applicable for Oracle Database and Microsoft SQL Server OLEDB providers only.
-
Select the Use Transaction check box to revert the export via rollback if an error occurs during the export.
-
The Use Transaction check box is only available when the Bulk Behavior check box is selected.
-
Selecting this option will affect the performance of the table export and may require a large amount of system resources, especially for the target destination. Therefore, make sure to test this option first and then make the necessary adjustments.
-
The transaction is valid for the export operation only and not for the overall process.
-
Enter your Table Name.
-
In the When table exists drop-down menu,
-
select Overwrite to substitute the data table,
-
select Update to update existing rows,
-
select Update and append to update existing rows and append new ones.
-
When Update or Update and append is selected, modify the following:
-
Click Column Mapping to map the export column to the destination column.
-
When update match not found: Select the Export to delimited text file check box to export the unmatched record(s) to a delimited text file when updating existing rows in a table if a matching row is not found in the destination table.
-
When Overwrite, Append, or Skip is selected, modify the following:
-
Export all: Exports all the columns in the export table.
-
Click Column Mapping to map the export column to the destination column.
Additional options for Microsoft Power BI file types
-
Enter your Table Name.
-
Select a value from the When table exists drop-down list.
-
Click the Set Power BI Login Data button to open the Power BI Login form.
Note: If the data on logged in user has been already received, then the button name becomes Update Power BI Login Data.
Additional options for IBM Cognos Analytics file types
-
Enter your Table Name.
-
Select some value from the When table exists drop-down list.
-
Type the value of the Cognos Analytics server URL to the Server Url field.
-
Click the Test button to get all namespaces.
-
Select the corresponding Namespace.
-
Type the correct Username and Password.
-
Click the Check Credentials button to validate credentials.
Note: If you want to log in anonymously, select the Is Anonymous check box.
Additional options for SAs transport format file types
-
In the When output file exists drop-down menu, select Overwrite to substitute files with the same file names.
-
Enter your Table Name.
NOTE: Click the Existing Tables button () to display the names of the tables in the export file when exporting to a file that already exists. Select the desired table name. The table name is limited to 8 characters in length.
Additional options for JSON file types
-
In the When table exists drop-down menu,
-
select Overwrite to substitute files with the same file names,
-
select Add data to add the new data to the existing one,
-
select Skip to stop the export operation from overwriting an existing data.
Automator supports the following plug-ins
Datawatch.DataPrep.Engine
-
PDF Report
-
Text Report
-
Delimited Text
-
JSON
-
XML
-
HTML
-
Microsoft Access
-
Microsoft Excel
-
ODBC
Panopticon.DataDirect
-
DB2
-
Hadoop Hive
-
Cloudera Impala
-
Informix
-
MonogoDB
-
MySql
-
Oracle
-
PostgreSQL
-
Amazon Redshift
-
Salesforce
-
SqlServer
-
SybaseIQ
-
Teradata
Panopticon.CloudantPlugin
-
IBM Cloudant
Panopticon.DatawatchPlugin
-
Monarch Server - Content
Panopticon.ODataPlugin
-
Odata
Panopticon.SplunkPlugin
-
Splunk
Panopticon.BusinessObjectsUniversePlugin
-
Business Objects Universe
Panopticon.GoogleAnalyticsPlugin
-
Google Analytics