Creating a Workflow

A workflow is normally built and run (executed) as follows:

  1. Create a blank workflow as described in Create a Workflow.

  2. Create a node that starts the workflow, that is, create a node that provides one or more sources for data mining operations. Such a node identifies a database object. For example, the starting node might be one of the Data Nodes.

  3. Create nodes that perform mining tasks, such as data preparation or model build and test. Nodes are described in Workflow Nodes.

  4. Connect nodes as described in Link.


    Note:

    Steps 1 and 2 are usually performed together, that is, you create a node, edit it, and connect it to an already existing node.

  5. Run nodes, as described in Run Node.

  6. Examine results.

  7. Iterate the steps as necessary.

Workflows must contain one or more sources of data, such as a table or a model. For example, to build a Naive Bayes model, you first identify input with a Data Source node. You then create a classification node to build and test the model.

Workflows are built up in such a way that they read left to right. The following is a simple workflow:

workflow builds and applies a classification model

The Data Source node MINING_DATA_BUILD_V is at the start of the workflow; normally, it identifies a table or a view. The Data Source node is the parent of the Classification node, ClassBuild. ClassBuild is the child of the Data Source node.

This is an example of a very simple workflow. In Data Miner a workflow is not required to be a connected graph. A workflow can consist of several different computations. For example, you could build a classification model using one data source and build a clustering model using a different data source.

Workflow Terminology describes how workflow components are named.