Home > Model Nodes > Automatic Data Preparation ...
Data used for building a model must be properly prepared. Different algorithms have different requirements for input; for example, Naive Bayes requires binned data.
ADP transforms the build data according to the requirements of the algorithm, embeds the transformation instructions in the model, and uses the instructions to transform the test or scoring data when the model is applied.
If you are connected to Oracle Database 12c, ADP prepares text data.
Here are some examples of how ADP prepares numerical data:
For algorithms that require binned data (like Naive Bayes), ADP performs supervised binning. Supervised binning is a special binning approach that takes into account the target to find good cut-points in the predictor.
For algorithms that require normalized data (like Support Vector Machines), the numerical data is normalized.
For algorithms that can handle untransformed data (like Decision Tree), the numerical data is used to find splitters in the tree with an approach similar to supervised binning.
See Oracle Data Miner Concepts for detailed information about Automatic Data Preparation.
Manual data preparation is complicated to perform because you have to understand the requirements of each algorithms, and you have to carry the transformations around with you so that you can properly prepare test data or scoring data.
Unless there is a good reason to perform manual binning (such as recoding a numeric columns of ages column to desired ranges like 'YOUTH', 'ADULT', etc., which have business meaning), then automatic data preparation is recommended.