ANNCodes
10 Oct 2008 22:14 UTC 2008284+2214 UTC

So you want to build your own ANN predictive model

Great idea! Below you will find a framework to build and test ANN predictive models as well as pieces of codes that should speed up the code development. We use Matlab to develop and run our codes so you also need a Matlab License (http://www.mathworks.com/). If you use our codes and find them useful, please send us your comments and/or post pieces of codes that could be helpful to others.

For each of the steps below you will soon find pieces of codes. The codes can be run individually or assembled in one matlab m-file. Requirements for naming convention for key parameters will also be specified.

The framework and codes are developed by faculty, researchers and students from TAMUCC, the Division of Nearshore Research and the Center for Water Supply studies (Russel Carden, Scott Duff, Rick Hay, Dan Prouty, Philippe Tissot and Beate Zimmer).

Raw Data Gathering/Archival

In this step you build a code that reads the data that will be necessary for the predicitive model development (will often be from various sources). We recommend building a data file that includes all the raw data with metadata indicating where the data comes from, when it was acquired, and any possibly useful information.

Data Conditioning/Analysis

In this step you build a code that reads the raw data, computes statististics over the raw data such as gap percent and fill in potential gaps and remove possible spikes with your favorite methods. The percent of data that has been interpolated should stay very low or you should discard the data set. In any case you need to keep this information.

Input Matrices and Target Vectors

You build a code that reads the preconditioned data and builds at least the input matrix and the target vector/matrix for the training of the ANN. Each column of the matrix represents one input vector and should correspond to the target elements in the target vector/matrix. The order in which the input elements are should always be the same. This is usually the most time consuming part of the coding. If you are planing to use verification for the training of your model, you need to build the verification input and target matrix/vectors following the same process.

Graphing analysis of input data

Not an essential step but often a useful one to visualize the hypothesized relationships between inputs and targets.

ANN Design/Parameters

In this step you first indicate the name and location of the input and target vectors/matrices set up in the previous step. The code then automatically extract the relevant ANN information on the number of inputs and outputs. One then specifies the number of hidden layers layers and the number of hidden neurons per layer, the type of training algorithm and error function.

ANN Training

Once all the preparation steps are completed, one runs a short code with the training statement which reads in the input and target data as well as the ANN parameters. This is usually the shortest piece of code but the one that takes the longest time to run. The ANN also runs a simulation based on the input data and compares the output from the training set and the target. The ANN is saved in a file for later use.

ANN Simulation/Predictions

A previously traine ANN is recalled and applied to one or more input matrices. The input matrices must have the correct format for the code to run.

ANN Performance Analysis

Load targets and ANN simulation and compute the ANN performance. We have such code for water levels.

ANN Simulation Plots

Load targets and ANN simulation and build comparative graphs.

Page last modified on November 20, 2005, at 09:50 AM