Summary
Artificial neural networks (ANNs) have been used widely for prediction and
classification problems. In particular, many methods for building ANNs have
appeared in the last 2 decades. One of the continuing important limitations of
using ANNs, however, is their poor ability to analyze small data sets because
of overfitting. Several methods have been proposed in the literature to
overcome this problem. On the basis of our study, we can conclude that ANNs
that use radial basis functions (RBFs) can decrease the error of the prediction
effectively when there is an underlying relationship between the variables. We
have applied this and other methods to determine the factors controlling and
related to fracture spacing in the Lisburne formation, northeastern Alaska.
By comparing the RBF results with those from other ANN methods, we find that
the former method gives a substantially smaller error than many of the
alternative methods. For example, the errors in predicted fracture spacing for
the Lisburne formation with conventional ANN methods are approximately 50 to
200% larger than those obtained with RBFs. With a method that predicts fracture
spacing more accurately, we were able to identify more reliably the effects on
the spacing of such factors as bed thickness, lithology, structural position,
and degree of folding.
By comparing performances of all the methods we tested, we observed that
some methods that performed well in one test did not necessarily do as well in
another test. This suggests that, while RBF can be expected to be among the
best methods, there is no “best universal method” for all the cases, and
testing different methods for each case is required. Nonetheless, through this
study, we were able to identify several candidate methods and, thereby, narrow
the work required to find a suitable ANN.
In petroleum engineering and geosciences, the number of data is limited in
many cases because of expense or logistical limitations (e.g., limited core,
poor borehole conditions, or restricted logging suites). Thus, the methods used
in this study should be attractive in many petroleum-engineering contexts in
which complex, nonlinear relationships need to be modeled by use of small data
sets.
Introduction
An ANN is “an information-processing system that has certain performance
characteristics in common with biological neural networks” (Fausett 1994). On
the basis of the “universal approximation theorem” with a sufficient number of
hidden nodes, multilayer neural networks (Fig. 1) are able to predict any
unknown function (Haykin 1999). ANNs are widely used in prediction and
classification problems and have numerous applications in geosciences and
petroleum engineering, including permeability prediction (Aminian et al. 2003),
fluid-properties prediction (Sultan and Al-Kaabi 2002), and well-test-data
analysis (Osman and Al-Marhoun 2005).
Given a basic network structure, there is a wide variety of ANNs that can be
produced. For example, different methods or criteria used to train the network
produce ANNs that provide different predictions (e.g., the early-stopping and
weight-decay methods.) Also, two or more neural networks can be combined to
produce an ANN with better error performance or other qualities, giving the
so-called “ensemble learning methods,” a term that covers a large variety of
methods, including stacked generalization and ensemble averaging. An additional
problem is introduced when the data sets are small. This is a common situation
in petroleum-engineering and geosciences applications, in which the cost of
data or collection logistics may limit the number of measurements. In such
instances, the use of ANNs can result in overfitting, where the model is fitted
to the training data points but performs poorly for prediction of other points
(Fig. 2).
In this study, we try to identify—among myriad possibilities—a few ANNs that
provide good error performance with limited sample numbers. After a brief
review of various types of ANNs, we use a synthetic data set to discuss, apply,
and compare the methods that have been proposed in the literature to overcome
the small-data-sets problem. Finally, we apply these methods to an actual data
set—fracture-spacing data from the Lisburne Group, northeastern Alaska—and
evaluate the results.
© 2008. Society of Petroleum Engineers
View full textPDF
(
912 KB
)
History
- Original manuscript received:
28 June 2006
- Meeting paper published:
24 September 2006
- Revised manuscript received:
7 May 2007
- Manuscript approved:
24 November 2007
- Version of record:
20 June 2008