## Pseudodensity Log Generation by Use of Artificial Neural Networks

You have access to this full article to experience the outstanding content available to SPE members and JPT subscribers.

To ensure continued access to JPT's content, please Sign In, JOIN SPE, or Subscribe to JPT

The challenges of reservoir characterization can be overcome accurately and efficiently by the use of computer-based intelligence methods such as neural networks, fuzzy logic, and genetic algorithms. This paper will describe how one integrates a comprehensive methodology of data-mining techniques and artificial neural networks (ANNs) in reservoir-petrophysical-properties prediction and regeneration.

## Introduction

ANNs—machine-learning models that provide the potential to establish multidimensional, nonlinear, and complex models—can be powerful tools with which to analyze experimental, industrial, and field data.

It is crucial to find the optimal data from one well to build the model with ANNs for pseudowell-log generation of a target well. Manual stratigraphic interpretation, though labor-intensive, is regarded as one approach. Data-mining techniques are another applicable approach, involving the automatic processing of data associated with nonlinearity by use of a statistical method to discover data patterns. One application in petrophysics is facies (or electrofacies) classification, which is widely used to divide well-log data to obtain target information. Clustering analysis, an adjunct to artificial intelligence, can determine electrofacies and categorize lithological profiles quite efficiently.

Porosity, one of the more important petrophysical properties, can be obtained from density logs. In this study, a three-step approach was produced. First, the authors apply preprocessing of the log data by use of standardization and dimension reduction [principal-component analysis (PCA)]. Second, they apply clustering [model-based clustering (MBC)] to recognize specific patterns and interpret stratigraphic information. Finally, a similar pattern is chosen as input to generate a target pseudodensity log by use of ANNs.

## Well-Log-Data Preprocessing

**Normalization. **Well-log data are constructed in a matrix form whereby each row represents the depth record and each column is the different type of well log. Each well constructs one well-log matrix, and one field that has multiple wells constructs a big data set. The initial step in the first stage is to normalize well-log data. This normalization is necessary because different types of well-log data have different units. For instance, the spontaneous-potential log is given in millivolts, whereas the gamma ray log is given in an API unit.

**PCA.** This is a statistical procedure that uses an orthogonal transformation to convert a set of well-log vectors (attributes) of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. Then, principal components construct a new matrix, which has much lower dimensions than the original. In this paper, the authors propose to use singular-value decomposition. PCA allows the reduction of the dimensionality (number of the columns) of the well-log data but retains most of the variability of those data.

## Well-Log-Data Mining

Though well logs are a record of rock and formation properties vs. depth, they are not a straightforward representation of the formation, and the amount of well-log data is usually extremely large. In addition, stratigraphic interpretation and classification are necessary to select appropriate wells for data post-processing. Data mining is the computational process of discovering patterns in large data sets, and it involves methods at the intersection of artificial intelligence, machine learning, statistics, and database systems.

**Lithofacies and Electrofacies.** A lithofacies is a body of rock with specified characteristics. Different types of lithofacies have an internally different signal response on well logs. For example, a gamma ray log of sandstone can reflect the clay content. If a gamma ray log of sandstone is used to build a model, and subsequently that model is used to predict the gamma ray log of shale, severe mismatch problems arise that will ensure prediction failure. It is therefore of paramount importance to have correct lithofacies. Lithofacies identification is classified into three general approaches:

- Core-data analysis
- Knowledge-based well-log analysis by the expert system
- Electrofacies

Compared with the first and second methods, electrofacies analysis is relatively inexpensive and more efficient.

**Gaussian Mixture Model (GMM). **It is assumed that one well-log matrix is one mixture model, from the statistical point of view. A mixture model is a probabilistic model representing the presence of a subpopulation within an overall population. Here, the subpopulation is well-log data from the specific type of rock. The mathematical expression and calculation of the GMM are described in detail in the complete paper.

**MBC.** The foundational assumption of MBC is that the data are generated by a mixture of probability distributions (or mixture models) in which each part is a different cluster or model. In the GMM, the number of clusters has to be defined by the user. However, this might not be the case, because one rarely has the advantage of accessing accurate rock properties before processing the well-log data. Therefore, the easy way to set this imperative parameter in GMM is to add two components: (1) four basic models of the covariance matrix and (2) the agglomerative algorithm. The new upgraded method is the main structure of MBC.

Before applying these two components, the likelihood function must be extended into multivariate form; then, the log likelihood is concentrated.

The four basic models are scenarios that have four related criteria and the following characteristics:

- The first model has diagonal and equal covariance matrices and the same value in diagonal elements.
- The second model has diagonal covariance matrices but the same value in each diagonal element of the individual covariance matrix.
- The third model has equal covariance matrices that have nonzero off-diagonal elements.
- The fourth model’s covariance matrices can vary among components.

The agglomerative algorithm is implemented by use of single linkage, which is the similarity of the closest pair. If two clusters are close enough, the two will be merged.

**Well Correlation. **Well correlation is enabled by generating electrofacies after applying MBC. Before using MBC, it is necessary to change well-log matrix to combined-well-log matrix, which comprises the data from all wells of interest. However, a serious problem for well correlation could exist because of the low efficiency of the expectation-maximization algorithm. For this paper, the authors proposed two steps to solve the problem: (1) sampling and (2) discriminant analysis (DA). In the sampling part, the size of the sample should be determined, and then, on the basis of the stratigraphic property of the formation, the systematic-random-sampling method is used to select sample points.

The MBC method can then be implemented to generate electrofacies from sampling data. For the rest of the data set, the use of DA is suggested to put data into the existing electrofacies generated from the previous step.

DA is a classification method wherein groups or clusters from populations are known to exist a priori, and other new observations are classified into one of these on the basis of the measured characteristics. DA assumes that different classes generate data on the basis of different Gaussian distributions. For this study, quadratic DA was chosen, in which the covariance matrix can be different for each class. The sample is then classified into the groups that have the largest quadratic score function.

## Well-Log-Data Post-Processing

In this process, the optimal well for building a model by use of the ANN has been selected. An ANN is a biologically inspired dynamic computation system that processes data and learns in an inherently parallel and distributed fashion. It is capable of extracting and recognizing the underlying dominant patterns and structural relationships among data. Once properly trained, the network can implicitly classify new patterns and generalize an output on the basis of the learned patterns.

Typically, ANNs are arranged into three types of layers: input, hidden, and output. Each layer comprises a different number of processing elements (PEs) that are massively interconnected. Feed-forward backpropagation is a common scheme for training the network.

## Field Application and Results

In the case study covered in the complete paper, there are eight wells in one field, each of which has 38 different types of well logs. After the data-mining process, an ideal well is available to use as a training tool for the artificial neural model; the trained-neural-network model is used subsequently to generate a density log from the target well. The momentum is an added parameter to the generalized delta rule, to prevent the learning process from converging to a local minimum. It normally varies between zero and unity.

The learning speed of a neural network is decided by the parameter of learning rate. In most cases, the number of PEs in the input and output layers is given by the number of each dimension. But the number of hidden layers and number of PEs in each one of them is somewhat arbitrary. One rule states that the number of hidden-layer neurons should be approximately 75% of the input variables. Incorporating that tenet with trial and error, the authors decided to use one-hidden-layer architecture and specified 40 neurons for that single hidden layer.

After data preprocessing, the input and output numerical values are normalized within -1 to 1. Initially, weights of inputs are assigned randomly but are updated after each iteration.

A weighted sum of input variables at each PE is then modified by the hyperbolic tangent sigmoid transfer function. The data set introduced into the neural network is divided further into training subset, validation subset, and test subset to avoid overfitting during the training process. The ratio for each is 0.7, 0.15, and 0.15, respectively.

After building up the model, the next critical task is prediction or pseudowell-log generation by use of an existing model.

Finally, a comprehensive prediction based on the obtained well-log data is implemented.

*JPT*Technology Editor Chris Carpenter, contains highlights of paper SPE 180439, “Pseudodensity-Log Generation by Use of Artificial Neural Networks,” by

**Wennan Long,**University of Southern California;

**Di Chai,**University of Kansas; and

**Fred Aminzadeh,**University of Southern California, prepared for the 2016 SPE Western Regional Meeting, Anchorage, 23–26 May. The paper has not been peer reviewed.

Pseudodensity Log Generation by Use of Artificial Neural Networks

01 May 2017

## Related Articles

### Major Alaskan Discoveries Promote Great Expectations

Three major finds have shattered assumptions about what is possible in the most explored parts of Alaska’s North Slope.

### Case History: High-Pressure/High-Temperature Underbalanced Drillstem Testing

This paper describes the method developed to achieve underbalanced drillstem testing (DST) in a deepwater field offshore India.

### Novel Solutions for Transient Sandface Temperature in Dry-Gas-Producing Wells

The application of high-precision downhole temperature sensors has resulted in pressure-transient analysis (PTA) being complemented or replaced by temperature-transient analysis (TTA).