Deep Learning Allows for Near-Real-Time Hydraulic Fracture Event Recognition

Historically, real-time hydraulic fracturing analytics systems rely heavily on manually labeled data. The manual tasks, including fracture-stage start and end labeling and ball-pumpdown and -seat event labeling, can be affected by human bias and errors of inconsistency and can take days to finish. This paper provides the technical details of developing an automated stagewise key-performance-indicator (KPI) report generator that fills the manual task gaps. The generator is constructed with two machine-learning models that detect the stage start and end and identify the ball-pumpdown and -seat operation. These tasks are performed on the basis of the reliably available measurements of slurry rate and wellhead pressure, which enable the real-time automated stagewise KPI analysis. They also lay the foundation for further advanced analysis regarding real-time hydraulic fracturing operational decision-making.


Multistage hydraulic fracturing is the common operation to maximize reservoir contact and enhance fluid flow for shale oil and gas wells. During hydraulic fracturing, pumping data (e.g., pressures, rates, volumes) are recorded for real-time monitoring and analytics. The in-house real-time hydraulic fracturing analytics system, the Real-Time Completion (RTC) system, pulls such hydraulic fracturing pumping data from the fracturing van, analyzes it with advanced analytics models, generates meaningful analytics results, and finally pushes it to a web-based user interface, which aids real-time decision-making and drives well-completion efficiency.

With today’s communication technology, as well as powerful computation hardware, the RTC system with advanced analytics models plays an increasingly important role in today’s fracture job optimization. For example, Paryani et al. (2018) propose real-time fracture modeling based on the real-time hydraulic fracturing pumping data to optimize the fracture job in real-time; Ben et al. (2020a, 2020b) predict the hydraulic fracturing pressure using machine learning and conduct fracture job cost optimization in real-time based on pressure prediction.

This paper presents one of the advanced analytics modules within the RTC system, the real-time hydraulic-fracture stagewise KPI autogenerator. The KPI autogenerator is constructed with two machine-learning models that detect the stage start and end and identify the ball pumpdown and seat. Each of these models are built by various techniques including deep learning such as convolutional neural networks (CNN) and the U-Net architecture. One purpose of these models is to generate meaningful KPI reports to enable data-driven optimization of fracturing jobs. The other purpose is to automate the current RTC pipeline, dispelling the need for manual labeling of stage start and end and eliminating human bias and errors.

State of the Art

Ramirez et al. (2019a, 2019b) provides an initial study on stage start/end detection. They proposed to use the logistic regression and a support vector machine to classify each timestamp as inside or outside a stage then detect the stage start/end accordingly. They achieved an accuracy of 90% from training and validation sets (nonblind test). Accuracy, however, is not a reliable metric in this case. We advanced the technique using deep learning, and we designed a more reliable metric called flagwise accuracy to evaluate the performance of the models in this problem. Our model achieved an F1 score of 0.95 and a flagwise accuracy of 98.5%. To the authors’ knowledge, no previous study has been published on identifying the ball pumpdown/seat operation.

Methods and Key Results

Stage Start/End Detection. When humans try to identify whether a timestamp is inside or outside a stage, they look not only at the data at that specific time but also at the previous and subsequent data.

We need to look at continuous streams of data to determine the properties of each timestamp. We follow this strategy to structure the problem. First, we manually label each timestamp as 0 or 1, indicating whether it is outside or inside a stage, respectively. With these labels, we can easily identify the start and the end of a stage by searching for the time when the label changes from 0 to 1 or from 1 to 0. After the labeling, we extract the samples from the data using a fixed-length sliding window. In addition to the original two features, we add the first and second derivatives of both the slurry rate and the wellhead pressure to give the model additional information. The samples extracted from the data are now matrices with six rows and a fixed number of columns. We label each sample using the timestamp label corresponding to the middle column of the matrix. These sample matrices resemble the data structure of images and later will be fed into a CNN for the image classification task. Because the raw data contains many spikes that will seriously affect the performance, we apply a median filter to get rid of the spikes before sending the data into the model. Fig. 1 shows the procedure of extracting the data.

Fig. 1—Sample extraction and labeling.

The model we are using is the CNN. After the data preprocessing, the training samples are fed into the network for training. The network structure consists of an input layer, two convolutional layers, a max pool layer, a dense layer, and an output layer.

The convolutional layers will capture the higher-level features from the training samples and pass them to the output layer, which gives the probability of a sample belonging to either of the two classes according to the soft max function. The first convolutional layer has eight filters with size 10×6, and the second convolutional layer has 16 filters with the same size. The dense layer has 64 neurons with the rectified linear unit activation.

In order to quantify the performance, we use two different metrics to measure the accuracy. The first one is what we call timestampwise accuracy. The timestampwise accuracy is calculated as the number of timestamps with correct label divided by the number of total timestamps. It measures the accuracy in terms of the duration of a stage. The second metric is the flagwise accuracy. In order to calculate the flagwise accuracy, we define a time window called the tolerance window, which has a fixed duration centered at a true flag. Any predicted flags located within the tolerance window are considered accurate. The flagwise accuracy is calculated as the number of predicted flags within the tolerance window divided by the number of total flags. This metric sheds light on the performance of the model in terms of the start and end flag, and we can control the resolution by adjusting the length of the tolerance window. The two metrics are illustrated in Fig 2.

Fig. 2—Left, timestampwise accuracy. Right, flagwise accuracy.

Although flagwise accuracy better measures with the start/stop identification performance, the training process is designed to correspond to the timestampwise accuracy, in order to keep the data set from becoming unbalanced. If we use the flagwise accuracy for the training process, we will have to create labels that directly correspond to the start and end flag, of which there are only hundreds among millions of data points. Conversely, the timestampwise accuracy allows for labeling as a balanced data set, because it corresponds to the duration of the stage. (Approximately 60% of data points are within a stage, and approximately 40% are outside a stage.) Training on this balanced data set yields much better performance. Fig. 3 shows a stage with its true label (blue solid line) and the predicted label (yellow dashed line). The predicted label accurately marks the stage. In order to make full use of the data set for blind test, we use the five-fold cross-validation technique to evaluate the performance.

Fig. 3—A stage with its true labels and predicted labels.

The data set is partitioned into five parts. In each trial, the model is trained on four of the five parts then produces predictions on the remaining part. With five trials, we have a full blind test result on the whole data set, and we use that result to indicate the performance of our model. Table 1 shows the flagwise accuracies with different tolerance windows. The model achieves an F1 score of 0.95, and a flagwise accuracy of 98.5% with a tolerance window of 10 seconds.

Tolerance  Flagwise Accuracy Accurate Flags Total Flags
25 seconds 99.7% 646 648
20 seconds 99.3% 644 648
15 seconds 99.0% 642 648
10 seconds 98.5% 639 648
5 seconds 92.3% 598 648
3 seconds 68.1% 441 648

Table 1—Flagwise accuracies of the blind test results with different tolerance windows.

Ball Pumpdown/Seat Detection. The ball pumpdown/seat event recognition is a two-step strategy. The first step is to tell if there is a ball pumpdown/seat in a stage, and the second step is to locate the end of the ball pumpdown/seat if there is one. To achieve the first step, we borrow the idea from Cao et al. (2020) where the authors used the image segmentation method to locate the downlink sequences. Compared with the downlink recognition, the pattern for the ball pumpdown/seat is more obscure and only counts for a small portion of a whole stage. The image segmentation method is less likely to locate the ball pumpdown/seat event precisely, but it is sufficient to tell if a ball pumpdown/seat event exists in a stage. The second step can be achieved by a rule-based selection given the information from the image segmentation. The first step of the ball pumpdown/seat recognition is formulated as an image segmentation problem, and we will need to reconstruct the time series data into images. In contrast with the stage start/end detection, we are not feeding the values of the time series data into the deep learning model. Instead, we are feeding the time series plot as images into the model. For each plot image, we assign a corresponding mask to indicate the location of the ball pumpdown/seat pattern. Because the ball pumpdown/seat event always happens at the beginning of a stage, we take the data within the first hour of each stage as the samples. Fig. 4 shows an example of a ball pumpdown/seat event, and the red vertical line indicates where the ball pumpdown/seat ends. Because, in the first step, we only need to tell if there is a ball pumpdown/seat or not, there is no need to mask and recognize the whole pattern. A clear Z pattern (brown rectangle) can be seen in the slurry rate curve, and the wellhead pressure signal is unstable compared with the slurry rate. We use the slurry rate as our only input, and we look for the Z pattern.

Fig. 4—The ball pumpdown/seat at the beginning of a stage.

We follow the idea of Cao et al. (2020) and use a deep-learning model with the U-Net architecture (Ronneberger et al. 2015) to perform the image segmentation task. Because our training data set consists of 72 samples, which is not enough to train the whole model, we apply the transfer learning technique and we train our model based on the pretrained ResNet-34 model. All 72 samples have a ball pumpdown/seat in them. After training the model, we introduce an extra set of testing data. Among the 107 samples in the extra testing data set, nine samples have a ball pumpdown/seat phase. With that, we have two parts in our evaluation. The first part is the blind test of the training data set using cross validation, and the second part comes from the extra testing data set. The model is trained by the entire training data set (without validation). Result shows that this model alone achieves a very good true positive rate (79 out of 81) in determining whether there is a ball pumpdown/seat. But, it also has a high false positive rate (8 out of 98). Fig. 5 shows an example of the prediction mask. In order to lower the false positive rate, we borrow the second-opinion mechanism from Cao et al. (2020) and introduce a second model to vote on the decision.

Fig. 5—An example of the prediction mask.

The second model is the same as the first except that it considers the wellhead pressure signal along with the slurry rate through an additional channel. The input image is now an RGB image, where the R channel is the wellhead pressure, the G channel is the slurry rate, and the B channel is left blank. With the opinion from the second model, the two models together achieve an F1 score of 0.97, with a true positive rate of 0.96 (78 out of 81) and a false positive rate of 0.03 (3 out of 98), which is a better performance than a single model.

After determining the existence of the ball pumpdown/seat, the second step is to locate the end of the event. The masks given by the U-Net model provides an approximation of where the ball pumpdown/seat event happens, and we pinpoint the end of the event on the basis of that information. In order to get the location information from the mask, we first project the prediction mask to 1D in correspondence with the time axis. For each column from a mask, if the fraction of pixels that are marked as positive exceeds a certain threshold, we consider the timestamp that correspond to the column as positive. The threshold for our tests is 95%. Fig. 6 shows the mask from Fig. 5 after cleaning the edge.

Fig. 6—Mask from Fig. 5 after cleaning the edge.

Next, we scan the signals from the midpoint of the mask using the following rules:

  • The first derivative of the wellhead pressure at the current time stamp is greater than 20 psi/s
  • The first derivative of the wellhead pressure at the following timestamp is greater than 15 psi/s
  • The first derivative of the wellhead pressure at the second next timestamp is greater than 10 psi/s
  • The first derivative of the slurry rate at the current time stamp is smaller than 0.01

If such a time stamp exists, we mark it as the end of the ball pumpdown/seat. The blue vertical line in Fig. 7 shows the predicted ball pumpdown/seat ending point for the sample in Fig. 4. Our two-step strategy achieves 94% accuracy (173 out of 179 samples correct).

Fig. 7—The final ball pumpdown/seat ending point prediction for Fig. 4.


The work in this paper automates the manual tasks of labeling the start and end of a stage and the end of the ball pumpdown/seat event with high accuracy. It fills the manual task gaps in the RTC work flow and lays the foundation for further advanced analysis, as well as paves the way for a fully automated RTC system.


Paryani, M., Sia, D., Mistry, B., et al. 2018. Real-Time Completion Optimization of Fracture Treatment Using Commonly Available Surface Drilling and Fracking Data. SPE Canada Unconventional Resources Conference, Calgary, 13–14 March. SPE-189810-MS.

Ben, Y., Perrotte, P., Mistry, B., at al. 2020a. Real-Time Hydraulic Fracturing Pressure Prediction with Machine Learning. Prepared for the SPE Hydraulic Fracturing Technology Conference and Exhibition, The Woodlands, Texas, 4–6 February. SPE-199699-MS.

Ben, Y., Sankaran S., Harlin C., Perrotte, P. 2020b. Real Time Completion Cost Optimization Using Model Predictive Control. Prepared for the SPE Hydraulic Fracturing Technology Conference and Exhibition, The Woodlands, Texas, 4–6 February. SPE-199688-MS.

Ramirez, A. and Iriarte, J. 2019a. Event Recognition on Time Series Frac Data Using Machine Learning. SPE Western Regional Meeting, San Jose, California, SPE-195317-MS.

Lopez, J. and Ramirez, A. 2019b. Machine Learning Helps Pinpoint Events From Fracturing Data. Data Science and Digital Engineering in Upstream Oil and Gas, 10 July 2019, (accessed 10 September 2019).

Ronneberger O., Fischer P., and Brox T. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. International Conference on Medical Image Computing and Computer Assisted Intervention. October 5. pp. 234–241. Springer, Cham.

Cao, D., Hender, D., Ariabod, S., James, C., Ben, Y. and Lee Micheal. 2020. The Development an Application of Real-Time Deep Learning Models To Drive Directional Drilling Efficiency. Prepared for SPE/IADC Drilling Conference and Exhibition, Galveston, Texas, 3–5 March. SPE-199584-MS.

Yuchang Shen is a machine-learning engineer in Anadarko’s Advanced Analytics and Emerging Technology (AAET) organization. He pioneers and develops the RTC KPI autogenerator. He holds a PhD degree in physics from the University of Houston and a master’s degree in computer science from Rice University.

Dingzhou Cao is a data-science manager in Anadarko’s AAET organization. He leads the data science efforts on AAET’s real-time drilling and real-time completion initiatives. He holds a PhD degree in industrial engineering from Wayne State University.

Kate Ruddy is a senior drilling and completion data analytics engineer in Anadarko’s US Onshore Drilling and Completions Data Analytics group. She provides domain expertise to operations and subsurface analytics teams and is responsible for optimizing field and office data capture work flows for drilling and completions teams. She holds a bachelor of engineer degree in mechanical engineering from Vanderbilt University.


Don't miss out on the latest technology delivered to your email monthly.  Sign up for the Data Science and Digital Engineering newsletter.  If you are not logged in, you will receive a confirmation email that you will need to click on to confirm you want to receive the newsletter.