Do Data-Mining Methods Matter? A Wolfcamp Shale Case Study

Topics: Data and information management Risk management/decision-making
Fig. 1: The gross structure of the Delaware basin and central-basin-platform features of the Permian Basin of west Texas. Study wells are bubble mapped.

Data mining for production optimization in unconventional reservoirs brings together data from multiple sources with varying levels of aggregation, detail, and quality. The objective of this study was to compare and review the relative utility of several univariate and multivariate statistical and machine-learning methods in predicting the production quality of Permian Basin Wolfcamp shale wells. Methods considered were standard univariate and multivariate linear regression and the advanced machine-learning techniques support vector machine, random forests, and boosted regression trees.


In the last few decades, because of the diminishing availability of conventional oil reserves, unconventional reservoirs are fast becoming the mainstream source of energy resources. In the meantime, with technology advances in data collection, storage, and processing, the oil and gas industry, along with every other technical industry, is experiencing an era of data explosion. One of the more frequently asked questions is, “How can these massive amounts of data in unconventional jobs be used to understand better the relationship between operational parameters and well production?” To tackle this problem, raw data usually need to be pulled from multiple data sources, a data-cleanse job needs to be run, and the information needs to be merged into a joint data set for subsequent analysis.

Here, a data set from Permian Basin Wolfcamp shale wells is used as a case study to illustrate the implementation of several popular analytic methods in data mining. The study area is within the Delaware basin in west Texas. Fig. 1 (above) is color-contoured on the top of the Wolfcamp and highlights the basin structure in the greater area, with the deep basin in purple and the shallowest Wolfcamp contours in red. Well locations are bubble mapped on top of the contours. The bubble-map color scheme has the best monthly oil production per completed foot of lateral (log 10 scale) shown in red and the poorest wells shown in purple.

This article, written by Special Publications Editor Adam Wilson, contains highlights of paper SPE 173334, “Do Data-Mining Methods Matter? A Wolfcamp Shale Case Study,” by Ming Zhong, SPE, Baker Hughes; Jared Schuetter and Srikanta Mishra, SPE, Battelle Memorial Institute; and Randy F. LaFollette, SPE, Baker Hughes, prepared for the 2015 SPE Hydraulic Fracturing Technology Conference, The Woodlands, Texas, USA, 3–5 February. The paper has not been peer reviewed.
This article is reserved for SPE members and JPT subscribers.
If you would like to continue reading,
please Sign In, JOIN SPE or Subscribe to JPT

Do Data-Mining Methods Matter? A Wolfcamp Shale Case Study

01 October 2015

Volume: 67 | Issue: 10