Technical Topics

Data-Driven Reservoir Modeling: A Fact-Based Alternative to Numerical Simulation

Shahab D. Mohaghegh, Intelligent Solutions
Getty Images

To efficiently develop and operate a petroleum reservoir, it is important to have a model. Currently, numerical reservoir simulation is the accepted and widely used technology for this purpose. Data-driven reservoir modeling (also known as top-down modeling or TDM) is an alternative or a complement to numerical simulation. TDM uses the “big data” solutions of machine learning and data mining to develop—train, calibrate, and validate—full-field reservoir models based on measurements rather than mathematical formulation of our current understanding of the physics of fluid flow through porous media. 

Unlike other empirical technologies that forecast production such as decline curve analysis, or only use production/injection data for analysis (capacitance resistance model), TDM integrates all available field measurements such as well locations and trajectories, completions and stimulations, well logs, core data, well tests, seismic, as well as production/injection history, including wellhead pressure and choke setting. The measurements are used to build a cohesive, comprehensive reservoir model using artificial intelligence technologies. TDM is defined as a full-field model where production, including gas/oil ratio (GOR) and water cut (WC), is conditioned to all measured reservoir characteristics and operational constraints. TDM matches the historical production, and is validated through blind history matching, and is capable of forecasting a field’s future behavior. 

The novelty of data-driven reservoir modeling stems from the fact that it is a departure from traditional approaches to reservoir modeling. The fact-based method manifests a paradigm shift in how reservoir engineers and geoscientists model fluid flow through porous media. In this new paradigm, current understanding of physics and geology in a given reservoir is substituted with facts (data/field measurements) as the foundation of the model. This characteristic makes TDM a viable modeling technology for unconventional assets where the physics of the hydrocarbon production in the presence of massive hydraulic fractures is not yet well understood.

The Role of Physics and Geology

Although it does not start from the first principles physics, TDM is a physics-based reservoir model. The incorporation of physics in TDM is nontraditional. Reservoir characteristics and geological aspects are incorporated in the model for as much as they can be measured. While interpretations are intentionally left out during the model development, they are extensively used during the analysis of model results. Although fluid flow through porous media is not explicitly (mathematically) formulated during the development of data-driven reservoir models, successful development of such models is unlikely without a solid understanding and experience in reservoir engineering and geosciences. Physics and geology are the foundation and the framework for the assimilation of data that are used to develop the TDM.

Formulation and Computational Footprint

A top-down model is built by correlating (correlation that is conditioned to causation) flow rate, reservoir pressure, and fluid saturation at each well and at each time-step to a set of measured static and dynamic variables. The static variables include reservoir characteristics such as well logs (gamma ray, sonic, density, and resistivity), porosity, and formation tops and thickness at the following locations:

  1. At and around the well 

  2. The average from the drainage area 

  3. The average from the drainage area of the offset producers 

  4. The average from the drainage area of the offset injectors

The dynamic variables include operational constraints and production/injection characteristics at appropriate time-steps, such as

  1. Wellhead or bottomhole pressure, or choke size, at time-step t 

  2. Completion modification (operation of the inflow control valve, squeeze off) at time-step t

  3. Number of days of production at time-step t

  4. GOR, WC, and oil production at time-step t−1

  5. GOR, WC, and oil production of the offset wells at time-step t−1 

  6. Water and/or gas injections at time-step t

  7. Well stimulation details

The data that are incorporated into TDM show its distinction from other empirically formulated models. Once the development of the TDM is completed, its deployment in forecast mode is computationally efficient. A single run of the TDM is usually measured in seconds or in some cases minutes. The small computational footprint makes TDM a tool for reservoir management, uncertainty quantification, and field development planning. Development and deployment costs of TDM is a fraction of numerical simulation. 

Expected Outcomes

Data-driven reservoir modeling can accurately model a mature hydrocarbon field and successfully forecast its future production behavior. The outcomes of TDM are forecast of oil production, GOR and WC of existing wells, as well as static reservoir pressure and fluid saturation, all of which are used for field development planning and infill drilling. When TDM is used to identify the communication between wells, it generates a map of reservoir conductivity that is defined as a composite variable that includes multiple geologic features and rock characteristics contributing to fluid flow in the reservoir. This is accomplished by deconvolving the effect of operational issues from reservoir characteristics on production. 

Limitations of the Technology

Data-driven reservoir modeling is applicable to fields with a certain amount of production history, and as such, TDM is not applicable to greenfields and fields with a small number of wells and short production history. Another limitation is that it is not valid once the physics is changed completely; for example, once a TDM is developed for a field under primary recovery, the model cannot be applied to enhanced recovery phase without retraining the model.


Shahab D. Mohaghegh is the president and CEO of Intelligent Solutions and professor of petroleum and natural gas engineering at West Virginia University. He has carried out more than 60 projects for national and international oil companies. Considered a pioneer in the application of artificial intelligence and data mining in the exploration and production industry, he is the founder of the SPE Petroleum Data-Driven Analytics Technical Section dedicated to machine learning and data mining. He was a member of the US Secretary of Energy’s Technical Advisory Committee on Unconventional Resources from 2008 to 2014, and represented the US in the International Organization for Standardization on carbon capture and storage during 2014–2016. An SPE Distinguished Lecturer, he has authored more than 170 technical papers and books, including Data-Driven Reservoir Modeling, and Shale Analytics, and teaches the SPE training courses on these topics. Mohaghegh holds BS, MS, and PhD degrees in petroleum and natural gas engineering.