Skip to main content Skip to secondary navigation
Publication

SEP-194 (2024)

Rustam Akhmadiev, Hassan Almomin, Biondo Biondi, Thomas Cullison, Ivan Deiana, Robert G. Clapp, Paige Given, Min Jun Park, Seunghoo Kim, Haipeng Li, Julio Oliva Frigerio, Michele Pipan, Giacomo Roncoroni, Joseph Stitt, Phillip Teng, Gi Ung Lee, Andy Vo

Download 
SEP-194 (Password Protected) (May, 2024)

Distributed acoustic sensing (DAS)
 

The Impact of Preprocessing on DAS Event Detection: A Comparative Study
Hassan Almomin and Min Jun Park
Distributed Acoustic Sensing (DAS) is revolutionizing subsurface monitoring, particularly for micro-seismic event detection. However, DAS datasets often vary widely, requiring tailored preprocessing techniques for optimal event identification. This study addresses this challenge by investigating how different preprocessing levels affect the accuracy of a convolutional neural network (CNN) model designed to detect micro-seismic events. By training our CNN on labeled data with varying preprocessing, we aim to uncover the ideal balance for maximizing event detection while potentially reducing the need for extensive, dataset-specific preprocessing in the future. Our research contributes to refining DAS data analysis and ultimately improving subsurface monitoring techniques.

Enhancing Microseismic Event Detection in Imbalanced DAS Data Using Deep Learning Embeddings
Min Jun Park and Hassan Almomin
Detecting microseismic events from Distributed Acoustic Sensing data presents significant challenges, especially under conditions of label imbalance prevalent in seismic datasets. This study addresses the imbalance where noise data significantly outnumbers event labels, leading to a high false negative ratio when using conventional convolutional neural networks. Our methodology involves an initial training phase of convolutional neural networks followed by the extraction of embeddings to define class centers in the embedding space. We subsequently fine-tune the embedding model to minimize the distance between each class center and its corresponding data points.

Exploring Model Transferability: Full FORGE 2022 DAS Microseismic Detection using Pre-trained Convolutional Neural Network
Paige Given, Robert Clapp, Biondo Biondi
This study explores the utilization of our pre-trained unconventional-reservoir-based model on the complete and continuous 2022 FORGE Enhanced Geothermal System DAS dataset. Initial findings from the first day of stimulation show that our model detected 385 events where the public Silixa catalogue only identified 25. Subsequently, when running the entire dataset, our model detected 7,330 microseismic events (2,646 new events), while Silixa reported a total of 1,309 events. Furthermore, when analyzing the entire dataset with the two fibers segregated and processed independently, we found 3,502 events. The significant disparity in the number of events detected by our model compared to those catalogued by Silixa underscores the model’s capacity to identify previously undetected microseismic events, surpassing the initial STA/LTA method employed by Silixa. Additionally, the findings of this research contribute to the overall understanding of the adaptability and generalization capabilities of our pre-trained model to different geophysical fields.

Dual Signals, Better Data: Optimizing DAS with Counter-propagation
Seunghoo Kim, Robert G. Clapp and Biondo L. Biondi
Distributed acoustic sensing (DAS) offers unique potential for near-surface geophysical investigations, but factors such as gauge length and signal amplitude can impact its sensitivity and resolution. To explore potential improvements in DAS data quality, we utilized fiber-optic cables on the Stanford University campus to analyze bi-directional DASsignals, employing a counter-propagation setup in two separate fibers with varying power level. Leveraging the open-source DASCorea package within the DAS Data Analysis Ecosystem (DASDAE) initiative, we aim to investigate how manipulating these parameters in a bi-directional DAS system could influence signal-to-noise ratios (SNR) and spatial resolution. Our goal is to identify the potential advantages and limitations of using counter-propagating signals compared to traditional single-direction DAS,with the ultimate aim of contributing to advancements in near-surface geophysical imaging and monitoring applications.

Using a Convolutional Autoencoder to compress Distributed Acoustic Sensing data
Thomas Cullison, Haipeng Li, and Hassan Almomin
We explored using a Convolutional Autoencoder (CAE) for compressing large-scale distributed acoustic sensing (DAS) data acquired from telecom fiber-optic cables. This type of DAS data can be useful for near-surface geophysical analysis in urban areas. However, these data can be quite large, and finding or maintaining the resources required to store these data can be challenging. As such, we adapted a CAE initially designed for general image compression to explore the potential compression capabilities of such an encoder for compressing DAS data. Our primary goal was to achieve a high compression rate while preserving signal integrity. Initial results were promising, showing a preliminary compression rate of approximately 30X and a reduction in random noise. However, some important signals, such as quasi-static signals (e.g., vehicles), show minor changes.

GenericCable: A Python Toolkit for Waveform-based DAS Seismology
Haipeng Li, Robert G. Clapp and Biondo L. Biondi
GenericCable is a Python toolkit developed to improve Distributed Acoustic Sensing (DAS) data modeling for irregularly shaped fiber cables. DAS, applied in diverse geophysical fields such as energy exploration and earthquake monitoring, requires accurate waveform modeling to image and monitor subsurface structures. The toolkit addresses the challenge posed by DAS’s principle of measuring averaged tangential strain along the fiber, which complicates data modeling. GenericCable operates as a standalone tool compatible with existing wave equation modeling and inversion frameworks. It utilizes Frenet Coordinates for calculating tangential strain wavefields, enabling forward and adjoint operations. Demonstrations include (1) DAS waveform modeling in a volcano setting with an irregular fiber cable using SPECFEM3D-Cartesian, a spectral-element wave propagator; (2) elastic full-waveform inversion (FWI) for monitoring CO2 plumes using DAS in a vertically deviated well; and (3) surface-wave FWI for characterizing the near-surface with smart DAS fiber deployment. It is expected that the developed GenericCable toolkit, still in its early stages, will become a useful tool for DAS waveform data modeling and inversion in geophysical research.Imaging and inversion

One-way full-waveform inversion using frequencydomain model extension (FWIX): slowness-impedance parametrization
Rustam Akhmadiev
The problem of full-waveform inversion using frequency-extended model and oneway wave equation (FWIX) is posed and solved using the slowness and impedance parametrization. The proposed method addresses and overcomes the challenges of sensitivity to initial model choice and reliance on low-frequency data. Combining the two extended models, it is capable of accurately capturing the data residuals, while at the same time driving the solution to the correct physical solution. A multi-scale spatial approach with variable spline grid parametrization is employed to refine both extended models effectively. The advantages of using slowness-impedance parametrization as well as the importance of proper multi-scale strategy are demonstrated on synthetic examples.

Extended modeling explained using method of multiple scales and renormalization group
Rustam Akhmadiev
Accurate velocity reconstruction from seismic data is a complex, non-linear inverse problem. Full-waveform inversion (FWI) is sensitive to the starting model, and its success depends on the validity of the Born approximation (small velocity perturbations). Methods like multiple scales and renormalization group (RG) offer tools to address the limitations of conventional perturbation analysis. I show how RG and multiple scales concepts applied to the wave equation lead to the framework of ”extended images” and ”extended modeling”. This framework provides a theoretical explanation for the data-matching advantages of extended modeling. It also reveals how extended images relate to the underlying velocity perturbation through coarse-graining. The derived RG flow equation illustrates how extended images change across scales, suggesting potential avenues for new discoveries.

Tomographic waveform inversion (TWI) with spaceand time-lags
Julio Oliva Frigerio
Previously, tomographic waveform inversion (TWI) has been formulated with space and time-lags as alternative domains, which had been employed separately. Here strategies to combine them are presented along with examples of applications, showing the benefits of this combination. The joint inversion of transmission events, with the time-lag formulation, and reflection events, with time- and/or space-lag formulations, is able to produce models with accurate kinematics under complex geologic settings. Moreover, the examples also illustrate the advantages of staggering TWI and FWI in the process of velocity model building.

Integrated Feasibility Study of 4D Elastic Full-waveform Inversion in CCUS
Haipeng Li, Robert G. Clapp and Biondo L. Biondi
Seismic monitoring is crucial for optimizing CO2 sequestration processes and ensuring the safety of long-term geological CO2 storage. Time-lapse elastic Full-waveform Inversion (4D FWI) provides the high-resolution characterization of subsurface changes, yet it is challenged by its ill-posed nature and difficulties in uncertainty quantification. This study undertakes an integrated 4D feasibility analysis, initiating with the construction of a realistic CO2 Earth model, followed by the application of 4D elastic FWI under conditions that mirror real-world complexities, such as sparse survey configurations, random noise, and near-surface variations. Our findings underscore the complexity of applying elastic FWI with sparse and noisy time-lapse seismic data. To resolve these challenges, we explore a probabilistic 4D elastic FWI methodology employing Hamiltonian Monte Carlo (HMC) for efficient sampling, alongside a Model Order Reduction (MOR) strategy through Radial Basis Function (RBF) interpolation to enhance computational efficiency. The expected outcome of this research is to yield robust time-dependent distributions of CO2 plume with uncertainty quantification, valuable for risk analysis and decision-making in CO2 geological storage projects.

Mitigating Near Surface Challenges for Land-data Applications of 4D Elastic Full Waveform Inversion in CCUS
Haipeng Li, Robert G. Clapp and Biondo L. Biondi
Time-lapse Full Waveform Inversion (4D FWI) method plays an important role in monitoring subsurface changes, especially in Carbon Capture, Utilization, and Storage (CCUS) projects. However, the application of 4D FWI to land field data is notably challenged by the non-repeatability of near-surface conditions. This study investigates the impact of near-surface changes on elastic FWI results, revealing that seasonal variations can compromise elastic FWI practices. Various strategies to mitigate this issue are explored, including the inversion of the surface waves recorded by surface Distributed Acoustic Sensing (DAS) and local migration method. A field-data case study using the Stanford DAS-2 experiment illustrates the seasonal variations introduced by rainfall in near-surface conditions can be revealed by the surface wave inversion. The findings underscore the importance of understanding and improving near-surface repeatability in applying 4D elastic FWI for onshore CCUS initiatives.

Machine learning
 

Validating CO2 Plume Detection in Seismic Data Using DeepNRMS and t-SNE Embedding Space Analysis
Min Jun Park
This study extends the application of DeepNRMS, an unsupervised deep learning framework previously developed for monitoring carbon capture and storage sites, by employing t-distributed Stochastic Neighbor Embedding for a comprehensive validation of the CO2 plume detection capabilities within seismic data. Focusing on the Aquistore CO2 storage site, we utilize t-distributed Stochastic Neighbor Embedding to analyze the embedding spaces generated by DeepNRMS, aiming to visually validate the distinctiveness of CO2 plume signals from seismic background noise. Through detailed embedding space analysis, this report presents the approach to corroborate the effectiveness of DeepNRMS in identifying and monitoring CO2 injections. This approach not only confirms the precision of DeepNRMS in distinguishing CO2 signals but also enhances our understanding of the model’s operational dynamics in real-world CO2 monitoring scenarios.

Bi-directional LSTM Neural Network for Extending Frequency Bandwidth
Ivan Deiana, Giacomo Roncoroni, Antoine Guitton, Robert Clapp, and Michele Pipan
This study investigates the potential of bi-directional LSTM NN for geophysical signal processing, specifically focusing on frequency bandwidth extension and inference tasks. Three distinct scenarios, namely Low and High frequency inference, and deconvolution, are explored, each utilizing a dedicated LSTM model. The primary objective of each scenario is to address key challenges within the field. Accurate low-frequency estimations aims at mitigating cycle skipping in FWI. High-frequency predictions seek to enhance resolution and reconstruct sections with noisy high frequencies. The deconvolution task aims to develop a fast, robust, and easily applicable model that does not necessitate extensive retraining or several parameter tuning and preprocessing. These models are extensively trained using synthetic data derived from enhanced 1D convolutional models, with tailored training strategies employed for each specific scenario.

Unmasking the Non-Linear Basis Function of the Embedding Space of NN Operators through t-SNE Analysis
Joseph Stitt and Robert G. Clapp
This research examines the learning mechanisms of neural network operators (NNO) within their embedding spaces, arguably a critical aspect of validating deep learning models at SEP. We analyze latent space dynamics and explore how Convolutional Neural Network (CNN) operators process and categorize synthetic geophysical model distributions. Employing a modern PyTorch framework, we transition from TensorFlow 1 to improve code organization and align with contemporary practices. t-SNE analysis aids in visualizing and understanding the categorization process within the embedding spaces, revealing distinct patterns and clustering phenomena that underscore the NNs’ capability to interpret complex velocity models. These findings highlight the potential of NNOs to enhance geophysical inversion processes, particularly in Full Waveform Inversion (FWI), and lay the groundwork for future advancements in network testing and validation.

Exploring the Methods for Training and Developing a Neural Network Regularizer to Assist in Resolving Discrete Subsurface Features in Acoustic FWI
Joseph Stitt and Robert Clapp
This study explores the enhancement of regularization methods for Full Waveform Inversion (FWI) in geophysical exploration, employing neural network (NN) operators to effectively impose realistic non-stationary changes in subsurface model estimation. FWI, crucial for detailed subsurface insights, faces accuracy challenges in complex settings like beneath salt structures. While traditional regularization methods like total variation have improved boundary delineation, accurately updating deep subsurface features remains difficult. Our approach employs NN operators to leverage non-linear basis functions, aiming to update subsurface models more effectively. Initial results from synthetic models and FWI experiments suggest that NN regularization has the potential to address some traditional FWI limitations, showing promising indications of improved subsurface modeling. Future work will optimize NN strategies, explore unsupervised learning for preconditioning, deploy synthetic salt model experiments, and refine training models to improve performance further.

Synthesis: An AI-Driven Platform for Efficient File Management and Collaboration
Min Jun Park, Andy Vo, Gi Ung Lee, and Phillip Teng
The exponential growth of digital files and the heterogeneous nature of data management practices pose significant challenges for effective information organization and retrieval in collaborative environments. This technical report introduces Synthesis, an innovative AI-driven platform designed to revolutionize file management by leveraging large language models for automatic tagging, summarizing, and intuitive searching of text-based files. By enabling natural language search capabilities, facilitating discussions directly on data sets, and providing dynamic visualizations of stored information, Synthesis aims to address the inefficiencies found in conventional group drive systems by minimizing the manual works. This report details the motivation behind Synthesis, the technological solutions employed in its development, and the anticipated outcomes for users and collaborative teams.

Deep convolutional neural networks applied to earthquake detection in the image-domain
Julio Oliva Frigerio
Earthquake detection and location can be challenging for events with low magnitude with respect to background noise level. An algorithm that combines seismic imaging with machine learning techniques to detect and locate earthquakes under poor signal to-noise conditions is proposed. The application carried out in this work was successful with synthetic data and showed that this methodology has potential to be employed with field data.

High-performance computing (HPC)
 

Efficient memory management for parallel processing in Python
Thomas Cullison
Python has become a pivotal tool in the scientific and research communities, benefiting from a robust computational ecosystem and an open-source development approach. Its straightforward syntax eases the integration of existing scientific libraries into new research projects. Despite these advantages, the Python Global Interpreter Lock (GIL) creates challenges when parallelizing tasks, particularly in processing large data such as seismic and distributed acoustic sensing (DAS) data. The GIL controls memory access and ensures thread safety, creating challenges for multi-threaded applications and workflows. The GIL can be circumvented via the multiprocessing package to leverage multiple cores by creating separate subprocesses, each with its own interpreter. However, memory sharing between subprocesses can be problematic in environments like Jupyter Notebook, where the default object serialization in Python may hinder the ability to take advantage of the multiprocessing package. An alternative, the multiprocess package, offers enhanced serialization and maintains a compatible API. This report introduces methods to manage shared memory and demonstrates a technique for parallelizing workflows in a Python-based Jupyter Notebook environment, thus optimizing the processing of large datasets like seismic or DAS data.

Python packaging, reproducibility, and performance testing using parallelized ZFP as example
Robert G. Clapp
The proliferation of computing technologies has fundamentally transformed the dissemination of scientific research. This transition is exemplified through the enhancement of the ZFP compression library via parallelization, showcasing the convergence of computational efficiency and accessibility. A Python package, installable via pip from GitHub and utilizing CMake, represents a leap in software distribution, embodying the modernization of research tools deployment. Complementary, a GitHub repository elucidates the facilitation of reproducible research via Google Colab links, further democratizing scientific inquiry. The assessment of performance across cloud environments, utilizing Docker within a testing library framework, underscores the scalability of these approaches. A novel development is the adoption of JupyterHub, leveraging Kubernetes, to provide an orchestrated environment that broadens the horizon for reproducible research practices. This paper navigates through these advancements, underlining their collective impact on the accessibility, efficiency, and reproducibility of scientific research, marking a pivotal shift towards a more open and collaborative scientific ecosystem.

Author(s)
R. Akhmadiev
H. Almomin
B. Biondi
T. Cullison
I. Deiana
R. Clapp
P. Given
M.J. Park
S.H. Kim
H. Li
J. Frigerio
M. Pipan
G. Roncoroni
J. Stitt
P. Teng
G.U. Lee
A. Vo
Publication Date
May, 2024