GEO 5920/6920: Machine Learning Methods in Geosciences

Instructors: Jerry Schuster (gerard.schuster2022@gmail.com) and TA Shi Yonxiang (shiyongxiang@pku.edu.cn)

Book: Machine Learning in Geosciences by G.T. Schuster

Grading: Hmwk+Labs:40%, Exams 40%, Project 20%

Lecture Days+Time:Tuesday & Thursday (1PM-2:30PM)

Objective: Learn basic methods for supervised and unsupervised machine learning in geosciences.

Format: 3 hours credit, meet twice/week for 1.4 hours/lecture.

Recorded Lectures: Previous Zoom lectures for Fall 2020 are recorded and cannot be accessed here.

Registration: UU students register using the normal UU registration process for either credit or audit credit. Invited alumni and friends register by sending Shi Yonxiang (shiyongxiang@pku.edu.cn) your request to be put on the class email list.

MATLAB and Keras Information: Computational Labs and Keras tutorial Keras

Term Projects: Projects, Microcrack Labeling of Micrographs, and Turkana Geochem Project. Colab Geochem clustering colab is here.

2020 Term projects: 2020 Projects

WEEK

SUBJECT

RECORDINGS

READINGS/HMWK/LABS/YOUTUBE

Week Jan. 9 Course Overview* (90 minutes) and Clustering* (90-120 minutes) Chapters 1 and 17

Thursday Zoom:

Readings: Eduardo's Geophysics paper.

Labs:

    Do one of the labs below.
  1. Lab for Exploration Geophysicists: NMO Cluster NMO Cluster Lab (Due Jan. 25).
  2. Lab for Non-Geophysicists:Create your own data set or use data from your research. Repeat the K-means and DBSCAN labs below with this new data. Compare effectiveness of K-Means and DBSCAN Clustering. Useful references for DBSCAN are here. Devise a data set where K-means will fail while DBSCAN succeeeds. What are the advantages and disadvantages of each? (Due Jan. 25).
  3. Hamid's 3D DBSCAN code.

YouTube:
  1. Machine Learning Overview
Week Jan. 16 NMO IW Clustering (60 minutes) NMO and SOM (60 minutes with lab) Chapter 17

Tuesday Zoom

Thursday Zoom

Labs:
  1. How to download your own data into Colab
  2. SOM CoLab (Due Jan. 25) or do the MATLAB lab with MATLAB commands

    x = simplecluster_dataset; net = selforgmap([8 8]); net = train(net,x); view(net) y = net(x); classes = vec2ind(y);

    where you input a multidimensional data set of your choosing. It would be interesting to input a multidmensional data set that does not separate into non-overlpping clusters in any two of the coordinate planes. Instead, the SOM 2D plot shows points that belong to distinct non-overlapping clusters. As an example, take points in distinct clusters and rotate them in highr-dimensional space so the points do not cleanly separate in any two of the coordinate planes.

  3. Either Exercise 5 or 6 in section 17.14 (Due Jan. 25)
Week Jan. 23 NMO IW Clustering (cont.) and Fuzzy Clustering for Traveltime Picks (60 minutes with lab) Chapter 17,

Tuesday Zoom

Thursday Zoom

Readings:
  1. Fuzzy vs Hard Clustering paper
  2. Traveltime Picking with Fuzzy Clustering Geophysics paper.
Labs: Do only one lab from labs 1, 2 and 3.
  1. Optional for Geophysicists: Picking Traveltimes by Fuzzy Clustering (Due Feb. 2).
  2. Use the Fuzzy Logic MATLAB program for Fuzzy Clustering and try it on Yellowstone data (Due Feb. 1). Compare the Fuzzy clustering results to that from a K-means cluster analysis of the Yellowstone data (docx file for Yellowstone data is here). What are the advantages and disadvantages of Fuzzy clustering vs K-means clustering? A tutorial for using Fuzzy.m is here. You will need the Fuzzy logic toolbox to run this MATLAB code. A "Fuzzy vs Hard Clustering" paper you should read is here (Due Feb. 3).
  3. Download the MATLAB K-Means code run.m, and the rock images Rock1.jpg, Rock2.jpg, and Rock3.jpg. Find the optimal number of clusters that separate one type of rock from another. Use the Silhouette and Elbow techniques to determine the best number of clusters. Only do this for your favorite rock photo. (Due Feb. 3).
Week Jan. 30 Silhouette Validation (25 minutes) and Least Squares Inversion*. Chapter 2

Tuesday Zoom

Thursday Zoom

Exercises: Exercises 2.1-2.7 and 2.9 (Optional Extra Credit for those who plan to further their education & understanding in ML: Due Feb. 7).

Labs:

  1. Least Squares labs (Exs. 1, 2 and 3: (Due Feb. 7).
  2. Silohuette lab. Read this paper to see how the optimal number of clusters can determined by the Silohuette method. (Due Feb. 7).
  3. 1D Optimization MATLAB lab (Due Feb. 7).

YouTube: Machine Learning Overview, Linear Model I: Linear Regression, and Learning.

Week Feb. 7 Gradient Descent and Non-linear Inversion Chapters 2-3

Tuesday Zoom

Thursday Zoom

Yuan's Taylor Series Lecture

Exercises: Chapter 3: 3.1, 3.2, 3.3, 3.7, 3.8, and 3.9 (Due Feb. 23).

Labs (Only 1 out of 2 labs need to be done. Choose one, if you do two then extra credit.):

  1. 2D Rosenbrock Optimization MATLAB lab (Due Feb. 19).
  2. Visualizing the Hessian MATLAB lab (Due Feb. 23).
YouTube: ML: Training vs Testing and Linear Model II.
Week Feb. 14 Neural Networks Chapter 4

Tuesday Zoom

Thursday Zoom

Exercises: Exercises 4.1-4.7 (Due Feb. 28).

Lab:

  1. Fully Connected NN Matlab Lab as Binary Classifier (Due Feb. 28) . The PPT that explains the code is here.
YouTube: Neural Networks.
Week Feb. 21 Neural Networks and Multiple Node NN Chapters 4-5 and Intro to Probability Theory

Tuesday Zoom

Thursday Zoom

Lab (Only 1 out of the 2 labs need to be done):
  1. Fully Connected NN Matlab Lab as Multinary Classifier. (Due Feb. 28)
  2. NN and inconsistent data in Exercise 4.12 that uses the MATLAB codes. (Due Feb. 28)

Tutorials:

  1. Matlab NN tutorial
  2. Gimp Labeling of Images. Install GIMP and go to lab. Proceed to label photos per instructions. This lab is not to be done but it might be useful for your term project where you label photos. Gimp Zoom Lecture Passcode: cv5V!o#Q
Papers: Fukushima (1980), Hubel+Wiesel (1958), LeCun et al. (1998), 9 key papers, and Aramco (2018) papers.
Week of Feb 28 Intro. to CNN . Chapters 6, 8 & 9

Tuesday Zoom

Thursday Zoom

Exercises:
  1. Exercises 8.4.1, 8.4.2, 8.4.3, 8.4.5, 8.4.6 (March 16)

Labs:

  1. Alexnet Number Reading CNN Lab. Tutorials on implementing an AlexNet architecture in Keras are at here and here. (Due March 21).
Tutorials:
  1. Python Tutoiral in CoLab.
  2. Introduction to CoLab.

Week of March 14 Intro. to CNN (cont.)* . Chapters 8 & 9

Tuesday Zoom

Thursday Zoom

Exercise:
  1. Exercises 8.4.1, 8.4.2, 8.4.3, 8.4.5, 8.4.6 (Due Date Delayed until March 23)

Labs:

    Do one of the following labs. (Due March 23)
  1. Identifying Artifacts in Migration by NN and SVM Lab (MATLAB)
  2. Rock Crack Picking by Alexnet (MATLAB)
  3. AlexNet Bird Picking Lab (CoLab)
  4. Alexnet Fault Picking lab (MATLAB)
  5. UNet Old U-Net Salt Picking lab (don't use) and New U-Net Salt Picking Lab (CoLab)

YouTube Videos:

  1. AlexNet implementation in Keras/TensorFlow is here.
Week of March 20 AlUla Crack Picking by U-Net

CNN Examples

Chapter 10

Tuesday Zoom

Thursday Zoom

Exercise:
  1. Exercises 9.7.1 (Due March 30)

Labs:

  1. Choose a project and make a 3-minute PPT presentation: Title, Goal+Motivation, Procedure, Expected Results, Work Timeline. (Due: March 28)
  2. In-class AlUla Crack Picking Colab
  3. Surface wave dispersion lab.

Papers:

  1. The Shi et al. paper on AlUla crack picking is here.
  2. Shi et al. paper on extracting dispersion curves is here.
Week of March 27 Object Detection, Localization, Classification*.

Yolo*(1-hour)

Chapter 10

Tuesday Zoom

Thursday Zoom

Paper
  1. Overview Blog of Yolo
Labs:
  1. In-class Yolo Lab.
  2. Optional Blood Cell Detection Lab by R-CNN.
  3. PPT Progress Report of Project (Due April 4)

YouTube:

  1. Youtube: Yolo
  2. Yolo 1 w/more details
  3. Yolo 2 w/more details
  4. Yolo3/4 with Colab
  5. Yolo4
  6. Real-time detection
Week of April 3 Support Vector Machines* (Overview).

Support Vector Machines* (Part 1).

Support Vector Machines* (Part 2).

Support Vector Machines* (Part 3).

Support Vector Machines Soft Margin+ (Part 4)

Chapter 7

Tuesday Zoom

Thursday Zoom

Paper
  1. Yuqing's SVM Lab.
  2. Read SVM and Medical Imaging.
Labs:
  1. SVM, NN, and Logistic Regression Denoising of Migration Images Lab
  2. Different Classifier Comparisons.
  3. Hinge Loss Synthetic Data
  4. Hinge Loss SVM Cancer Data.
  5. Download Yellowstone Data into CoLab.
  6. PPT Progress Report of Project (Due April 4)

YouTube:

  1. Mostafa's Support Vector Machines
  2. and Kernel Methods.
Week of April 10 Hinge-Loss SVM.

PCAa,

Chapter 16 and 21

Tuesday Zoom

Papers:
  1. NN vs SVM
  2. Jeeva, M., 2018: The scuffle between two algorithms: Neural network vs Support Vector Machine:.

Labs:

  1. Hinge Loss Synthetic Data or Geochem Lab: Discover Best Strategy to Separate 6 Geochem classes (Due April 18)
YouTube:
  1. Unconstrained vs Constrained Optimization Problems
Week of April 17 PCAa, Chapters 16 and 21

Tuesday Zoom

Thursday Zoom

Papers:
  1. PCA 1982 Geophysics
  2. PCA Review.

Labs:

  1. Learn the importance of normalizing your data here.
  2. Basic PCA lab.
  3. PCR vs PLS lab.
  4. PCA CoLab with Nashville Carbonate Data
  5. CoLab K-Means & Elbow method Requires downloading Yellowstone Data into CoLab.
  6. PCA Lab PDF
  7. Simple PCA MATLAB Lab.
Week of April 24 PCAb, and PCAc. Chapter 16

Tuesday Zoom

Papers:
  1. TLE Radiometric paper.

Labs:

May 2
  1. Sean's Project
  2. Tessa's Project
  3. Changdi's Project
  4. Santiago's Project

Video of Project Presentations

TBA GANS: Xiangliang's Lecture Lab: GANs Lab

Papers:GAN Tutorial and Goodfellow Talk

TBA R-CNN Object Detect+Localization and Yolo Chapter 10 Lab:
  1. R-CNN Detection of Blood Cells.