Automatic Semblance Picking using Clustering Algorithms

by Yuqing Chen
figure pick_result.png
Figure 1 (a) The velocity spectrum and (b) the picked result. The red points indicate the central location of each cluster and the black line is the picked result.
0 Objective
In this lab, we use the K-means clustering method to automatically pick the velocity spectrum.
1 Reading Materials
Automatic picking paper autosembpick.pdf
2 Prerequisties
3 Procedure of the K-meas Method
  1. Set the number of clusters as K.
  2. Place K points into space which represent the initial centroid of each cluster.
  3. Assign each data points to the cluster that has the closest centroid.
  4. After all points have been assigned, re-compute the centroid of each cluster.
  5. Repeat step 3-4 until convergence has been reached.
4 How to Use the Lab
Download the codes K_Means_Method.zip and unzip it. Change your Matlab working directory under this file so you may able to use all necessary sub-functions. The main function is K_mean_cluster.m, open it in the Matlab script and run. The comments in the program will help you understand the Lab.
5 Theory of the K-means Method
Let x = {x(i)}, i = 1, 2, ..., Nbe the set of N-dimensional feature vectors to be clustered into a set of K clusters, with the centroid points at C. The misfit function is defined as
ϵ = (1)/(2)Ki = 1(1)/(mi)mij = 1 Ci − xij2, 
where K is the number of clusters and miindicate the number of data points in the ithcluster. These cluster centroids can be iteratively updated by
C(k + 1)i = C(k)i − (1)/(m(k)i)m(k)ij = 1( C(k)i − x(k)ij), 
C(k + 1)i = (1)/(m(k)i)mkij = 1 x(k)ij.
where the updated centroid C(k + 1)i is actually the averaged result of the data points in the ith cluster.
6 Questions
  1. comment the “thresholding” procedure in the program and re-run the program. How is the K-means method performs and why?
  2. How to choose the number of clusters?
6 PS
Please let me know if there are any errors in this Lab, please contact: yuqing.chen@kaust.edu.sa
Regards,
Yuqing Chen