Shi Yongxiang
2020\11\03
In this Lab, We are going to train a Yolov3 Network to do Object Detection. Here is the online lab
tensorflow 1.15
,keras 2.2
, opencv-python
, matplotlib
, sklearn
, h5py
, numpy
and pandas
.Yolo.ipynb
in this env by jupyter-notebook
to see all commands.Before running the following code blocks in google colab, please prepare an python3 env:
Run the following cell, And then restart the host (by Runtime/Factory reset runtime
) to refresh tensorflow version.
% tensorflow_version 1.x
import tensorflow as tf
version = tf.__version__
if version[0:2] == '2.':
print('fail to use 1.x tensorflow, some unpredicted errors will occurs')
print('You have to reset the runtime to refresh tensorflow Version')
else:
print('Successfully activate tensorflow 1.x')
print(tf.__version__)
Weight file is yolov2.weights
, I have shared it to everyone by email and google drive.
/content/
, you can upload the local file to this folder. But it will be deleted if you logout the notebook. Also, make sure you have a high speed to connect colab.! wget https://pjreddie.com/media/files/yolov2.weights
, but the speed is only 128KB.Choose one and uncomment its commands by ctrl+/
# Method 1
# upload your local file by Files/upload
# Method 2
# ! wget https://pjreddie.com/media/files/yolov2.weights
# Method 3
import os
if 'drive' in os.listdir('./'):
print('you have successfully mounted your google drive')
else:
from google.colab import drive
drive.mount('/content/drive')
print('successfully mount your google drive, please restart the run time')
! cp drive/My\ Drive/yolov2.weights ./
try:
f = open('yolov2.weights','r')
f.close()
except IOError:
raise ValueError('You have to upload the yolov2.weights into the folder')
We have two datasets, One is blood cells object detection and another is license plate. For more dataset, go Here
Dataset | mAP | Demo | Config | Model |
---|---|---|---|---|
Kangaroo Detection (1 class) (https://github.com/experiencor/kangaroo) | 95% | https://youtu.be/URO3UDHvoLY | check zoo | https://bit.ly/39rLNoE |
License Plate Detection (European in Romania) (1 class) (https://github.com/RobertLucian/license-plate-dataset) | 90% | https://youtu.be/HrqzIXFVCRo | check zoo | https://bit.ly/2tIpvPl |
Raccoon Detection (1 class) (https://github.com/experiencor/raccoon_dataset) | 98% | https://youtu.be/lxLyLIL7OsU | check zoo | https://bit.ly/39rLNoE |
Red Blood Cell Detection (3 classes) (https://github.com/experiencor/BCCD_Dataset) | 84% | https://imgur.com/a/uJl2lRI | check zoo | https://bit.ly/39rLNoE |
VOC (20 classes) (http://host.robots.ox.ac.uk/pascal/VOC/voc2012/) | 72% | https://youtu.be/0RmOI6hcfBI | check zoo | https://bit.ly/39rLNoE |
! pwd
! git clone https://github.com/experiencor/keras-yolo2.git
# first script, we need to use some functions inside
! mv keras-yolo2/preprocessing.py ./
# second script
! mv keras-yolo2/utils.py ./
# blood images
! git clone https://github.com/Shenggan/BCCD_Dataset.git
# license plate dataset
! git clone https://github.com/RobertLucian/license-plate-dataset.git
from keras.models import Sequential, Model
from keras.layers import Reshape, Activation, Conv2D, Input, MaxPooling2D, BatchNormalization, Flatten, Dense, Lambda
from keras.layers.advanced_activations import LeakyReLU
from keras.callbacks import EarlyStopping, ModelCheckpoint, TensorBoard
from keras.optimizers import SGD, Adam, RMSprop
from keras.layers.merge import concatenate
import matplotlib.pyplot as plt
import keras.backend as K
import tensorflow as tf
import imgaug as ia
from tqdm import tqdm
from imgaug import augmenters as iaa
import numpy as np
import pickle
import os, cv2
from preprocessing import parse_annotation, BatchGenerator
from utils import WeightReader, decode_netout, draw_boxes
import keras
print("Keras version used: ", keras.__version__)
print("GPU imformation:", tf.test.gpu_device_name())
from tensorflow.python.client import device_lib
print(tf.__version__)
device_lib.list_local_devices()
Define dataset paths and weights path
wt_path = 'yolov2.weights'
# BCCD dataset
train_image_folder = 'BCCD_Dataset/BCCD/JPEGImages/'
train_annot_folder = 'BCCD_Dataset/BCCD/Annotations/'
valid_image_folder = 'BCCD_Dataset/BCCD/JPEGImages/' # no valid set
valid_annot_folder = 'BCCD_Dataset/BCCD/Annotations/'
# license plate dataset
# train_image_folder = '/content/license-plate-dataset/dataset/train/images/'
# train_annot_folder = '/content/license-plate-dataset/dataset/train/annots/'
# valid_image_folder = '/content/license-plate-dataset/dataset/valid/images/'
# valid_annot_folder = '/content/license-plate-dataset/dataset/valid/annots/'
Define classes, imagesize and so on parameters for training and predicting
# Blood Cells
LABELS = ["RBC",'WBC','Platelets']
# license plate dataset
# LABELS = ['license-plate']
# Image size
IMAGE_H, IMAGE_W = 416, 416
# Grid number in feature map, also the grids of Yolo
GRID_H, GRID_W = 13 , 13
# Class number and weights
CLASS = len(LABELS)
CLASS_WEIGHTS = np.ones(CLASS, dtype='float32')
# How many boxes proposed for each grid
BOX = 5
# size of the 5 boxex
ANCHORS = [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828]
# Threshold for object
OBJ_THRESHOLD = 0.3#0.5
NMS_THRESHOLD = 0.3#0.45
NO_OBJECT_SCALE = 1.0
OBJECT_SCALE = 5.0
COORD_SCALE = 1.0
CLASS_SCALE = 1.0
# parameters for training
BATCH_SIZE = 16
WARM_UP_BATCHES = 0
TRUE_BOX_BUFFER = 50 # pass all ture boxes of a image for non-object loss function
Our images and labels are in different folders. A label is stored as a xml file.
import xml.etree.ElementTree as ET
import csv
from random import seed
import os.path
from random import randint
import random
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
from matplotlib import patches
image_idx = 1
# create the csv writer object
csv_data = open('./train.csv', 'w')
csvwriter = csv.writer(csv_data)
csv_head = ['Image_names','cell_type','xmin','xmax','ymin','ymax']
csvwriter.writerow(csv_head)
imagename = train_annot_folder+"BloodImage_%05d.xml" % (image_idx)
tree=ET.parse(imagename)
root=tree.getroot()
filename=root.find('filename').text
print(filename)
for region in root.findall('object'):
csv = []
csv.append(filename)
name = region.find('name').text
csv.append(name)
xmin = region.find('bndbox').find('xmin').text
csv.append(xmin)
xmax = region.find('bndbox').find('xmax').text
csv.append(xmax)
ymin = region.find('bndbox').find('ymin').text
csv.append(ymin)
ymax = region.find('bndbox').find('ymax').text
csv.append(ymax)
#print('cell_type='+name,'xmin='+xmin,'xmax='+xmax,'ymin='+ymin,'ymax='+ymax)
csvwriter.writerow(csv)
csv_data.close()
# read the csv file using read_csv function of pandas
train = pd.read_csv('./train.csv')
train.head()
So that is the true boxes information in this images, lets display them.
#plot the image
fig = plt.figure(figsize=[10,10], dpi=100)
plt.subplot(121)
newimage = train_image_folder + 'BloodImage_%05d.jpg' % (image_idx)
image = plt.imread(newimage)
plt.imshow(image)
plt.title('image')
# plot label
ax = plt.subplot(122)
image = plt.imread(newimage)
plt.imshow(image)
# iterating over the image for different objects
for _,row in train[train.Image_names == "BloodImage_%05d.jpg" % (image_idx)].iterrows():
xmin = row.xmin
xmax = row.xmax
ymin = row.ymin
ymax = row.ymax
width = xmax - xmin
height = ymax - ymin
# assign different color to different classes of objects
if row.cell_type == 'RBC':
edgecolor = 'r'
ax.annotate('RBC', xy=(xmax-40,ymin+20))
elif row.cell_type == 'WBC':
edgecolor = 'b'
ax.annotate('WBC', xy=(xmax-40,ymin+20))
elif row.cell_type == 'Platelets':
edgecolor = 'g'
ax.annotate('Platelets', xy=(xmax-40,ymin+20))
# add bounding boxes to the image
rect = patches.Rectangle((xmin,ymin), width, height, edgecolor = edgecolor, facecolor = 'none')
ax.add_patch(rect)
plt.title('True Bounding Box')
Here, we build a Yolo Network by keras with 23 convolutional layers.
# the function to implement the orgnization layer (thanks to github.com/allanzelener/YAD2K)
def space_to_depth_x2(x):
return tf.nn.space_to_depth(x, block_size=2)
input_image = Input(shape=(IMAGE_H, IMAGE_W, 3))
true_boxes = Input(shape=(1, 1, 1, TRUE_BOX_BUFFER , 4))
# Layer 1
x = Conv2D(32, (3,3), strides=(1,1), padding='same', name='conv_1', use_bias=False)(input_image)
x = BatchNormalization(name='norm_1')(x)
x = LeakyReLU(alpha=0.1)(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
# Layer 2
x = Conv2D(64, (3,3), strides=(1,1), padding='same', name='conv_2', use_bias=False)(x)
x = BatchNormalization(name='norm_2')(x)
x = LeakyReLU(alpha=0.1)(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
# Layer 3
x = Conv2D(128, (3,3), strides=(1,1), padding='same', name='conv_3', use_bias=False)(x)
x = BatchNormalization(name='norm_3')(x)
x = LeakyReLU(alpha=0.1)(x)
# Layer 4
x = Conv2D(64, (1,1), strides=(1,1), padding='same', name='conv_4', use_bias=False)(x)
x = BatchNormalization(name='norm_4')(x)
x = LeakyReLU(alpha=0.1)(x)
# Layer 5
x = Conv2D(128, (3,3), strides=(1,1), padding='same', name='conv_5', use_bias=False)(x)
x = BatchNormalization(name='norm_5')(x)
x = LeakyReLU(alpha=0.1)(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
# Layer 6
x = Conv2D(256, (3,3), strides=(1,1), padding='same', name='conv_6', use_bias=False)(x)
x = BatchNormalization(name='norm_6')(x)
x = LeakyReLU(alpha=0.1)(x)
# Layer 7
x = Conv2D(128, (1,1), strides=(1,1), padding='same', name='conv_7', use_bias=False)(x)
x = BatchNormalization(name='norm_7')(x)
x = LeakyReLU(alpha=0.1)(x)
# Layer 8
x = Conv2D(256, (3,3), strides=(1,1), padding='same', name='conv_8', use_bias=False)(x)
x = BatchNormalization(name='norm_8')(x)
x = LeakyReLU(alpha=0.1)(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
# Layer 9
x = Conv2D(512, (3,3), strides=(1,1), padding='same', name='conv_9', use_bias=False)(x)
x = BatchNormalization(name='norm_9')(x)
x = LeakyReLU(alpha=0.1)(x)
# Layer 10
x = Conv2D(256, (1,1), strides=(1,1), padding='same', name='conv_10', use_bias=False)(x)
x = BatchNormalization(name='norm_10')(x)
x = LeakyReLU(alpha=0.1)(x)
# Layer 11
x = Conv2D(512, (3,3), strides=(1,1), padding='same', name='conv_11', use_bias=False)(x)
x = BatchNormalization(name='norm_11')(x)
x = LeakyReLU(alpha=0.1)(x)
# Layer 12
x = Conv2D(256, (1,1), strides=(1,1), padding='same', name='conv_12', use_bias=False)(x)
x = BatchNormalization(name='norm_12')(x)
x = LeakyReLU(alpha=0.1)(x)
# Layer 13
x = Conv2D(512, (3,3), strides=(1,1), padding='same', name='conv_13', use_bias=False)(x)
x = BatchNormalization(name='norm_13')(x)
x = LeakyReLU(alpha=0.1)(x)
skip_connection = x
x = MaxPooling2D(pool_size=(2, 2))(x)
# Layer 14
x = Conv2D(1024, (3,3), strides=(1,1), padding='same', name='conv_14', use_bias=False)(x)
x = BatchNormalization(name='norm_14')(x)
x = LeakyReLU(alpha=0.1)(x)
# Layer 15
x = Conv2D(512, (1,1), strides=(1,1), padding='same', name='conv_15', use_bias=False)(x)
x = BatchNormalization(name='norm_15')(x)
x = LeakyReLU(alpha=0.1)(x)
# Layer 16
x = Conv2D(1024, (3,3), strides=(1,1), padding='same', name='conv_16', use_bias=False)(x)
x = BatchNormalization(name='norm_16')(x)
x = LeakyReLU(alpha=0.1)(x)
# Layer 17
x = Conv2D(512, (1,1), strides=(1,1), padding='same', name='conv_17', use_bias=False)(x)
x = BatchNormalization(name='norm_17')(x)
x = LeakyReLU(alpha=0.1)(x)
# Layer 18
x = Conv2D(1024, (3,3), strides=(1,1), padding='same', name='conv_18', use_bias=False)(x)
x = BatchNormalization(name='norm_18')(x)
x = LeakyReLU(alpha=0.1)(x)
# Layer 19
x = Conv2D(1024, (3,3), strides=(1,1), padding='same', name='conv_19', use_bias=False)(x)
x = BatchNormalization(name='norm_19')(x)
x = LeakyReLU(alpha=0.1)(x)
# Layer 20
x = Conv2D(1024, (3,3), strides=(1,1), padding='same', name='conv_20', use_bias=False)(x)
x = BatchNormalization(name='norm_20')(x)
x = LeakyReLU(alpha=0.1)(x)
# Layer 21
skip_connection = Conv2D(64, (1,1), strides=(1,1), padding='same', name='conv_21', use_bias=False)(skip_connection)
skip_connection = BatchNormalization(name='norm_21')(skip_connection)
skip_connection = LeakyReLU(alpha=0.1)(skip_connection)
skip_connection = Lambda(space_to_depth_x2)(skip_connection)
x = concatenate([skip_connection, x])
# Layer 22
x = Conv2D(1024, (3,3), strides=(1,1), padding='same', name='conv_22', use_bias=False)(x)
x = BatchNormalization(name='norm_22')(x)
x = LeakyReLU(alpha=0.1)(x)
# Layer 23
# This layer will generate 5 boxes for each grid, and we have 13x13 grids
# each box will predict an object with W, H and center coordinate(X, Y), and also Class
# so for each box, network will give we a vector [W, H, X, Y, No_object, is_class1, is_class2, is_classn]
# In the end, the output shape of the yolo network should be [13, 13, 5, 4+1+NUM_class]
x = Conv2D(BOX * (4 + 1 + 3), (1,1), strides=(1,1), padding='same', name='conv_23')(x)
output = Reshape((GRID_H, GRID_W, BOX, 4 + 1 + 3))(x)
# small hack to allow true_boxes to be registered when Keras build the model
# for more information: https://github.com/fchollet/keras/issues/2790
output = Lambda(lambda args: args[0])([output, true_boxes])
model = Model([input_image, true_boxes], output)
model.summary()
Considering that we do not have so many images to train the big network, we load pre-trained weights and only randomize the last layer to fit objects in BCCD dataset.
# load weights for all layers
weight_reader = WeightReader(wt_path)
weight_reader.reset()
nb_conv = 23
for i in range(1, nb_conv+1):
conv_layer = model.get_layer('conv_' + str(i))
print('load weight for conv_{}'.format(i))
# weights in BN
if i < nb_conv:
norm_layer = model.get_layer('norm_' + str(i))
size = np.prod(norm_layer.get_weights()[0].shape)
beta = weight_reader.read_bytes(size)
gamma = weight_reader.read_bytes(size)
mean = weight_reader.read_bytes(size)
var = weight_reader.read_bytes(size)
weights = norm_layer.set_weights([gamma, beta, mean, var])
# weights in convolutions
if len(conv_layer.get_weights()) > 1:
bias = weight_reader.read_bytes(np.prod(conv_layer.get_weights()[1].shape))
kernel = weight_reader.read_bytes(np.prod(conv_layer.get_weights()[0].shape))
kernel = kernel.reshape(list(reversed(conv_layer.get_weights()[0].shape)))
kernel = kernel.transpose([2,3,1,0])
conv_layer.set_weights([kernel, bias])
else:
kernel = weight_reader.read_bytes(np.prod(conv_layer.get_weights()[0].shape))
kernel = kernel.reshape(list(reversed(conv_layer.get_weights()[0].shape)))
kernel = kernel.transpose([2,3,1,0])
conv_layer.set_weights([kernel])
# Randomize weights of the last layer
layer = model.layers[-4] # the last convolutional layer
weights = layer.get_weights()
new_kernel = np.random.normal(size=weights[0].shape)/(GRID_H*GRID_W)
new_bias = np.random.normal(size=weights[1].shape)/(GRID_H*GRID_W)
layer.set_weights([new_kernel, new_bias])
Where:
There are serval tricks:
Warm-Up Stage: When train-batch-num is smaller than a value, the true X&Y of an anchor whose $\hat{C}=0$ should be set as the its center; the true W&H should be set as the W&H of the anchor. After that, they are 0.
No-Object Anchor: {Best_IOU<0.6} and {$\hat{C}=0$}. If the best IOU of a predict box based on the anchor is smaller than 0.6 with any true boxes , it will be considered as a no-object anchor. (fourth item in loss function).
# it is defined by tensorflow and very fragile
# if you want to run another model for license-plate detection, factory reset the runtime
def custom_loss(y_true, y_pred):
# size of y: [N, 13, 13, 5, 4+1+NUM_class]
seen = tf.Variable(0.)
total_recall = tf.Variable(0.)
'''
calculate meshgrid
'''
mask_shape = tf.shape(y_true)[:4]
# mask_shape=[N_batch, 13,13, 5]
# X, Y meshgrid, coordinate of each grid
cell_x = tf.cast(tf.reshape(tf.tile(tf.range(GRID_W), [GRID_H]), (1, GRID_H, GRID_W, 1, 1)),'float32')
cell_y = tf.compat.v1.transpose(cell_x, (0,2,1,3,4))
cell_grid = tf.compat.v1.tile(tf.concat([cell_x,cell_y], -1), [BATCH_SIZE, 1, 1, 5, 1])
coord_mask = tf.zeros(mask_shape)
conf_mask = tf.zeros(mask_shape)
class_mask = tf.zeros(mask_shape)
'''
get slice of predicted [X Y], predicted [W H], confidence, and class
'''
### adjust x and y, normalize each x and y centered with the coor of each grid
pred_box_xy = tf.math.sigmoid(y_pred[..., :2]) + cell_grid
### adjust w and h, get the normalized w and h of the box
pred_box_wh = tf.math.exp(y_pred[..., 2:4]) * np.reshape(ANCHORS, [1,1,1,BOX,2])
### adjust confidence, whether there is an object(1) or not(0) for each box of each grid
pred_box_conf = tf.math.sigmoid(y_pred[..., 4])
### adjust class probabilities, what kind of the object
pred_box_class = y_pred[..., 5:]
'''
Adjust ground truth, do same thing on our label
'''
### adjust x and y
true_box_xy = y_true[..., 0:2] # relative position to the containing cell
### adjust w and h
true_box_wh = y_true[..., 2:4] # number of cells accross, horizontally and vertically
'''
get boxes information (left bottom point and right upper point) to calculate IOU
'''
true_wh_half = true_box_wh / 2.
true_mins = true_box_xy - true_wh_half # left bottom of a box
true_maxes = true_box_xy + true_wh_half # right upper of a box
pred_wh_half = pred_box_wh / 2.
pred_mins = pred_box_xy - pred_wh_half
pred_maxes = pred_box_xy + pred_wh_half
# calulate intersection of union
intersect_mins = tf.math.maximum(pred_mins, true_mins)
intersect_maxes = tf.math.minimum(pred_maxes, true_maxes)
intersect_wh = tf.math.maximum(intersect_maxes - intersect_mins, 0.)
intersect_areas = intersect_wh[..., 0] * intersect_wh[..., 1]
true_areas = true_box_wh[..., 0] * true_box_wh[..., 1]
pred_areas = pred_box_wh[..., 0] * pred_box_wh[..., 1]
union_areas = pred_areas + true_areas - intersect_areas
iou_scores = tf.math.truediv(intersect_areas, union_areas)
# when there is an object(1) true_box_conf = iou_scores else: true_box_conf=0
true_box_conf = iou_scores * y_true[..., 4]
### adjust class probabilities
true_box_class = tf.math.argmax(y_true[..., 5:], -1)
'''
Determine the masks
'''
### coordinate mask: simply the position of the ground truth boxes (the predictors)
coord_mask = tf.expand_dims(y_true[..., 4], axis=-1) * COORD_SCALE
### confidence mask: penelize predictors + penalize boxes with low IOU
# penalize the confidence of the boxes, which have IOU with some ground truth box < 0.6
true_xy = true_boxes[..., 0:2] # size[1, 1, 1, B, 2]
true_wh = true_boxes[..., 2:4] # size[1, 1, 1, B, 2]
true_wh_half = true_wh / 2.
true_mins = true_xy - true_wh_half
true_maxes = true_xy + true_wh_half
pred_xy = tf.expand_dims(pred_box_xy, 4) # size[1, 1, 1, B, 1, 2]
pred_wh = tf.expand_dims(pred_box_wh, 4) # size[1, 1, 1, B, 1, 2]
pred_wh_half = pred_wh / 2.
pred_mins = pred_xy - pred_wh_half # # size[N, H, W, B, 2]
pred_maxes = pred_xy + pred_wh_half
intersect_mins = tf.math.maximum(pred_mins, true_mins)
intersect_maxes = tf.math.minimum(pred_maxes, true_maxes)
intersect_wh = tf.math.maximum(intersect_maxes - intersect_mins, 0.) # must have some overlap, otherwise 0
intersect_areas = intersect_wh[..., 0] * intersect_wh[..., 1] # size[N, H, W, B, 1]
true_areas = true_wh[..., 0] * true_wh[..., 1]
pred_areas = pred_wh[..., 0] * pred_wh[..., 1]
union_areas = pred_areas + true_areas - intersect_areas
iou_scores = tf.math.truediv(intersect_areas, union_areas) # size[N, H, W, B, 1]
best_ious = tf.math.reduce_max(iou_scores, axis=4) # size[N, H, W, B]
conf_mask = conf_mask + tf.cast(best_ious < 0.6, 'float32') * (1 - y_true[..., 4]) * NO_OBJECT_SCALE
# penalize the confidence of the boxes, which are reponsible for corresponding ground truth box
conf_mask = conf_mask + y_true[..., 4] * OBJECT_SCALE
### class mask: simply the position of the ground truth boxes (the predictors)
class_mask = y_true[..., 4] * tf.gather(CLASS_WEIGHTS, true_box_class) * CLASS_SCALE
"""
Warm-up training
"""
no_boxes_mask = tf.cast(coord_mask < COORD_SCALE/2.,'float32')
seen = tf.assign_add(seen, 1.)
true_box_xy, true_box_wh, coord_mask = tf.cond(tf.math.less(seen, WARM_UP_BATCHES),
lambda: [true_box_xy + (0.5 + cell_grid) * no_boxes_mask,
true_box_wh + tf.ones_like(true_box_wh) * np.reshape(ANCHORS, [1,1,1,BOX,2]) * no_boxes_mask,
tf.ones_like(coord_mask)],
lambda: [true_box_xy,
true_box_wh,
coord_mask])
"""
Finalize the loss
"""
nb_coord_box = tf.math.reduce_sum(tf.compat.v1.to_float(coord_mask > 0.0))
nb_conf_box = tf.math.reduce_sum(tf.compat.v1.to_float(conf_mask > 0.0))
nb_class_box = tf.math.reduce_sum(tf.compat.v1.to_float(class_mask > 0.0))
loss_xy = tf.math.reduce_sum(tf.math.square(true_box_xy-pred_box_xy) * coord_mask) / (nb_coord_box + 1e-6) / 2.
loss_wh = tf.math.reduce_sum(tf.math.square(true_box_wh-pred_box_wh) * coord_mask) / (nb_coord_box + 1e-6) / 2.
loss_conf = tf.math.reduce_sum(tf.math.square(true_box_conf-pred_box_conf) * conf_mask) / (nb_conf_box + 1e-6) / 2.
loss_class = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=true_box_class, logits=pred_box_class)
loss_class = tf.math.reduce_sum(loss_class * class_mask) / (nb_class_box + 1e-6)
loss = loss_xy + loss_wh + loss_conf + loss_class
nb_true_box = tf.math.reduce_sum(y_true[..., 4])
nb_pred_box = tf.math.reduce_sum(tf.cast(true_box_conf > 0.5,'float32') * tf.cast(pred_box_conf > 0.3,'float32'))
"""
Debugging code
"""
current_recall = nb_pred_box/(nb_true_box + 1e-6)
total_recall = tf.assign_add(total_recall, current_recall)
loss = tf.compat.v1.Print(loss, [tf.zeros((1))], message='Dummy Line \t', summarize=1000)
loss = tf.compat.v1.Print(loss, [loss_xy], message='Loss XY \t', summarize=1000)
loss = tf.compat.v1.Print(loss, [loss_wh], message='Loss WH \t', summarize=1000)
loss = tf.compat.v1.Print(loss, [loss_conf], message='Loss Conf \t', summarize=1000)
loss = tf.compat.v1.Print(loss, [loss_class], message='Loss Class \t', summarize=1000)
loss = tf.compat.v1.Print(loss, [loss], message='Total Loss \t', summarize=1000)
loss = tf.compat.v1.Print(loss, [current_recall], message='Current Recall \t', summarize=1000)
loss = tf.compat.v1.Print(loss, [total_recall/seen], message='Average Recall \t', summarize=1000)
return loss
generator_config = {
'IMAGE_H' : IMAGE_H,
'IMAGE_W' : IMAGE_W,
'GRID_H' : GRID_H,
'GRID_W' : GRID_W,
'BOX' : BOX,
'LABELS' : LABELS,
'CLASS' : len(LABELS),
'ANCHORS' : ANCHORS,
'BATCH_SIZE' : BATCH_SIZE,
'TRUE_BOX_BUFFER' : 50,
}
def normalize(image):
return image / 255.
load all images and create trainset and validset
# Get
train_imgs, seen_train_labels = parse_annotation(train_annot_folder, train_image_folder, labels=LABELS)
train_batch = BatchGenerator(train_imgs, generator_config, norm=normalize)
valid_imgs, seen_valid_labels = parse_annotation(valid_annot_folder, valid_image_folder, labels=LABELS)
valid_batch = BatchGenerator(valid_imgs, generator_config, norm=normalize, jitter=False)
Set serval callbacks to store weights after each epoch and early stop the training in some cases
Finally, we have the yolo network after 26 epochs training.
# callbacks
early_stop = EarlyStopping(monitor='val_loss',
min_delta=0.001,
patience=3,
mode='min',
verbose=1)
checkpoint = ModelCheckpoint('weights.h5',
monitor='val_loss',
verbose=1,
save_best_only=True,
mode='min',
period=1)
tensorboard = TensorBoard(log_dir='./logs',
histogram_freq=0,
write_graph=True,
write_images=False)
optimizer = Adam(lr=0.5e-4, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
#optimizer = SGD(lr=1e-4, decay=0.0005, momentum=0.9)
#optimizer = RMSprop(lr=1e-4, rho=0.9, epsilon=1e-08, decay=0.0)
model.compile(loss=custom_loss, optimizer=optimizer)
model.fit_generator(generator = train_batch,
steps_per_epoch = len(train_batch),
epochs = 100,
verbose = 1,
validation_data = valid_batch,
validation_steps = len(valid_batch),
callbacks = [early_stop, checkpoint],
max_queue_size = 3)
Here, we use the trained weights to detect cells in an image
model.load_weights('weights.h5')
image = cv2.imread('/content/BCCD_Dataset/BCCD/JPEGImages/BloodImage_00001.jpg')
#image = cv2.imread('/content/license-plate-dataset/dataset/valid/images/dayride_type1_001.mp4#t=1135.jpg')
dummy_array = np.zeros((1,1,1,1,TRUE_BOX_BUFFER,4))
plt.figure(figsize=(10,10), dpi=100)
# read image
input_image = cv2.resize(image, (416,416))
input_image = input_image / 255.
input_image = input_image[:,:,::-1]
input_image = np.expand_dims(input_image, 0)
# do prediction and decode output to be predicted bounding boxes
netout = model.predict([input_image, dummy_array])
boxes = decode_netout(netout[0],
obj_threshold=0.2,
nms_threshold=NMS_THRESHOLD,
anchors=ANCHORS,
nb_class=CLASS)
plt.subplot(121)
plt.imshow(image[:,:,::-1]);
plt.title('raw image')
plt.subplot(122)
image = draw_boxes(image, boxes, labels=LABELS)
plt.imshow(image[:,:,::-1])
plt.title('labeled image')
We think some small fossils in a rock is similar to blood cells, so try to apply yolo to detect fossils.
image = cv2.imread('2.png')
#image = cv2.imread('/content/license-plate-dataset/dataset/valid/images/dayride_type1_001.mp4#t=1135.jpg')
dummy_array = np.zeros((1,1,1,1,TRUE_BOX_BUFFER,4))
plt.figure(figsize=(10,10), dpi=100)
input_image = cv2.resize(image, (416,416))
input_image = input_image / 255.
input_image = input_image[:,:,::-1]
input_image = np.expand_dims(input_image, 0)
netout = model.predict([input_image, dummy_array])
boxes = decode_netout(netout[0],
obj_threshold=0.2,
nms_threshold=NMS_THRESHOLD,
anchors=ANCHORS,
nb_class=CLASS)
plt.subplot(121)
plt.imshow(image[:,:,::-1]);
plt.title('raw image')
plt.subplot(122)
image = draw_boxes(image, boxes, labels=LABELS)
plt.imshow(image[:,:,::-1])
plt.title('labeled image')
Thanks for keras-yolo2 in github