使用VGG訓練Imagenet

時間 2020-02-04 標籤使用 vgg 訓練 imagenet

使用VGG訓練Imagenet

之前的筆記，放上來吧~~~php

準備數據

具體官網地址，請點擊這裏git

ImageNet官網github

訓練數據集：ILSVRC2012_img_train.tarweb

驗證數據集：ILSVRC2012_img_val.tar數據庫

數據解壓

sudo tar –xvf ILSVRC2012_img_train.tar -C ./train
sudo tar –xvf ILSVRC2012_img_val.tar -C ./valbash

對於val數據集，解壓之後是全部的驗證集圖片，共50000張，大約6.3G。服務器

對於train數據集，解壓後是1000個tar文件，每一個tar文件表示1000類裏的一個類，共138G，對於1000個子tar，須要再次解壓，解壓腳本unzip.sh以下ide

dir=/home/satisfie/imagenet/train #satisfie 是個人用戶名
for x in `ls *.tar`
do
    filename=`basename $x .tar`   #注意空格
    mkdir $filename
    tar -xvf $x -C ./$filename
done

i7 6700K配合個人500G固態硬盤解壓超快，到這原始數據就準備好了，分別放在svg

/home/satisfie/imagenet/train：裏面有1000個文件夾，每一個文件夾下爲JPG圖片測試
/home/satisfie/imagenet/val ：裏面有驗證集的50000張圖片

接下來下載標籤等其餘說明數據~~~

下載其餘數據

進入大caffe根目錄，執行/data/ilsvrc12/get_ilsvrc_aux.sh下載其餘數據，包括

det_synset_words.txt
synset_words.txt— 1000個類別的文件夾名稱及真是物體的名稱，好比「n01440764 tench Tinca tinca」,在訓練中，這些都當作一個類別。
synsets.txt — 1000個類別的文件夾名稱,好比」n01440764」…
train.txt — 1000個類別每張圖片的名字及其標籤，好比「n01440764/n01440764_10026.JPEG 0」共有1281167張圖片
val.txt — 同上，總共有50000張。好比「ILSVRC2012_val_00000001.JPEG 65」
test.txt — 同上，爲測試集合,總有100000張
imagenet_mean.binaryproto — 模型圖片的各個通道均值
imagenet.bet.pickle

模型的訓練

訓練數據準備

因爲轉化爲lmdb數據庫格式須要耗費較大的空間，且不支持shuffle等操做，因此這裏直接讀取原圖片，使用的類型是ImageData，具體看下面的prototxt

其中的train_new.txt中對每張圖片的加上了絕對值路徑，這樣才能被讀取。
使用sed命令便可，

sed 's/^/\/home\/satisfie\/imagenet\/val\/&/g' val.txt >val_new.txt

VGG_train_val.prototxt

name: "VGG_ILSVRC_16_layers"
layer {
  name: "data"
  type: "ImageData"
  include {
    phase: TRAIN
  }
 transform_param {
    #crop_size: 224
    mean_value: 104
    mean_value: 117
    mean_value: 123
    mirror: true
 }
 image_data_param {
    source: "/home/satisfie/imagenet/train_new.txt"
    batch_size: 8
    new_height: 224
    new_width: 224
  }
  top: "data"
  top: "label"
}
layer {
  name: "data"
  type: "ImageData"
  include {
    phase: TEST
  }
 transform_param {
    #crop_size: 224
    mean_value: 104
    mean_value: 117
    mean_value: 123
    mirror: false
 }
 image_data_param {
    source: "/home/satisfie/imagenet/val_new.txt"
    batch_size: 4
    new_height: 224
    new_width: 224
  }
  top: "data"
  top: "label"
}
layer {
  bottom: "data"
  top: "conv1_1"
  name: "conv1_1"
  type: "Convolution"
  convolution_param {
    num_output: 64
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv1_1"
  top: "conv1_1"
  name: "relu1_1"
  type: "ReLU"
}
layer {
  bottom: "conv1_1"
  top: "conv1_2"
  name: "conv1_2"
  type: "Convolution"
  convolution_param {
    num_output: 64
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv1_2"
  top: "conv1_2"
  name: "relu1_2"
  type: "ReLU"
}
layer {
  bottom: "conv1_2"
  top: "pool1"
  name: "pool1"
  type: "Pooling"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  bottom: "pool1"
  top: "conv2_1"
  name: "conv2_1"
  type: "Convolution"
  convolution_param {
    num_output: 128
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
}
layer {
  bottom: "conv2_1"
  top: "conv2_1"
  name: "relu2_1"
  type: "ReLU"
}
layer {
  bottom: "conv2_1"
  top: "conv2_2"
  name: "conv2_2"
  type: "Convolution"
  convolution_param {
    num_output: 128
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv2_2"
  top: "conv2_2"
  name: "relu2_2"
  type: "ReLU"
}
layer {
  bottom: "conv2_2"
  top: "pool2"
  name: "pool2"
  type: "Pooling"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  bottom: "pool2"
  top: "conv3_1"
  name: "conv3_1"
  type: "Convolution"
  convolution_param {
    num_output: 256
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv3_1"
  top: "conv3_1"
  name: "relu3_1"
  type: "ReLU"
}
layer {
  bottom: "conv3_1"
  top: "conv3_2"
  name: "conv3_2"
  type: "Convolution"
  convolution_param {
    num_output: 256
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv3_2"
  top: "conv3_2"
  name: "relu3_2"
  type: "ReLU"
}
layer {
  bottom: "conv3_2"
  top: "conv3_3"
  name: "conv3_3"
  type: "Convolution"
  convolution_param {
    num_output: 256
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv3_3"
  top: "conv3_3"
  name: "relu3_3"
  type: "ReLU"
}
layer {
  bottom: "conv3_3"
  top: "pool3"
  name: "pool3"
  type: "Pooling"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  bottom: "pool3"
  top: "conv4_1"
  name: "conv4_1"
  type: "Convolution"
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv4_1"
  top: "conv4_1"
  name: "relu4_1"
  type: "ReLU"
}
layer {
  bottom: "conv4_1"
  top: "conv4_2"
  name: "conv4_2"
  type: "Convolution"
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv4_2"
  top: "conv4_2"
  name: "relu4_2"
  type: "ReLU"
}
layer {
  bottom: "conv4_2"
  top: "conv4_3"
  name: "conv4_3"
  type: "Convolution"
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv4_3"
  top: "conv4_3"
  name: "relu4_3"
  type: "ReLU"
}
layer {
  bottom: "conv4_3"
  top: "pool4"
  name: "pool4"
  type: "Pooling"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  bottom: "pool4"
  top: "conv5_1"
  name: "conv5_1"
  type: "Convolution"
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv5_1"
  top: "conv5_1"
  name: "relu5_1"
  type: "ReLU"
}
layer {
  bottom: "conv5_1"
  top: "conv5_2"
  name: "conv5_2"
  type: "Convolution"
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv5_2"
  top: "conv5_2"
  name: "relu5_2"
  type: "ReLU"
}
layer {
  bottom: "conv5_2"
  top: "conv5_3"
  name: "conv5_3"
  type: "Convolution"
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv5_3"
  top: "conv5_3"
  name: "relu5_3"
  type: "ReLU"
}
layer {
  bottom: "conv5_3"
  top: "pool5"
  name: "pool5"
  type: "Pooling"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  bottom: "pool5"
  top: "fc6"
  name: "fc6"
  type: "InnerProduct"
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "fc6"
  top: "fc6"
  name: "relu6"
  type: "ReLU"
}
layer {
  bottom: "fc6"
  top: "fc6"
  name: "drop6"
  type: "Dropout"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  bottom: "fc6"
  top: "fc7"
  name: "fc7"
  type: "InnerProduct"
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "fc7"
  top: "fc7"
  name: "relu7"
  type: "ReLU"
}
layer {
  bottom: "fc7"
  top: "fc7"
  name: "drop7"
  type: "Dropout"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc8"
  bottom: "fc7"
  top: "fc8"
  type: "InnerProduct"
  inner_product_param {
    num_output: 1000
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc8"
  bottom: "label"
  top: "loss/loss"
}
layer {
  name: "accuracy/top1"
  type: "Accuracy"
  bottom: "fc8"
  bottom: "label"
  top: "accuracy@1"
  include: { phase: TEST }
  accuracy_param {
    top_k: 1
  }
}
layer {
  name: "accuracy/top5"
  type: "Accuracy"
  bottom: "fc8"
  bottom: "label"
  top: "accuracy@5"
  include: { phase: TEST }
  accuracy_param {
    top_k: 5
  }
}

solver.prototxt

net: "models/vgg/train_val.prototxt"
test_iter: 10000
test_interval: 40000
test_initialization: false
display: 200
base_lr: 0.0001
lr_policy: "step"
stepsize: 320000
gamma: 0.96
max_iter: 10000000
momentum: 0.9
weight_decay: 0.0005
snapshot: 800000
snapshot_prefix: "models/vgg/vgg"
solver_mode: GPU

finetuning

模型太大，試了下，在GTX980的4G顯存下，batchsize只能設置爲8或者16這麼小。。。
大模型仍是得服務器並行，直接在原有的模型上finetuning

VGG_ILSVRC_16_layers_deploy.prototxt

#!/usr/bin/env sh
set -e

TOOLS=./build/tools

GLOG_logtostderr=0 GLOG_log_dir=models/vgg/Log/ \
$TOOLS/caffe train \
    --solver=models/vgg/solver.prototxt \
    --weights models/vgg/VGG_ILSVRC_16_layers.caffemodel