使用VGG訓練Imagenet

使用VGG訓練Imagenet

之前的筆記,放上來吧~~~php

準備數據

具體官網地址,請點擊這裏git

ImageNet官網github

訓練數據集:ILSVRC2012_img_train.tarweb

驗證數據集:ILSVRC2012_img_val.tar數據庫

數據解壓

sudo tar –xvf ILSVRC2012_img_train.tar -C ./train
sudo tar –xvf ILSVRC2012_img_val.tar -C ./valbash

對於val數據集,解壓之後是全部的驗證集圖片,共50000張,大約6.3G。服務器

對於train數據集,解壓後是1000個tar文件,每一個tar文件表示1000類裏的一個類,共138G,對於1000個子tar,須要再次解壓,解壓腳本unzip.sh以下ide

dir=/home/satisfie/imagenet/train #satisfie 是個人用戶名
for x in `ls *.tar`
do
    filename=`basename $x .tar`   #注意空格
    mkdir $filename
    tar -xvf $x -C ./$filename
done

i7 6700K配合個人500G固態硬盤解壓超快,到這原始數據就準備好了,分別放在svg

  • /home/satisfie/imagenet/train:裏面有1000個文件夾,每一個文件夾下爲JPG圖片測試

  • /home/satisfie/imagenet/val :裏面有驗證集的50000張圖片

接下來下載標籤等其餘說明數據~~~

下載其餘數據

進入大caffe根目錄,執行/data/ilsvrc12/get_ilsvrc_aux.sh下載其餘數據,包括

  • det_synset_words.txt
  • synset_words.txt— 1000個類別的文件夾名稱及真是物體的名稱,好比 「n01440764 tench Tinca tinca」,在訓練中,這些都當作一個類別。
  • synsets.txt — 1000個類別的文件夾名稱,好比」n01440764」…
  • train.txt — 1000個類別每張圖片的名字及其標籤,好比 「n01440764/n01440764_10026.JPEG 0」 共有1281167張圖片
  • val.txt — 同上,總共有50000張。好比「ILSVRC2012_val_00000001.JPEG 65」
  • test.txt — 同上,爲測試集合,總有100000張
  • imagenet_mean.binaryproto — 模型圖片的各個通道均值
  • imagenet.bet.pickle

模型的訓練

訓練數據準備

因爲轉化爲lmdb數據庫格式須要耗費較大的空間,且不支持shuffle等操做,因此這裏直接讀取原圖片,使用的類型是ImageData,具體看下面的prototxt

其中的train_new.txt中對每張圖片的加上了絕對值路徑,這樣才能被讀取。
使用sed命令便可,

sed 's/^/\/home\/satisfie\/imagenet\/val\/&/g' val.txt >val_new.txt

VGG_train_val.prototxt

name: "VGG_ILSVRC_16_layers"
layer {
  name: "data"
  type: "ImageData"
  include {
    phase: TRAIN
  }
 transform_param {
    #crop_size: 224
    mean_value: 104
    mean_value: 117
    mean_value: 123
    mirror: true
 }
 image_data_param {
    source: "/home/satisfie/imagenet/train_new.txt"
    batch_size: 8
    new_height: 224
    new_width: 224
  }
  top: "data"
  top: "label"
}
layer {
  name: "data"
  type: "ImageData"
  include {
    phase: TEST
  }
 transform_param {
    #crop_size: 224
    mean_value: 104
    mean_value: 117
    mean_value: 123
    mirror: false
 }
 image_data_param {
    source: "/home/satisfie/imagenet/val_new.txt"
    batch_size: 4
    new_height: 224
    new_width: 224
  }
  top: "data"
  top: "label"
}
layer {
  bottom: "data"
  top: "conv1_1"
  name: "conv1_1"
  type: "Convolution"
  convolution_param {
    num_output: 64
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv1_1"
  top: "conv1_1"
  name: "relu1_1"
  type: "ReLU"
}
layer {
  bottom: "conv1_1"
  top: "conv1_2"
  name: "conv1_2"
  type: "Convolution"
  convolution_param {
    num_output: 64
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv1_2"
  top: "conv1_2"
  name: "relu1_2"
  type: "ReLU"
}
layer {
  bottom: "conv1_2"
  top: "pool1"
  name: "pool1"
  type: "Pooling"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  bottom: "pool1"
  top: "conv2_1"
  name: "conv2_1"
  type: "Convolution"
  convolution_param {
    num_output: 128
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
}
layer {
  bottom: "conv2_1"
  top: "conv2_1"
  name: "relu2_1"
  type: "ReLU"
}
layer {
  bottom: "conv2_1"
  top: "conv2_2"
  name: "conv2_2"
  type: "Convolution"
  convolution_param {
    num_output: 128
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv2_2"
  top: "conv2_2"
  name: "relu2_2"
  type: "ReLU"
}
layer {
  bottom: "conv2_2"
  top: "pool2"
  name: "pool2"
  type: "Pooling"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  bottom: "pool2"
  top: "conv3_1"
  name: "conv3_1"
  type: "Convolution"
  convolution_param {
    num_output: 256
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv3_1"
  top: "conv3_1"
  name: "relu3_1"
  type: "ReLU"
}
layer {
  bottom: "conv3_1"
  top: "conv3_2"
  name: "conv3_2"
  type: "Convolution"
  convolution_param {
    num_output: 256
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv3_2"
  top: "conv3_2"
  name: "relu3_2"
  type: "ReLU"
}
layer {
  bottom: "conv3_2"
  top: "conv3_3"
  name: "conv3_3"
  type: "Convolution"
  convolution_param {
    num_output: 256
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv3_3"
  top: "conv3_3"
  name: "relu3_3"
  type: "ReLU"
}
layer {
  bottom: "conv3_3"
  top: "pool3"
  name: "pool3"
  type: "Pooling"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  bottom: "pool3"
  top: "conv4_1"
  name: "conv4_1"
  type: "Convolution"
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv4_1"
  top: "conv4_1"
  name: "relu4_1"
  type: "ReLU"
}
layer {
  bottom: "conv4_1"
  top: "conv4_2"
  name: "conv4_2"
  type: "Convolution"
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv4_2"
  top: "conv4_2"
  name: "relu4_2"
  type: "ReLU"
}
layer {
  bottom: "conv4_2"
  top: "conv4_3"
  name: "conv4_3"
  type: "Convolution"
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv4_3"
  top: "conv4_3"
  name: "relu4_3"
  type: "ReLU"
}
layer {
  bottom: "conv4_3"
  top: "pool4"
  name: "pool4"
  type: "Pooling"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  bottom: "pool4"
  top: "conv5_1"
  name: "conv5_1"
  type: "Convolution"
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv5_1"
  top: "conv5_1"
  name: "relu5_1"
  type: "ReLU"
}
layer {
  bottom: "conv5_1"
  top: "conv5_2"
  name: "conv5_2"
  type: "Convolution"
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv5_2"
  top: "conv5_2"
  name: "relu5_2"
  type: "ReLU"
}
layer {
  bottom: "conv5_2"
  top: "conv5_3"
  name: "conv5_3"
  type: "Convolution"
  convolution_param {
    num_output: 512
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "conv5_3"
  top: "conv5_3"
  name: "relu5_3"
  type: "ReLU"
}
layer {
  bottom: "conv5_3"
  top: "pool5"
  name: "pool5"
  type: "Pooling"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  bottom: "pool5"
  top: "fc6"
  name: "fc6"
  type: "InnerProduct"
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "fc6"
  top: "fc6"
  name: "relu6"
  type: "ReLU"
}
layer {
  bottom: "fc6"
  top: "fc6"
  name: "drop6"
  type: "Dropout"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  bottom: "fc6"
  top: "fc7"
  name: "fc7"
  type: "InnerProduct"
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  bottom: "fc7"
  top: "fc7"
  name: "relu7"
  type: "ReLU"
}
layer {
  bottom: "fc7"
  top: "fc7"
  name: "drop7"
  type: "Dropout"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc8"
  bottom: "fc7"
  top: "fc8"
  type: "InnerProduct"
  inner_product_param {
    num_output: 1000
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  param {
    lr_mult: 1
    decay_mult :1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc8"
  bottom: "label"
  top: "loss/loss"
}
layer {
  name: "accuracy/top1"
  type: "Accuracy"
  bottom: "fc8"
  bottom: "label"
  top: "accuracy@1"
  include: { phase: TEST }
  accuracy_param {
    top_k: 1
  }
}
layer {
  name: "accuracy/top5"
  type: "Accuracy"
  bottom: "fc8"
  bottom: "label"
  top: "accuracy@5"
  include: { phase: TEST }
  accuracy_param {
    top_k: 5
  }
}

solver.prototxt

net: "models/vgg/train_val.prototxt"
test_iter: 10000
test_interval: 40000
test_initialization: false
display: 200
base_lr: 0.0001
lr_policy: "step"
stepsize: 320000
gamma: 0.96
max_iter: 10000000
momentum: 0.9
weight_decay: 0.0005
snapshot: 800000
snapshot_prefix: "models/vgg/vgg"
solver_mode: GPU

finetuning

模型太大,試了下,在GTX980的4G顯存下,batchsize只能設置爲8或者16這麼小。。。
大模型仍是得服務器並行,直接在原有的模型上finetuning

VGG_ILSVRC_16_layers_deploy.prototxt

#!/usr/bin/env sh
set -e

TOOLS=./build/tools

GLOG_logtostderr=0 GLOG_log_dir=models/vgg/Log/ \
$TOOLS/caffe train \
    --solver=models/vgg/solver.prototxt \
    --weights models/vgg/VGG_ILSVRC_16_layers.caffemodel