2016-03-23

caffe学习（六）创建自己的模型

最近阅读文章《Supervised Learning of Semantics-Preserving Hashing via Deep Neural Networks for Large-Scale Image Search》，对应的开源代码有两个版本https://github.com/kevinlin311tw/caffe-cvprw15，https://github.com/kevinlin311tw/Caffe-DeepBinaryCode，两者之间的主要区别在于后者使用了更为复杂的目标函数，而不是简单的做了一层sigmoid。由于其在alexnet上进行了微小改动，在fc7和fc8层之间添加了一个latent layer，将原本用于检索的4096特征维度哈希为128维的二值特征，取得了不错的效果，借鉴这篇文章的代码，我实验了一下如何创建自己的模型，添加自己的layer

预训练模型

不同于文章使用的alexnet，我的实验使用的是更为复杂的googLeNet。首先要训练好一个模型，对应一个caffemodel文件，和其训练测试对应的prototxt文件，并在此之上进行fine-tune。

准备训练数据

跟普通的分类训练没有区别，也要准备lmdb格式的文件以及均值文件。

修改train.prototxt

原本的层级结构是pool5/7x7_s1 --> loss3/classifier --> SoftmaxWithLoss & Accuracy，改动后变成pool5/7x7_s1 --> latent_SSDH --> latent_SSDH_encode --> loss3/classifier_change --> SoftmaxWithLoss & Accuracy。具体就是添加了一层InnerProduct，输出数目为哈希的位数，后面接一层Sigmoid层将其二值化，再接到原本的loss3/classifier层，并对该层进行finetune，调整该层的训练参数。改动代码如下：

#########added by yx followed by layer "pool5/drop_7x7_s1"
layer {
  name: "latent_SSDH"
  type: "InnerProduct"
  bottom: "pool5/7x7_s1"
  top: "latent_SSDH"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 128
    weight_filler {
      type: "gaussian"
      std: 0.005
    }
    bias_filler {
      type: "constant"
      value: 1
    }
  }
}
layer {
  name: "latent_SSDH_encode"
  bottom: "latent_SSDH"
  top: "latent_SSDH_encode"
  type: "Sigmoid"
}
##########################

并把loss3/classifier层的名字改为loss3/classifier_change，bottom层改为latent_SSDH_encode，top层改为loss3/classifier_change，把lr_mult参数*10。

最后把SoftmaxWithLoss层以及Accuracy层的bottom改为新层名loss3/classifier_change。

这里详细介绍一下weight_filter的初始化细节，一般有gaussian和xavier两种方式：
xavier具体方式是从[-scale, +scale]中进行均匀采样，对卷积层或全连接层中参数进行初始化的方法。
其中scale = \sqrt(3 / n), n根据不同实现可设置为n=(num_in + num_out) / 2 (Understanding the difficulty of training deep feedforward neural networks )，或n=num_out (caffe最初实现方法)

修改solver.prototxt

将test_initialization屏蔽掉
weight_decay改为0.0005
gamma改为0.1
base_lr参数降低10倍，改为0.001

修改deploy.prototxt

因为原本的deploy是要做分类的任务，而我这里是要做哈希特征提取，所以要把原来的最后两层loss3/classifier_change和Softmax屏蔽掉，并添加如下代码

############## added by yx #########
layer {
  name: "latent_SSDH"
  type: "InnerProduct"
  bottom: "pool5/7x7_s1"
  top: "latent_SSDH"
  inner_product_param {
    num_output: 128
  }
}
layer {
  name: "latent_SSDH_encode"
  type: "Sigmoid"
  bottom: "latent_SSDH"
  top: "latent_SSDH_encode"
}

训练命令

nohup ../../build/tools/caffe train --solver=solver.prototxt -weights sku30450_googlenet_quick_iter_500000.caffemodel -gpu 1 &

杨现的个人博客

分享计算机视觉、算法、生活累积的点滴