個人環境配置是python 3.6.2 + opencv 3.4.5。python
下面是crowd counting計算人羣密度圖的代碼。git
# coding:utf-8 from __future__ import print_function import numpy as np import pylab import matplotlib.pyplot as plt import cv2 from cv2 import dnn import time cm_path = 'C:\\Users\\admin\\Desktop\\' if __name__ == "__main__": fn = r'C:\Users\admin\Desktop\ShanghaiTech_Crowd_Counting_Dataset\part_B_final\test_data\images\IMG_191.jpg' im_ori = cv2.imread(fn) plt.figure(1) plt.imshow(im_ori) plt.axis('off') pylab.show() blob = dnn.blobFromImage(im_ori, 1, (1024, 768), (0, 0, 0), True) print("Input:", blob.shape, blob.dtype) net = dnn.readNetFromCaffe(cm_path + 'B_testdemo.prototxt', cm_path + 'B2_iter_93000.caffemodel') t = time.time() net.setInput(blob) density = net.forward() elapsed = time.time() - t print('inference image: %.4f seconds.' % elapsed) density = density/1000.0 print("Output:", density.shape, density.dtype) person_num = np.sum(density[:]) print("number: ",person_num) plt.figure(1) plt.imshow(density[0, 0, :, :]) plt.axis('off') pylab.show()
dnn.blobFromImage(input_img,scalefactor, (width, height), mean, swapRB)github
mean和scalefactor是用來對圖像作標準化的,先減均值,再乘以一個係數。images -= mean;images *= scalefactor網絡
swapRB是選擇是否交換R與B顏色通道,opencv默認讀取的圖片是BGR格式,而訓練模型時,每每是轉換成RGB輸入,因此這裏一般設置爲True,調換R與B通道。ui
dnn.readNetFromCaffe(modelTxt, caffe_modelBin).net
輸入的兩個參數分別是網絡結構.prototxt文件和模型文件。code
程序運行結果以下,能夠看到網絡模型的輸入格式是N*C*H*W(Numbers*Channels*Height*Width)blog
C:\Users\admin\Desktop\Crowd-Counting-master>python opencv_caffe_crowd_density_map.py Input: (1, 3, 768, 1024) float32 inference image: 0.2005 seconds. Output: (1, 1, 192, 256) float32 number: 285.41965
tensorflow 運行.pb模型,前向運行100次耗時0.691 s圖片
init = tf.global_variables_initializer() sess.run(init) input_x = sess.graph.get_tensor_by_name("input_x:0") out_softmax = sess.graph.get_tensor_by_name("predictions/Reshape_1:0") img = cv2.imread(jpg_path) img_ori = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) test_img = cv2.resize(img_ori, (width, height)) test_img = np.asarray(test_img, np.float32) test_img = test_img[np.newaxis, :] / 255. time_start = time.time() img_out_softmax = sess.run(out_softmax, feed_dict={input_x:test_img}) time_end = time.time() print('run time: ', time_end - time_start, 's') print("pred:",img_out_softmax)
run time: 0.04388284683227539 s pred: [[9.8955745e-01 1.0416225e-02 2.6317310e-05]]
// example String weights = "nn.pb"; dnn::Net net = cv::dnn::readNetFromTensorflow(weights); Mat img = imread(files[i], 1); Mat inputBlob = dnn::blobFromImage(img, 0.00390625f, Size(256, 256), Scalar(), false,false); net.setInput(inputBlob, "data");//set the network input, "data" is the name of the input layer Mat pred = net.forward("fc2/prob");
opencv調用tensorflow的.pb模型也是相似的。以圖像分類爲例,以下所示:utf-8
img = cv2.imread(jpg_path) net = dnn.readNetFromTensorflow(pb_file_path) net.setInput(cv2.dnn.blobFromImage(img, 1/255.0, (width, height), (0, 0, 0), swapRB=True, crop=False)) time_start = time.time() pred = net.forward() time_end = time.time() print('run time: ', time_end - time_start, 's') print("pred:",pred)
這裏作的是三分類,運行100次耗時0.279 s,計算耗時減小了一半左右。
run time: 0.023903846740722656 s pred: [[9.8955745e-01 1.0416246e-02 2.6317308e-05]]
【參考資料】
[1] 使用OpenCV_python中的DNN調用CaffeModel識別圖像
[2] https://github.com/linzhirui1992/Crowd-Counting
[3] https://github.com/opencv/opencv/wiki/TensorFlow-Object-Detection-API