This blog is used to record the progress in my Master Project:”Efficient Human Re-Identification in Video Sequences”.
Current rank-1 accuracy: 94.23%.
Coraje y Corazón
This blog is used to record the progress in my Master Project:”Efficient Human Re-Identification in Video Sequences”.
Current rank-1 accuracy: 94.23%.
Kaggle is a brave new world for me. Sometimes I really miss the programming/project competitions I played in my first Uni. I should know this website earlier. Anyway, it’s time to start a new journey.
As a “hello world”-level competition, Data Science London + Scikit-learn is binary classification task, which helps me to practice the PCA & SVM, as well as the GMM and QQ-plot. Here are my notes from this simple competition.
Final result:
Source Code: here
In these days ,I am trying to integrate my re-id network. Then I found the official example of Siamese CNNs of the Keras based on the MNIST dataset.
Actually it is a very good example about how to integrate multiple networks. But there are 2 annoying bugs..
LinHungShi pointed this bug in This issue. Therefore, the correct contrastive loss function should be:1
2
3
4
5
6
7def contrastive_loss(y_true, y_pred):
'''Contrastive loss from Hadsell-et-al.'06
http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf
'''
margin = 1
return K.mean((1 - y_true) * K.square(y_pred) +
y_true * K.square(K.maximum(margin - y_pred, 0)))
The original accuracy function in this example is:1
2
3
4def compute_accuracy(predictions, labels):
'''Compute classification accuracy with a fixed threshold on distances.
'''
return labels[predictions.ravel() < 0.5].mean()
Actually this code cannot calculate the accuracy. The output is the mean value of labels that with the false index. For example:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15In [1]: import numpy as np
In [2]: def compute_accuracy(predictions, labels):
...: return labels[predictions.ravel() < 0.5].mean()
...: a=np.array([1,0,0,1])
...: b=np.array([0,1,1,0])
...: print(compute_accuracy(a,b))
...:
1.0
In [3]: print(a.ravel()<0.5)
[False True True False]
In [5]: print (b[a.ravel()<0.5])
[1 1]
The accuracy should be 0 because 2 arrays (a and b) are totally different. It calculate the mean of b[a.ravel()<0.5].
I saw there is a fix which gives a correct method of computing the accuracy, but they reverted this fix and I don’t know why.
Anyway, this sample code is a good example about how to set my own loss function, and how to integrate multiple networks into one model.
1 | import matplotlib.pyplot as plt |
The output look like this:
The axis looks not good. So just turn it off by one line:1
2
3
4
5
6
7import matplotlib.pyplot as plt
import matplotlib.image as mpimg
path='/Users/typewind/Desktop/xpy/test.jpg'
img = mpimg.imread(path)
# Turn off the axis
plt.axis("off")
plt.imshow(img)
Use the open() function to read the binary file. Each hex bit will be saved as one element in the list. Here is an example about how to read the first 8 bit hex code and transform it to decimal int number.1
2
3
4
5
6def read_8_bit_data(file):
with open(file) as f:
# 32 bit signed interger = 8 bit hex bit
data_8_bit_hex = f.read(8)
data_8_bit = int(data_size_hex, 16)
return data_8_bit
Basic installation tutorial of TF-Object Detection used apt-get
to install packages. I tried use brew
to replace it but some packages are missed. Fortunately, we can use apt-get
after installing the Fink.
Fink provided a helper script to install all things automatically. The Installation cost nearly half an hour on my computer, here is the whole process:
~/Desktop/Fink.sh
~/Desktop/Fink.sh
again to continue the process of Fink.
- I’ll ask you some questions and update the configuration file in
‘/sw/etc/fink.conf’.
Here just need to press “Enter” again and again to leave them at its default. Then have a coffee and wait for a few minutes :)
After the installation is done, remember to restart the terminal and run fink update-all
to active the fink. Now apt-get is available on Mac OS :D
The basic version of Tensorflow didn’t contain the models. So we should run1
2# From /tensorflow
git clone https://github.com/tensorflow/models
Then compile the photoruf libraries by running:1
2# From tensorflow/models
protoc object_detection/protos/*.proto --python_out=.
If photoruf was not installed, then run1
brew photoruf
to install it first.
Then set the pre-trained model path by:1
2# From tensorflow/models/
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
From this line, we can find it is obvious that object detection use the TF-slim to extract the features.
To test whether the model has been installed successfully or not, run:1
python object_detection/builders/model_builder_test.py
under your environment. If you use the VirtualEnv, remember run this first:1
source ~/tensorflow/bin/active
I’m following the CS20SI course and found something maybe useless but interesting.
The definition of tf.ones_like is:1
2
3
4
5
6ones_like(
tensor,
dtype=None,
name=None,
optimize=True
)
For example, given a tensor a that:1
a=tf.constant([1,2,3],[4,5,6])
The tf.ones_like can return a tensor with the same shape of a but all values are set to 1.1
b=tf.ones_like(a, tf.int16)
The ourpur of b should be:1
2[[1 1 1]
[1 1 1]]
As an accident, I found we can use the numbers instead of the data type and get the same output, for example:
1 | >>> b=tf.ones_like(a,1) |
By check the core code of tensorflow, the proto of enum data type as shown below:
1 | DT_FLOAT = 1; |
That’s why we can pass several constant numbers to the dtype. Notice that the definition of tf.ones_like’s dtype is
dtype: A type for the returned Tensor. Must be float32, float64, int8, int16, int32, int64, uint8, complex64, complex128 or bool.
Thus, the available numbers are 1,2,3,4,5,6,8,9,10,18. (7 is string)
The string of dtype can be replaced by the key value diectly for convenience. But the code becomes harder to understand.
Transfer the instance is same as reuse the same instance. But the weight of it is different in the target dataset.
For example, an instance is appeared both in the source set and target set. Then we can calculate the similarity between the source label and target label of the same instance and give a new weight value of it in the target set.[1]
The general idea of feature represenation transfer is to reduce the difference between the source and target featurespace. It can be done by mapping(copying :P) the feature space from the source data to target data.
Because the DNN/CNN has a modular structure, we get the representation of features for each layers by pre-train, then use the high-level features to classify the target. So we can transfer/reuse the low level features from the source data set (eg. ImageNet), then tune the hyper-parameters of our own network.
The key of the parameter transfer is to find the common priori knowledge between the source and target data. It can be applied on SVM, K-means, etc.
It used to transfer the relational knowledges between graphs or networks.
Rank-r matching rate means the percentage of the p images with correct matches found in the top rank r ranks against the p gallery images.
Rank 1 matching rate is equal to the correct matching/recognition rate.