A Physicist who programs

2017年6月12日星期一

Connect C++ to Python

Background

In order to support features such as function overloading and template, C++ applies the name mangling technique. However, the C++ standard does not restrict how to do it. Therefore, every vendor who implements the C++ compiler may do it in his own way. This fact makes C++ hard to connect with other languages.
With this in mind, I always think the only way to connect C++ with other languages is through the C interface. Until I saw Boost.Python.

Boost.Python

This module provides a convenient way to build such connection. With this module, you don't even need to change your C++ code at all (non-intrusive, see its Quick Start for more detail). However, when it comes to more advanced features such as the polymorphism and the template, one should do some effort to make it work...

2017年4月1日星期六

Why the traditional classifier is better than the Neural Network?

Overview

This article gives my interpretation of why traditional classifier suppresses the (fully-connected) neural network if the input features are given.

Introduction

To recognize certain objects from an image, a lot of research use the Convolution Neural Network (CNN) to extract features and then input those features to the traditional classifier to predict the result. This still holds even if the data sets are large enough (i.e. the effect of overfitting can be excluded).
It's hard to believe though, since the CNN is trained for such purpose and the last layer (such as the fully-connected layer or the softmax layer) is optimized with others as a whole. Namely, during training, each layer will adjust their parameters so that they can give the best result. In this situation, the former layers will extract features that best work with the last layer who performs the classification. And the last layer is especially designed for accepting those kinds of features and do the classification.
However, there're plenty of papers use the traditional classifier to give the final results. And even the SVM (which is regarded as the simplest classifier and it's trivial version can only separate the data linearly) is considered better than the fully-connected layer. Besides of the cause of overfitting (the data is too few compare to the parameters to fit), following is the interpretation I proposed.

Interpretation

Suppose we take 2D feature space to describe a system:
1. If the distribution of data looks like Fig. 1(a), it is very simple to separate it into two classes.
(All of the following graphs are all calculated by libsvm)

Fig. 1 (a)

(b)

2. If the distribution of data looks like Fig. 2(a), it still makes sense.

Fig. 2 (a)

(b)

For example, the horizontal axis represents the BMI of students, and the vertical axis is their sports performance. This distribution can also be perfectly separated by SVM if a certain transformation is performed on the two axes. For example, if you transform the BMI ratio to some score that represents the health of human body, the graph will look like Fig. 1 (which means "health score" is more suitable to describe such system).

3. If the distribution of data looks like Fig. 3(a), some red dots that surround by blue dots may be noises (i.e. data that is mistakenly labeled). You know you want your algorithm to separate your data and gives the result like Fig. 3(b). However, if your algorithm makes a fuss about such noises it would look like Fig. 3(c). If this is the case, it will drag down the performance.

Fig. 3 (a)

(b)

(c)

4. If the distribution of data looks like Fig. 4(a), this feature space may make no sense in my opinion. This probably means this feature space is not adequate to describe such system. If your algorithm tries hard and tends to separate these data into many small groups, it probably does it wrong.

Fig. 4 (a)

(b)

In this case, you should probably abandon one of the features (or even both of them). Or, find new features that describe the system well by setting some restrictions of your model (such as Regularization). This may also be able to explain why Clustering can't give a better result than SVM: Although the parameters of Clustering is quite a few (compare to the input data), trying to separate every data well is meaningless.

Now, let us back to the high dimension feature space. If your algorithm has many parameters that can fit into many circumstances (such as the fully-connected neural network), no matter current data is a noise or some of the features are poor to describe the system, it may get into trouble. Instead, if one algorithm that can ignore some noises and take less of those features which does not help to classify objects, it may give a better result.

Multi-tasks in Convolution Neural Network (CNN) - the reuse of the extracted features

Overview

This article will discuss one of the approaches to perform multi-tasks in computer vision: By extracting the features from a CNN and send those features to the traditional classification methods (such as SVM, Joint Bayesian... etc). This article also shows the range of accuracy, so that one can judge if this approach satisfies his own purposes.

Motivation

In some circumstances, we want more than one task is executed. For example, supposed that we want to develop an APP that can recommend different products to different consumer groups by examining the age, gender, wear glasses... etc. And judge their response to such advertisement by examining their facial expressions. It seems that multiple CNN should be executed at a time.
However, passing an image through one CNN is time-consuming. Not to mention that there're so many tasks to be performed. There should be other alternatives.

Introduction

As shown below, the Convolution Neural Network (CNN) is usually composed of several Convolution Layer at the front of the network, as well as the Fully-Connected Layer at its end.

The former of the network are believed to perform the feature extraction (i.e. to extract the characteristic of such image). While the later layer, especially the Fully-Connected Layer, is believed to perform the classification of such image by using the previously derived features.
Due to such characteristics of CNN, it is straightforward to replace the classification part (i.e. the Fully-Connected Layer) by the traditional classifiers (such as SVM). For its efficiency and even its accuracy (see this article).
In this article, we approach the multi-tasks by training a CNN for one task. And perform other tasks by extracting features from that CNN and input those features to the traditional classifier to perform other tasks. We believe that the features that used by one task may be close to the feature of the other task, if both of the tasks are similar to each other.

CNN + traditional classifiers

In our case, we have at least three tasks to perform: to infer whether the user is smiling, wearing glasses, and the user's age. We start from the DEX model (which is designed for age deduction), take features from the first the Fully-Connected Layer (so that the size of the features is large enough to do further processing) and finally, input those features to the traditional classifier (SVM in our cases).

Result

Age

Since we do not re-train the CNN, the error of Age should remain the same (i.e. from its paper: MAE = 3.221).
-------------------------------------------------------------------------------------------------------------------------------------------------------------

On the beginning of this project, our team has a heated discussion about whether the result of Smile-Task is more accurate than the Glasses-Task. One might say that: since the glasses is more apparent than any other textures of the human faces, the Glasses-Task should get the better result. On the other hand, the other may say: since the DEX model is trained for detecting the human age, it might tend to only extract the features that relate to it (e.g. the wrinkle) and these features are more suitable for judging smiling.

Smile

In this task, we only output two results: Smile or Not-Smile. We collect about 3000 training data for each class to train SVM.
The accuracy is around 88% which can be better if we train a model especially for detecting Smile. However, this accuracy is suitable for our purpose. Instead of letting users wait, some mistakes are tolerable.

Glasses

In this task, we output three results: No Glasses, Glasses, and Sun Glasses. We also collect around 3000 training data for each class.
The accuracy is only around 84%. Looks like the Smile-Task win! To give a brief interpretation after the result, this might due to the DEX model is trained for predicting age. Therefore, the later parts of the network might tend to ignore some features that belong to the glasses, since wearing a glasses does not change the age of a human.
Furthermore, the result of glasses may get better if we extract features from the former layers. However, since the features of the former layer is quite large (10 times as many features than we have used) and will drag down the efficiency.

Conclusions

It's possible to use features that extracted by CNN that is trained for another purpose. And the performance is around 80%~90% depend on whether current task is close to the original task (for which the CNN is designed).

2017年3月21日星期二

Qt中文亂碼問題

Problem

若想直接在Qt上打中文，常會遇到亂碼問題。嘗試了網路上蠻多方法都失敗，本文總結出解決亂碼的方法。

Solution

情境1

想在 Source Code (*.cpp) 裡面寫中文；但當其顯示在UI上時，卻呈現亂碼。
如：

QString newInfo("你好");
this->displayPanel->myQLabel->setText(newInfo);

解決方法請見這篇文章。

情境2

如果有一已編譯好的函示庫(假設為輸入員工ID，則該函示庫會輸出員工姓名)，該函示庫的輸出型態為std::string，其編碼為BIG-5(以下皆假設在Windows作業系統下)。若欲將此型態轉為QString，則需以下面方式即可：

namespace QtExtensions
{
    QString ConvertToUTF8(const std::string& big5FormatString_)
    {
        QString result = QString::fromLocal8Bit(big5FormatString_.c_str());

        return result;
    }
}//End of namespace

而在使用時：

std::string GetStaffName(const std::string& staffID_);

int main()
{
    std::string staffNameInBig5Format = ::GetStaffName("01234567");
    QString staffNameInUTF8 = QtExtensions::ConvertToUTF8(staffNameInBig5Format);
    staffInfoWindow.SetStaffName(staffNameInUTF8);
}

Can't create 64bits virtual machine on VirtualBox

Problem

I have a exported ubuntu 14.10 64bits that run on VirtualBox. After I upgrade my host OS (i.e. win7 to win10), that exported ubuntu will crash when I try to boot it.
Furthermore, when I want to create a new virtual machine, it does not have the "64bits" options. I know this issue! Normally, this problem can be solved by Enable the Virtualization Settings in BIOS. However, I found the Virtualization Settings is still enabled...

Solution

After several searches, I found this article. What causes this problem is: Only one virtual machine manager (hypervisor) is valid at a time, but we have Two (one is our VirtualBox, the other is Hyper-V that used by Microsoft (I still wonder if it has to do with the virtual machine of my HoloLens?)). Although I can't change the settings as the above article did (my OS is win10), the comments of that article provide the solution to turn off the Hyper-V:

bcdedit /set hypervisorlaunchtype off

After typing the above commands in command line and reboot, the VirtualBox can create the 64bits OS and execute it successfully. However, I found that I should type the above command everytime I reboot my computer. Finally, I solve this by removing the "Android-" something that relates to the VisualStudio...

No picture on display (black screen) after Updating Win10

Problem

After the first update of win 10 (Professional, maybe), I got No Picture On Display (even when I wait for a long time). While the computer seems to be still running, though. By forcing the computer to reboot (usually, I need to reboot many times), the computer can recover to its last status (i.e. I can use my computer again, but the update seems not applied at all...). This bothers me for a long time...

Solution

Recently, I can't stand anymore and begin to face the problem. Finally, I found it might be caused by the driver issue of my Graphic Card (the win7 driver not compatible with and win10?). Anyway, by removing the Graphic Card and reboot, I saw the updating procedure after reboot. When it finished, I put my Graphic Card back. And Everything works fine.

The History of my Computer

I start from win7, and install win10 Home (by buying from vendor). After then, I buy win10 Professional online (to enable the development of HoloLens).

My Hardware

CPU: Intel R. i7-4790K
Motherboard: GIGABYTE Z97MX-Gaming 5
Graphics Card: MSI GTX780Ti

2016年11月7日星期一

Taiwan ancient buildings recognition by Convolution Neural Network

Overview

The Alex net is applied to recognize 4 different types of Taiwan ancient buildings in this article.

Motivation

I'm new to Deep Learning, so I want to do some exercise to warm up. And since it would be interesting and meaningful to combine the technology together with the culture, I apply the Alex net to recognize 4 different types of Taiwan ancient buildings.

Introduction

The ancient buildings in Taiwan can be categorized into 4 types:

1. The traditional Chinese buildings: Many of Taiwanese are successors of the immigration who emigrated from China about hundred years ago. Therefore, many old buildings are ancient Chinese buildings. The following figure shows the traditional residence.

2. The traditional Japanese buildings: Taiwan has been colonial ruled by Japan for about half of the century. Therefore, there're many traditional Japanese buildings left. The following figure shows the Japanese temple located in Taoyuan.

3. The baroque style buildings: During the Japanese colonial period, several government buildings are designed as the Japanese Baroque style. These buildings are popular in Japan during the Meiji period. Thus, this category is labeled as Japanese modern in the present work. The most famous building of this style might be the Presidential Palace of Taiwan, as shown follows.

4. The modern Taiwan buildings: The word "modern" here refers to those buildings that constructed during the Early Revival Period. I'm not quite a fan of these kinds of buildings. However, these buildings still in everyone's childhood memory. Thus, these buildings are also recognized by the network.

Data Sets

The collection of data is very time-consuming. And since there are only countable historic buildings in Taiwan as well as not every building in these kinds are posted to the internet, I suddenly ran out of data to train the net. Therefore, some of the buildings in Chinese and Japanese are also collected.
After all, I spend a whole data to collect all of the following data! However, the training set seems too few to train an Alex net, though...
The collected data are listed below (25% of these data are used to calculate the validation):
Traditional Chinese Buildings: 159
Traditional Japanese Buildings: 152
Modern Japanese Buildings: 169
Modern Taiwan Buildings: 119

Result

The training result is shown as follows. After 30 epoch, the final accuracy (validation set) is 66%.

Several results in the test set are shown below:

Although the training data are quite insufficient, the result seems acceptable: up to 2/3 buildings are classified correctly. And it might get better performance if more and more data are included. Also, the Alex net seems to be capable of solving problems like this. Therefore, it's more reasonable to collect more data than design more complex neural network if one want to improve the predictions.

訂閱：文章 (Atom)

2017年6月12日 星期一

Background

2017年4月1日 星期六

Overview

Introduction

Interpretation

Overview

Motivation

Introduction

CNN + traditional classifiers

Result

Age

Smile

Glasses

Conclusions

2017年3月21日 星期二

Problem

Solution

情境1

情境2

Problem

Solution

Problem

Solution

The History of my Computer

My Hardware

2016年11月7日 星期一

Overview

Motivation

Introduction

Data Sets

Result

2017年6月12日星期一

2017年4月1日星期六

2017年3月21日星期二

2016年11月7日星期一