Recently I have spent a lot of time working on the Kaggle digit recognizer competition and finally reached an accuracy higher than 0.99. I am quite happy with it and would like to share with everyone how I did it. Basically I used TensorFlow to build a neural network with these ‘highlights’:
- three hidden layers, with some dropout between each layer, but no convolution in them
- an 25 times larger training data set – generated by nudging original training images to up, down, left, and right for 1 pixel each
- an exponential decay learning rate
You can find the code here.
Unlike some other scripts on Kaggle.com like this one and this one, my neural network does not use convolution, mainly because I do not have a GPU and do not want to pay for the AWS… However, I think my neural network did a good job just as these two.
I learned a lot about machine learning through this 101 Kaggle competition. The biggest lesson I learned is that: picking the right model is more important than fine-tuning the parameters in a model. Just like any other tasks, finding the right tool is always the very first step. Before I decided to use neural network, I have tried several other models already, including logistic regression, SVM, and k-nearest neighbor, but the accuracy never went above 0.97, no matter how hard I tried to fine-tune the model parameters.
The second biggest lesson is that: using a large training data set really helps with improving neural network’s accuracy. I adopted this ‘nudging images’ idea from an example on scikit-learn.org. I really like this idea and probably will keep using it for other projects.
Last but not least, I realized that it is not good just working on your own. I need to read about what other people have done, talk to different person even machine learning laymen, and listen to other peoples criticism whenever they are kind enough to do so. So, what is your criticism on this neural network, please?