Challenge: Large number of output classes leads to memory issues.
Large number of output classes (more than 20,000) leads to large matrices what leads to memory issues during learning (16G on my laptop is not enough).
Solution: Split classification on sub tasks.
Train Neural Network (NN) for country prediction and for every country train NN for exact coin prediction.
Prediction works in following way:
- using input coins photo country is predicted
- loading matrix for predicted country with output labels for predicted country only
- do exact coin classification using country specific matrix
There is about 500 countries/regions in database and each of them has less 1000 coins assigned, so, implementation NNs with 1000 output labels is feasible for current 'memory/computation budget'.
Drawback: A lot of NNs created (approximately 500) vs one large network.
Challenge: Not much labeled data.
Only 10% of coins have 5 and more photos. 70% of coins have one photo only.
Convolution frontend is used. Data augmentation is used: new images with tiny rotations and brightness adjustments for every photo are created.
Challenge: gut feeling vs empirical tests.
There are a lot of meta parameters in Neural Network (patch size, amount of features, ...). Initially they were set up using feeling that it will be the best parameters.
Despite performance was not ideal I followed 'gut feeling' few months.
After tuning these parameter using cross validation set performance become significantly better (20% better prediction accuracy and 3 times faster training time).
Do NN parameters tuning using empirical tests on cross validation set.
Some notes for the project
Amount of output classes (coins)
more than 20,000. Exact number check here
about 10 second per photo. Functionality is run as background job/service.
about 40% for the top prediction. 75% for the top 3 predictions. "Top 3 predictions" mode is good enough to run NN in pre-approvement mode (NN does prediction which is approved by human).