Debugging Neural Networks: A Checklist
You’ve framed your problem, prepared your datasets, designed your models and revved up your GPUs. With bated breath, you start training your neural network, hoping to return in a few days to great results.
When you do return though, you find yourself faced with a very different picture. Your network seems to do no better than random selection. Or, if it is a classification model, has curiously learned to classify all entries to a single dominant category. You scratch your head wondering what went wrong, and hit a wall. What’s more, since you’re programming at a higher layer of abstraction, you have no intuitive sense for what’s going on with your matrices and activation functions.
This isn’t a problem faced only by beginners. Empirically, it happens to even the more experienced among us, especially as the complexity of models, the dataset and the core problem increases. So if you find yourself in this situation, don’t fret. To tackle this, we’ve put together a little checklist that might help you find a way out of this hole. This was written specifically in the context of image classification, but the advice is generic enough to apply to all types of networks.
Model Architecture & Initialization
Despite all of these steps, if your network still doesn’t look like it’s headed in the right direction, then you ought to look for more fundamental errors in either your code or how your problem is framed. On the other hand, if you’ve now resolved your problems, congratulations! Start tuning your hyper-parameters and tweaking your model, and you’ll be on your way.
Written by Govind Chandrasekhar and the Semantics3 Team in Bengaluru, Singapore, and San Francisco
Published at: October 09, 2016