Tag Archives: deeplearning

Note on using Apex with multiple models and one optimizer.

This post is just a quick note on how to use Nvidia's Apex for pytorch with multiple models that somehow using a single optimizer.

I am currently working with a  classifier using a pre-trained backbone feature extractor(which needs to be finetuned as well). I could encapsulate them into one pytorch nn.module. But for some reason, I want to do ad-hoc modifications on the features. In my training loop it will looks like this:

So the problem is how to set up one optimizer for these two separated nn.modules, and how to initialize these two models one optimizer combo with apex.

The solution is quite simple: