STR interview question

If the gradients in a deep learning model are really small, what could you do to try and address that?