#pondernet #deepmind #machinelearning Humans don’t spend the same amount of mental effort on all problems equally. Instead, we respond quickly to easy tasks, and we take our time to deliberate hard tasks. DeepMind’s PonderNet attempts to achieve the same by dynamically deciding how many computation steps to allocate to any single input sample. This is done via a recurrent architecture and a trainable function that computes a halting probability. The resulting model performs well in dynamic computation tasks and is surprisingly robust to different hyperparameter settings. OUTLINE: 0:00 - Intro & Overview 2:30 - Problem Statement 8:00 - Probabilistic formulation of dynamic halting 14:40 - Training via unrolling 22:30 - Loss function and regularization of the halting distribution 27:35 - Experimental Results 37:10 - Sensitivity to hyperparameter choice 41:15 - Discussion, Conclusion, Broader Impact Paper: Abstract: In standard neural networks the amount of computation used gr
Hide player controls
Hide resume playing