Active Data Selection Scheme for Deep Neural Network

Sangwoong Yoon, Cheolho Han, Hanock Kwak, Munbo Shim, and Byoung-Tak Zhang


A machine learning model can incrementally improve its performance by querying new informative training examples. This active sample selection has enabled efficient learning with fewer labeled examples, and is practically significant since acquiring the data labels is particularly costly in most real-world cases. However, the active learning approach has not been applied to Deep Neural Network, a powerful model whose variants have renewed a number of state-of-the-art machine learning records. Here we present an active sample selection scheme for Deep Neural Network, which selects new training examples aiming to reduce expected variance of predicted label. Demonstrated on synthetic data as well as on organic material structure-energy data, the suggested method shows moderate performance on the initial limited dataset, yet incrementally becomes stronger as collecting informative data.