Positive-Unlabelled (PU) learning is the machine learning setting in which
only a set of positive instances are labelled, while the rest of the data set
is unlabelled. The unlabelled instances may be either unspecified positive
samples or true negative samples. Over the years, many solutions have been
proposed to deal with PU learning. Some techniques consider the unlabelled
samples as negative ones, reducing the problem to a binary classification with
a noisy negative set, while others aim to detect sets of possible negative
examples to later apply a supervised machine learning strategy (two-step
techniques). The approach proposed in this work falls in the latter category
and works in a semi-supervised fashion: motivated and inspired by previous
works, a Markov diffusion process with restart is used to assign pseudo-labels
to unlabelled instances. Afterward, a machine learning model, exploiting the
newly assigned classes, is trained. The principal aim of the algorithm is to
identify a set of instances which are likely to contain positive instances that
were originally unlabelled.
Dettaglio pubblicazione
2021, Computer Science and Machine Learning, Pages -
Adaptive Positive-Unlabelled Learning via Markov Diffusion (02a Capitolo o Articolo)
Stolfi Paola, Mastropietro Andrea, Pasculli Giuseppe, Tieri Paolo, Vergni Davide
Gruppo di ricerca: Algorithms and Data Science