Other forms of learning
Metric Learning
Most frequently used metrics of similarity between two feature vectors are Euclidean distance and cosine similarity. Such choice of metric seem logical but arbitrary. We can create metric that would work better for our dataset.
Learning to Rank
Learning to rank is a supervised learning problem.
Learning to recommend
Learning to recommend is an approach to build recommender systems. In this there is a user who consumes some content. We have the history of consumption and we want to suggest this user new content that the use would like. Approaches used in recommendations :
- Content-based filtering
- Collaborative filtering
Content-based filtering is based on learning what do users like based on the description of the content they consume. The content-based approach has many limitations. For example, the user can be trapped in the so-called filter bubble: the system will always suggest to that user the information that looks very similar to what user already consumed
Collaborative filtering has a significant advantage over content-based filtering: the recommendations to one user are computed based on what other users consume or rate. For instance, if two users gave high ratings to the same ten movies, then itâs more likely that user 1 will appreciate new movies recommended based on the tastes of the user 2 and vice versa. The drawback of this approach is that the content of the recommended items is ignored.
Most real-world recommender systems use a hybrid approach: they combine recommendation obtained by the content-based and collaborative filtering models.
Two effective collaborative-filtering learning algorithms are factorization machines and denoising autoencoders.
Factorization machines
The factorization machine model is defined as follows:
where
Denoising Autoencoders
A neural network that reconstructs its input from the bottleneck layer . The fact that the input is corrupted by noise while the output shouldnât be, makes denoising autoencoders an ideal tool to build a recommender model.
To prepare the training set for our denoising autoencoder, remove the blue and green features from the training set. Because now some examples become duplicates, keep only the unique ones.
Self-supervised learning : Word Embeddings
Word embedding are feature vectors that represents words. Similar words have similar word vectors. These embedding comes from learning from data There are many algorithms to learn word embeddings.
- word2vec
- skip-gram