Siamese LSTM for sentence similarity

quora-question-pairs

detection of question repeated questions

This is a sesond attempt at the Quora questions kaggle challange i worked on a few years back using classical features.
In this iteration I first attempt to use word2vec embeddings, then bert embedings, and finally training embeddings with the model.
The final model implemented is Siamese LSTM to classify pairs of sentences as either the same question or different.

Link to code

Preprocessing

Embeddings

Model architecture

Results