Revisit Semantic Representation and Tree Search for Similar Question
Retrieval
Abstract
This paper studies the performances of BERT in short sentence ranking task. We fine-tune BERT on the training data to get semantic vector on the test data. Given a sentence as query, we build our tree based on k-means and beam search in all the sentence embeddings. We do the experiments on the Quora Question Pairs Dataset and process the dataset for sentence ranking. Experimental results show that our methods is comparable to the strong baseline. Our tree accelerate the speed by 500%-1000% in our experiments without losing too much ranking accuracy.
View on arXivComments on this paper
