Twitter has become one of the main information sharing platforms for millions of users world-wide. Numerous tweets are created daily, many with highly sensitive content such as breaking news, new multimedia content or personal updates. Consequently, accurately recommending relevant tweets to users in a timely manner is a highly important and challenging problem. The ACM 2020 RecSys Challenge is aimed at benchmarking leading recommendation models for this task. The challenge is based on a large and recent dataset of over 200M tweet engagements released by Twitter with content in over 50 languages. In this work, we present our approach where we leverage recent advances in deep language modeling and attentioh architectures, to combine information from extracted features, user engagement history, and target tweet content. We first fine-tune leading multilingual language models M-BERT and XLM-R for Twitter data. Embeddings from these models are used to extract tweet and user history representations. We then combine all components together and jointly train them to maximize engagement prediction accuracy. Our approach achieves highly competitive performance placing 2nd on the final private leaderboard.