Mathieu Ravaut

Mathieu Ravaut

Machine Learning Scientist | PhD Candidate

Nanyang Technological University

Hello!

I am currently a final-year PhD Candidate in Natural Language Processing (NLP) at Nanyang Technological University in Singapore, in NTU-NLP group. I am advised by Dr. Shafiq Joty and Dr. Aixin Sun. Through my SINGA scholarship, I am also attached at A*STAR I2R ALI Lab, where I am supervised by Dr. Nancy Chen.

My PhD focus is on abstractive text summarization. Specifically, I am interested in how to build better summarization systems by going beyond the standard framework of MLE with token-level cross-entropy loss and teacher forcing. For instance, I have explored in depth how to learn summarization at the entire summary level (see my papers SummaReranker, SummaFusion, and SummScore on this topic). I am currently focusing on analyzing large language models (LLMs), including their behavior on summarization tasks. See my recent work on analizing the lost-in-the-middle phenomenon for LLMs in the task of summarization. I am also working on core issues in LLMs, including: scaling context length, evaluating open-source LLMs, and contamination. For our work LOCOST on using state-space models to scale the context length of the encoder, we won the Best Paper Award at the EACL 2024 conference (congrats Florian!). In Spring 2023, I interned in Huawei Noah Ark’s lab in Singapore under the supervision of Dr. Zhang Hao, where I worked on improving conversational recommender systems with language models. In Spring 2024, I interned at Lingjun Investment, a Chinese hedge fund, where I worked on index re-balancing stategies.

I am extremely passionate about machine learning, having been in this area since 2016. Before PhD, I obtained a MSc in Applied Computing (MScAC) from University of Toronto Department of Computer Science in 2018 and a B.Eng and M.Eng from Ecole Centrale Paris (a leading French Engineering School now known as CentraleSupelec and part of the newly formed Paris Saclay University) in 2015 and 2018, respectively. I also worked full-time from 2018 to 2020 as Machine Learning Research Scientist at Layer 6 AI, TD Bank’s AI lab in Toronto, where my research in machine learning applied to healthcare (diabetes) was featured in Nature. During my M.Eng, I interned at Thales and A*STAR I2R Visual Computing Lab in Singapore, where I worked on video classification. In my spare time, I like participating in machine learning competitions like those hosted by Kaggle.

Research Community Service:
Reviewer (journals): IEEE/ACM TASLP (2021-)
Reviewer (conferences): ACL Rolling Review (2023-), CoNLL 2023, SIGDIAL 2023, AACL 2023

Interests
  • Machine Learning
  • Natural Language Processing
  • Recommender Systems
  • Machine Learning for Healthcare
Education
  • PhD in Computer Science, 2021-2024

    Nanyang Technological University (Singapore)

  • MSc in Applied Computing, 2017-2018

    University of Toronto (Canada)

  • Master of Engineering, 2015-2018

    Ecole Centrale Paris (France)

  • Bachelor of Engineering, 2014-2015

    Ecole Centrale Paris (France)

  • Prepa MPSI/MP*, 2011-2014

    Lycée Montaigne (France)

Experience

 
 
 
 
 
Quantitative Research Intern
Lingjun Investment
February 2024 – May 2024 Singapore
Deep learning methods for index re-balancing stategies on mid-frequency Chinese equities.
 
 
 
 
 
Research Intern
Huawei Noah Ark
January 2023 – July 2023 Singapore
Search & Recommendation Team, supervised by Dr. Hao Zhang. Research in conversational recommender systems.
 
 
 
 
 
Teaching Assistant
Nanyang Technological University
January 2021 – November 2022 Singapore
TA for grad-level Deep Learning for NLP courses and various 1st-year CS courses: Introduction to computational thinking and programming, Data Structures and Algorithms.
 
 
 
 
 
Machine Learning Research Scientist
Layer 6 AI
May 2018 – July 2020 Toronto, Canada
  • Applied research in machine learning for healthcare, resulting in publications in Nature, JAMA and BMJ journals.
  • Member of a team placing 2nd (out of 70+) at ACM RecSys Challenge 2020.
  • Insurance claim fraud detection with NLP.
 
 
 
 
 
Teaching Assistant
University of Toronto
January 2018 – April 2018 Toronto, Canada
TA for 1st-year course Introduction to Stasticial Thinking and Programming.
 
 
 
 
 
Research Intern
A*STAR I2R
February 2017 – July 2017 Singapore
Visual Computing Lab, supervised by Dr. Vijay Chandrasekhar. Research in computer vision, resulting in a workshop paper at CVPR 2017.
 
 
 
 
 
Research Intern
Thales Solutions Asia
August 2016 – February 2017 Singapore
Research & Technology department, supervised by Dr. Antoine Fagette. Applied research in computer vision, leading to a paper published at IEEE Oceans 2017.

Journal Publications

(2022). Predicting Hospitalisations Related to Ambulatory Care Sensitive Conditions with Machine Learning for Population Health Planning: Derivation and Validation Cohort Study. In BMJ.

Cite Source Document

(2021). Development and Validation of a Machine Learning Model Using Administrative Health Data to Predict Onset of Type 2 Diabetes. In JAMA.

Cite Source Document

(2021). Predicting Adverse Outcomes Due to Diabetes Complications with Machine Learning Using Administrative Health Data. In Nature.

Cite Source Document

Awards

EACL 2024 Best Paper Award
Our paper LOCOST on state-space models was named the best paper of the conference.
Singapore Data Science Consortium Phd Fellowship 2023
10k SGD award to support my PhD research.
Singapore International Graduate Award (SINGA)
Tuition-fees waiver and monthly stipend to support my PhD research.
MITACS Accelerate Program
30k CAD award to support my MSc internship research at Layer 6 AI.
Singapore International Pre-Graduate Award (SIPGA)
Monthly stipend to support my graduate internship at A-STAR I2R.

Competitions

Optiver - Trading at the Close (Solo)
Ranked 33rd/4436, top 1% & Silver Medal (username=Ravox).
The AutoCast Competition - Forecasting the Future with AI (warm-up phase) (Solo)
Ranked 6th/109 (username=matravox).
CommonLit Readability Prize (Solo)
Ranked 29th/3633, top 1% & Silver Medal (username=Ravox).
Jane Street Market Prediction (Solo)
Ranked 291st/4245, top 7% & Bronze Medal (username=Ravox).
Riiid Answer Correctness Prediction (Solo)
Ranked 84th/3395, top 3% & Silver Medal (username=Ravox).
Mechanisms of Action (MoA) Prediction (Solo)
Ranked 40th/4373, top 1% & Silver Medal (username=Ravox). See my solution write-up here.
MIND News Recommendation Competition (Solo)
Ranked 4th/30+ (3rd prize) (username=Ravox).
RecSys Challenge 2020 (Team)
Ranked 2nd/30+ (username=learner). See our solution write-up here.
Two Sigma Using News to Predict Stock Movements (Solo)
Ranked 232nd/2927, top 8% & Bronze Medal (username=LeComteDeBronze).
Google Cloud and YouTube-8M Video Understanding Challenge (Team)
Ranked 22nd/655, top 4% & Silver Medal (username=DL2.0). See our approach explained here.