Thanh-Thiên Lê

if you have to refer to my name in one word, it's Thiên.
thienlt7 [at] gmail [dot] com
My avatar

Profile

Education

Experience

Publications

Languages

Collapse all

Expand all

Light mode

Profile
NLP Research Resident at VinAI Research, advised by Assoc. Prof. Thien Huu Nguyen.
I work on LLMs (inference, truthfulness, trustworthy, safety). I work on Information Extraction.
I work on Continual Learning.
Education
B.S. Computer Science GPA: 4/4 (Rounding to the Nearest Integer 🚀)
Oct 2023

Hanoi University of Science and Technology, Hanoi, Vietnam

+  Notable courses taken:

  • Linear Algebra
  • Probability and Statistics
  • Machine Learning and Data Mining
  • Information Retrieval
  • Deep Learning and Its Applications
  • Data Science
  • Fundamentals of Optimization
  • Programming Techniques
  • Computer Architecture
  • Evolutionary Computation
  • Object Oriented Programming
  • Data Structures and Algorithms
  • Discrete Mathematics
Experience

Research Resident

VinAI Research, Hanoi, Vietnam

Feb 2023 - Present
  • Conducted research in Large Language Models
  • Conducted research in Continual Information Extraction
  • Participated in weekly meetings of the NLP reading group
  • Attended courses relevant to research interests
Publications   Asterisk (*) denotes equal contribution.

Realistic Evaluation of Toxicity in Large Language Models

Tinh Son Luong*, Thanh-Thien Le*, Linh Ngo Van, Thien Huu Nguyen

ACL 2024  |  Code
We introduce a realistic dataset for safety evaluation of LLMs which is more competent in exposing toxicity in LLMs than state-of-the-art datasets.

ToVo: Toxicity Taxonomy via Voting

Tinh Son Luong*, Thanh-Thien Le*, Thang Viet Doan*, Linh Ngo Van, Thien Huu Nguyen, Diep Nguyen

NAACL 2025  |  Model
We introduce a method for effortless, automatic creation of customized moderation tools, and an accompanying dataset named ToVo.

Continual Relation Extraction via Sequential Multi-Task Learning

Thanh-Thien Le*, Manh Nguyen*, Tung Thanh Nguyen*, Linh Ngo Van, Thien Huu Nguyen

AAAI 2024  |  Code
We propose a novel multi-task learning framework to effectively handle catastrophic forgetting in Continual Relation Extraction.

SharpSeq: Empowering Continual Event Detection through Sharpness-Aware Sequential-task Learning

Thanh-Thien Le*, Viet Dao*, Linh Van Nguyen*, Thi-Nhung Nguyen, Linh Ngo Van, Thien Huu Nguyen

NAACL 2024  |  Code
We design a Sharpness-Aware framework to tackle the the Sequential-Task challenge of Continual Event Detection.

PhoWhisper: Automatic Speech Recognition for Vietnamese

Thanh-Thien Le, Linh The Nguyen, Dat Quoc Nguyen

Tiny Paper @ ICLR 2024  |  Model
We fine-tune the Whisper model on an 844-hour dataset that encompasses diverse Vietnamese accents.

Lifelong Event Detection via Optimal Transport

Viet Dao*, Van-Cuong Pham*, Quyen Tran*, Thanh-Thien Le, Linh Ngo Van, Thien Huu Nguyen

EMNLP 2024  |  Preprint  |  Code
We propose aligning the classifier module to the pre-trained language modeling head via optimal transport to enhance Continual Event Detection.

Few-Shot, No Problem: Descriptive Continual Relation Extraction

Thanh Nguyen*, Anh Duc Le*, Quyen Tran*, Thanh-Thien Le*, Linh Ngo Van, Thien Huu Nguyen

AAAI 2025  |  Code
We design a retrieval-based paradigm, leveraging LLM-generated descriptions to address Few-shot Continual Relation Extraction, boosting performance and reducing forgetting.

Adaptive Prompting for Continual Relation Extraction: A Within-Task Variance Perspective

Minh Le*, Tien-Ngoc Luu*, An Nguyen The*, Thanh-Thien Le, Thien Trang Nguyen Vu, Thanh Tung Nguyen Van, Linh Ngo Van, Thien Huu Nguyen

AAAI 2025  |  Code
We improve prompt-based Continual Relation Extraction by incorporating task-specific prompt pool that capture the variations within each task while maximizing cross-task variances.
Languages
  • Programming Languages: Python, C/C++
  • Libraries/Modules: PyTorch, NumPy/SciPy, pandas, matplotlib, seaborn, cvxpy
  • Language Proficiency: English (fluent), Vietnamese (native)