About Me
I am a Computer Science student in National University of Singapore graduated in 2023. The skills section shows the skills that I know and my confidence level.
I am enthusiastic about developing machine learning application in different domain. I have done several machine learning projects with respect to Human Activity Recognition, Image Classification and Planning using Reinforcement Learning. I also have background about developing full stack application and system testing for large scale system.
I am interested in discussion about Machine Learning techniques and cutting edge application development technique.
Work Experience
Professional Experience
Data Scientist
OCBC
2025 - Present
- Central AI service
Research Engineer
Institute for Infocomm Research, Agency of Science Technology & Research
2023 May - 2025 Oct
- Customize distributed training pipeline which utilize mosaic, pytorch and transformers library to customize distributed behavior which scales the training to 320 H100 GPU cards and train the MERaLION 2 model, also finetuning the whisper towards Singapore local context.
- Structured 200k hours open-sourced speech data into uniformed huggingface datasets format to support training of large speech foundation model.
- Research the methodologies of enhancing large language model to improve the multilingual consistency of the models using instruction finetuning.
- Build Pytorch model training pipeline to improve speech evaluation for unseen data by introducing objective metric combining with human label.
Data Science and Machine Learning Intern
Agoda
2022 Aug - 2023 May
- Design the system to support online reinforcement learning
- Implement general Reinforcement Learning framework with different exploration algorithm
- Adapt different use cases with respect to the framework
Research Intern
NUS Ubicomp Lab
2020 Dec - 2022 May
- Develop Android Application to support data collection from Esense earbuds device (Accelerometer, gyroscope)
- Collect face touching activity data using earbuds sensor for 30 participants
- Implement 1DCNN, LSTM model for activity recognition (classification)
Software Engineer Intern
PayPal
2021 Jan - 2021 July
- Develop Internal Issue Tracker using ReactJS, Springboot and Mongo DB
- Design API for website to fetch issue information from JIRA
- Support Auto email of the summary and analysis of issue
- Improve regression testing by creating java aspect to precreate data and fetch data directly during test time
- Reduce regression test time by 40%
Publications
My academic contributions in machine learning, natural language processing, and audio processing.
MERaLiON-AudioLLM: Advancing Speech and Language Understanding for Singapore
Y He, Z Liu, Lin Geyu, S Sun, B Wang, W Zhang, X Zou, NF Chen, AT Aw
ACL 2025 - System Demonstrations
Audiobench: A universal benchmark for audio large language models
B Wang, X Zou, Lin Geyu, S Sun, Z Liu, W Zhang, Z Liu, AT Aw, NF Chen
NAACL 2025
Personality-aware Student Simulation for Conversational Intelligent Tutoring Systems
Z Liu, SX Yin, Lin Geyu, NF Chen
EMNLP 2024 - Main
Resilience of Large Language Models for Noisy Instructions
B Wang, C Wei, Z Liu, Lin Geyu, NF Chen
EMNLP 2024 - Findings
Crossln: An Efficient Instruction Tuning Approach for Cross-Lingual Knowledge Alignment
Lin Geyu, B Wang, Z Liu, NF Chen
SUMEval-2 at COLING 2025
CRAFT: Extracting and Tuning Cultural Instructions from the Wild
B Wang, Lin Geyu, Z Liu, C Wei, NF Chen
C3NLP at ACL 2024
Mowe-audio: Multitask audioilms with mixture of weak encoders
W Zhang, S Sun, B Wang, X Zou, Z Liu, Y He, Lin Geyu, NF Chen, AT Aw
ICASSP 2025
Projects
Here are some projects that I have done during my study time and work experience.
MERaLION
Singapore AudioLLM foundation Model
ESense Log
Android BLE data collection app for Esense earbuds
IntelliJournal
Intelligent journaling application
Esense Pipeline
ML pipeline for earbuds sensor data processing
Lung Function Decline Prediction
Image recognition for medical prediction
Reinforcement Learning Framework
General RL framework with exploration algorithms
Learning In Progress....
More surprise coming...