Hello, I am Lin Geyu

|

"Stay hungry, stay foolish" -- Steve Jobs

About Me

Lin Geyu
Python 90%
Pytorch 90%
Machine Learning 90%
Large Scale Model Training 80%
Java 70%
C++ 70%
Tensorflow 70%
SQL 70%
Spark 60%
ReactJS 50%
SpringBoot 50%

I am a Computer Science student in National University of Singapore graduated in 2023. The skills section shows the skills that I know and my confidence level.

I am enthusiastic about developing machine learning application in different domain. I have done several machine learning projects with respect to Human Activity Recognition, Image Classification and Planning using Reinforcement Learning. I also have background about developing full stack application and system testing for large scale system.

I am interested in discussion about Machine Learning techniques and cutting edge application development technique.

Work Experience

Professional Experience

OCBC

Data Scientist

OCBC

2025 - Present

  • Central AI service
Institute for Infocomm Research, Agency of Science Technology & Research

Research Engineer

Institute for Infocomm Research, Agency of Science Technology & Research

2023 May - 2025 Oct

  • Customize distributed training pipeline which utilize mosaic, pytorch and transformers library to customize distributed behavior which scales the training to 320 H100 GPU cards and train the MERaLION 2 model, also finetuning the whisper towards Singapore local context.
  • Structured 200k hours open-sourced speech data into uniformed huggingface datasets format to support training of large speech foundation model.
  • Research the methodologies of enhancing large language model to improve the multilingual consistency of the models using instruction finetuning.
  • Build Pytorch model training pipeline to improve speech evaluation for unseen data by introducing objective metric combining with human label.
Agoda

Data Science and Machine Learning Intern

Agoda

2022 Aug - 2023 May

  • Design the system to support online reinforcement learning
  • Implement general Reinforcement Learning framework with different exploration algorithm
  • Adapt different use cases with respect to the framework
NUS Ubicomp Lab

Research Intern

NUS Ubicomp Lab

2020 Dec - 2022 May

  • Develop Android Application to support data collection from Esense earbuds device (Accelerometer, gyroscope)
  • Collect face touching activity data using earbuds sensor for 30 participants
  • Implement 1DCNN, LSTM model for activity recognition (classification)
PayPal

Software Engineer Intern

PayPal

2021 Jan - 2021 July

  • Develop Internal Issue Tracker using ReactJS, Springboot and Mongo DB
  • Design API for website to fetch issue information from JIRA
  • Support Auto email of the summary and analysis of issue
  • Improve regression testing by creating java aspect to precreate data and fetch data directly during test time
  • Reduce regression test time by 40%

Publications

My academic contributions in machine learning, natural language processing, and audio processing.

2025 Conference

MERaLiON-AudioLLM: Advancing Speech and Language Understanding for Singapore

Y He, Z Liu, Lin Geyu, S Sun, B Wang, W Zhang, X Zou, NF Chen, AT Aw

ACL 2025 - System Demonstrations

2025 Conference

Audiobench: A universal benchmark for audio large language models

B Wang, X Zou, Lin Geyu, S Sun, Z Liu, W Zhang, Z Liu, AT Aw, NF Chen

NAACL 2025

2024 Conference

Personality-aware Student Simulation for Conversational Intelligent Tutoring Systems

Z Liu, SX Yin, Lin Geyu, NF Chen

EMNLP 2024 - Main

2024 Conference

Resilience of Large Language Models for Noisy Instructions

B Wang, C Wei, Z Liu, Lin Geyu, NF Chen

EMNLP 2024 - Findings

2025 Workshop
Best Paper Award

Crossln: An Efficient Instruction Tuning Approach for Cross-Lingual Knowledge Alignment

Lin Geyu, B Wang, Z Liu, NF Chen

SUMEval-2 at COLING 2025

2024 Workshop
Best Paper Award

CRAFT: Extracting and Tuning Cultural Instructions from the Wild

B Wang, Lin Geyu, Z Liu, C Wei, NF Chen

C3NLP at ACL 2024

2025 Conference

Mowe-audio: Multitask audioilms with mixture of weak encoders

W Zhang, S Sun, B Wang, X Zou, Z Liu, Y He, Lin Geyu, NF Chen, AT Aw

ICASSP 2025

Get In Touch

Whether you want to get in touch, talk about a project collaboration, or just say hi, I'd love to hear from you.

Let's Connect

Simply fill the form and send me an email. I'll get back to you as soon as possible.