EDUCATION
Southern University of Science and Technology (SUSTech)
Shenzhen, China
B.Sc. in Data Science and Big Data Technology, GPA: 3.88/4.0, Major GPA: 3.95/4.0
Aug. 2021 - Jun. 2025 (Expected)
Awards: 2nd Class Scholarship of Zhicheng College (Top 10%, 2022 & 2023), Elite Student in Dept. of STAT & DS
Relevant Courses: Computer Vision, Data Structures & Algorithm Analysis, Data Science Practice, Big Data Analysis Software & Application, Computer Programming, Multivariate Statistical Analysis, Mathematical Analysis
Stony Brook University
Stony Brook, NY, USA
Exchange Student, GPA: 4.0/4.0
Aug. 2023 - Dec. 2023
Relevant Courses: Data Analysis, Data Mining, Foundations of Machine Learning, Introduction to Visualization
RESEARCH & INTERNSHIP
Exploiting Offline Dataset to Improve Online Bandit Learning Efficiency with Relative Feedback
Shenzhen, China
Research Assistant, PI: Prof. Fang Kong, SUSTech
Sep. 2024 - Present
Did a comprehensive literature review on the state of-the-art bandit algorithms, focusing on estimating model parameters using offline stochastic data for online preference feedback by accommodating LinUCB algorithm.
Proposed several algorithmic approaches and conducted regret analysis, mathematically demonstrating that the offline
dataset can effectively reduce regret within the proposed algorithm framework.
Delivered a seminar presentation on stochastic linear bandits based on Bandit Algorithms [Tor Latimore et al, chapter 19].
Computer Vision & Cloud Engineer Intern, R&D Department
Jun. 2024 - Aug. 2024
Scraped 20,000+ uncopyrighted videos from several websites (Pikwizard, Vecteezy, Mazwai, etc.) through Beautiful Soup using Python; designed a video filtering algorithm to select the videos with flat surface, camera in motion from the dataset for product demonstration in various environments.
Employed OpenCV to identify the basic optical flow and video characteristics, and applied SegFormer (a semantic segmentation method) to detect the specific objects in the video.
Generated captions using VideoLLaMA2 and classified videos into predetermined categories.
Developed a cloud scheduling system to implement the company's 3D Modelling and Digital Human solutions in Huawei Cloud. By strategically deploying the GPU instead of engaging in long-term leasing, successfully reduced the back-end operational expenses.
Nature Index Periodicals: A Comparative Analysis of Impact and Innovation in Publications
Shenzhen, China
Research Assistant, PI: Prof. Yifang Ma
Jul. 2022 - Aug. 2023
Collected 21,848 journal information from over 10 million papers from the Web of Science and a private database, featuring paper indexes like conventionality, hit5, hotspot, and novelty.
Visualized KDE plots for over 1.5 million papers in NI and non-NI journals with similar journal impact factors in the same field, finding that NI journals do not exhibit an advantage in novelty.
Analyzed the affiliations and publications in NI and non-NI journals, concluding that NI journals tend to favor works from institutions in the U.S. and Europe. Suggesting either a limitation in the Nature Index's ability to differentiate highly influential journals or an inherent bias in the selection practices of these publishers.
COMPETITIONS
Optimization of the CLIP Model (ViT-B-32) with Few-shot Learning and Zero-shot Learning
Shenzhen, China
Core Member, The 2024 Jittor Artificial Intelligence Challenge
Apr. 2024 - Aug. 2024
Based on hierarchical clustering, selected four photos from each dataset (Animal, Caltech-101, Food-101, and Tsinghua Dogs) for few-shot training, with Hamming distance as the metric and image vectors encoded by the original model.
Fine-tuned the prompt from "a photo of xxx" to various forms by merging self-written and GPT-4-written captions, and selected the final prompt based on increased accuracy in training datasets.
Developed an adapter layer for an image encoder using the boosting method with cross-entropy loss; Trained multiple adapters with different structures and parameters, and integrated predicted results via ensemble learning. Improved classification accuracy from 62% to 69.15%, ranking top 0.6% among all participants.
Momentum: Is it Truly Exist in Tennis Game?
Shenzhen, China
Team Leader, 2024 COMAP Mathematical Contest in Modeling (MCM)
Feb. 2024
Analyzed shifting momentum between players and their impact on game outcomes based on the 2023 Wimbledon Championships - Men's doubles dataset.
Quantified momentum through a self-designed time series model and developed a logistic regression model with adjusted inputs. Applied Non-Maximum Suppression to figure out the momentum change and implemented the Anderson-Darling test to compare the observed and simulated momentum effects under random conditions. Achieved 79% accuracy in predicting tennis game winners.
Predicting Wordle Game Result Based on Random Forest
Shenzhen, China
Core Member, 2023 COMAP Mathematical Contest in Modeling (MCM)
Feb. 2023
Based on the dataset provided by Twitter in 2022, predicted the daily number of players of Wordle Game using an ARIMA model, determined the distribution of the number of attempts for any given word using random forest regression, and assessed the difficulty of guessing a particular word using K-means clustering, achieving 73.6% accuracy.
Constructed a word-guessing machine using the Monte Carlo algorithm to maximize the probability of correct guesses, showing that the real number of guesses is at least 2.75 times more than the number shared on Twitter.
Smart Racing Car Design and Manufacturing
Shenzhen, China
Team Leader, Advisor: Prof. Yuan Chen & Prof. Din Zhou
Aug. 2022
Designed a vehicle that precisely fits the race circuit, employed Arduino to develop a remote-control system, and deployed an alcohol explosive device to accelerate the car.
Won the championship in the 2022 Da Vinci Summer Camp in System Design and Intelligence Manufacture.
ACADEMIC PROJECTS
Kleinberg's Small World Phenomenon: A Modification
Jun. 2024
Modified the long-range connections in Kleinberg's model in a paper entitled "A Small World Phenomenon: An Algorithmic Perspective" to explore how changes in assumptions can affect the social network.
Proved that power-law distribution with α=1 could optimize the network under modified conditions through theoretical proof and simulated experiments in a 20,000x20,000 network model.
Shenzhen Metro's Operation Schedule Optimization
May 2024
Customized the train stop schedule during the morning peak based on passenger flow data of Metro Line 5 from Aug 31, 2018.
Optimized three different greedy algorithms with local search algorithms, saving passengers 10% to 15% of their average commuting time during peak congestion periods.
Visualized the dynamic passenger flow of each station and the operation of the entire subway system using HTML.
Stock Price Trend Prediction Based on Dimension Reduction Techniques and Cluster Analysis
Apr. 2024
Analyzed the operational conditions of 1,200 listed companies from three industries (pharmaceutical, chemical, and machinery) for the fourth quarter of 2023 using financial statements scraped from BaoStock.
Developed a company evaluation & classification system through factor analysis & K-means clustering, creating a stock recommendation system based on Ridge regression. The system predicted a profit of 2.36 RMB/stock compared to random stock purchases.
A Face Detection System Based on DeepFace
Jun. 2023
Encapsulated the Deepface model and created a user interface for a face recognition system using Python.
Realized five main functions: face attribute analysis (emotion, race, age, gender), face tracking, face detection, face recognition, and face verification, achieving 76.1% accuracy for face recognition and classification.
Klotski: Number Puzzle Game Solver Development
Dec. 2022
Developed a solver for Klotski, determining if there was a solution in any given initial configuration.
The solver, based on a self-designed greedy algorithm and Min Priority Queue, prioritized moves that brought the board closer to its final structure.
The solver quickly found solutions for a 6x6 Klotski using 1x1, 2x1, and 2x2 blocks within five minutes, whereas BFS and DFS algorithms were limited to solving smaller puzzles.
Bilibili User Study Data Analysis
Nov. 2022
Scraped data for 9,000+ vloggers from 2019 and 2022 using Google's web scraper.
Utilized Python for exploratory data analysis (EDA) to analyze vloggers' characteristics and videos to identify changes in patterns and platform usage.
Provided suggestions for new and experienced vloggers on creating videos and attracting fans accordingly.
SKILLS & MISCELLANEOUS
Technical: Python, R, Java, Hadoop & Spark, HTML/CSS/JavaScript, MATLAB
Language: Mandarin (native), English (proficient), Spanish (basic)
Social Works:
•     Assistant Instructor for SUSTech Youth Rock Climbing Course
Jan. 2024
•     Student Assistant in Student Affairs Center, SUSTech
Jun. 2022 - Aug. 2022
•     Contributed Over 80 Hours of Volunteer Service Across Diverse Activities and Organizations
Interest:Rock Climbing (SUSTech’s Rock Climbing Team), Badminton (Zhicheng College’s Badminton Team), Hiking.