Qijia He

(+86) 17752877602 | heqj2021@mail.sustech.edu.cn | 1088Xueyuan Blvd, Shenzhen



EDUCATION

Southern University of Science and Technology (SUSTech)
Shenzhen, China
B.Sc. in Data Science and Big Data Technology, GPA: 3.88/4.0, Major GPA: 3.95/4.0
Aug. 2021 - Jun. 2025 (Expected)
  • Awards: 2nd Class Scholarship of Zhicheng College (Top 10%, 2022 & 2023), Elite Student in Dept. of STAT & DS
  • Relevant Courses: Computer Vision, Data Structures & Algorithm Analysis, Data Science Practice, Big Data Analysis Software & Application, Computer Programming, Multivariate Statistical Analysis, Mathematical Analysis

  • Stony Brook University
    Stony Brook, NY, USA
    Exchange Student, GPA: 4.0/4.0
    Aug. 2023 - Dec. 2023
  • Relevant Courses: Data Analysis, Data Mining, Foundations of Machine Learning, Introduction to Visualization

  • RESEARCH & INTERNSHIP

    Exploiting Offline Dataset to Improve Online Bandit Learning Efficiency with Relative Feedback
    Shenzhen, China
    Research Assistant, PI: Prof. Fang Kong, SUSTech
    Sep. 2024 - Present
  • Did a comprehensive literature review on the state of-the-art bandit algorithms, focusing on estimating model parameters using offline stochastic data for online preference feedback by accommodating LinUCB algorithm.
  • Proposed several algorithmic approaches and conducted regret analysis, mathematically demonstrating that the offline dataset can effectively reduce regret within the proposed algorithm framework.
  • Delivered a seminar presentation on stochastic linear bandits based on Bandit Algorithms [Tor Latimore et al, chapter 19].

  • TAPALL
    Shenzhen, China
    Computer Vision & Cloud Engineer Intern, R&D Department
    Jun. 2024 - Aug. 2024
  • Scraped 20,000+ uncopyrighted videos from several websites (Pikwizard, Vecteezy, Mazwai, etc.) through Beautiful Soup using Python; designed a video filtering algorithm to select the videos with flat surface, camera in motion from the dataset for product demonstration in various environments.
  • Employed OpenCV to identify the basic optical flow and video characteristics, and applied SegFormer (a semantic segmentation method) to detect the specific objects in the video.
  • Generated captions using VideoLLaMA2 and classified videos into predetermined categories.
  • Developed a cloud scheduling system to implement the company's 3D Modelling and Digital Human solutions in Huawei Cloud. By strategically deploying the GPU instead of engaging in long-term leasing, successfully reduced the back-end operational expenses.

  • Nature Index Periodicals: A Comparative Analysis of Impact and Innovation in Publications
    Shenzhen, China
    Research Assistant, PI: Prof. Yifang Ma
    Jul. 2022 - Aug. 2023
  • Collected 21,848 journal information from over 10 million papers from the Web of Science and a private database, featuring paper indexes like conventionality, hit5, hotspot, and novelty.
  • Visualized KDE plots for over 1.5 million papers in NI and non-NI journals with similar journal impact factors in the same field, finding that NI journals do not exhibit an advantage in novelty.
  • Analyzed the affiliations and publications in NI and non-NI journals, concluding that NI journals tend to favor works from institutions in the U.S. and Europe. Suggesting either a limitation in the Nature Index's ability to differentiate highly influential journals or an inherent bias in the selection practices of these publishers.

  • COMPETITIONS

    Optimization of the CLIP Model (ViT-B-32) with Few-shot Learning and Zero-shot Learning
    Shenzhen, China
    Core Member, The 2024 Jittor Artificial Intelligence Challenge
    Apr. 2024 - Aug. 2024
  • Based on hierarchical clustering, selected four photos from each dataset (Animal, Caltech-101, Food-101, and Tsinghua Dogs) for few-shot training, with Hamming distance as the metric and image vectors encoded by the original model.
  • Fine-tuned the prompt from "a photo of xxx" to various forms by merging self-written and GPT-4-written captions, and selected the final prompt based on increased accuracy in training datasets.
  • Developed an adapter layer for an image encoder using the boosting method with cross-entropy loss; Trained multiple adapters with different structures and parameters, and integrated predicted results via ensemble learning. Improved classification accuracy from 62% to 69.15%, ranking top 0.6% among all participants.

  • Momentum: Is it Truly Exist in Tennis Game?
    Shenzhen, China
    Team Leader, 2024 COMAP Mathematical Contest in Modeling (MCM)
    Feb. 2024
  • Analyzed shifting momentum between players and their impact on game outcomes based on the 2023 Wimbledon Championships - Men's doubles dataset.
  • Quantified momentum through a self-designed time series model and developed a logistic regression model with adjusted inputs. Applied Non-Maximum Suppression to figure out the momentum change and implemented the Anderson-Darling test to compare the observed and simulated momentum effects under random conditions. Achieved 79% accuracy in predicting tennis game winners.

  • Predicting Wordle Game Result Based on Random Forest
    Shenzhen, China
    Core Member, 2023 COMAP Mathematical Contest in Modeling (MCM)
    Feb. 2023
  • Based on the dataset provided by Twitter in 2022, predicted the daily number of players of Wordle Game using an ARIMA model, determined the distribution of the number of attempts for any given word using random forest regression, and assessed the difficulty of guessing a particular word using K-means clustering, achieving 73.6% accuracy.
  • Constructed a word-guessing machine using the Monte Carlo algorithm to maximize the probability of correct guesses, showing that the real number of guesses is at least 2.75 times more than the number shared on Twitter.

  • Smart Racing Car Design and Manufacturing
    Shenzhen, China
    Team Leader, Advisor: Prof. Yuan Chen & Prof. Din Zhou
    Aug. 2022
  • Designed a vehicle that precisely fits the race circuit, employed Arduino to develop a remote-control system, and deployed an alcohol explosive device to accelerate the car.
  • Won the championship in the 2022 Da Vinci Summer Camp in System Design and Intelligence Manufacture.

  • ACADEMIC PROJECTS

    Kleinberg's Small World Phenomenon: A Modification
    Jun. 2024
  • Modified the long-range connections in Kleinberg's model in a paper entitled "A Small World Phenomenon: An Algorithmic Perspective" to explore how changes in assumptions can affect the social network.
  • Proved that power-law distribution with α=1 could optimize the network under modified conditions through theoretical proof and simulated experiments in a 20,000x20,000 network model.

  • Shenzhen Metro's Operation Schedule Optimization
    May 2024
  • Customized the train stop schedule during the morning peak based on passenger flow data of Metro Line 5 from Aug 31, 2018.
  • Optimized three different greedy algorithms with local search algorithms, saving passengers 10% to 15% of their average commuting time during peak congestion periods.
  • Visualized the dynamic passenger flow of each station and the operation of the entire subway system using HTML.

  • Stock Price Trend Prediction Based on Dimension Reduction Techniques and Cluster Analysis
    Apr. 2024
  • Analyzed the operational conditions of 1,200 listed companies from three industries (pharmaceutical, chemical, and machinery) for the fourth quarter of 2023 using financial statements scraped from BaoStock.
  • Developed a company evaluation & classification system through factor analysis & K-means clustering, creating a stock recommendation system based on Ridge regression. The system predicted a profit of 2.36 RMB/stock compared to random stock purchases.

  • A Face Detection System Based on DeepFace
    Jun. 2023
  • Encapsulated the Deepface model and created a user interface for a face recognition system using Python.
  • Realized five main functions: face attribute analysis (emotion, race, age, gender), face tracking, face detection, face recognition, and face verification, achieving 76.1% accuracy for face recognition and classification.

  • Klotski: Number Puzzle Game Solver Development
    Dec. 2022
  • Developed a solver for Klotski, determining if there was a solution in any given initial configuration.
  • The solver, based on a self-designed greedy algorithm and Min Priority Queue, prioritized moves that brought the board closer to its final structure.
  • The solver quickly found solutions for a 6x6 Klotski using 1x1, 2x1, and 2x2 blocks within five minutes, whereas BFS and DFS algorithms were limited to solving smaller puzzles.

  • Bilibili User Study Data Analysis
    Nov. 2022
  • Scraped data for 9,000+ vloggers from 2019 and 2022 using Google's web scraper.
  • Utilized Python for exploratory data analysis (EDA) to analyze vloggers' characteristics and videos to identify changes in patterns and platform usage.
  • Provided suggestions for new and experienced vloggers on creating videos and attracting fans accordingly.

  • SKILLS & MISCELLANEOUS

    Technical: Python, R, Java, Hadoop & Spark, HTML/CSS/JavaScript, MATLAB
    Language: Mandarin (native), English (proficient), Spanish (basic)
    Social Works:
    •     Assistant Instructor for SUSTech Youth Rock Climbing Course
    Jan. 2024
    •     Student Assistant in Student Affairs Center, SUSTech
    Jun. 2022 - Aug. 2022
    •     Contributed Over 80 Hours of Volunteer Service Across Diverse Activities and Organizations
    Interest:Rock Climbing (SUSTech’s Rock Climbing Team), Badminton (Zhicheng College’s Badminton Team), Hiking.