Research | Junfei Zhan's Website

Research

Trains but Doesn't Learn: A Post-Training Delivery Benchmark for LLM Agents as Forward-Deployed Engineers

Trains but Doesn't Learn: A Post-Training Delivery Benchmark for LLM Agents as Forward-Deployed Engineers

A governed delivery-plane benchmark that asks not whether an LLM agent can raise a metric, but whether it can be trusted to deliver post-training as a service. Frontier agents run ten governed stages on real H200 and A40 hardware across 8B–70B bases; the risk lives where failure is silent — in judgment and governance, not the arithmetic a configurator already solves.

Jun 17, 2026

Bridging Optimal Control And Reinforcement Learning For Node-Level Vaccine Allocation: A Regime-Based Comparative Analysis

Reinforcement Learning

Bridging Optimal Control And Reinforcement Learning For Node-Level Vaccine Allocation: A Regime-Based Comparative Analysis

Master's thesis. A scalable framework for per-individual vaccine allocation on heterogeneous contact networks, comparing group-level optimal control with hub-aware heuristics against end-to-end reinforcement learning on a stochastic SEPAILHRVD simulator.

Apr 23, 2026

Seeing is Free, Speaking is Not: Uncovering the True Energy Bottleneck in Edge VLM Inference

Vision-Language Models

Seeing is Free, Speaking is Not: Uncovering the True Energy Bottleneck in Edge VLM Inference

Conducted the first systematic energy profiling of on-device VLM inference, revealing that autoregressive decoding—not visual token processing—dominates energy consumption (86–97%), overturning conventional assumptions about visual token reduction as the primary efficiency strategy.

Mar 27, 2026

Stochastic Power Modeling and Constrained MDP Optimization for On-Device SLM Inference

Small Language Models

Stochastic Power Modeling and Constrained MDP Optimization for On-Device SLM Inference

Proposed a unified stochastic framework combining HSMM-based power modeling and constrained MDP optimization to enable sustainable deployment of small language models (SLMs) on edge devices.

Sep 22, 2025

PRISM: Privacy-Aware Routing for Adaptive Cloud–Edge LLM Inference with Semantic Sketch Collaboration

PRISM: Privacy-Aware Routing for Adaptive Cloud–Edge LLM Inference with Semantic Sketch Collaboration

Designed a privacy-aware routing framework that dynamically selects execution paths across cloud and edge for LLM inference, combining adaptive LDP and semantic sketching

Jul 30, 2025

RL-Enhanced Disturbance-Aware MPC for Robust UAV Trajectory Tracking

RL-Enhanced Disturbance-Aware MPC for Robust UAV Trajectory Tracking

Developed a hybrid control framework integrating reinforcement learning and sliding mode observer into MPC for disturbance-aware UAV tracking.

May 7, 2025

Can Large Language Models Credibly Stand in for Humans in Game-Theoretic Experiments?

Can Large Language Models Credibly Stand in for Humans in Game-Theoretic Experiments?

Evaluated LLM alignment with human behavior across strategic social games and proposed PRIME-Router to enhance role consistency and adaptability.

Apr 17, 2025

Minimizing Maximum Age of Service in Virtualized Green IoT Networks

Minimizing Maximum Age of Service in Virtualized Green IoT Networks

Developed optimization and control strategies to reduce service latency in renewable-powered IoT networks

Dec 7, 2024

Task Offloading and Approximate Computing in Solar Powered IoT Networks

Task Offloading and Approximate Computing in Solar Powered IoT Networks

Proposed a novel MILP and Digital Twin-based control strategy for optimizing energy use in approximate IoT task execution.

Jan 7, 2024