<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Research | Junfei Zhan's Website</title><link>https://junfei-z.github.io/research/</link><atom:link href="https://junfei-z.github.io/research/index.xml" rel="self" type="application/rss+xml"/><description>Research</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><image><url>https://junfei-z.github.io/media/icon_hu70bcee51a3cd7a7338014254a2e0c844_1401285_512x512_fill_lanczos_center_3.png</url><title>Research</title><link>https://junfei-z.github.io/research/</link></image><item><title>Scalable Node-Level Vaccine Allocation on Contact Networks: Bridging Optimal Control and Reinforcement Learning</title><link>https://junfei-z.github.io/research/scalable-node-level-vaccine-allocation-on-contact-networks/</link><pubDate>Thu, 23 Apr 2026 00:00:00 +0000</pubDate><guid>https://junfei-z.github.io/research/scalable-node-level-vaccine-allocation-on-contact-networks/</guid><description>&lt;a href="https://junfei-z.github.io/vaccine_rl/" target="_blank">
&lt;img src="https://img.shields.io/badge/Interactive%20Demo-Open-2563eb?logo=googlechrome&amp;logoColor=white" alt="Demo">
&lt;/a>
&lt;p>📄 &lt;em>Master&amp;rsquo;s Thesis, University of Pennsylvania (2026). Advisor: Prof. Saswati Sarkar.&lt;/em>&lt;/p>
&lt;p>In the first weeks of a pandemic, vaccines must be allocated across a large, heterogeneous population under a tight daily dose budget and over a horizon of weeks to months. A deployable policy must name specific individuals — not group-level proportions — and cope with three structural difficulties: sequential decisions over a long horizon with a delayed reward signal, a combinatorial daily action space of size $\binom{N}{K}$, and individual network position that matters as much as demographic group.&lt;/p>
&lt;h2 id="interactive-demo">Interactive Demo&lt;/h2>
&lt;p>The companion demo walks through the thesis visually:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Three-group population model&lt;/strong> — baseline (X), high-risk elderly (Y), and high-contact hubs (Z), each with group-specific symptomatic, hospitalisation, and case-fatality rates.&lt;/li>
&lt;li>&lt;strong>10-compartment SEPAILHRVD disease model&lt;/strong> — latent, pre-symptomatic, asymptomatic, symptomatic, late-stage, hospitalised, recovered, vaccinated, and dead.&lt;/li>
&lt;li>&lt;strong>Barabási–Albert network construction&lt;/strong> — watch preferential attachment grow a scale-free contact graph and the characteristic power-law degree tail emerge.&lt;/li>
&lt;li>&lt;strong>Stochastic simulator&lt;/strong> — seed infections in any group mix and watch an unvaccinated outbreak unfold day by day, reporting cumulative deaths as the no-intervention baseline.&lt;/li>
&lt;li>&lt;strong>Method comparison&lt;/strong> &lt;em>(coming soon)&lt;/em> — OC-Random, OC-high, Naive RL, and Node RL on identical seeds.&lt;/li>
&lt;/ol>
&lt;p>👉 &lt;a href="https://junfei-z.github.io/vaccine_rl/">&lt;strong>Open the interactive demo&lt;/strong>&lt;/a>&lt;/p>
&lt;h2 id="contributions">Contributions&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>C1 — Stochastic node-level simulator&lt;/strong>: a high-fidelity environment integrating an explicit Barabási–Albert contact network with a 10-compartment SEPAILHRVD model, capturing intrinsic stochasticity of infection events and individual-level risk heterogeneity.&lt;/li>
&lt;li>&lt;strong>C2 — OC-high&lt;/strong>: augments principled group-level optimal control with a high-degree-first intra-group heuristic, bridging aggregate policy and individual action.&lt;/li>
&lt;li>&lt;strong>C3 — Node RL&lt;/strong>: an end-to-end actor–critic with a shared-parameter scoring MLP and Gumbel-Top-$K$ reparameterised sampling, yielding $O(K)$ gradient variance versus $\Theta(N)$ for independent Bernoulli baselines.&lt;/li>
&lt;li>&lt;strong>C4 — Regime map&lt;/strong>: systematic benchmarking across population size, horizon, and initial-infection ratio identifying when each method is preferable — and when the additional compute of node-level RL is justified.&lt;/li>
&lt;/ul>
&lt;h2 id="headline-findings">Headline Findings&lt;/h2>
&lt;ul>
&lt;li>OC-high matches or beats Node RL in most regimes at roughly &lt;strong>two orders of magnitude&lt;/strong> less preparation cost.&lt;/li>
&lt;li>Node RL&amp;rsquo;s advantage is real but &lt;strong>confined&lt;/strong> to short horizons and hub-heavy initial infections, where the mean-field assumption underlying OC-high structurally breaks down.&lt;/li>
&lt;li>The intra-group high-degree heuristic alone accounts for a &lt;strong>5–10% reduction in deaths&lt;/strong> on average, comparable to the contribution of the group-level OC rates themselves.&lt;/li>
&lt;/ul></description></item><item><title>Seeing is Free, Speaking is Not: Uncovering the True Energy Bottleneck in Edge VLM Inference</title><link>https://junfei-z.github.io/research/seeing-is-free-speaking-is-not/</link><pubDate>Fri, 27 Mar 2026 00:00:00 +0000</pubDate><guid>https://junfei-z.github.io/research/seeing-is-free-speaking-is-not/</guid><description>&lt;p>Vision-Language Models (VLMs) are the perceptual backbone of embodied AI, but their energy footprint on edge hardware remains poorly understood. Existing efficiency efforts focus predominantly on reducing visual tokens, implicitly treating visual processing as the dominant energy cost. We overturn this implicit assumption through the &lt;strong>first systematic energy profiling&lt;/strong> of on-device VLM inference, spanning five models across three architecture families, four input resolutions, and two hardware platforms (NVIDIA RTX 3070 and Jetson Orin NX).&lt;/p>
&lt;h2 id="key-findings">Key Findings&lt;/h2>
&lt;p>Our analysis yields three core findings:&lt;/p>
&lt;h3 id="1-power-is-a-model-fingerprint">1. Power is a Model Fingerprint&lt;/h3>
&lt;p>Average inference power is a &lt;strong>model-intrinsic constant&lt;/strong>, invariant to input resolution, image complexity, and prompt type, with less than 5% variation across all conditions. This means that all energy variation across inputs must arise from variation in &lt;strong>inference time&lt;/strong>, not from variation in power draw.&lt;/p>
&lt;h3 id="2-decode-dominates-energy">2. Decode Dominates Energy&lt;/h3>
&lt;p>Autoregressive decoding accounts for &lt;strong>86 to 97% of total energy&lt;/strong>. Each output token costs &lt;strong>11 to 39x more&lt;/strong> wall-clock time than each input token due to the compute-bound and memory-bound asymmetry between prefill and decode phases. Output token count is the dominant driver of both latency and energy.&lt;/p>
&lt;h3 id="3-the-visual-token-pruning-illusion">3. The Visual Token Pruning Illusion&lt;/h3>
&lt;p>Even removing &lt;strong>all visual tokens&lt;/strong> saves at most &lt;strong>10% of total energy&lt;/strong> for fixed-token models. In contrast, controlling output length by 50% saves up to &lt;strong>97%&lt;/strong>. These findings expose a fundamental limitation of visual token pruning: it targets prefill, which is already a minority of total energy.&lt;/p>
&lt;h2 id="contributions">Contributions&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Energy decomposition&lt;/strong> into prefill vs. decode phases, showing decode dominance across all configurations&lt;/li>
&lt;li>&lt;strong>Theoretical upper bound&lt;/strong> on energy savings from visual token pruning&lt;/li>
&lt;li>&lt;strong>Cross-model energy predictor&lt;/strong> — a linear model with five features (model size, input token count, output token count, and interaction terms) that explains &lt;strong>98.6% of energy variance&lt;/strong> without per-model calibration (MAPE = 10.3%)&lt;/li>
&lt;li>&lt;strong>Deployment guidelines&lt;/strong>: budget output not input; match token strategy to deployment scenario; anticipate content-driven energy variation&lt;/li>
&lt;/ul>
&lt;h2 id="conclusion">Conclusion&lt;/h2>
&lt;p>The true energy bottleneck in edge VLM inference is not &lt;em>seeing&lt;/em> but &lt;em>speaking&lt;/em>: not what the model sees, but how much it says. Our energy decomposition framework provides actionable guidelines for energy-aware VLM deployment on resource-constrained edge devices.&lt;/p>
&lt;p>[ACM MM 2026 Submission] — In Review&lt;/p></description></item><item><title>Stochastic Power Modeling and Constrained MDP Optimization for On-Device SLM Inference</title><link>https://junfei-z.github.io/research/power_modeling/</link><pubDate>Mon, 22 Sep 2025 00:00:00 +0000</pubDate><guid>https://junfei-z.github.io/research/power_modeling/</guid><description>&lt;p>📄 [ICASSP 2026 Submission] — In Review&lt;/p>
&lt;p>This research introduces a &lt;strong>stochastic and interpretable framework&lt;/strong> for sustainable &lt;strong>on-device inference of small language models (SLMs)&lt;/strong> under strict energy and hardware constraints. By capturing fine-grained CPU/GPU power dynamics and optimizing inference scheduling with constrained MDPs, the work provides a principled foundation for &lt;strong>adaptive, resource-aware AI at the edge&lt;/strong>.&lt;/p>
&lt;h2 id="problem-and-motivation">Problem and Motivation&lt;/h2>
&lt;p>Running SLMs locally on smartphones, laptops, or IoT nodes promises &lt;strong>low-latency and privacy-preserving AI services&lt;/strong>, but these devices face &lt;strong>finite battery budgets&lt;/strong> and &lt;strong>strict power caps&lt;/strong>. Traditional energy models fail to capture the stochastic, phase-wise CPU/GPU behaviors of SLM inference, making them unsuitable for &lt;strong>multi-task adaptive deployment&lt;/strong>.&lt;/p>
&lt;h2 id="technical-contributions">Technical Contributions&lt;/h2>
&lt;h3 id="1-hsmm-based-energy-modeling">1. HSMM-Based Energy Modeling&lt;/h3>
&lt;ul>
&lt;li>Conducted fine-grained power measurements of &lt;strong>Gemma2-2B&lt;/strong> and &lt;strong>Qwen3-4B&lt;/strong> on MT-Bench.&lt;/li>
&lt;li>Modeled CPU and GPU traces separately with &lt;strong>Hidden Semi-Markov Models (HSMMs)&lt;/strong>:
&lt;ul>
&lt;li>GPU: ramp-up, plateau, decay phases.&lt;/li>
&lt;li>CPU: low-load and high-load bursts.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Achieved &lt;strong>higher fidelity than HMM and TCN baselines&lt;/strong> in predicting power fluctuations.&lt;/li>
&lt;/ul>
&lt;h3 id="2-constrained-mdp-formulation">2. Constrained MDP Formulation&lt;/h3>
&lt;ul>
&lt;li>Defined a &lt;strong>CMDP&lt;/strong> where each inference task selects an SLM configuration (model + quantization).&lt;/li>
&lt;li>State: remaining energy budget.&lt;/li>
&lt;li>Actions: candidate SLM setups.&lt;/li>
&lt;li>Reward: &lt;strong>LLM-as-a-Judge quality scores&lt;/strong>.&lt;/li>
&lt;li>Constraints: &lt;strong>finite energy budget&lt;/strong> and &lt;strong>instantaneous device-level power cap&lt;/strong>.&lt;/li>
&lt;/ul>
&lt;h3 id="3-policy-optimization-with-q-learning">3. Policy Optimization with Q-Learning&lt;/h3>
&lt;ul>
&lt;li>Constructed cost–reward pairs for six candidate actions.&lt;/li>
&lt;li>Solved CMDP with tabular Q-learning:
&lt;ul>
&lt;li>Improved average reward from &lt;strong>~9 to ~15&lt;/strong> over 300 episodes.&lt;/li>
&lt;li>Maintained energy usage within &lt;strong>85–90% of budget&lt;/strong>.&lt;/li>
&lt;li>Guaranteed no violation of power caps.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="results-and-insights">Results and Insights&lt;/h2>
&lt;ul>
&lt;li>HSMMs effectively capture &lt;strong>piecewise-stationary phases&lt;/strong> in edge inference.&lt;/li>
&lt;li>CMDP optimization reveals clear &lt;strong>energy–quality trade-offs&lt;/strong>.&lt;/li>
&lt;li>Learned policies significantly improve cumulative inference quality while &lt;strong>respecting real-world constraints&lt;/strong>.&lt;/li>
&lt;/ul>
&lt;h2 id="conclusion">Conclusion&lt;/h2>
&lt;p>This study establishes the first &lt;strong>unified mathematical framework&lt;/strong> linking SLM parameters, stochastic energy consumption, and inference quality. By integrating HSMM-based cost modeling with CMDP optimization, it enables &lt;strong>sustainable, adaptive deployment&lt;/strong> of SLMs in edge and IoT environments, paving the way for future extensions with deep RL and collaborative multi-device scheduling.&lt;/p></description></item><item><title>PRISM: Privacy-Aware Routing for Adaptive Cloud–Edge LLM Inference with Semantic Sketch Collaboration</title><link>https://junfei-z.github.io/research/prism/</link><pubDate>Wed, 30 Jul 2025 00:00:00 +0000</pubDate><guid>https://junfei-z.github.io/research/prism/</guid><description>&lt;a href="https://junfei-z.github.io/prism_full.pdf" target="_blank">
&lt;img src="https://img.shields.io/badge/View%20Full%20Paper-PDF-red?logo=adobeacrobatreader&amp;logoColor=white" alt="PDF">
&lt;/a>
&lt;p>📄 [Accepted at 2026 AAAI Conference on Artificial Intelligence] — To appear&lt;/p>
&lt;p>This project introduces &lt;strong>PRISM&lt;/strong>, a context-aware cloud–edge inference framework that balances privacy, utility, and efficiency for &lt;strong>Large Language Model (LLM)&lt;/strong> services. It addresses the key limitations of uniform privacy mechanisms by adapting protection based on &lt;strong>semantic sensitivity&lt;/strong> of user inputs.&lt;/p>
&lt;h2 id="objectives">Objectives&lt;/h2>
&lt;p>The primary goal is to enable &lt;strong>privacy-preserving LLM inference&lt;/strong> in real-world deployments, where sensitive user prompts are routed intelligently between edge devices and the cloud. PRISM is designed to:&lt;/p>
&lt;ul>
&lt;li>Avoid unnecessary noise for benign inputs&lt;/li>
&lt;li>Preserve semantic coherence in sensitive prompts&lt;/li>
&lt;li>Reduce latency and energy consumption without compromising utility&lt;/li>
&lt;/ul>
&lt;h2 id="key-contributions">Key Contributions&lt;/h2>
&lt;h3 id="semantic-sensitive-execution-routing">Semantic-Sensitive Execution Routing&lt;/h3>
&lt;ul>
&lt;li>A &lt;strong>soft gating controller&lt;/strong> on the edge scores entity-level risk using contextual features (e.g., named entities, first-person references)&lt;/li>
&lt;li>Routes prompts to one of three execution paths:
&lt;ul>
&lt;li>&lt;strong>Edge-only&lt;/strong> for high-risk prompts&lt;/li>
&lt;li>&lt;strong>Cloud-only&lt;/strong> for low-risk prompts&lt;/li>
&lt;li>&lt;strong>Cloud–Edge Collaboration&lt;/strong> for mid-sensitivity prompts&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="adaptive-two-layer-local-differential-privacy-ldp">Adaptive Two-Layer Local Differential Privacy (LDP)&lt;/h3>
&lt;ul>
&lt;li>Each sensitive entity is obfuscated through:
&lt;ul>
&lt;li>Category-level perturbation (e.g., masking &amp;ldquo;Diagnosis&amp;rdquo;)&lt;/li>
&lt;li>Value-level perturbation (e.g., replacing &amp;ldquo;HIV&amp;rdquo; with &amp;ldquo;Flu&amp;rdquo;)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Privacy budget allocation is guided by a sensitivity weight model ensuring &lt;strong>fine-grained protection without semantic collapse&lt;/strong>&lt;/li>
&lt;/ul>
&lt;h3 id="semantic-sketch-collaboration-protocol">Semantic Sketch Collaboration Protocol&lt;/h3>
&lt;ul>
&lt;li>Noisy prompts are processed in the cloud to generate &lt;strong>semantic sketches&lt;/strong> (e.g., high-level abstract responses)&lt;/li>
&lt;li>The edge-side &lt;strong>Small Language Model (SLM)&lt;/strong> refines these sketches using the original context&lt;/li>
&lt;li>Enables &lt;strong>high-utility responses under strong privacy constraints&lt;/strong>&lt;/li>
&lt;/ul>
&lt;h2 id="results--insights">Results &amp;amp; Insights&lt;/h2>
&lt;ul>
&lt;li>PRISM achieves &lt;strong>up to 3× lower latency&lt;/strong> and &lt;strong>2.5× lower energy consumption&lt;/strong> than baselines like Uniform and Selective LDP&lt;/li>
&lt;li>Delivers &lt;strong>higher LLM-Judge scores (up to 7.2)&lt;/strong> under strong privacy budgets&lt;/li>
&lt;li>Outperforms state-of-the-art methods (e.g., Split-and-Denoise, DP-Forward) in terms of both utility and efficiency&lt;/li>
&lt;li>Robust across &lt;strong>8 different model combinations&lt;/strong> (e.g., GPT-4o + StableLM)&lt;/li>
&lt;/ul>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Method&lt;/th>
&lt;th>Ct.(s)&lt;/th>
&lt;th>Ec.(J)&lt;/th>
&lt;th>IQ.&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>PRISM&lt;/td>
&lt;td>7.92&lt;/td>
&lt;td>687.2&lt;/td>
&lt;td>6.88&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Uniform LDP&lt;/td>
&lt;td>20.56&lt;/td>
&lt;td>1707.6&lt;/td>
&lt;td>5.72&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Selective LDP&lt;/td>
&lt;td>21.22&lt;/td>
&lt;td>1770.8&lt;/td>
&lt;td>5.94&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Edge-Only&lt;/td>
&lt;td>17.84&lt;/td>
&lt;td>1573.9&lt;/td>
&lt;td>5.09&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Cloud-Only&lt;/td>
&lt;td>&lt;strong>5.13&lt;/strong>&lt;/td>
&lt;td>&lt;strong>296.3&lt;/strong>&lt;/td>
&lt;td>&lt;strong>8.14&lt;/strong>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h2 id="broader-impact">Broader Impact&lt;/h2>
&lt;p>PRISM enables &lt;strong>selective privacy-preserving inference&lt;/strong> for sensitive domains such as &lt;strong>medical, financial, and personal assistants&lt;/strong>, paving the way for:&lt;/p>
&lt;ul>
&lt;li>Deploying LLMs responsibly in &lt;strong>privacy-critical environments&lt;/strong>&lt;/li>
&lt;li>Reducing energy costs in &lt;strong>cloud-edge infrastructure&lt;/strong>&lt;/li>
&lt;li>Bridging the tradeoff between &lt;strong>privacy and inference quality&lt;/strong>&lt;/li>
&lt;/ul></description></item><item><title>RL-Enhanced Disturbance-Aware MPC for Robust UAV Trajectory Tracking</title><link>https://junfei-z.github.io/research/rl-enhanced-disturbance-aware-mpc-for-robust-uav-trajectory-tracking/</link><pubDate>Wed, 07 May 2025 00:00:00 +0000</pubDate><guid>https://junfei-z.github.io/research/rl-enhanced-disturbance-aware-mpc-for-robust-uav-trajectory-tracking/</guid><description>&lt;a href="https://junfei-z.github.io/uav_control.pdf" target="_blank">
&lt;img src="https://img.shields.io/badge/View%20Full%20Paper-PDF-red?logo=adobeacrobatreader&amp;logoColor=white" alt="PDF">
&lt;/a>
&lt;p>📄[Accepted at IEEE SMC 2025] — To appear&lt;/p>
&lt;p>This research introduces &lt;strong>ROAM&lt;/strong>, a novel RL-enhanced, disturbance-aware MPC framework for &lt;strong>precise UAV trajectory tracking&lt;/strong> in uncertain and dynamic environments. The method combines the predictive strengths of MPC with the fast response of reinforcement learning (RL) and the robustness of an adaptive sliding mode observer (SMO).&lt;/p>
&lt;h2 id="problem-and-motivation">Problem and Motivation&lt;/h2>
&lt;p>Traditional UAV controllers using MPC struggle under &lt;strong>model mismatch&lt;/strong>, &lt;strong>wind disturbances&lt;/strong>, and &lt;strong>computational delays&lt;/strong>, resulting in residual tracking errors and slow convergence. This work addresses those challenges via two innovations:&lt;/p>
&lt;ul>
&lt;li>An &lt;strong>offline-trained RL warm-start policy&lt;/strong> to accelerate MPC convergence&lt;/li>
&lt;li>An &lt;strong>Adaptive Super-Twisting Sliding Mode Observer (AST-SMO)&lt;/strong> to estimate and reject real-time disturbances&lt;/li>
&lt;/ul>
&lt;h2 id="technical-contributions">Technical Contributions&lt;/h2>
&lt;h3 id="1-rl-based-warm-start">1. RL-Based Warm Start&lt;/h3>
&lt;ul>
&lt;li>A &lt;strong>direction-conditioned policy&lt;/strong> is trained via imitation learning on expert MPC trajectories.&lt;/li>
&lt;li>During real-time control, it provides &lt;strong>trajectory-consistent initial guesses&lt;/strong> to the MPC solver, reducing early-stage tracking error by &lt;strong>16.9%&lt;/strong> and computation time by &lt;strong>38.7%&lt;/strong>.&lt;/li>
&lt;/ul>
&lt;h3 id="2-ast-smo-for-disturbance-estimation">2. AST-SMO for Disturbance Estimation&lt;/h3>
&lt;ul>
&lt;li>The SMO estimates external disturbances in real time using a smooth hyperbolic function to avoid chattering.&lt;/li>
&lt;li>An adaptive gain tuning mechanism adjusts sensitivity dynamically for better convergence.&lt;/li>
&lt;/ul>
&lt;h3 id="3-disturbance-aware-mpc">3. Disturbance-Aware MPC&lt;/h3>
&lt;ul>
&lt;li>MPC is reformulated to incorporate real-time estimates from AST-SMO:
\[
x_{k+1} = Ax_k + Bu_k + E(\hat{d}_k)
\]&lt;/li>
&lt;li>Objective: minimize both tracking error and control effort, while maintaining system constraints.&lt;/li>
&lt;/ul>
&lt;h2 id="simulation-results">Simulation Results&lt;/h2>
&lt;ul>
&lt;li>Evaluated on a 12-DOF quadrotor model under sinusoidal and noisy disturbances.&lt;/li>
&lt;li>ROAM achieved:
&lt;ul>
&lt;li>16.9% improvement in early-stage tracking accuracy&lt;/li>
&lt;li>38.7% reduction in computation time&lt;/li>
&lt;li>Superior trajectory adherence under heavy external disturbances compared to classical MPC&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="conclusion">Conclusion&lt;/h2>
&lt;p>ROAM demonstrates that &lt;strong>deep integration of RL, observers, and MPC&lt;/strong> yields a control system with faster convergence, better stability, and higher resilience. Its lightweight and modular design makes it highly suitable for &lt;strong>real-time deployment&lt;/strong> on embedded UAV platforms.&lt;/p>
&lt;!-- [Hugo Blox Builder](https://hugoblox.com) is designed to give technical content creators a seamless experience. You can focus on the content and the Hugo Blox Builder which this template is built upon handles the rest.
**Embed videos, podcasts, code, LaTeX math, and even test students!**
On this page, you'll find some examples of the types of technical content that can be rendered with Hugo Blox.
## Video
Teach your course by sharing videos with your students. Choose from one of the following approaches:
**Youtube**:
{{&lt; youtube w7Ft2ymGmfc >}}
**Bilibili**:
{{&lt; bilibili id="BV1WV4y1r7DF" >}}
**Video file**
Videos may be added to a page by either placing them in your `assets/media/` media library or in your [page's folder](https://gohugo.io/content-management/page-bundles/), and then embedding them with the _video_ shortcode:
{{&lt; video src="my_video.mp4" controls="yes" >}}
## Podcast
You can add a podcast or music to a page by placing the MP3 file in the page's folder or the media library folder and then embedding the audio on your page with the _audio_ shortcode:
{{&lt; audio src="ambient-piano.mp3" >}}
Try it out:
&lt;audio controls >
&lt;source src="https://junfei-z.github.io/research/rl-enhanced-disturbance-aware-mpc-for-robust-uav-trajectory-tracking/ambient-piano.mp3" type="audio/mpeg">
&lt;/audio>
## Test students
Provide a simple yet fun self-assessment by revealing the solutions to challenges with the `spoiler` shortcode:
```markdown
{{&lt; spoiler text="👉 Click to view the solution" >}}
You found me!
{{&lt; /spoiler >}}
```
renders as
&lt;details class="spoiler " id="spoiler-2">
&lt;summary class="cursor-pointer">👉 Click to view the solution&lt;/summary>
&lt;div class="rounded-lg bg-neutral-50 dark:bg-neutral-800 p-2">
You found me 🎉
&lt;/div>
&lt;/details>
## Math
Hugo Blox Builder supports a Markdown extension for $\LaTeX$ math. You can enable this feature by toggling the `math` option in your `config/_default/params.yaml` file.
To render _inline_ or _block_ math, wrap your LaTeX math with `{{&lt; math >}}$...${{&lt; /math >}}` or `{{&lt; math >}}$$...$${{&lt; /math >}}`, respectively.
&lt;div class="flex px-4 py-3 mb-6 rounded-md bg-primary-100 dark:bg-primary-900">
&lt;span class="pr-3 pt-1 text-primary-600 dark:text-primary-300">
&lt;svg height="24" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">&lt;path fill="none" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round" stroke-width="1.5" d="m11.25 11.25l.041-.02a.75.75 0 0 1 1.063.852l-.708 2.836a.75.75 0 0 0 1.063.853l.041-.021M21 12a9 9 0 1 1-18 0a9 9 0 0 1 18 0m-9-3.75h.008v.008H12z"/>&lt;/svg>
&lt;/span>
&lt;span class="dark:text-neutral-300">We wrap the LaTeX math in the Hugo Blox &lt;em>math&lt;/em> shortcode to prevent Hugo rendering our math as Markdown.&lt;/span>
&lt;/div>
Example **math block**:
```latex
{{&lt; math >}}
$$
\gamma_{n} = \frac{ \left | \left (\mathbf x_{n} - \mathbf x_{n-1} \right )^T \left [\nabla F (\mathbf x_{n}) - \nabla F (\mathbf x_{n-1}) \right ] \right |}{\left \|\nabla F(\mathbf{x}_{n}) - \nabla F(\mathbf{x}_{n-1}) \right \|^2}
$$
{{&lt; /math >}}
```
renders as
$$\gamma_{n} = \frac{ \left | \left (\mathbf x_{n} - \mathbf x_{n-1} \right )^T \left [\nabla F (\mathbf x_{n}) - \nabla F (\mathbf x_{n-1}) \right ] \right |}{\left \|\nabla F(\mathbf{x}_{n}) - \nabla F(\mathbf{x}_{n-1}) \right \|^2}$$
Example **inline math** `{{&lt; math >}}$\nabla F(\mathbf{x}_{n})${{&lt; /math >}}` renders as $\nabla F(\mathbf{x}_{n})$
.
Example **multi-line math** using the math linebreak (`\\`):
```latex
{{&lt; math >}}
$$f(k;p_{0}^{*}) = \begin{cases}p_{0}^{*} &amp; \text{if }k=1, \\
1-p_{0}^{*} &amp; \text{if }k=0.\end{cases}$$
{{&lt; /math >}}
```
renders as
$$
f(k;p_{0}^{*}) = \begin{cases}p_{0}^{*} &amp; \text{if }k=1, \\
1-p_{0}^{*} &amp; \text{if }k=0.\end{cases}
$$
## Code
Hugo Blox Builder utilises Hugo's Markdown extension for highlighting code syntax. The code theme can be selected in the `config/_default/params.yaml` file.
```python
import pandas as pd
data = pd.read_csv("data.csv")
data.head()
```
renders as
```python
import pandas as pd
data = pd.read_csv("data.csv")
data.head()
```
## Inline Images
```go
{{&lt; icon name="python" >}} Python
```
renders as
&lt;span class="inline-block pr-1">
&lt;svg style="height: 1em; transform: translateY(0.1em);" xmlns="http://www.w3.org/2000/svg" height="1em" viewBox="0 0 448 512" fill="currentColor">&lt;path d="M439.8 200.5c-7.7-30.9-22.3-54.2-53.4-54.2h-40.1v47.4c0 36.8-31.2 67.8-66.8 67.8H172.7c-29.2 0-53.4 25-53.4 54.3v101.8c0 29 25.2 46 53.4 54.3 33.8 9.9 66.3 11.7 106.8 0 26.9-7.8 53.4-23.5 53.4-54.3v-40.7H226.2v-13.6h160.2c31.1 0 42.6-21.7 53.4-54.2 11.2-33.5 10.7-65.7 0-108.6zM286.2 404c11.1 0 20.1 9.1 20.1 20.3 0 11.3-9 20.4-20.1 20.4-11 0-20.1-9.2-20.1-20.4.1-11.3 9.1-20.3 20.1-20.3zM167.8 248.1h106.8c29.7 0 53.4-24.5 53.4-54.3V91.9c0-29-24.4-50.7-53.4-55.6-35.8-5.9-74.7-5.6-106.8.1-45.2 8-53.4 24.7-53.4 55.6v40.7h106.9v13.6h-147c-31.1 0-58.3 18.7-66.8 54.2-9.8 40.7-10.2 66.1 0 108.6 7.6 31.6 25.7 54.2 56.8 54.2H101v-48.8c0-35.3 30.5-66.4 66.8-66.4zm-6.7-142.6c-11.1 0-20.1-9.1-20.1-20.3.1-11.3 9-20.4 20.1-20.4 11 0 20.1 9.2 20.1 20.4s-9 20.3-20.1 20.3z"/>&lt;/svg>
&lt;/span> Python
## Did you find this page helpful? Consider sharing it 🙌 --></description></item><item><title>Can Large Language Models Credibly Stand in for Humans in Game-Theoretic Experiments?</title><link>https://junfei-z.github.io/research/can-large-language-models-credibly-stand-in-for-humans-in-game-theoretic-experiments/</link><pubDate>Thu, 17 Apr 2025 00:00:00 +0000</pubDate><guid>https://junfei-z.github.io/research/can-large-language-models-credibly-stand-in-for-humans-in-game-theoretic-experiments/</guid><description>&lt;p>This work investigates the feasibility of using Large Language Models (LLMs) as &lt;strong>proxies for human participants&lt;/strong> in behavioral game-theoretic experiments. We evaluated four LLMs—GPT-4o, Llama‑3.3‑70B‑Instruct, Llama‑3.3‑8B‑Instruct, and DeepSeek-R1 across &lt;strong>three canonical games&lt;/strong>: the &lt;strong>Prisoner’s Dilemma&lt;/strong>, the &lt;strong>Ultimatum Game&lt;/strong>, and the &lt;strong>Public Goods Game&lt;/strong>.&lt;/p>
&lt;h2 id="research-objectives">Research Objectives&lt;/h2>
&lt;ul>
&lt;li>Evaluate &lt;strong>behavioral alignment&lt;/strong>, &lt;strong>persona consistency&lt;/strong>, and &lt;strong>strategic adaptability&lt;/strong> of LLMs vs. human norms.&lt;/li>
&lt;li>Design a &lt;strong>modular, multi-agent framework (PRIME-Router)&lt;/strong> for improved consistency and adaptability.&lt;/li>
&lt;li>Benchmark LLM behavior using &lt;strong>MBTI-based persona prompts&lt;/strong>: Diplomat, Analyst, Sentinel, Explorer.&lt;/li>
&lt;/ul>
&lt;h2 id="core-contributions">Core Contributions&lt;/h2>
&lt;h3 id="1-behavioral-assessment-in-canonical-games">1. Behavioral Assessment in Canonical Games&lt;/h3>
&lt;p>LLMs were benchmarked against human behavior using three new metrics:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>BAM (Behavioral Alignment Measure)&lt;/strong>: similarity to human action distributions&lt;/li>
&lt;li>&lt;strong>PCI (Persona Consistency Index)&lt;/strong>: adherence to prompted social roles&lt;/li>
&lt;li>&lt;strong>ASP (Adaptive Strategic Profile)&lt;/strong>: responsiveness to evolving game contexts&lt;/li>
&lt;/ul>
&lt;p>Key findings:&lt;/p>
&lt;ul>
&lt;li>Most LLMs showed &lt;strong>high initial BAM&lt;/strong> but struggled with &lt;strong>adaptive consistency&lt;/strong> in repeated games.&lt;/li>
&lt;li>GPT-4o and LLaMA-3.3-70B demonstrated &lt;strong>excellent persona consistency&lt;/strong> in one-shot games.&lt;/li>
&lt;/ul>
&lt;h3 id="2-prime-router-framework">2. PRIME-Router Framework&lt;/h3>
&lt;p>To overcome adaptation and consistency limitations, we proposed &lt;strong>PRIME-Router&lt;/strong>, a modular MoE-style architecture that:&lt;/p>
&lt;ul>
&lt;li>Spawns &lt;strong>specialized subroles&lt;/strong> (e.g., Empathy Enforcer, Strategic Planner)&lt;/li>
&lt;li>Assigns the &lt;strong>most suitable LLM&lt;/strong> to each subrole based on empirical performance&lt;/li>
&lt;li>Aggregates multi-agent outputs via &lt;strong>collaboration patterns&lt;/strong> (e.g., star, debate, chain)&lt;/li>
&lt;/ul>
&lt;p>PRIME-Router improves:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>PCI&lt;/strong> by up to &lt;strong>0.23&lt;/strong>&lt;/li>
&lt;li>&lt;strong>ASP&lt;/strong> by up to &lt;strong>0.32&lt;/strong>
across repeated games.&lt;/li>
&lt;/ul>
&lt;h3 id="3-implications-and-outlook">3. Implications and Outlook&lt;/h3>
&lt;ul>
&lt;li>LLMs can &lt;strong>simulate human-like behavior credibly&lt;/strong>, but &lt;strong>strategic depth&lt;/strong> and &lt;strong>long-horizon persona fidelity&lt;/strong> remain challenges.&lt;/li>
&lt;li>PRIME-Router paves the way for &lt;strong>cost-effective AI agents&lt;/strong> in &lt;strong>social science experimentation&lt;/strong>, &lt;strong>policy modeling&lt;/strong>, and &lt;strong>online platform simulation&lt;/strong>.&lt;/li>
&lt;/ul>
&lt;h2 id="conclusion">Conclusion&lt;/h2>
&lt;p>Our study highlights the promise and limitations of LLMs in behavioral game simulations. Structured multi-agent design like PRIME-Router significantly enhances realism, offering a new paradigm for &lt;strong>AI-driven human modeling&lt;/strong> in experimental social science.&lt;/p>
&lt;p>📄 [AAAI 2026 Submission] — In Review&lt;/p></description></item><item><title>Minimizing Maximum Age of Service in Virtualized Green IoT Networks</title><link>https://junfei-z.github.io/research/minimizing-maximum-age-of-service-in-virtualized-green-iot-networks/</link><pubDate>Sat, 07 Dec 2024 00:00:00 +0000</pubDate><guid>https://junfei-z.github.io/research/minimizing-maximum-age-of-service-in-virtualized-green-iot-networks/</guid><description>&lt;p>This project addresses the challenge of embedding and scheduling applications in solar-powered green IoT networks, with the goal of minimizing the &lt;strong>maximum Age of Service (AoS)&lt;/strong> — a freshness metric indicating the delay between data generation and service completion.&lt;/p>
&lt;h2 id="objectives">Objectives&lt;/h2>
&lt;p>The research focuses on virtualized, computation-enabled IoT infrastructures powered by &lt;strong>renewable energy&lt;/strong> (solar). The applications are modeled as &lt;strong>Directed Acyclic Graphs (DAGs)&lt;/strong> with &lt;strong>Virtual Network Functions (VNFs)&lt;/strong> that must be executed under fluctuating energy and computational constraints.&lt;/p>
&lt;h2 id="key-contributions">Key Contributions&lt;/h2>
&lt;h3 id="mixed-integer-linear-programming-milp-formulation">Mixed Integer Linear Programming (MILP) Formulation&lt;/h3>
&lt;ul>
&lt;li>Proposed the &lt;strong>first MILP model&lt;/strong> to jointly optimize:
&lt;ul>
&lt;li>Device selection and sampling time&lt;/li>
&lt;li>DAG request embedding decision&lt;/li>
&lt;li>Energy consumption at devices, gateways, and servers&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Objective: minimize the &lt;strong>maximum AoS&lt;/strong> across all DAG requests.&lt;/li>
&lt;/ul>
&lt;h3 id="heuristic-and-predictive-control-solutions">Heuristic and Predictive Control Solutions&lt;/h3>
&lt;ul>
&lt;li>Developed &lt;strong>GreedyOL&lt;/strong>, a fast heuristic that embeds DAGs based on current AoS.&lt;/li>
&lt;li>Proposed &lt;strong>RHCOP&lt;/strong>, a &lt;strong>Receding Horizon Control Optimization&lt;/strong> framework:
&lt;ul>
&lt;li>Utilizes &lt;strong>Gaussian Mixture Models (GMMs)&lt;/strong> to predict solar energy arrivals and wireless channel gains.&lt;/li>
&lt;li>Enables real-time scheduling using only causal (non-future) information.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="results--insights">Results &amp;amp; Insights&lt;/h3>
&lt;ul>
&lt;li>RHCOP achieves a &lt;strong>1.07×&lt;/strong> and GreedyOL a &lt;strong>1.13×&lt;/strong> min-max AoS compared to optimal MILP.&lt;/li>
&lt;li>More gateways and servers reduce AoS due to enhanced redundancy and flexibility.&lt;/li>
&lt;li>Equal numbers of &lt;strong>VNF-Cs&lt;/strong> (collection) and &lt;strong>VNF-Ps&lt;/strong> (processing) yield optimal freshness.&lt;/li>
&lt;/ul>
&lt;h2 id="broader-impact">Broader Impact&lt;/h2>
&lt;p>The proposed system lays groundwork for &lt;strong>energy-aware, delay-sensitive IoT applications&lt;/strong>, especially in &lt;strong>remote or energy-constrained environments&lt;/strong>. The results provide insights into the tradeoffs between &lt;strong>computation freshness&lt;/strong>, &lt;strong>resource allocation&lt;/strong>, and &lt;strong>green network deployment&lt;/strong> strategies.&lt;/p>
&lt;p>📄 [IEEE Transactions on Services Computing Submission] — Coming Soon&lt;/p></description></item><item><title>Task Offloading and Approximate Computing in Solar Powered IoT Networks</title><link>https://junfei-z.github.io/research/task-offloading-and-approximate-computing-in-solar-powered-iot-networks/</link><pubDate>Sun, 07 Jan 2024 00:00:00 +0000</pubDate><guid>https://junfei-z.github.io/research/task-offloading-and-approximate-computing-in-solar-powered-iot-networks/</guid><description>&lt;p>This research proposes a novel framework for minimizing the &lt;strong>total energy consumption&lt;/strong> of solar-powered IoT networks through &lt;strong>task offloading and approximate computing&lt;/strong>. Devices can choose between local execution (exact or approximate) or offloading tasks to a solar-powered edge server.&lt;/p>
&lt;h2 id="core-objectives">Core Objectives&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Reduce energy usage&lt;/strong> by allowing approximate task execution when tolerable errors are acceptable.&lt;/li>
&lt;li>&lt;strong>Leverage digital twins (DTs)&lt;/strong> to estimate future energy availability and channel conditions.&lt;/li>
&lt;li>&lt;strong>Optimize offloading decisions&lt;/strong> and resource allocation across time slots and channels.&lt;/li>
&lt;/ul>
&lt;h2 id="technical-highlights">Technical Highlights&lt;/h2>
&lt;h3 id="milp-formulation">MILP Formulation&lt;/h3>
&lt;ul>
&lt;li>Designed the &lt;strong>first MILP&lt;/strong> to jointly optimize:
&lt;ul>
&lt;li>Task offloading decisions&lt;/li>
&lt;li>Approximate vs. exact execution&lt;/li>
&lt;li>Channel allocation&lt;/li>
&lt;li>Virtual machine (VM) assignment&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Captures constraints on energy arrivals, CPU cycles, approximation error bounds, and VM capacity.&lt;/li>
&lt;/ul>
&lt;h3 id="dt-assisted-receding-horizon-control-dt-rhc">DT-Assisted Receding Horizon Control (DT-RHC)&lt;/h3>
&lt;ul>
&lt;li>Introduced a &lt;strong>DT-based control algorithm&lt;/strong> using:
&lt;ul>
&lt;li>&lt;strong>Gaussian Mixture Models (GMMs)&lt;/strong> to predict energy and channel gain&lt;/li>
&lt;li>Sliding-window MILP optimization for dynamic scheduling&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Achieves energy usage within &lt;strong>1.62×&lt;/strong> of MILP optimal while requiring only &lt;strong>causal (past) data&lt;/strong>&lt;/li>
&lt;/ul>
&lt;h3 id="results--evaluation">Results &amp;amp; Evaluation&lt;/h3>
&lt;ul>
&lt;li>DT-RHC significantly outperforms random strategies across metrics such as:
&lt;ul>
&lt;li>Energy consumption vs. number of devices&lt;/li>
&lt;li>Impact of approximation ratios&lt;/li>
&lt;li>Task completion within extended time horizons&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Simulations conducted in Python + Gurobi over 100×100 m² deployments using realistic solar input and wireless models.&lt;/li>
&lt;/ul>
&lt;h2 id="conclusion">Conclusion&lt;/h2>
&lt;p>This study demonstrates the viability of integrating &lt;strong>approximate computing and intelligent offloading&lt;/strong> in &lt;strong>renewable-powered IoT environments&lt;/strong>. It provides a robust foundation for future &lt;strong>distributed optimization and adaptive energy-aware network control&lt;/strong>.&lt;/p>
&lt;p>&lt;a href="https://doi.org/10.1109/LNET.2023.3328893">IEEE Paper DOI: 10.1109/LNET.2023.3328893&lt;/a>&lt;/p></description></item></channel></rss>