You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
desc: "Working on test-time training methods for scientific discovery under uncertainty, with a focus on adaptive compute allocation, agentic planning, and stratified scaling search for test-time reasoning in large language models and diffusion language models.",
"Co-developed a modular multi-agent ecosystem where autonomous agents evolve their capabilities, reputation, and social connections over time through Bayesian reputation updates, dynamic team formation, and social graph evolution. The framework enables emergent specialization and self-organizing collaboration for complex task execution (NeurIPS 2026, in progress).",
14
+
],
15
+
},
16
+
{
17
+
name: "Test-Time Compute and Reasoning in Large Language Models",
18
+
bullets: [
19
+
"Currently working on adaptive test-time compute strategies for improving reasoning accuracy in LLMs, focusing on dynamic control of inference depth, tool usage, and verification under strict compute constraints. The work studies principled trade-offs between accuracy, latency, and reliability via adaptive compute allocation.",
20
+
],
21
+
},
22
+
{
23
+
name: "Bayesian Preference Alignment for Mathematical Reasoning",
24
+
bullets: [
25
+
"Developed active learning frameworks for Bayesian General Preference Models and Continuous-Utility Direct Preference Optimization (CU-DPO) to align small language models for mathematical reasoning tasks, enabling sample-efficient preference learning with calibrated uncertainty (ICML 2026 and CoLM 2026).",
26
+
],
27
+
},
28
+
],
9
29
},
10
30
{
11
-
title: "Evolving Agentic Systems",
12
-
desc: "Developing Internet of Evolving Agents frameworks for self-evolving multi-agent systems with dynamic reputation modeling and social graph-based coordination mechanisms.",
13
-
venues: "NeurIPS'26, Ongoing",
31
+
org: "Intel Corporation, Ph.D. Researcher",
32
+
dateRange: "September 2024 – December 2024",
33
+
advisor: "Dr. John M. Cioffi",
34
+
projects: [
35
+
{
36
+
name: "Neural Gaussian Radio Fields for Environment Perception",
37
+
bullets: [
38
+
"Worked on 3D computer vision-based channel estimation for next-generation wireless networks.",
39
+
"Implemented a CUDA-based differentiable real-time pipeline with 1 ms inference time, leading to KDD 2026 submission.",
40
+
],
41
+
},
42
+
],
14
43
},
15
44
{
16
-
title: "Reinforcement Learning for LLMs",
17
-
desc: "Research on preference optimization, active learning, and alignment methods for large language model reasoning systems. Current work also explores reinforcement learning approaches for reward decomposition to mitigate sycophancy and improve alignment.",
"Developed a constraint-graph representation of SDPs and a GNN encoder (Graph Attention) with sequence prediction to learn rank trajectories directly from problem structure.",
53
+
"Integrated learned rank schedules into low-rank solvers to remove hand-tuned rank heuristics and reduce trial-and-error, yielding up to 3× speedups on large-scale benchmarks.",
0 commit comments