You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: index.md
+4-9Lines changed: 4 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,13 +22,7 @@ layout: default
22
22
23
23
I was honoured to receive the [Peter Gavin Hall Institute of Mathematical Statistics (IMS) Early Career Prize](https://imstat.org/ims-awards/peter-gavin-hall-ims-early-career-prize/), [IMS Tweedie Award](https://imstat.org/2024/03/05/chengchun-shi-receives-2024-ims-tweedie-new-researcher-award/) and the [Royal Statistical Society (RSS) Research Prize](https://rss.org.uk/news-publication/news-publications/2021/general-news/announcing-our-honours-recipients-for-2021/).
24
24
25
-
I am looking for PhD students interested in reinforcement learning and LLMs; see
26
-
27
-
* my recent [paper](https://arxiv.org/pdf/2506.01183) and [slides](https://callmespring.github.io/slides/DRPO.pdf) on LLMs;
28
-
* my lecture [slides](https://github.com/callmespring/RL-short-course) on RL and my survey [paper](https://arxiv.org/abs/2502.16195) on statistical inference in RL (its Chinese [version](https://mp.weixin.qq.com/s/_uPxaxhYuG0D4AiillMJug));
* my [slides](https://github.com/callmespring/RL-short-course/blob/main/Lecture%205/OPEslides.pdf) on off-policy evaluation and our review [paper](https://arxiv.org/pdf/2212.06355.pdf).
25
+
I am looking for PhD students interested in reinforcement learning (see my lecture [slides](https://github.com/callmespring/RL-short-course)) and LLMs.
32
26
33
27
My email <c.shi7@lse.ac.uk>. My [GitHub](https://github.com/callmespring).
34
28
@@ -49,8 +43,8 @@ My email <c.shi7@lse.ac.uk>. My [GitHub](https://github.com/callmespring).
49
43
50
44
My research is motivated from the following applications:
51
45
***LLMs** (see our recent papers [DRPO](https://arxiv.org/pdf/2506.01183), [VRPO](https://arxiv.org/pdf/2504.03784), [slides](https://callmespring.github.io/slides/DRPO.pdf) on fine-tuning);
52
-
***Ridesharing** (simulated environments for [Order Dispatch](https://github.com/callmespring/MDPOD) and [Spatio-temporal Policy Evaluation](https://github.com/RunzheStat/CausalMARL); see also our tutorial ([Youtube](https://www.youtube.com/watch?v=LwShOYaRFqM&list=PLA_E7IjY9cw4aC4T8pnV3vl9wSA1461KV), [Bilibili](https://www.bilibili.com/video/BV1ZS9NYpEHg/?spm_id_from=333.788.recommend_more_video.-1&vd_source=0ff25cf8645aa63231bec2428b94bf6f
53
-
)) and my [talk](https://www.bilibili.com/video/BV1yo4y1j7FU/?spm_id_from=333.337.search-card.all.click&vd_source=0ff25cf8645aa63231bec2428b94bf6f));
46
+
***Ridesharing** (see our AAAI tutorial ([Youtube](https://www.youtube.com/watch?v=LwShOYaRFqM&list=PLA_E7IjY9cw4aC4T8pnV3vl9wSA1461KV), [Bilibili](https://www.bilibili.com/video/BV1ZS9NYpEHg/?spm_id_from=333.788.recommend_more_video.-1&vd_source=0ff25cf8645aa63231bec2428b94bf6f
47
+
)) and my [talk](https://www.bilibili.com/video/BV1yo4y1j7FU/?spm_id_from=333.337.search-card.all.click&vd_source=0ff25cf8645aa63231bec2428b94bf6f); see also some simulated environments for [Order Dispatch](https://github.com/callmespring/MDPOD) and [Spatio-temporal Policy Evaluation](https://github.com/RunzheStat/CausalMARL));
54
48
***Video-sharing** (see our KDD [paper](https://dl.acm.org/doi/pdf/10.1145/3580305.3599809) for details about our proposal successfully deployed in a widely used mobile app with millions of daily active users)
55
49
***Mobile health** (some simulated environments for [Diabetes](https://github.com/RunzheStat/TestMDP) and [Intern Health](https://github.com/limengbinggz/cusum-rl));
56
50
***Neuroscience** (see our [paper](https://www.biorxiv.org/content/10.1101/2023.06.19.545524v1.full.pdf) on using RL for modelling human decision making)
@@ -60,6 +54,7 @@ Some of my recent **talks** and **slides** on statistical inference, RL, causal
60
54
***StatRL**[Bilibili](https://www.bilibili.com/video/BV1ZP4y1r7DC/?spm_id_from=333.337.search-card.all.click&vd_source=0ff25cf8645aa63231bec2428b94bf6f), [Youtube](https://www.youtube.com/watch?v=-SW9PevZThs&t=982s); Chinese versions: [Bilibili](https://www.bilibili.com/video/BV1kP411f7dA/?spm_id_from=333.337.search-card.all.click), [Youtube](https://www.youtube.com/watch?v=7NWBLuok8nk&t=3048s); [slides](https://callmespring.github.io/slides/StatRL.pdf); the accompanying [paper](https://arxiv.org/abs/2502.16195) and its Chinese [version](https://mp.weixin.qq.com/s/_uPxaxhYuG0D4AiillMJug)
61
55
***CausalRL**[Youtube](https://www.youtube.com/watch?v=Zor1CmRyycw&t=397s), [slides](https://callmespring.github.io/slides/CausalRL.pdf); the accompanying [paper](https://arxiv.org/pdf/2002.01711)
62
56
***ARMAdesign**[2-hour slides](https://callmespring.github.io/slides/ABtesting.pdf), [1-hour slides](https://callmespring.github.io/slides/ARMAdesign.pdf), [30-minute slides](https://callmespring.github.io/slides/design30m.pdf); the accompanying [paper](https://arxiv.org/pdf/2408.05342v3)
57
+
***OPE** my [slides](https://github.com/callmespring/RL-short-course/blob/main/Lecture%205/OPEslides.pdf) and our review [paper](https://arxiv.org/pdf/2212.06355.pdf);
63
58
***Pessimistic Data Integration**[slides](https://callmespring.github.io/slides/DataIntegration.pdf); the accompanying [paper](https://arxiv.org/pdf/2406.00317)
0 commit comments