北京大学工学院“数据与运筹智能”研究生暑期学校

在人工智能与数据科学浪潮席卷全球的当下，你是否渴望站在学科交叉的前沿，与顶尖学者面对面交流，深入探索优化、控制与学习的奥秘？ 2025年7月14日至25日，北京大学工学院精心组织的“数据与运筹智能”研究生暑期学校，将在北京大学燕园校区盛大启幕！这里将成为你拓宽学术视野、提升科研能力的绝佳平台，诚邀全国相关专业的优秀研究生和博士后加入我们，共赴这场学术盛宴！

项目概况：聚焦前沿，培养交叉学科人才

本次暑期学校以“服务学科建设，汇聚全球资源，搭建学术桥梁，促进人才交流”为愿景，目标定位 “聚焦优化、控制与学习的基础与前沿发展，培养数据科学与人工智能赋能的交叉学科研究人才”，计划开设《Foundations of Reinforcement Learning》、《Learning-based Safe Control under Uncertainty》和《最优化理论与算法》三门高水平研究生课程，每门课程15-20课时，面向全国招收约100名学员。课程将采用“基础知识+前沿学术+课题研究”的多元形式，注重夯实理论基础，同时紧跟学术前沿动态。

顶尖师资：群英荟萃，引领学术方向

Enrique Mallada：约翰霍普金斯大学副教授

Enrique Mallada博士2016年加入约翰霍普金斯大学，自2022年起担任电气与计算机工程系副教授。 2014-2016年，在加州理工学院担任博士后研究员，于2005年在乌拉圭ORT大学获得电信工程学士学位，2014年在康奈尔大学获得电气与计算机工程博士学位（辅修应用数学）。曾荣获2021年约翰霍普金斯大学校友会教学奖、2018年NSF CAREER、及2014年康奈尔大学博士论文奖等多项荣誉。研究聚焦控制与动力学系统、机器学习和优化，及其在安全攸关网络与系统中的应用。

凌青：中山大学教授

凌青教授于2001年和2006年在中国科学技术大学自动化系分别获得学士与博士学位，2006-2009年担任密歇根理工大学电子工程与计算机科学系博士后研究员。 2009-2017年任教于中国科学技术大学自动化系，期间曾为宾夕法尼亚大学和微软亚洲研究院访问学者。 2017年起任中山大学计算机学院教授、博导。两次荣获IEEE信号处理协会青年作者最佳论文奖，还获得广东省科技进步奖二等奖等，指导的研究生入选中国电子学会优秀硕士论文激励计划，同时在多个国际期刊和会议中担任重要职务。

高瑜隆：帝国理工学院讲师/助理教授

高瑜隆博士于2013年和2016年在北京理工大学分别获得自动化学士学位和控制科学与工程硕士学位，2021年在瑞典皇家理工学院与新加坡南洋理工大学获得电气工程联合博士学位。2021-2022年在瑞典皇家理工学院担任研究员，2022-2023年在英国牛津大学担任博士后，2024年起任英国帝国理工学院电气与电子工程系讲师（助理教授）。研究兴趣包括形式化验证与控制、机器学习及其在安全攸关系统中的应用。

课程设置：三大课程，筑牢知识体系

课程 1：Foundations of Reinforcement Learning

课程简介

The course will provide a rigorous treatment of reinforcement learning by building on the mathematical foundations laid by optimal control, dynamic programming, and machine learning. Topics include model-based methods such as deterministic and stochastic dynamic programming, as well as model-free methods that are broadly identified as Reinforcement Learning. In particular, we will cover on and off-policy tabular methods such as Monte Carlo, Temporal Differences, as well as approximate solution methods, including on- and off-policy approximation, policy gradient methods, actor-critic algorithms. Frontiers and applications will be reviewed at the end of the course.

授课教师

Enrique Mallada

课程内容

Lecture 1: What is Reinforcement Learning?
- Agent-environment loop
- MDPs: states, actions, transitions, rewards
- Episodic vs continuing tasks
- Finite vs infinite horizon
- Policies: deterministic, stochastic, Markov
- Return: discounted, average
Lecture 2: Value Functions and Optimality
- State value, action value, Bellman expectations
- Optimality and Bellman optimality equations
- Greedy policies (conceptual)
Lecture 3: Dynamic Programming
- Value iteration
- Policy iteration
- Generalized policy iteration (GPI)
- Convergence
- Contraction mappings
- Planning in known MDPs
Lecture 4: Multi-Armed Bandits
- K-armed bandits
- Regret and exploration
- Epsilon-greedy
- Optimistic initialization
- UCB
- Gradient bandits
Lecture 5: Monte Carlo Methods
- First-visit vs every-visit
- Monte Carlo prediction and control
- Exploring starts
- Epsilon-soft policies
Lecture 6: Temporal Difference Learning
- TD(0)
- SARSA
- Q-learning
- Online learning
- Bootstrapping
- n-step returns (brief)
- Comparison: MC vs TD vs DP
Lecture 7: Function Approximation
- Curse of dimensionality
- Linear value function approximation
- Semi-gradient TD
- Divergence issues (high-level)
Lecture 8: Policy Gradient Methods
- REINFORCE algorithm
- Policy gradient theorem
- Baselines
- Variance reduction
- Gaussian policies
Lecture 9: Actor-Critic Overview + Deep RL Teasers
- Actor-Critic idea
- A3C-style learning (conceptual only)
- Why Deep RL is hard
- Replay buffers
- Target networks (intro only)
Lecture 10: Frontiers and Applications
- Transfer RL
- Offline RL
- Safe RL
- Human feedback (RLHF)
- Real-world use cases: DQN for Atari
- AlphaGo/AlphaZero
- ChatGPT fine-tuning

课程 2：最优化理论与算法

课程简介

本课程介绍凸优化问题及其对偶理论；介绍求解凸优化问题的一阶无约束优化算法、二阶无约束优化算法、有约束优化算法；介绍上述优化算法在机器学习等领域的应用。

授课教师

凌青

课程内容

第一讲：
- 凸集、凸函数
- 凸优化问题
- 应用实例：最小二乘类问题
第二讲：
- 对偶问题
- KKT条件
- 应用实例：注水算法
第三讲：
- 迭代算法的一般格式与步长规则
- 梯度下降法的迭代格式
- 梯度下降法的次线性收敛速度
第四讲：
- 梯度下降法的线性收敛速度
- 梯度下降法的变种
- 坐标轮换法
- 应用实例：K-Means算法
第五讲：
- 次梯度法的迭代格式
- 次梯度法的收敛速度
- 邻近点梯度法的迭代格式
- 应用实例：迭代软门限算法
第六讲：
- 随机梯度法的迭代格式
- 随机梯度法的梯度噪声
- 深度学习中的加速算法与步长规则
第七讲：
- 牛顿法的迭代格式
- 牛顿法的收敛性质
- 拟牛顿法的迭代格式
第八讲：
- 约束满足牛顿法的迭代格式
- 拉格朗日乘子法的迭代格式
- 增广拉格朗日法的迭代格式
第九讲：
- 交替方向乘子法的迭代格式
- 应用实例：分布式优化

课程 3：Learning-Based Safe Control under Uncertainty

课程简介

This course aims to introduce the principles, methods, and challenges of ensuring safety in autonomous systems. We will cover theoretical foundations, algorithmic techniques, and practical applications by blending advanced control theory, formal verification, and data-driven theory.

授课教师

高瑜隆

课程内容

Lecture 1: Introduction and Preliminaries
- What is safe control?
- Motivation for learning-based safe control
- Autonomous system modelling
- Safety definitions
- Reachability analysis (definition, set representation, and computational algorithms)
Lecture 2: Learning-based Safe Predictive Control
- Deterministic model predictive control
- Robust tube model predictive control
- Learning-based tube model predictive control
Lecture 3: Learning-based Safe Motion Planning
- Basic safe motion planning
- Safe planning by learning obstacle uncertainty
- Environment-aware safe planning
Lecture 4: Learning-based safe control with complex specifications (I)
- Basics of formal methods
- Linear temporal logic
- Temporal logic trees
- Temporal logic tree-based model checking
- Temporal logic tree-based control synthesis
Lecture 5: Learning-based safe control with complex specifications (II)
- Integration with learning-based adaptive task planner
- Application to shared autonomy
- Extension to signal temporal logic
- Emerging research directions: online learning with safety guarantees, human-in-the-loop safety

课程表（2025 年 7 月）

时间	14日	15日	16日	17日	18日	19日	20日	21日	22日	23日	24日	25日
上午	9:00-11:00 Enrique Mallada	9:00-11:00 Enrique Mallada	9:00-11:00 Enrique Mallada	9:00-11:00 Enrique Mallada	9:00-11:00 Enrique Mallada	休息		9:00-11:00 Enrique Mallada	9:00-11:00 Enrique Mallada	9:00-11:00 Enrique Mallada	9:00-11:00 Enrique Mallada	9:00-11:00 Enrique Mallada
下午	13:00-17:00 凌青	13:00-17:00 凌青	13:00-17:00 凌青	13:00-17:00 凌青	13:00-15:00 凌青	休息		13:00-16:00 高瑜隆	13:00-16:00 高瑜隆	13:00-16:00 高瑜隆	13:00-16:00 高瑜隆	13:00-16:00 高瑜隆

注：19-20 日为休息日，不安排课程。

招生简章

申请要求

诚邀全国高校相关专业的优秀研究生和博士后报名参加“数据与运筹智能”研究生暑期学校，计划招收约100名学员。

一、申请材料

个人简历：请详细列出教育背景、科研经历、已修课程、发表论文（如有）及其它能体现申请人学术潜力与兴趣的内容。建议使用PDF格式，文件命名为“姓名-所在单位-数据与运筹智能暑校申请.pdf”。
导师推荐信：须由现阶段主要负责的导师撰写，推荐信开头须明确同意申请人参加此次暑校，进而可简要介绍申请人的研究基础、学术能力、参与此次暑校的动机与适配性。推荐信需包含推荐人单位、职称、联系方式及签名。建议使用PDF格式，文件命名为“姓名-所在单位-导师推荐信.pdf”。
可选补充材料：代表性论文全文或摘要、项目报告等（合并为一个PDF文件，不超过10MB）。

二、申请方式

请将上述材料以附件形式发送至邮箱： pku_gss_apply@163.com，邮件标题格式：姓名-所在单位-手机号码-数据与运筹智能暑期学校申请

三、申请截止

2025年6月20日（星期四）23:59（北京时间）

请在截止时间前提交完整申请材料，逾期不予受理。

咨询邮箱：qinruqing@pku.edu.cn

四、录取结果时间

2025年6月30日（星期一）23:59（北京时间）之前

五、注意事项

部分课程将以英文授课为主，需具备基本的英文阅读和听说能力；
暑期学校不收取任何学费；
完成暑期学校可获得北大研究生院的结业证明；
暑期学校不安排住宿，学员需自行在周边区域安排住宿；
暑期学校将为学员开通入校权限，发放临时校园卡，学员需自行充值用于校内就餐和消费；
收到邮件录取通知后，请于三日内确定是否参加，逾期则录取作废。

这个夏天，让我们相聚北京大学，在顶尖学者的引领下，探索数据与运筹智能的无限可能，为未来的学术和职业发展打下坚实的基础！