Researcher at Mohamed bin Zayed University of Artificial Intelligence
I am currently a researcher at MBZUAI (ranked top 10 in AI), and also serve as the part-time CTO of Spatialtemporal AI, and have served as a technical lead for multiple unicorns and startups (Rongtao-Xu.github.io). Previously, I was an Assistant Professor at the Institute of Automation, Chinese Academy of Sciences (CASIA), and a Visiting Scholar at Southern University of Science and Technology (SUSTech). I co-led the development of the world’s first navigation foundation model, NaVid, at Galbot. At the Momenta–CAS joint laboratory, I led research on autonomous driving perception algorithms. At Spatialtemporal AI, as Co-founder and CTO, I led the development of manipulation foundation models A1 and A0. I also co-organized the CVPR 2026 Embodied Intelligence Challenge (ManipArena) in collaboration with x2robot. I received my Ph.D. from the State Key Laboratory of Multimodal Artificial Intelligence at the Institute of Automation, Chinese Academy of Sciences (formerly the National Lab of Pattern Recognition). During my Ph.D., I was awarded the CAS President’s Award (top 0.7\%), the National Scholarship, Beijing Outstanding Graduate, and CAS Outstanding Graduate, and received two Best Paper Award nominations (top 0.3\%) at IEEE flagship conferences. In 2019, I obtained dual Bachelor’s degrees in Mathematics and Computer Science from Huazhong University of Science and Technology (HUST), where I received the university’s first-ever top prize in the National Undergraduate Mathematical Contest in Modeling (1st out of 36,375 teams). My research focuses on embodied intelligence and robot foundation models. I have published over 80 papers in top-tier conferences and journals (including RSS, IJCAI, IROS, CVPR, ICCV, ECCV, NeurIPS, ICLR, AAAI, EMNLP, MICCAI, TPAMI, TIP, TNNLS, TII, TIM, TMM, TCSVT, ISPRS), with nearly 40 as first or corresponding author. My works include 3 ESI Highly Cited Papers, 1 IEEE Transactions cover paper, and 8 oral presentations. My publications have received over 2,000 citations on Google Scholar. I hold more than 10 invention patents, and my research has been deployed in real-world systems such as the YOLO series, as well as products from Wujie Intelligence, Galaxy General Robotics, Yijiahe, Huawei, and Momenta. Students and interns I have mentored have gone on to receive top industry honors such as Huawei “Genius Youth,” Ant Star, and Xiaomi Star programs, or have been admitted to leading universities worldwide, including CMU, Stanford, UCSD, Cambridge, HKU, Tsinghua, and Peking University.
more
I am a member of IEEE, the China Society of Image and Graphics, the China Society for Graphics, and the Chinese Society for Stereology. I serve as a reviewer for leading international journals and conferences, including IEEE TPAMI, TIP, TNNLS, TMM, TCSVT, TII, CVPR, NeurIPS, AAAI, and MICCAI. I also co-organize the Embodied Intelligence International Challenge at CVPR 2025 and CVPR 2026.
我现在是MBZUAI(全球AI前10)研究员,并兼职无界智慧CTO,多家独角兽/初创技术负责人,Rongtao-Xu.github.io。 前中科院自动化所助理研究员、南方科技大学访问学者。曾在银河通用机器人共同主导全球首个导航大模型NaVid,在Momenta-中科院联合实验室主导自动驾驶感知算法,在无界智慧任CTO主导操控大模型A1和A0,和自变量机器人共同组织CVPR2026具身智能挑战赛ManipArena。 中科院自动化所多模态人工智能国重(前模识国重)博士,在学期间曾获得中国科学院院长奖(0.7\%)、国家奖学金、北京市优秀毕业生、中国科学院优秀毕业生和两次IEEE旗舰会议最佳论文提名奖(均0.3\%)。 2019年在华中科技大学(HUST)获得了数学与计算机双学士学位,曾获校史首次全国大学生数模竞赛最高奖(1/36375)。 研究方向为具身智能与机器人大模型,在顶级学术会议和期刊(RSS,IRCA,IROS,CVPR,ICCV,ECCV,NeurIPS,ICLR,AAAI,EMNLP,MICCAI,TPAMI,TIP,TNNLS,TII,TIM,TMM,TCSVT,ISPRS)上共发表论文80余篇,其中以一作/通讯发表近40篇,含ESI高被引论文3篇,IEEE Trans封面文章1篇,发表8次Oral论文。谷歌学术引用2000余次。 拥有10余项发明专利,研究成果应用于YOLO系列,以及无界智慧、银河通用、亿嘉和、华为、Momenta等多款产品。指导的多位学生/实习生拿到华为天才少年/蚂蚁星/小米星等大厂头部计划或者申请上CMU/Stanford/UCSD/Cambridge/HKU/THU/PKU等海内外名校。
展开更多
任IEEE会员,中国图像图形学会会员,中国图学学会会员,中国体视学学会会员,担任IEEE TPAMI, TIP, TNNLS, TMM, TCSVT, TII, CVPR, NeurIPS, AAAI, MICCAI等国际期刊和会议的审稿人。共同组织CVPR 2025 和CVPR 2026具身智能国际挑战赛。