Uljad Berdica

I am a 2022 Rhodes Scholar and PhD student in the CDT-AIMS program at the University of Oxford focusing on Reinforcement Learning, World Models, Exploration and Agentic LLMs. Working on meaningful AI that learns from experience, under the supervision of Prof. Jakob Foerster and Prof. Perla Maiolino from the Oxford Robotics Institute.

SR at Google DeepMind working on Game Theory, Reinforcement Learning and LLMs.

My most recent papers include methods to make LLMs more effective and diverse, unification of Offline RL Algorithms to automatically discover new ones and bioimpedance measurement in Electronics. Currently interested in Backprop-Free methods and automating research with AI. Previously interned at J.P. Morgan’s AI Research group.

Served as a reviewer for IEEE Robotics Journals, conferences like NeurIPS, ICML, ICLR, AAAI, IEEE RoboSoft and numerous workshops. Since moving out of my home country at 17 on a scholarship, I have lived in the USA, UAE, China, and France as part of my studies before moving to the UK for my PhD. I love doing stand-up comedy.

Education

PhD in AI and Machine Learning, 2022-2026

University of Oxford

BSc in Electrical Engineering, 2018-2022

New York University (NYU)

Research Interests

Reinforcement Learning (Off and Online)

World Models

LLMs for Reasoning and Exploring

Control Theory

Robot Perception and Planning

News

Jun 07, 2026	Evolving Many Worlds accepted to ALife 2026 as Oral
May 15, 2026	Recognized as an ICML 2026 Top Reviewer
May 06, 2026	Two papers accepted to RLC 2026 in Montreal
May 01, 2026	Two papers accepted to ICML 2026 in Korea
Apr 20, 2026	Released the PBT-NCA framework for Evolving Many Worlds
Dec 01, 2025	Attended NeurIPS 2025 — Oral & Poster live

Selected Publications

Evolving Many Worlds: Towards Open-Ended Discovery in Petri Dish NCA via Population-Based Training

Uljad Berdica^*, Jakob Foerster, Frank Hutter, and 1 more author

Artificial Life Conference (ALife) - Main Track Oral, 2026

HTML PDF Code Promo
When Do We Need LLMs? A Diagnostic for Language-Driven Bandits

Uljad Berdica, Fernando Acero, Anton Ipsen, and 3 more authors

The Reinforcement Learning Conference (RLC) - Main Track, 2026

PDF
A Clean Slate for Offline Reinforcement Learning

Matthew T. Jackson^*, Uljad Berdica^*, Jarek Liesen^*, and 2 more authors

NeurIPS 2025 - Main Track Oral, 2025

PDF Code Promo
Intent Factored Generation: Unleashing the Diversity in Your Language Model

Eltayeb Ahmed^*, Uljad Berdica^*, Martha Elliott, and 2 more authors

ICML 2025 - Exploration in AI Today Workshop, 2025

HTML PDF Code Promo
Asynchronous Quadrature-phase Undersampling Technique for Wide-frequency Impedance Measurement

Soon-Jae Kweon, Uljad Berdica, Hyunwoo Park, and 3 more authors

IEEE Transactions on Instrumentation and Measurement, 2025

PDF Promo

If you wish, please reach out via Email or LinkedIn.