|
Nils Blank
I am a PhD student in the Intuitive Robots Lab (IRL) at the Karlsruhe Institute of Technology (KIT), Germany.
My research focuses on Imitation Learning and Foundation Models for Human-Robot-Interaction. I am supervised by Rudolf Lioutikov.
I obtained my Master's Degree in Computer Science at the KIT.
During my studies I interned at SAP SE and IONOS.
Email  / 
Google Scholar  / 
Github  / 
LinkedIn
|
|
|
Research
My research focuses on Foundation Models and their applications in Robotics. In particular, I explore how we can employ Foundation models
robustly and reliably in challenging robotic scenarios. Furthermore, my research focuses on goal driven explainability and how we can leverage foundation models for improved human-robot interaction.
|
SIR: Structured Image Representations for Explainable Robot Learning
Paul Mattes, Jan Schwab
, Jens Oliver Bosch, Maximilian Xiling Li, Nils Blank, Minh-Trung Tang,
Rudolf Lioutikov
CVPR 2025, Poster
Project Page
/
Code
/
arXiv
We introduce SIR, a novel approach for learning robot policies with explicit, interpretable structure. Instead of relying on opaque visual embeddings, our method constructs a fully connected scene graph from 2D or 3D image features and learns to sparsify it end-to-end, producing a minimal, task-relevant subgraph used for action generation. This design makes policies intrinsically explainable. Experiments on RoboCasa show that our sparse graph policies outperform image-based baselines (19.5% vs. 14.81% success rate) and are significantly more robust to visual distractors. Furthermore, analyzing the learned subgraphs enables introspection, revealing dataset biases such as spurious correlations and positional biases.
|
BEAST: Efficient Tokenization of B-Splines Encoded Action Sequences for Imitation Learning
Hongyi Zhou,
Weiran Liao, Xi Huang, Yucheng Tang, Fabian Otto, Xiaogang Jia, Xinkai Jiang, Simon Hilber, Ge Li, Qian Wang, Ömer Erdinç Yağmurlu, Nils Blank, Moritz Reuss
Rudolf Lioutikov
NeurIPS 2025, Poster
Project Page
/
Code
/
arXiv
We present the B-spline Encoded Action Sequence Tokenizer (BEAST), a novel action tokenizer that encodes action sequences into compact discrete or continuous
tokens using B-splines. In contrast to existing action tokenizers based on vector quantization or byte pair encoding, BEAST requires no separate tokenizer
training and consistently produces tokens of uniform length, enabling fast action sequence generation via parallel decoding.
Leveraging our B-spline formulation, BEAST inherently ensures generating smooth trajectories without discontinuities between adjacent segments.
|
Scaling Robot Policy Learning via Zero-Shot Labeling with Foundation Models
Nils Blank,
Moritz Reuss,
Marcel Rühle,
Ömer Erdinç Yağmurlu,
Fabian Wenzel,
Oier Mees,
Rudolf Lioutikov
CoRL 2024
Paper Link
We introduce a novel approach to automatically label uncurated, long-horizon robot teleoperation data at scale in a zero-shot manner without any human intervention.
We utilize a combination of pre-trained vision-language foundation models to detect objects in a scene, propose possible tasks, segment tasks from large datasets of unlabelled interaction data and then train language-conditioned policies on the relabeled datasets.
Our initial experiments show that our method enables training language-conditioned policies on unlabeled and unstructured datasets that match ones trained with oracle human annotations.
|
|