• Xeuron logo
Discover
  • Home
  • Popular
  • Hot & Trending
  • Explore
  • My Extractions
Create
  • SubXeurons
    • iPSC-Cardio Cells
    • HALO: A Unified Visio
  • Publications
    • Adoption and Use of LLMs at an Academic Medical Center
    • Toward AI-Driven Digital Organism
    • You Can Run, You Can Hide: The Epidemiology and Statistical Mechanics of Zombies
    • embryonic stem cell-derived cardiac organoids via synthetic guidance
    • In vitro generation of human pluripotent stem cell derived lung organoids
    • Generating Self-Assembling Human Heart Organoids Derived from Pluripotent Stem Cells
    • SMAD4: A Critical Regulator of Cardiac Neural Crest Cell Fate and Vascular Smooth Muscle Differentiation. bioRxiv
    • Insights into AI Agent Security from a Large-Scale Red-Teaming Competition
    • TxPert: using multiple knowledge graphs for prediction of transcriptomic perturbation effects
    • Self-organizing human heart assembloids with autologous and developmentally relevant cardiac neural crest-derived tissues
    • Path Planning of Cleaning Robot with Reinforcement Learning
    • Reinforcement Learning Approaches in Social Robotics
    • Robotic Packaging Optimization with Reinforcement Learning
    • A Concise Introduction to Reinforcement Learning in Robotics
    • Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics
    • Robotic Surgery With Lean Reinforcement Learning
    • Residual Reinforcement Learning for Robot Control
    • Autonomous robotic nanofabrication with reinforcement learning
    • Heterogeneous Multi-Robot Reinforcement Learning
    • Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning
    • Reinforcement learning for freeform robot design
    • Geometric Reinforcement Learning For Robotic Manipulation
    • On-Robot Bayesian Reinforcement Learning for POMDPs
    • Efficient Content-Based Sparse Attention with Routing Transformers
    • A foundation model of transcription across human cell types
    • Transformer AI
    • HALO, a unified VLA model that enables embodied multimodal chain-of-thought (EM-CoT) reasoning through a sequential process of textual task reasoning, visual subgoal prediction for fine-grained guidan
    • HALO: A Unified Vision-Language-Action Model for Embodied Multimodal Chain-of-Thought Reasoning
  • Events
    • No events yet
HomeSearchEventsProfileCreate

Autonomous robotic nanofabrication with reinforcement learning

arXiv:10.1126/sciadv.abb6987[2020]
·DOI·Source·PDF|

AI Summary

The ability to handle single molecules as effectively as macroscopic building-blocks would enable the construction of complex supramolecular structures inaccessible to self-assembly. The fundamental challenges obstructing this goal are the uncontrolled variability and poor observability of atomic-scale conformations. Here, we present a strategy to work around both obstacles, and demonstrate autonomous robotic nanofabrication by manipulating single molecules. Our approach employs reinforcement learning (RL), which finds solution strategies even in the face of large uncertainty and sparse feedback. We demonstrate the potential of our RL approach by removing molecules autonomously with a scanning probe microscope from a supramolecular structure -- an exemplary task of subtractive manufacturing at the nanoscale. Our RL agent reaches an excellent performance, enabling us to automate a task which previously had to be performed by a human. We anticipate that our work opens the way towards autonomous agents for the robotic construction of functional supramolecular structures with speed, precision and perseverance beyond our current capabilities.

AI Metadata Extraction

Extract authors, key findings, references, and an executive summary using AI.

Version:· 2 versions extracted
Extraction v2anthropic/claude-sonnet-4-64/29/2026

Executive Summary

This study presents the first demonstration of reinforcement learning (RL) applied to autonomous robotic nanofabrication, addressing the long-standing challenge of manipulating individual molecules without human intervention. The authors target a PTCDA (3,4,9,10-perylene-tetracarboxylic dianhydride) monolayer on an Ag(111) surface, with the RL agent tasked with autonomously removing single molecules using a scanning probe microscope (SPM)—a textbook example of subtractive manufacturing at the nanoscale. The fundamental difficulty is that the complete atomic-scale state of the environment is unobservable, the system is non-stationary due to unpredictable tip apex changes, and conventional approaches (human expertise or model-based simulation) fail at this scale. The RL framework models the problem as a Markov Decision Process with a simplified 3D state space (Cartesian tip coordinates) and five discrete actions moving the tip in different directions. Two critical algorithmic innovations enable practical data efficiency: a model-based planning component (Dyna-style) that exploits the deterministic Cartesian state transitions to generate synthetic training experience, and a rupture avoidance mechanism using negative training temperature that propagates failure-state information far back through trajectories, enabling rapid avoidance of dangerous regions. Together, these modifications reduce agent failure rates from 70% to 11% in simulation and make real-world application feasible. In physical experiments conducted at 5 K, the RL agent autonomously created 16 molecular vacancies in the PTCDA layer, each verified by STM imaging. Pre-trained agents (P-agents), initialized with weights from a previously successful run, outperformed randomly initialized (R-agents) by focusing exploration in the physically meaningful lower-left trajectory quadrant corresponding to peeling the molecule along its long axis—a universally valid policy that transfers across different tip configurations. The difficulty of the task scales inversely with tip-molecule bond strength, with weak tips requiring the agent to traverse a very narrow corridor in xy-space at critical heights. The work demonstrates that RL can succeed in a real-world nanoscale robotic task characterized by partial observability, non-stationarity, and sparse feedback—conditions under which classical robotics approaches fail. The authors envision future extensions incorporating tunneling current and force gradient signals into the state representation, hybrid simulation-guided RL for more complex tasks, and integration with autonomous tip preparation. Ultimately, this approach opens a path toward the autonomous construction of arbitrary metastable supramolecular structures with functional properties inaccessible through self-assembly alone.

Authors (6)

Philipp LeinenFirst Author

Peter Grünberg Institut (PGI-3), Forschungszentrum Jülich, 52425 Jülich, Germany; Jülich Aachen Research Alliance (JARA)-Fundamentals of Future Information Technology, 52425 Jülich, Germany; Experimentalphysik IV A, RWTH Aachen University, Otto-Blumenthal-Straße, 52074 Aachen, Germany

Malte Esders

Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany

Kristof T. Schütt

Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany

Abstract

The ability to handle single molecules as effectively as macroscopic building-blocks would enable the construction of complex supramolecular structures inaccessible to self-assembly. The fundamental challenges obstructing this goal are the uncontrolled variability and poor observability of atomic-scale conformations. Here, we present a strategy to work around both obstacles, and demonstrate autonomous robotic nanofabrication by manipulating single molecules. Our approach employs reinforcement learning (RL), which finds solution strategies even in the face of large uncertainty and sparse feedback. We demonstrate the potential of our RL approach by removing molecules autonomously with a scanning probe microscope from a supramolecular structure – an exemplary task of subtractive manufacturing at the nanoscale. Our RL agent reaches an excellent performance, enabling us to automate a task which previously had to be performed by a human. We anticipate that our work opens the way towards autonomous agents for the robotic construction of functional supramolecular structures with speed, precision and perseverance beyond our current capabilities.

Key Findings (24)

1.Reinforcement learning was demonstrated for the first time to automate a manipulation task at the nanoscale, specifically autonomous removal of PTCDA molecules from a self-assembled monolayer on Ag(111) using a scanning probe microscope.

2.The RL agent successfully performed subtractive nanofabrication, creating 16 vacancies in a PTCDA monolayer as demonstrated by STM imaging, without human intervention during the manipulation process.

3.The nanofabrication problem was formulated as a Markov Decision Process (MDP) with a 3-dimensional state space consisting only of the Cartesian coordinates (x, y, z) of the SPM tip apex, making the approach tractable despite the partial observability of the full atomic-scale environment.

Discussion & Future Directions

The discussion analyzes the learning process by comparing randomly initialized (R-agents) and pre-trained (P-agents), finding that P-agents perform better due to a transferable universal policy corresponding to exploration of the lower-left trajectory quadrant consistent with the physical 'peeling' mechanism. The performance variability is linked to tip-dependent bond strength, with weaker tips requiring narrower successful trajectory corridors and thus more episodes. The authors identify limited observability as the most severe limitation of RL at the nanoscale, noting that partial observability and stochasticity increase the number of trials needed. Future directions include: (1) hybrid approaches combining atomistic simulation insight with RL for guided exploration; (2) incorporation of measurable quantities such as tunneling current and force gradient into the state representation for tasks with hysteretic behavior; and (3) combination of autonomous SPM-based nanofabrication with autonomous tip preparation methods. The authors conclude that autonomous robotic nanofabrication is viable and enables progress towards designing quantum matter beyond the constraints of crystal growth and self-assembly.

References (40)

  1. [1]Andrychowicz, M., Wolski, F., Ray, A., Schneider, J., Fong, R., Welinder, P., McGrew, B., Tobin, J., Abbeel, P., & Zaremba, W. (2017). Hindsight Experience Replay. 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA.
    Create publication
  2. [2]Aono, M., & Ariga, K. (2016). The Way to Nanoarchitectonics and the Way of Nanoarchitectonics. Advanced Materials, 28, 989–992.
    Create publication
  3. [3]Atkeson, C., & Santamaria, J. (1997). A comparison of direct and model-based reinforcement learning. Proceedings of International Conference on Robotics and Automation, 4, 3557–3564.
    Create publication

Sections

Executive SummaryAuthorsAbstractKey FindingsDiscussionReferences