Publications – xscave

Find our relevant publications below

Work Package 3: Differentiable Simulators

T3.1 - Multi-body dynamics and machine-terrain interaction simulation

Simulation of Heavy Vehicle Traversal on Deformable Terrain

Hortlund, E.

This study investigates how terrain deformability affects the performance of heavy forestry vehicles using simulation-based analysis. A Komatsu 895 forwarder was simulated in AGX Dynamics across three test scenarios including obstacle traversal, turning, and uneven terrain. Each scenario was evaluated on both rigid and deformable terrains with varying soil deformability, vehicle speed, and load. Results show that terrain deformability is the primary factor influencing vehicle behavior. Higher compression indices lead to increased fuel consumption, deeper rut formation, and more unstable dynamic wheel-ground interactions.

Constraint-based terramechanics simulation for realtime and faster simulation

Ahlman B., Berglund T., Lundbäck M., Marklund H., Nordin P., Persson P., Rydman J., Wiberg V., and Servin M.

The work presents models and numerical methods for realtime and faster simulation of wheeled and tracked vehicles interacting with deformable terrain. Wheel–terrain interaction is formulated using kinematic constraints with forces and limits that represent wheel stresses and soil failure. When soil failure occurs, a three-dimensional soil displacement field is predicted and used to update soil distribution, local packing density, and the terrain surface heightmap. The constraint-based formulation allows stable coupling between terrain and vehicle multibody dynamics at large time steps. The model is tested by comparison with experiments and standard semi-empirical terramechanics models, and is applied to simulate a heavy forest machine traversing rough terrain to study the effects of deformable terrain on vehicle motion and dynamic load forces.

FORWARD: Dataset of a forwarder operating in rough terrain

Lundbäck M., Häggström C., Nyström M., Grönlund A., Fälldin A., Wallin E., and Servin M.

The FORWARD dataset comprises detailed sensor and log data from a Komatsu cut-to-length forwarder, collected during operations in challenging terrain as well as on gravel roads at two Swedish harvest sites. Sensor instrumentation includes RTK-GNSS for precise positioning, a 360-degree camera for visual documentation, operator seat vibration sensors, CAN-bus signal logging, and multiple IMUs mounted on the machine. The dataset provides time-stamped vehicle positions with centimeter-level accuracy, and machine event logs sampled at 5 Hz, covering variables such as driving speed, fuel consumption, and crane activity. Additionally, high resolution terrain data from helicopter-borne laser scanning (1500 points/m²) is included, along with StanForD production log files containing detailed event records for each work session.

Physics-informed data-driven modeling of rock motion dynamics in excavation using a high-fidelity simulator

Heravi, M., Molaei, A., & Ghabcheloo, R.

The paper presents a physics-informed, data-driven modeling approach to simulate rock motion dynamics during excavation processes. Using a high-fidelity numerical simulator to generate training data, the authors integrate physical laws with machine-learning models to accurately capture complex rock behavior, including collisions and interactions. The approach improves prediction accuracy, stability, and generalization compared to purely data-driven methods, while requiring less computational effort than full numerical simulations. Overall, the study demonstrates that combining physics constraints with data-driven models is an effective and efficient way to model excavation-induced rock dynamics for engineering applications.

T3.2 - A tiny differentiable vehicle-terrain physics engine

Bi-level trajectory optimization on uneven terrains with differentiable wheel-terrain interaction model

Manoharan A., Sharma A., Belsare H., Pal K., Krishna KM., Singh AK.

The paper introduces a bi-level trajectory optimization framework for wheeled robots navigating uneven terrain, using a differentiable wheel–terrain interaction model. The inner optimization level predicts the robot’s full 6-DoF motion by solving a nonlinear least-squares problem that captures stability and terrain effects, while the outer level optimizes the trajectory using gradients from this model. The results show that the method generates smooth, stable, and physically feasible trajectories that closely match high-fidelity physics simulations, while remaining computationally efficient for planning.

T3.4 - Distilling physics simulation into differentiable neural simulator

FusionForce: End-to-End Differentiable Neural-Symbolic Layer for Trajectory Prediction

Ruslan Agishev and Karel Zimmermann

The paper proposes FusionForce, a hybrid model that combines neural learning with symbolic physics reasoning in a single end-to-end differentiable architecture for trajectory prediction. It integrates a learnable component that predicts interaction forces (e.g., between robot and terrain) with a neural-symbolic physics layer that enforces classical mechanics laws during prediction, enabling the model to better generalize and reduce errors compared to purely data-driven approaches. The approach works with inputs such as camera images or lidar and offers fast trajectory prediction suitable for applications like model predictive control, learning, and SLAM, while mitigating sensitivity to out-of-distribution scenarios.

Transaction to Robotics

Martin Pecka, Karel Zimmermann, Bedrich Himmel, Valentyn Cıhala, Ruslan Agishev

Under review.

T3.5 - Tools for minimizing the sim-to-real gap

Joint parameter and state estimation for regularized time-discrete multibody dynamics

Marklund H, Larson MG, and Servin M.

The paper develops a method for joint offline estimation of both the states and parameters in time-discrete multibody dynamic systems that include regularized and frictional kinematic constraints. Because some degrees of freedom are unobserved due to these constraints, the authors frame a nonlinear least-squares optimization problem that simultaneously solves for the system’s state trajectory and the unknown parameters by minimizing inverse dynamics and observation errors. The solution uses a Levenberg–Marquardt algorithm with derivatives from automatic differentiation and custom differentiation rules to handle dry friction conditions, and is tested on synthetic data and a real Furuta pendulum, showing fast convergence and good agreement with measured data across different method settings, though very stiff constraints can pose numerical difficulties.

Work Package 4: Differentiable Structured Priors Driven Learning

T4.1 - Neural Model-Based Planners and Reinforcement Learning

Extracting Visual Plans from Unlabeled Videos via Symbolic Guidance

Yang, W., Tikna, A., Zhao, Y., Zhang, Y., Palopoli, L., Roveri, M., & Pajarinen, J.

The paper introduces Vis2Plan, a visual planning framework that learns to extract high-level symbolic plans from raw, unlabeled video play data using vision foundation models to identify task-relevant symbols. By constructing a symbolic transition graph from the extracted symbols, the method performs symbolic planning to generate a sequence of intermediate visual subgoals that are physically consistent and reachable, assembling them into a visual plan for a goal-conditioned low-level controller. This symbolic guidance enables efficient and interpretable long-horizon planning that outperforms diffusion-based visual planners in success rate and speed, delivering more reliable plans while avoiding hallucinations common in generative models.

T4.2 - Safe Imitation and Reinforcement Learning

Manual, Semi or Fully Autonomous Flipper Control? A Framework for Fair Comparison

Valentýn Číhala, Martin Pecka, Tomáš Svoboda, and Karel Zimmermann

The paper presents a comprehensive study comparing manual, semi-autonomous, and fully autonomous control methods for flipper-equipped skid-steer robots traversing uneven terrain. To enable a fair evaluation, the authors reimplemented several existing approaches and introduced a novel semi-autonomous control policy that provides a promising trade-off between traversal quality and operator effort. New metrics are proposed to quantify cognitive load and traversal quality, and results are visualized in a 2D Quality-Load space. Surprisingly, the study finds that fully manual control of all six degrees of freedom can still be highly effective when performed by an experienced operator, while the semi-autonomous policy bridges the gap between manual and autonomous methods, offering both strong performance and reduced cognitive load.

Physics-informed 3D scene understanding and its role in planning and control of mobile equipment

Fälldin A., Johansson A., Lundberg P., Bodin K., Lindmark D., Wiberg V., and Servin M.

The paper discusses the importance of physics-informed 3D scene understanding for robust planning and control of mobile equipment in unstructured, dynamic environments such as construction and forestry. It proposes representing the world state as a 3D scene with semantic and physical properties that support forward simulation, motion planning (including inverse dynamics or learned models), and synthetic sensor data generation for inference. By estimating physical states and material properties of objects and terrain, the approach enables better predictions of interactions between machines and their environment, improving autonomy and coordination in tasks like handling logs, rocks, and deformable terrain.

Combining foundation models and numerical solvers for physics-informed motion control

Anna Johansson

The thesis explores how foundation models trained on large amounts of robot trajectory data can be enhanced with numerical physics solvers to improve safety in motion control. Because foundation models alone lack guarantees about safe robot actions, the author proposes a Sparse Evaluation scheme that screens proposed actions with a numerical physics solver before they are executed, helping to detect potentially unsafe trajectories. A small proof of concept shows that this hybrid method can identify whether actions will behave as expected under physical dynamics, and the thesis also surveys existing foundation-model-based robot control approaches.

Reality to Simulation: A Scene Understanding Approach to 3D Log Pile Scene Reconstruction

Philiph Lundberg

The thesis presents a pipeline that reconstructs physically accurate 3D scenes of log piles from RGB-D sensor data, bridging perception and physics-based simulation. It uses a zero-shot 6D pose estimator (SAM-6D) to detect and estimate the poses of individual logs from images, infers underlying terrain where occlusions occur by interpolation, and then refines the scene through a heightfield optimization process driven by a physics simulator (AGX Dynamics) to ensure stability and reduce errors between the predicted and simulated configurations. Evaluated on synthetic and more complex scenes, the optimized reconstructions achieve low positional and angular errors, demonstrating the pipeline’s ability to produce realistic 3D reconstructions suitable for simulation tasks.

T4.3 - Scalable Algorithms for Control Barrier Functions

Minimally Conservative Controlled-Invariant Set Synthesis Using Control Barrier Certificates

Toulkani, Naeim Ebrahimi, and Reza Ghabcheloo

The work addresses the problem of finding controlled-invariant safe sets for nonlinear control-affine systems with state and control constraints in safety-critical applications. Traditional approaches often produce overly conservative sets; this paper instead formulates Control Barrier Certificates (CBCs) as Sum-of-Squares (SOS) constraints that can be solved via SOS programming. A key contribution is an iterative algorithm that progressively enlarges the safe set by maximizing its boundary expansion at each step, eliminating the need for predefined shapes or class-functions. Theoretical results guarantee that the safe set grows with each iteration, and numerical simulations in 2D and 3D systems show that the method yields larger controlled-invariant sets than state-of-the-art techniques based on Control Barrier Functions.

Work Package 4: Differentiable Structured Priors Driven Learning

T5.2 - Risk-aware decision making

Trajectory Optimization Under Stochastic Dynamics Leveraging Maximum Mean Discrepancy

Sharma B., Singh AK.

The paper addresses the problem of risk-aware trajectory optimization for systems with stochastic dynamics, where estimating collision risk via many simulated rollouts is computationally expensive. To improve sample efficiency, the authors propose an approach that distills statistical information from a large set of rollouts into a much smaller subset and introduces a novel risk surrogate based on Maximum Mean Discrepancy (MMD) using distribution embeddings in a Reproducing Kernel Hilbert Space. Benchmarking shows that this MMD-based method yields safer trajectories in low-sample regimes and performs better than baseline techniques that use Conditional Value-at-Risk for risk estimation.

T5.3 - Online Adaptation of RL Algorithms

Manipulate-to-Navigate: Reinforcement Learning with Visual Affordances and Manipulability Priors

Zhang, Y., Pajarinen, J.

The paper tackles the challenge of enabling mobile robots to interact with their environment to clear obstacles before navigating, a situation where traditional navigation and manipulation methods treated separately often fail. The authors propose a reinforcement learning framework that uses visual affordance maps to identify promising manipulation actions and manipulability priors to bias the robot toward body configurations that make manipulation easier, reducing unnecessary exploration. They design two simulation tasks using a Boston Dynamics Spot robot—one focusing on choosing effective hand positions and the other on moving a door to open a path—and show that the learned policies allow the robot to successfully perform manipulation followed by navigation, including transfer of the learned policy to a real robot in at least one task.

Funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the HADEA. Neither the European Union nor the granting authority can be held responsible for them.