Jekyll2023-10-31T19:30:24+00:00https://andreascorsoglio.com/feed.xmlAndrea ScorsoglioAndrea Scorsoglio in a nutshell.Andrea ScorsoglioOrbit determination via physics informed neural networks2023-08-08T00:00:00+00:002023-08-08T00:00:00+00:00https://andreascorsoglio.com/physics%20informed%20neural%20networks/PIOD<figure class=" ">
<figcaption>Estimated trajectories
</figcaption>
</figure>
<div style="text-align: justify;">
<font size="3">
This paper introduces a method for solving orbit determination problems named Physics Informed Orbit Determination. We use a particular kind of single-layer, feed-forward neural network with random input weights and biases called Extreme Learning Machines to estimate the spacecraft’s state. The least-squares estimate is used as the baseline for the loss function, to which a regularizing term based on the differential equations modeling the dynamics of the problem is added. This ensures that the learned relationship between input and output is compliant with the physics of the problem while also fitting the observation data. The method works with range/range-rate or angular observations, either in Keplerian or non-Keplerian dynamics. The method is tested on synthetically generated data, with and without perturbations. The results are comparable with the batch least-squares solution.
However, the particular structure of PIOD, based on ELM, allows it to provide an accurate estimate without needing an initial guess. The only parameters to initialize are input weights and biases, which as per ELM theory, are initialized at random. This essentially removes the dependence of the solution on the initial guess. This is in stark contrast with batch least squares, which shows convergence sensitivity to initial guess, especially when estimating the state of satellites in cislunar space, which is becoming increasingly critical for the future of space exploration. Overall, this paper shows that physics informed neural networks are a powerful tool that can be employed for OD, which performs on par or better than existing least-squares techniques, without requiring any initial guess. Moreover, the flexibility of PINNs allows the method to be extended further using a more refined parameter selection algorithm, as well as adding functionality to account for maneuvering targets and perturbations.
<!-- include figure -->
<figure class="">
<img src="/assets/images/Journal/PIOD/PIOD_range.webp" alt="Orbit determination via physics informed neural networks" /><figcaption>
PIOD with range/range-rate observations
</figcaption></figure>
<figure class="">
<img src="/assets/images/Journal/PIOD/PIOD_angles.webp" alt="Orbit determination via physics informed neural networks" /><figcaption>
PIOD with angle-only observations
</figcaption></figure>
Reference:
Scorsoglio, A., Ghilardi, L., & Furfaro, R. (2023). A Physic-Informed Neural Network Approach to Orbit Determination. The Journal of the Astronautical Sciences, 70(4), 1-30. <a href="https://doi.org/10.1007/s40295-023-00392-w">DOI</a>
</font>
</div>Andrea ScorsoglioEstimated trajectories This paper introduces a method for solving orbit determination problems named Physics Informed Orbit Determination. We use a particular kind of single-layer, feed-forward neural network with random input weights and biases called Extreme Learning Machines to estimate the spacecraft’s state. The least-squares estimate is used as the baseline for the loss function, to which a regularizing term based on the differential equations modeling the dynamics of the problem is added. This ensures that the learned relationship between input and output is compliant with the physics of the problem while also fitting the observation data. The method works with range/range-rate or angular observations, either in Keplerian or non-Keplerian dynamics. The method is tested on synthetically generated data, with and without perturbations. The results are comparable with the batch least-squares solution. However, the particular structure of PIOD, based on ELM, allows it to provide an accurate estimate without needing an initial guess. The only parameters to initialize are input weights and biases, which as per ELM theory, are initialized at random. This essentially removes the dependence of the solution on the initial guess. This is in stark contrast with batch least squares, which shows convergence sensitivity to initial guess, especially when estimating the state of satellites in cislunar space, which is becoming increasingly critical for the future of space exploration. Overall, this paper shows that physics informed neural networks are a powerful tool that can be employed for OD, which performs on par or better than existing least-squares techniques, without requiring any initial guess. Moreover, the flexibility of PINNs allows the method to be extended further using a more refined parameter selection algorithm, as well as adding functionality to account for maneuvering targets and perturbations. PIOD with range/range-rate observations PIOD with angle-only observations Reference: Scorsoglio, A., Ghilardi, L., & Furfaro, R. (2023). A Physic-Informed Neural Network Approach to Orbit Determination. The Journal of the Astronautical Sciences, 70(4), 1-30. DOIVisualEnv: visual Gym environments with Blender2021-12-01T00:00:00+00:002021-12-01T00:00:00+00:00https://andreascorsoglio.com/computer%20vision/visualenv<font size="3">
<div style="text-align: justify;">VisualEnv is a tool designed for creating visual environment for reinforcement learning. It is the product of an integration of an open-source modelling and rendering software, Blender, and a python module used to generate environment models for simulation, OpenAI Gym. VisualEnv allows the user to create custom environments with photorealistic rendering capabilities and full integration with python. The framework is described and tested on a series of example problems that showcase its features for training reinforcement learning agents.</div>
<p><br /></p>
<p float="center">
<img src="/assets/images/Journal/VisualEnv/test_Vcart.gif" width="220" />
<img src="/assets/images/Journal/VisualEnv/test_Hover2D.gif" width="220" />
<img src="/assets/images/Journal/VisualEnv/test_Goal.gif" width="220" />
</p>
<p><br /></p>
<div style="text-align: justify;">The tool is built around two open-source pieces of software: Blender and OpenAI Gym. Blender is a modelling, animation and rendering software that is usually used for production quality computer graphics workflows. It offers a wide range of tools for modelling, sculpting, shading, texturing, lighting and rendering. Moreover, it interfaces easily with python through an API. OpenAI Gym on the other hand is a powerful environment builder written in python and widely used to train RL agents. VisualEnv harnesses the power of both to create a standalone package that can be used to generate custom visual environment in python. These can then be used to train RL algorithm or perform other types of real time simulation tasks. The user has control over every aspect of the rendering and animation of the environment. The environment and the actor can be created using a pletora of photorealistic shaders and materials. The environment dynamics can be controlled by the gym environment that then interfaces with blender through the API, moving and modifying the actors in the scene. The rendered observations can then be used for a variety of applications. The power of the frameworks resides in the ease of implementation and capabilities suitable for online-applications.</div>
<p><br /></p>
Reference:
Scorsoglio, A., & Furfaro, R. (2021). VisualEnv: visual Gym environments with Blender. arXiv preprint arXiv:2111.08096. <a href="https://arxiv.org/pdf/2111.08096.pdf">PDF</a>
</font>Andrea ScorsoglioVisualEnv is a tool designed for creating visual environment for reinforcement learning. It is the product of an integration of an open-source modelling and rendering software, Blender, and a python module used to generate environment models for simulation, OpenAI Gym. VisualEnv allows the user to create custom environments with photorealistic rendering capabilities and full integration with python. The framework is described and tested on a series of example problems that showcase its features for training reinforcement learning agents. The tool is built around two open-source pieces of software: Blender and OpenAI Gym. Blender is a modelling, animation and rendering software that is usually used for production quality computer graphics workflows. It offers a wide range of tools for modelling, sculpting, shading, texturing, lighting and rendering. Moreover, it interfaces easily with python through an API. OpenAI Gym on the other hand is a powerful environment builder written in python and widely used to train RL agents. VisualEnv harnesses the power of both to create a standalone package that can be used to generate custom visual environment in python. These can then be used to train RL algorithm or perform other types of real time simulation tasks. The user has control over every aspect of the rendering and animation of the environment. The environment and the actor can be created using a pletora of photorealistic shaders and materials. The environment dynamics can be controlled by the gym environment that then interfaces with blender through the API, moving and modifying the actors in the scene. The rendered observations can then be used for a variety of applications. The power of the frameworks resides in the ease of implementation and capabilities suitable for online-applications. Reference: Scorsoglio, A., & Furfaro, R. (2021). VisualEnv: visual Gym environments with Blender. arXiv preprint arXiv:2111.08096. PDFOrbit determination via physics informed neural networks in cislunar environment2021-09-12T00:00:00+00:002021-09-12T00:00:00+00:00https://andreascorsoglio.com/physics%20informed%20neural%20networks/PINN-OD-cislunar<figure class=" ">
<a href="/assets/images/Conf/BigSky/halo.jpg">
<img src="/assets/images/Conf/BigSky/halo.jpg" alt="placeholder image 1" />
</a>
<figcaption>Estimated trajectories
</figcaption>
</figure>
<div style="text-align: justify;">
<font size="3">
The approach presented in this paper is based on using Physics Informed Neural Networks (PINN) to estimate the state of a spacecraft in cislunar space given angles measurements. The novel concept introduced by PINNs is the ability to solve forward and inverse problems governed by parametric Differential Equations (DEs). This is done by approximating the solution of the DE using a neural network which is trained by embedding the dynamical model into the NN loss function. This acts as a regularizer that penalizes the network's training when the physics of the problem is violated. PINN have been successfully used in a variety of works, spanning from computational gas dynamics to spacecraft guidance and control. When used to solve the OD problem, PINN allows the solution to be physically sound and, in the meantime, avoids the need for proper initialization. The method presented can use angle measurements (e.g., right ascension, declination) and the knowledge of the dynamical system to create an accurate estimation of the state of the system. This is particularly useful for objects moving in cislunar space where the CRTBP framework governs the dynamics. Moreover, this method can estimate the entire trajectory without needing propagation steps that are computationally expensive and prone to errors.</div>
Reference:
Scorsoglio, A., Ghilardi, L., & Furfaro, R. (2023). A Physic-Informed Neural Network Approach to Orbit Determination. The Journal of the Astronautical Sciences, 70(4), 1-30. <a href="https://doi.org/10.1007/s40295-023-00392-w">DOI</a>
</font>
</div>
<!-- <figure class="">
<img src="/assets/images/Conf/BigSky/halo.jpg"
alt=""><figcaption>
Estimated trajectory for halo orbit
</figcaption></figure>
-->Andrea ScorsoglioEstimated trajectories The approach presented in this paper is based on using Physics Informed Neural Networks (PINN) to estimate the state of a spacecraft in cislunar space given angles measurements. The novel concept introduced by PINNs is the ability to solve forward and inverse problems governed by parametric Differential Equations (DEs). This is done by approximating the solution of the DE using a neural network which is trained by embedding the dynamical model into the NN loss function. This acts as a regularizer that penalizes the network's training when the physics of the problem is violated. PINN have been successfully used in a variety of works, spanning from computational gas dynamics to spacecraft guidance and control. When used to solve the OD problem, PINN allows the solution to be physically sound and, in the meantime, avoids the need for proper initialization. The method presented can use angle measurements (e.g., right ascension, declination) and the knowledge of the dynamical system to create an accurate estimation of the state of the system. This is particularly useful for objects moving in cislunar space where the CRTBP framework governs the dynamics. Moreover, this method can estimate the entire trajectory without needing propagation steps that are computationally expensive and prone to errors.</div> Reference: Scorsoglio, A., Ghilardi, L., & Furfaro, R. (2023). A Physic-Informed Neural Network Approach to Orbit Determination. The Journal of the Astronautical Sciences, 70(4), 1-30. DOIImage-Based Deep Reinforcement Meta-Learning for Autonomous Lunar Landing2021-07-28T00:00:00+00:002021-07-28T00:00:00+00:00https://andreascorsoglio.com/meta-reinforcement%20learning/Image-based-landing-paper<figure class=" ">
<a href="/assets/images/Journal/ImageBased/results.png">
<img src="/assets/images/Journal/ImageBased/results.png" alt="placeholder image 1" />
</a>
</figure>
<div style="text-align: justify;">
<font size="3">
Future exploration and human missions on large planetary bodies (e.g., moon, Mars) will require advanced guidance navigation and control algorithms for the powered descent phase, which must be capable of unprecedented levels of autonomy. The advent of machine learning, and specifically reinforcement learning, has enabled new
possibilities for closed-loop autonomous guidance and navigation. In this paper, image-based reinforcement meta-learning is applied to solve the lunar pinpoint powered descent and landing task with uncertain dynamic parameters and actuator failure. The agent, a deep neural network, takes real-time images and ranging observations acquired during the descent and maps them directly to thrust command (i.e., sensor-to-action policy). Training and validation of the algorithm and Monte Carlo simulations shows that the resulting closed-loop guidance policy reaches errors in the order of meters in different scenarios, even when the environment is partially observed, and the state of the spacecraft is not fully known.
Here the goal is to employ a RML approach to devise a deep network that maps, in a closed-loop fashion, sequence of images and radar altimetry data directly into trust. The latter poses a completely new challenge: since the network input is comprised by sequences of 2D images as well as ranging observations, the closed-loop NN policy become much larger in the parameter spaces. Importantly, a novel integrated environment for photo-realistic rendering which interfaces in real-time with the learning algorithm, is developed for fast training and validation. Specifically, we created a simulator that integrates dynamics and sensor data acquisition seamlessly in a python environment, that generates accurate images using lunar digital terrain models (DTM) and a physically-based rendering engine. We leverage the python-based Blender platform with dynamical models to create the powered descent and landing simulation environment accounting for both sensing and dynamics.
The proposed RML policy integrates guidance and navigation in a single compact system that is capable of mapping sequences of observations to thrust command. This is carried out in an adaptive way within the defined distribution boundaries. Specifically, some parameters of the environment, such as the gravity acceleration and the initial spacecraft mass, are randomly sampled within some distributions, which allows the meta-learner to learn over a wide distribution of instances of a stochastic environment. Moreover, random actuator failures are introduced during training, which makes the algorithm robust to such events as well. This approach allows to obtain a complete integration of guidance and navigation in a image-based closed-loop system that is robust to perturbations and un-modeled dynamics.
<!-- <figure class="">
<img src="/assets/images/Journal/ImageBased/results.png"
alt=""></figure>
-->
<p><br /></p>
Reference:
Scorsoglio, A., D’Ambrosio, A., Ghilardi, L., Gaudet, B., Curti, F., & Furfaro, R. (2021). Image-Based Deep Reinforcement Meta-Learning for Autonomous Lunar Landing. Journal of Spacecraft and Rockets, 1-13. <a href="https://doi.org/10.2514/1.a35072">DOI</a>
</font>
</div>Andrea ScorsoglioFuture exploration and human missions on large planetary bodies (e.g., moon, Mars) will require advanced guidance navigation and control algorithms for the powered descent phase, which must be capable of unprecedented levels of autonomy. The advent of machine learning, and specifically reinforcement learning, has enabled new possibilities for closed-loop autonomous guidance and navigation. In this paper, image-based reinforcement meta-learning is applied to solve the lunar pinpoint powered descent and landing task with uncertain dynamic parameters and actuator failure. The agent, a deep neural network, takes real-time images and ranging observations acquired during the descent and maps them directly to thrust command (i.e., sensor-to-action policy). Training and validation of the algorithm and Monte Carlo simulations shows that the resulting closed-loop guidance policy reaches errors in the order of meters in different scenarios, even when the environment is partially observed, and the state of the spacecraft is not fully known. Here the goal is to employ a RML approach to devise a deep network that maps, in a closed-loop fashion, sequence of images and radar altimetry data directly into trust. The latter poses a completely new challenge: since the network input is comprised by sequences of 2D images as well as ranging observations, the closed-loop NN policy become much larger in the parameter spaces. Importantly, a novel integrated environment for photo-realistic rendering which interfaces in real-time with the learning algorithm, is developed for fast training and validation. Specifically, we created a simulator that integrates dynamics and sensor data acquisition seamlessly in a python environment, that generates accurate images using lunar digital terrain models (DTM) and a physically-based rendering engine. We leverage the python-based Blender platform with dynamical models to create the powered descent and landing simulation environment accounting for both sensing and dynamics. The proposed RML policy integrates guidance and navigation in a single compact system that is capable of mapping sequences of observations to thrust command. This is carried out in an adaptive way within the defined distribution boundaries. Specifically, some parameters of the environment, such as the gravity acceleration and the initial spacecraft mass, are randomly sampled within some distributions, which allows the meta-learner to learn over a wide distribution of instances of a stochastic environment. Moreover, random actuator failures are introduced during training, which makes the algorithm robust to such events as well. This approach allows to obtain a complete integration of guidance and navigation in a image-based closed-loop system that is robust to perturbations and un-modeled dynamics. Reference: Scorsoglio, A., D’Ambrosio, A., Ghilardi, L., Gaudet, B., Curti, F., & Furfaro, R. (2021). Image-Based Deep Reinforcement Meta-Learning for Autonomous Lunar Landing. Journal of Spacecraft and Rockets, 1-13. DOIOrbit determination via physics informed neural networks2021-01-05T00:00:00+00:002021-01-05T00:00:00+00:00https://andreascorsoglio.com/physics%20informed%20neural%20networks/PINN-OD<figure class="third ">
<a href="/assets/images/Conf/Charlotte/LEO_traj.png">
<img src="/assets/images/Conf/Charlotte/LEO_traj.png" alt="placeholder image 1" />
</a>
<a href="/assets/images/Conf/Charlotte/GEO_traj.png">
<img src="/assets/images/Conf/Charlotte/GEO_traj.png" alt="placeholder image 2" />
</a>
<a href="/assets/images/Conf/Charlotte/MOL_traj.png">
<img src="/assets/images/Conf/Charlotte/MOL_traj.png" alt="placeholder image 3" />
</a>
<figcaption>Estimated trajectories
</figcaption>
</figure>
<div style="text-align: justify;">
<font size="3">
In this paper, a new method for solving orbit determination problems is introduced named Physics Informed Least Squares (PILS). We use a particular kind of single layer feed-forward neural networks called Extreme Learning Machines to enable higher accuracy and flexibility than classical least squares. The least squares estimate is used as baseline for the loss function, to which a regularizing term based on the differential equations modelling the dynamics of the problem is then added. This ensure that the learned relationship between input and output is compliant with the physics of the problem. The results are comparable or better than the batch least squares solution, with the advantage of not requiring an initial guess and being able to solve for the complete trajectory without needing any integration.
<!-- <figure class="">
<img src="/assets/images/Conf/Charlotte/LEO_traj.png"
alt=""><figcaption>
Estimated trajectory for LEO
</figcaption></figure>
-->
The goal of this work is to build upon the foundation of ML applied to the OD problem using the recently developed Physics-Informed Extreme Learning Machines (PIELM) to regularize the learning algorithm and improve dynamical soundness of the estimated solution. Standard PIELM are particular neural networks, developed for solving forward and inverse problems governed by parametric Differential Equations (DEs). The training of these networks, as in any Physics Informed method, is regulated by the physics of the problem. In particular, the latent solution of the parametric DE is approximated via Neural Networks (NNs); the DE that models the physics of the problem, is then embedded, in its implicit form, into the NN loss function. Thus, the DE drives the training of the network acting as regulator that penalizes the network’s training when the physics of the problem is violated. Usually these NN are then trained via gradient-based methods. In case the method employs Extreme Learning Machines (ELM), as in this case, the training can be carried out in one single step using minimum norm least squares. With this work we harness the power and ease of implementation of neural network to solve a set of OD problems based on least squares. This leads to an algorithm capable of performance similar to batch least squares with some advantages. Specifically, the algorithm is capable of estimating the complete state evolution without any integration step and without the need to provide an initial guess.
</font>
</div>
<!-- <figure class="">
<img src="/assets/images/Conf/Charlotte/GEO_traj.png"
alt=""><figcaption>
Estimated trajectory for GEO
</figcaption></figure>
-->
<!-- <figure class="">
<img src="/assets/images/Conf/Charlotte/MOL_traj.png"
alt=""><figcaption>
Estimated trajectory for Molniya
</figcaption></figure>
-->Andrea ScorsoglioEstimated trajectories In this paper, a new method for solving orbit determination problems is introduced named Physics Informed Least Squares (PILS). We use a particular kind of single layer feed-forward neural networks called Extreme Learning Machines to enable higher accuracy and flexibility than classical least squares. The least squares estimate is used as baseline for the loss function, to which a regularizing term based on the differential equations modelling the dynamics of the problem is then added. This ensure that the learned relationship between input and output is compliant with the physics of the problem. The results are comparable or better than the batch least squares solution, with the advantage of not requiring an initial guess and being able to solve for the complete trajectory without needing any integration. The goal of this work is to build upon the foundation of ML applied to the OD problem using the recently developed Physics-Informed Extreme Learning Machines (PIELM) to regularize the learning algorithm and improve dynamical soundness of the estimated solution. Standard PIELM are particular neural networks, developed for solving forward and inverse problems governed by parametric Differential Equations (DEs). The training of these networks, as in any Physics Informed method, is regulated by the physics of the problem. In particular, the latent solution of the parametric DE is approximated via Neural Networks (NNs); the DE that models the physics of the problem, is then embedded, in its implicit form, into the NN loss function. Thus, the DE drives the training of the network acting as regulator that penalizes the network’s training when the physics of the problem is violated. Usually these NN are then trained via gradient-based methods. In case the method employs Extreme Learning Machines (ELM), as in this case, the training can be carried out in one single step using minimum norm least squares. With this work we harness the power and ease of implementation of neural network to solve a set of OD problems based on least squares. This leads to an algorithm capable of performance similar to batch least squares with some advantages. Specifically, the algorithm is capable of estimating the complete state evolution without any integration step and without the need to provide an initial guess.Safe Lunar landing via images: A Reinforcement Meta-Learning application to autonomous hazard avoidance and landing2020-08-12T00:00:00+00:002020-08-12T00:00:00+00:00https://andreascorsoglio.com/reinforcement%20learning/safe-lunar-landing<figure class="third ">
<a href="/assets/images/Conf/LakeTahoe/lake_tahoe_real.gif">
<img src="/assets/images/Conf/LakeTahoe/lake_tahoe_real.gif" alt="placeholder image 1" />
</a>
<a href="/assets/images/Conf/LakeTahoe/lake_tahoe_mask.gif">
<img src="/assets/images/Conf/LakeTahoe/lake_tahoe_mask.gif" alt="placeholder image 2" />
</a>
<a href="/assets/images/Conf/LakeTahoe/lake_tahoe_hazard.gif">
<img src="/assets/images/Conf/LakeTahoe/lake_tahoe_hazard.gif" alt="placeholder image 3" />
</a>
<figcaption>Ground view, segmentation mask and hazard map.
</figcaption>
</figure>
<font size="3">
<div style="text-align: justify;">
In this paper, we propose a new approach based on deep learning that integrates guidance and navigation functions, providing a complete solution to the lunar landing problem that integrates image-based navigation to intelligent guidance. We exploit the latest advancements in computer vision and Convolutional Neural Network (CNN) to implement a hazard detection and avoidance algorithm to detect safe landing sites autonomously. As it can be noted, the approach we propose is entirely based on machine learning algorithms (semantic segmentation), whereas the procedure used for the Chang'e-3 mission was still based on optical images, but the detection was performed via more "classical" edge detection algorithms. The CNN for hazard detection and avoidance is trained in a supervised fashion and is then embedded in the guidance framework. In order to do this, we designed a simulation environment that is able to integrate the dynamics of the system and simulate image acquisition from onboard cameras. This is achieved by interfacing the simulator in Python with a ray tracer (i.e., Blender) that generates accurate images using lunar Digital Terrain Models (DTM) and a physically-based rendering engine. The control policy is trained to align the boresight direction of the ground-facing camera to a target on the ground using reinforcement learning, specifically an actor-critic algorithm called Proximal Policy Optimization (PPO). The hazard detection and avoidance subroutine is then used at the testing phase to assign a safe area on the ground as a target for the policy in order to perform a targeted soft landing.
<!-- <figure class="">
<img src="/assets/images/Conf/LakeTahoe/lake_tahoe_real.gif"
alt=""><figcaption>
Ground View
</figcaption></figure>
<figure class="">
<img src="/assets/images/Conf/LakeTahoe/lake_tahoe_mask.gif"
alt=""><figcaption>
Ground Mask
</figcaption></figure>
<figure class="">
<img src="/assets/images/Conf/LakeTahoe/lake_tahoe_hazard.gif"
alt=""><figcaption>
Ground Hazard map
</figcaption></figure>
-->
<p><br /></p>
Reference:
Scorsoglio, A., D’Ambrosio, A., Ghilardi, L., Furfaro, R., Gaudet, B., Linares, R., & Curti, F. (2020, August). Safe Lunar landing via images: A Reinforcement Meta-Learning application to autonomous hazard avoidance and landing. In Proceedings of the 2020 AAS/AIAA Astrodynamics Specialist Conference, Virtual (pp. 9-12).
<a href="https://www.researchgate.net/profile/Andrea-Scorsoglio/publication/343650361_Safe_lunar_landing_via_images_a_reinforcement_meta-learning_application_to_autonomous_hazard_avoidance_and_landing/links/60b94f15a6fdcc22ead3c19b/Safe-lunar-landing-via-images-a-reinforcement-meta-learning-application-to-autonomous-hazard-avoidance-and-landing.pdf">PDF</a>
</font>
</div>
</font>Andrea ScorsoglioGround view, segmentation mask and hazard map. In this paper, we propose a new approach based on deep learning that integrates guidance and navigation functions, providing a complete solution to the lunar landing problem that integrates image-based navigation to intelligent guidance. We exploit the latest advancements in computer vision and Convolutional Neural Network (CNN) to implement a hazard detection and avoidance algorithm to detect safe landing sites autonomously. As it can be noted, the approach we propose is entirely based on machine learning algorithms (semantic segmentation), whereas the procedure used for the Chang'e-3 mission was still based on optical images, but the detection was performed via more "classical" edge detection algorithms. The CNN for hazard detection and avoidance is trained in a supervised fashion and is then embedded in the guidance framework. In order to do this, we designed a simulation environment that is able to integrate the dynamics of the system and simulate image acquisition from onboard cameras. This is achieved by interfacing the simulator in Python with a ray tracer (i.e., Blender) that generates accurate images using lunar Digital Terrain Models (DTM) and a physically-based rendering engine. The control policy is trained to align the boresight direction of the ground-facing camera to a target on the ground using reinforcement learning, specifically an actor-critic algorithm called Proximal Policy Optimization (PPO). The hazard detection and avoidance subroutine is then used at the testing phase to assign a safe area on the ground as a target for the policy in order to perform a targeted soft landing. Reference: Scorsoglio, A., D’Ambrosio, A., Ghilardi, L., Furfaro, R., Gaudet, B., Linares, R., & Curti, F. (2020, August). Safe Lunar landing via images: A Reinforcement Meta-Learning application to autonomous hazard avoidance and landing. In Proceedings of the 2020 AAS/AIAA Astrodynamics Specialist Conference, Virtual (pp. 9-12). PDF </font>Adaptive generalized ZEM-ZEV feedback guidance for planetary landing via a deep reinforcement learning approach2020-06-01T00:00:00+00:002020-06-01T00:00:00+00:00https://andreascorsoglio.com/reinforcement%20learning/zem-zev-landing<figure class="half ">
<a href="/assets/images/Journal/Adaptive/trajectory.png">
<img src="/assets/images/Journal/Adaptive/trajectory.png" alt="placeholder image 1" />
</a>
<a href="/assets/images/Journal/Adaptive/trajectory_3D.png">
<img src="/assets/images/Journal/Adaptive/trajectory_3D.png" alt="placeholder image 2" />
</a>
</figure>
<!-- <figure class="">
<img src="/assets/images/Journal/Adaptive/trajectory.png"
alt=""><figcaption>
2D trajectory
</figcaption></figure>
-->
<div style="text-align: justify;">
<font size="3">
Precision landing on large and small planetary bodies is a technology of utmost importance for future human and robotic exploration of the solar system. In this context, the Zero-Effort-Miss/Zero-Effort-Velocity (ZEM/ZEV) feedback guidance algorithm has been studied extensively and is still a field of active research. The algorithm, although powerful in terms of accuracy and ease of implementation, has some limitations. Therefore with this paper we present an adaptive guidance algorithm based on classical ZEM/ZEV in which machine learning is used to overcome its limitations and create a closed loop guidance algorithm that is sufficiently lightweight to be implemented on board spacecraft and flexible enough to be able to adapt to the given constraint scenario. The adopted methodology is an actor-critic reinforcement learning algorithm that learns the parameters of the above-mentioned guidance architecture according to the given problem constraints.
We propose a ZEM/ZEV-based guidance algorithm for powered descent landing that can adaptively change both guidance gains and time-to-go to generate a class of closed-loop trajectories that 1) are quasi-optimal (w.r.t. the fuel-efficiency) and 2) satisfy flight constraints (e.g. thrust constraints, glide slope). The proposed algorithm exploit recent advancements in deep reinforcement learning (e.g. deterministic policy gradient), and machine learning (e.g. Extreme Learning Machines, ELM).
The overall structure of the guidance algorithm is unchanged with respect to the classical ZEM/ZEV, but the optimal guidance gains are determined at each time step as function of the state via a parametrized learned policy. This is achieved using a deep reinforcement learning method based on an actor-critic algorithm that learns the optimal policy parameters minimizing a specific cost function. The policy is stochastic, but only its mean, expressed as a linear combination of radial basis functions, is updated by stochastic gradient descent. The variance of the policy is kept constant and is used to ensure exploration of the state space. The critic is an Extreme Learning Machine (ELM) that approximates the value function. The approximated value function is then used by the actor to update the policy. The power of the method resides in its capability, if an adequate cost function is introduced, of satisfying virtually any constraint and in its model-free nature that, given an accurate enough dynamics simulator for the generation of sample trajectories, allows learning of the guidance law in any environment, regardless of its properties. This greatly expands the capabilities of classical ZEM/ZEV guidance, allowing for its use in a wide variety of environment and constraint combinations, giving results that are generally close to the constrained fuel optimal off-line solution. Additionally, because the guidance structure is left virtually unchanged, we are able to ensure that the adaptive algorithm is maintained globally stable regardeless of the gain adaptation.</div>
<!-- <figure class="">
<img src="/assets/images/Journal/Adaptive/trajectory_3D.png"
alt=""><figcaption>
3D trajectory
</figcaption></figure>
-->
<p><br /></p>
Reference:
Furfaro, R., Scorsoglio, A., Linares, R., & Massari, M. (2020). Adaptive generalized ZEM-ZEV feedback guidance for planetary landing via a deep reinforcement learning approach. Acta Astronautica, 171, 156-171. <a href="http://dx.doi.org/10.1016/j.actaastro.2020.02.051">DOI</a>
</font>
</div>Andrea ScorsoglioPrecision landing on large and small planetary bodies is a technology of utmost importance for future human and robotic exploration of the solar system. In this context, the Zero-Effort-Miss/Zero-Effort-Velocity (ZEM/ZEV) feedback guidance algorithm has been studied extensively and is still a field of active research. The algorithm, although powerful in terms of accuracy and ease of implementation, has some limitations. Therefore with this paper we present an adaptive guidance algorithm based on classical ZEM/ZEV in which machine learning is used to overcome its limitations and create a closed loop guidance algorithm that is sufficiently lightweight to be implemented on board spacecraft and flexible enough to be able to adapt to the given constraint scenario. The adopted methodology is an actor-critic reinforcement learning algorithm that learns the parameters of the above-mentioned guidance architecture according to the given problem constraints. We propose a ZEM/ZEV-based guidance algorithm for powered descent landing that can adaptively change both guidance gains and time-to-go to generate a class of closed-loop trajectories that 1) are quasi-optimal (w.r.t. the fuel-efficiency) and 2) satisfy flight constraints (e.g. thrust constraints, glide slope). The proposed algorithm exploit recent advancements in deep reinforcement learning (e.g. deterministic policy gradient), and machine learning (e.g. Extreme Learning Machines, ELM). The overall structure of the guidance algorithm is unchanged with respect to the classical ZEM/ZEV, but the optimal guidance gains are determined at each time step as function of the state via a parametrized learned policy. This is achieved using a deep reinforcement learning method based on an actor-critic algorithm that learns the optimal policy parameters minimizing a specific cost function. The policy is stochastic, but only its mean, expressed as a linear combination of radial basis functions, is updated by stochastic gradient descent. The variance of the policy is kept constant and is used to ensure exploration of the state space. The critic is an Extreme Learning Machine (ELM) that approximates the value function. The approximated value function is then used by the actor to update the policy. The power of the method resides in its capability, if an adequate cost function is introduced, of satisfying virtually any constraint and in its model-free nature that, given an accurate enough dynamics simulator for the generation of sample trajectories, allows learning of the guidance law in any environment, regardless of its properties. This greatly expands the capabilities of classical ZEM/ZEV guidance, allowing for its use in a wide variety of environment and constraint combinations, giving results that are generally close to the constrained fuel optimal off-line solution. Additionally, because the guidance structure is left virtually unchanged, we are able to ensure that the adaptive algorithm is maintained globally stable regardeless of the gain adaptation.</div> Reference: Furfaro, R., Scorsoglio, A., Linares, R., & Massari, M. (2020). Adaptive generalized ZEM-ZEV feedback guidance for planetary landing via a deep reinforcement learning approach. Acta Astronautica, 171, 156-171. DOIImage-based deep reinforcement learning for autonomous lunar landing2020-01-06T00:00:00+00:002020-01-06T00:00:00+00:00https://andreascorsoglio.com/reinforcement%20learning/image-based-landing<figure class="half ">
<a href="/assets/images/Conf/Orlando/Apollo16_lowCont.png">
<img src="/assets/images/Conf/Orlando/Apollo16_lowCont.png" alt="placeholder image 1" />
</a>
<a href="/assets/images/Conf/Orlando/Apollo16_reduced.png">
<img src="/assets/images/Conf/Orlando/Apollo16_reduced.png" alt="placeholder image 2" />
</a>
<figcaption>Digital terrain model and rendered view
</figcaption>
</figure>
<div style="text-align: justify;">
<font size="3">
Future missions to the Moon and Mars will require advanced guidance navigation and control algorithms for the powered descent phase. These algorithm should be capable of reconstructing the state of the spacecraft using the inputs from an array of sensors and apply the required command to ensure pinpoint landing accuracy, possibly in an optimal way. This has historically been solved using off-line architectures that rely on the computation of the optimal trajectory beforehand which is then used to drive the controller. The advent of machine learning and artificial intelligence has opened new possibilities for closed-loop optimal guidance. Specifically, the use of reinforcement learning can lead to intelligent systems that learn from a simulated environment how to perform optimally a certain task. In this paper we present an adaptive landing algorithm that learns from experience how to derive the optimal thrust in a lunar pinpoint landing problem using images and altimeter data as input.
We propose a new approach based on meta reinforcement learning (meta-RL) that integrates guidance and navigation functions providing a complete solution to the lunar landing problem that integrates an image-based navigation to an intelligent guidance. More specifically, we design a simulation environment that is able to integrate the dynamics of the system and simulate image acquisition from on-board cameras. This is achieved by interfacing the simulator in Python with a ray tracer (i.e. Blender) that generates accurate images using lunar digital terrain models (DTM) and a physically based rendering engine. The images are then used to update a policy in real time using reinforcement learning. We take advantage of the latest discoveries in Convolutional Neural Net (CNN) and Recurrent Neural Nets (RNN) for image processing and Proximal Policy Optimization (PPO) to design our agent and learn the optimal policy for soft landing.
<!-- <figure class="">
<img src="/assets/images/Conf/Orlando/Apollo16_lowCont.png"
alt=""><figcaption>
DTM
</figcaption></figure>
<figure class="">
<img src="/assets/images/Conf/Orlando/Apollo16_reduced.png"
alt=""><figcaption>
Render
</figcaption></figure>
-->
<p><br /></p>
Reference:
Scorsoglio, A., Furfaro, R., Linares, R., & Gaudet, B. (2020). Image-based deep reinforcement learning for autonomous lunar landing. In AIAA Scitech 2020 Forum (p. 1910). <a href="http://dx.doi.org/10.2514/6.2020-1910">DOI</a>
</font>
</div>Andrea ScorsoglioDigital terrain model and rendered view Future missions to the Moon and Mars will require advanced guidance navigation and control algorithms for the powered descent phase. These algorithm should be capable of reconstructing the state of the spacecraft using the inputs from an array of sensors and apply the required command to ensure pinpoint landing accuracy, possibly in an optimal way. This has historically been solved using off-line architectures that rely on the computation of the optimal trajectory beforehand which is then used to drive the controller. The advent of machine learning and artificial intelligence has opened new possibilities for closed-loop optimal guidance. Specifically, the use of reinforcement learning can lead to intelligent systems that learn from a simulated environment how to perform optimally a certain task. In this paper we present an adaptive landing algorithm that learns from experience how to derive the optimal thrust in a lunar pinpoint landing problem using images and altimeter data as input. We propose a new approach based on meta reinforcement learning (meta-RL) that integrates guidance and navigation functions providing a complete solution to the lunar landing problem that integrates an image-based navigation to an intelligent guidance. More specifically, we design a simulation environment that is able to integrate the dynamics of the system and simulate image acquisition from on-board cameras. This is achieved by interfacing the simulator in Python with a ray tracer (i.e. Blender) that generates accurate images using lunar digital terrain models (DTM) and a physically based rendering engine. The images are then used to update a policy in real time using reinforcement learning. We take advantage of the latest discoveries in Convolutional Neural Net (CNN) and Recurrent Neural Nets (RNN) for image processing and Proximal Policy Optimization (PPO) to design our agent and learn the optimal policy for soft landing. Reference: Scorsoglio, A., Furfaro, R., Linares, R., & Gaudet, B. (2020). Image-based deep reinforcement learning for autonomous lunar landing. In AIAA Scitech 2020 Forum (p. 1910). DOIELM-based Actor-Critic Approach to Lyapunov Vector Fields Relative Motion Guidance in Near-Rectilinear Orbit2019-07-11T00:00:00+00:002019-07-11T00:00:00+00:00https://andreascorsoglio.com/reinforcement%20learning/lyapunov<figure class=" ">
<a href="/assets/images/Conf/Portland/field_2.png">
<img src="/assets/images/Conf/Portland/field_2.png" alt="placeholder image 1" />
</a>
<figcaption>Trajectory and Lyapunov vector field
</figcaption>
</figure>
<font size="3">
<div style="text-align: justify;">
In this paper, we present a new feedback guidance algorithm for autonomous docking maneuvers in the cislunar environment. In particular, we propose a closed-loop optimal guidance algorithm that is capable of taking path constraints and collision avoidance into account while being on a Near Rectilinear Orbit (NRO) around the L2 Lagrangian point in the Earth-Moon system. The algorithm is based on the Lyapunov vector field guidance where the acceleration command is derived from a desired velocity vector field. We use reinforcement learning to learn the shape of the field as a function of the state of the system, allowing for increased flexibility in terms of constraint shapes and better performance in terms of fuel consumption with respect to classical Lyapunov vector field guidance.
<!-- <figure class="">
<img src="/assets/images/Conf/Portland/field_2.png"
alt=""></figure>
<figure class="">
<img src="/assets/images/Conf/Portland/field_3.png"
alt=""></figure>
-->
<figure class="">
<img src="/assets/images/Conf/Portland/field_3.png" alt="" /></figure>
<p><br /></p>
Reference:
Scorsoglio, A., & Furfaro, R. (2019). ELM-based Actor-Critic Approach to Lyapunov Vector Fields Relative Motion Guidance in Near-Rectilinear Orbit. In AAS/AIAA Astrodynamics Specialist Conference, 2019, Portland, ME. Advances in the Astronautical Sciences 171 (2019) (pp. 1-20). <a href="https://www.researchgate.net/profile/Andrea-Scorsoglio/publication/340682520_ELM-based_Actor-Critic_Approach_to_Lyapunov_Vector_Fields_Relative_Motion_Guidance_in_Near-Rectilinear_Orbits/links/5e98fa43a6fdcca789200dca/ELM-based-Actor-Critic-Approach-to-Lyapunov-Vector-Fields-Relative-Motion-Guidance-in-Near-Rectilinear-Orbits.pdf">PDF</a>
</font>
</div>
</font>Andrea ScorsoglioTrajectory and Lyapunov vector field In this paper, we present a new feedback guidance algorithm for autonomous docking maneuvers in the cislunar environment. In particular, we propose a closed-loop optimal guidance algorithm that is capable of taking path constraints and collision avoidance into account while being on a Near Rectilinear Orbit (NRO) around the L2 Lagrangian point in the Earth-Moon system. The algorithm is based on the Lyapunov vector field guidance where the acceleration command is derived from a desired velocity vector field. We use reinforcement learning to learn the shape of the field as a function of the state of the system, allowing for increased flexibility in terms of constraint shapes and better performance in terms of fuel consumption with respect to classical Lyapunov vector field guidance. Reference: Scorsoglio, A., & Furfaro, R. (2019). ELM-based Actor-Critic Approach to Lyapunov Vector Fields Relative Motion Guidance in Near-Rectilinear Orbit. In AAS/AIAA Astrodynamics Specialist Conference, 2019, Portland, ME. Advances in the Astronautical Sciences 171 (2019) (pp. 1-20). PDF </font>Actor-critic reinforcement learning approach to relative motion guidance in near-rectilinear orbit2019-01-11T00:00:00+00:002019-01-11T00:00:00+00:00https://andreascorsoglio.com/reinforcement%20learning/zem-zev-NRO<figure class="half ">
<a href="/assets/images/Conf/Hawaii/trajectoryLVLH_1.png">
<img src="/assets/images/Conf/Hawaii/trajectoryLVLH_1.png" alt="placeholder image 1" />
</a>
<a href="/assets/images/Conf/Hawaii/trajectoryLVLH_2.png">
<img src="/assets/images/Conf/Hawaii/trajectoryLVLH_2.png" alt="placeholder image 2" />
</a>
<figcaption>Trajectories
</figcaption>
</figure>
<div style="text-align: justify;">
<font size="3">
This paper aims a developing a new feedback guidance algorithm for docking maneuvers in the cislunar environment. In particular, the goal is to create an algorithm that is lightweight, closed-loop and capable of taking path constraints into account. The problem has been solved starting from the well know Zero-Effort-Miss/Zero-Effort-Velocity (ZEM/ZEV) guidance using machine learning to improve its capabilities and widen its field of application. The algorithm has been developed in the circular restricted three body problem (CRTBP) framework for Near Rectilinear Orbits (NRO) in the Earth-Moon system but the results can be easily generalized to many more guidance problems. The results are satisfactory and show that reinforcement learning can be effectively used to solve constrained relative spacecraft guidance problems.
<!-- <figure class="">
<img src="/assets/images/Conf/Hawaii/trajectoryLVLH_1.png"
alt=""></figure>
<figure class="">
<img src="/assets/images/Conf/Hawaii/trajectoryLVLH_2.png"
alt=""></figure>
-->
<p><br /></p>
Reference:
Scorsoglio, A., Furfaro, R., Linares, R., & Massari, M. (2019). Actor-critic reinforcement learning approach to relative motion guidance in near-rectilinear orbit. 29th AAS/AIAA Space Flight Mechanics Meeting, 2019, Ka’anapali, Maui, HI, Advances in the Astronautical Sciences, 168, 1737-1756. <a href="https://www.researchgate.net/profile/Richard-Linares/publication/331147324_Actor-Critic_Reinforcement_Learning_Approach_to_Relative_Motion_Guidance_in_Near-Rectilinear_Orbit/links/5c67667ba6fdcc404eb453bd/Actor-Critic-Reinforcement-Learning-Approach-to-Relative-Motion-Guidance-in-Near-Rectilinear-Orbit.pdf">PDF</a>
</font>
</div>Andrea ScorsoglioTrajectories This paper aims a developing a new feedback guidance algorithm for docking maneuvers in the cislunar environment. In particular, the goal is to create an algorithm that is lightweight, closed-loop and capable of taking path constraints into account. The problem has been solved starting from the well know Zero-Effort-Miss/Zero-Effort-Velocity (ZEM/ZEV) guidance using machine learning to improve its capabilities and widen its field of application. The algorithm has been developed in the circular restricted three body problem (CRTBP) framework for Near Rectilinear Orbits (NRO) in the Earth-Moon system but the results can be easily generalized to many more guidance problems. The results are satisfactory and show that reinforcement learning can be effectively used to solve constrained relative spacecraft guidance problems. Reference: Scorsoglio, A., Furfaro, R., Linares, R., & Massari, M. (2019). Actor-critic reinforcement learning approach to relative motion guidance in near-rectilinear orbit. 29th AAS/AIAA Space Flight Mechanics Meeting, 2019, Ka’anapali, Maui, HI, Advances in the Astronautical Sciences, 168, 1737-1756. PDF