A Reflective Learning and Association Control Framework based on Adaptive Dynamic Programming: Architecture and Applications in Robotics

About

Data efficiency and learning speed are two of the major bottlenecks for applying biologically-inspired control methods in many domains. The project's goal is to address these fundamental challenges by introducing a new adaptive dynamic programming-based learning control framework and integrate it into space robot navigation and scouting applications such as the Mars Rover. The scientific contribution of this project will promote interdisciplinary research in computational intelligence, machine learning, control and robotics. In addition to space applications, the proposed structure can also be applied to robot-assisted pedestrian evacuation application and cyber-physical power systems and is expected to impact general systems beyond this project period. This project will expand the principle investigator (PI)'s research capacity through an extended visit and collaboration with NASA Ames Research Center located in San Jose, CA, and transform the PI's career path from theoretical algorithm/architecture development towards a new direction in complex space applications. The collaboration fits well with NASA's mission to Mars and technology roadmaps.

The proposed project will fundamentally advance the learning and association of biologically-inspired control methods. Three major contributions to the scientific field are expected. First, a new experience network is proposed and systematically integrated into a model-free adaptive dynamic programming-based learning control framework. The PI will design an experience replay tuple (i.e., state-action-reward pair) based on backward temporal difference information from historical data. This design can avoid the model network/prediction noted in existing literature and significantly save computation resources. Second, instead of a uniform sampling method, the PI proposes a prioritized sampling method based on the Bellman's estimation error. This new method is expected to enhance the controller's reflective learning performance with useful long-short term memory. The stability and convergence properties will also be analyzed. Third, this project is closely tied with NASA on robot and optimal control for space program. This new learning control structure will be integrated for robot navigation, exploration and scouting in unknown spaces. The PI and the collaborator will use both a virtual reality platform and a real Rover facility to analyze the control performance of the proposed algorithm at NASA Ames. The PI's outreach and dissemination plans will cultivate the scientific curiosity of K-12 students and motivate their interest in STEM programs. Moreover, the integration of the project's cutting-edge research results into the PI's new courses will aid retention of current STEM students. Specific plans include a workshop for a local middle school, a distance course for demographically diverse institutions, and development of new courses.

Publications

Jiang, Chao and Ni, Zhen and Guo, Yi and He, Haibo "Pedestrian Flow Optimization to Reduce the Risk of Crowd Disasters through Human-Robot Interaction," IEEE Transactions on Emerging Topics in Computational Intelligence , 2019.

Paul, Shuva and Ni, Zhen "A Comparative Study of Smart Grid Security Based on Unsupervised Learning and Load Ranking," IEEE International Conference on Electro Information Technology , 2019.

Das, Avijit and Ni, Zhen. "A Case Study of Horizon Window in Receding Horizon Control for Renewable Energy Integration," IEEE International Conference on Electro Information Technology , 2019.

Paul, Shuva and Ni, Zhen. "Study of Learning of Power Grid Defense Strategy in Adversarial Stage Game," IEEE International Conference on Electro Information Technology , 2019.

Jiang, Chao and Ni, Zhen and Guo, Yi and He, Haibo. "Optimization of Merging Pedestrian Flows Based on Adaptive Dynamic Programming," 2019 American Control Conference (ACC), Philadelphia, PA, USA. July 10-12, 2019.

Wan, Zhiqiang and Jiang, Chao and Fahad, Muhammad and Ni, Zhen and Guo, Yi and He, Haibo "Robot-Assisted Pedestrian Regulation Based on Deep Reinforcement Learning," IEEE Transactions on Cybernetics , 2019.