Internships and Deep Q networks

Fangyi Zhang is a third year PhD researcher in the centre. His research has been focusing on learning visuo-motor policies through reinforcement learning and transferring policies from simulation to the real world, which are two sub-projects of the VA program.

Fangyi’s PhD research was started by evaluating the feasibility of learning vision-based planar reaching using a Deep Q Network (DQN), showing that DQN is able to learn reaching in simulation, while the learned policies do not transfer directly to real robots with real cameras observing real scenes. He then proposed modular deep Q networks to transfer policies from simulation to the real world in a low-cost manner with a small number of labelled real images. To further weaken the reliance on labelling real data which is expensive or even impractical in many robotic applications, Fangyi proposed an adversarial discriminative approach to transfer visuo-motor policies from simulated to real environments, which reduced the required amount of labelled real data by 50% for object reaching in clutter with a seven DoF robotic arm (Baxter). These results have been presented at robotics conferences such as ACRA and CVPR, among which the ACRA 2017 paper on modular deep Q networks appeared as a best paper finalist. Currently, Fangyi is investigating how to extend the adversarial discriminative transfer and modular approaches to more complicated robotic manipulation tasks and also further reduce the demand for labelling real data.

Apart from his PhD research, Fangyi has also been actively taking part in cross-node collaborations. In October 2017, he stayed in the ANU node for 10 days to collaborate with fellow PhD researcher Zheyu Zhuang and CI Robert Mahony for a project of using Lyapunov functions as bridges to link the state-of-the-art deep learning techniques and current robotic control frameworks for better robustness (Zheyu’s PhD research). In addition, he also stayed in the Autonomy Robotics Cognition (ARC) lab at the University of Maryland in the USA for 3 months in 2016, participating in a project enabling a Baxter robot to do housework in a kitchen scenario.