 
In this project, our goal is to train an agent to protect villagers and eliminate zombies using deep reinforcement learning—a technique that combines deep learning with reinforcement learning. Initially, the agent explores the environment by randomly selecting actions, but over time, it learns from its experiences and begins to choose the most effective moves to achieve its objectives.
Some of the link maynot be relevant to Malmo, but we are using similar idea here.
Fighting Zombies in Minecraft With Deep Reinforcement Learning - Hiroto Udagawa, Tarun Narasimhan, Shim-Young Lee
Beat Atari with Deep Reinforcement Learning! - Adrien Lucas Ecoffet
Improvements in Deep Q Learning: Dueling Double DQN, Prioritized Experience Replay, and fixed Q-targets - Thomas Simonini