Synthesising high-quality views of dynamic scenes via path tracing is prohibitively expensive. Although caching offline-quality global illumination in neural networks alleviates this issue, existing neural view synthesis methods are limited to mainly static scenes, have low inference performance or do not integrate well with existing rendering paradigms. We propose a novel neural method that is able to capture a dynamic light field, renders at real-time frame rates at 1920x1080 resolution and integrates seamlessly with Monte Carlo ray tracing frameworks. We demonstrate how a combination of spatial, temporal and a novel surface-space encoding are each effective at capturing different kinds of spatio-temporal signals. Together with a compact fully-fused neural network and architectural improvements, we achieve a twenty-fold increase in network inference speed compared to related methods at equal or better quality. Our approach is suitable for providing offline-quality real-time rendering in a variety of scenarios, such as free-viewpoint video, interactive multi-view rendering, or streaming rendering. Finally, our work can be integrated into other rendering paradigms, e.g., providing a dynamic background for interactive scenarios where the foreground is rendered with traditional methods.