Light-field imaging is appealing to the mobile devices market because of its capability for intuitive post-capture processing. Acquiring light field (LF) data with high angular, spatial and temporal resolution poses significant challenges, especially with space constraints preventing bulky optics. At the same time, stereo video capture, now available on many consumer devices, can be interpreted as a sparse LF-capture. We explore the application of small baseline stereo videos for reconstructing high fidelity LF videos.We propose a self-supervised learning-based algorithm for LF video reconstruction from stereo video. The selfsupervised LF video reconstruction is guided via the geo-Light-field imaging is appealing to the mobile devices market because of its capability for intuitive post-capture processing. Acquiring light field (LF) data with high angular, spatial and temporal resolution poses significant chal- lenges, especially with space constraints preventing bulky optics. At the same time, stereo video capture, now available on many consumer devices, can be interpreted as a sparse LF-capture. We explore the application of small baseline stereo videos for reconstructing high fidelity LF videos. We propose a self-supervised learning-based algorithm for LF video reconstruction from stereo video. The selfsupervised LF video reconstruction is guided via the geometric information from the individual stereo pairs and the temporal information from the video sequence. LF estima- tion is further regularized by a low-rank constraint based on layered LF displays. The proposed self-supervised algorithm facilitates advantages such as post-training fine- tuning on test sequences and variable angular view interpolation and extrapolation. Quantitatively the reconstructed LF videos show higher fidelity than previously proposed unsupervised approaches.We demonstrate our results via LF videos generated from publicly available stereo videos ac- quired from commercially available stereoscopic cameras.Finally, we demonstrate that our reconstructed LF videos allow applications such as post-capture focus control and region-of-interest (RoI) based focus tracking for videos. metric information from the individual stereo pairs and the temporal information from the video sequence. LF estima-Light-field imaging is appealing to the mobile devices market because of its capability for intuitive post-capture processing. Acquiring light field (LF) data with high angular, spatial and temporal resolution poses significant chal- lenges, especially with space constraints preventing bulky optics. At the same time, stereo video capture, now available on many consumer devices, can be interpreted as a sparse LF-capture. We explore the application of small baseline stereo videos for reconstructing high fidelity LF videos. We propose a self-supervised learning-based algorithm for LF video reconstruction from stereo video. The selfsupervised LF video reconstruction is guided via the geometric information from the individual stereo pairs and the temporal information from the video sequence. LF estimation is further regularized by a low-rank constraint based on layered LF displays. The proposed self-supervised al- gorithm facilitates advantages such as post-training finetuning on test sequences and variable angular view interpolation and extrapolation. Quantitatively the reconstructed LF videos show higher fidelity than previously proposed unsupervised approaches.We demonstrate our results via LF videos generated from publicly available stereo videos ac- quired from commercially available stereoscopic cameras. Finally, we demonstrate that our reconstructed LF videos allow applications such as post-capture focus control and region-of-interest (RoI) based focus tracking for videos. tion is further regularized by a low-rank constraint based on layered LF displays. The proposed self-supervised al- gorithm facilitates advantages such as post-training finetuning on test sequences and variable angular view interpolation and extrapolation. Quantitatively the reconstructed LF videos show higher fidelity than previously proposed un-supervised approaches.We demonstrate our results via LF videos generated from publicly available stereo videos acquired from commercially available stereoscopic cameras.Finally, we demonstrate that our reconstructed LF videos allow applications such as post-capture focus control and region-of-interest (RoI) based focus tracking for videos.