Word spotting is popularly used for digitisation and transcription of historical handwritten documents. Recently, deep learning based methods have dominated the current state-of-the-art in learning-based word spotting. However, deep learning architectures such as Convolutional Neural Networks (CNNs) require a large amount of training data, and suffer from translation invariance. Capsule Networks (CapsNet) have been recently introduced as a data-efficient alternative to CNNs. This work explores the applicability of CapsNets for segmentation-based word spotting, and is the first such effort in the Handwritten Text Recognition (HTR) community to the best of authors' knowledge. The effectiveness of CapsNets will be empirically evaluated on well-known historical handwritten datasets using standard evaluation measures. The impact of varying amounts of training data on the recognition performance will be investigated, along with a comparison with the state-of-the-art methods.