A serious challenge when finding influential actors in real-world social networks, to enable efficient community-wide interventions, is the lack of knowledge about the structure of the underlying network. Current state-of-the-art methods rely on hand-crafted sampling algorithms; these methods sample nodes and their neighbours in a carefully constructed order and choose opinion leaders from this discovered network to maximize influence spread in the (unknown) complete network. In this work, we propose a reinforcement learning framework to discover effective network sampling heuristics by leveraging automatically learnt node and graph representations that encode important structural properties of the network. At training time, the method identifies portions of the network such that the nodes selected from this sampled subgraph can effectively influence nodes in the complete network. The output of this training is a transferable, adaptive policy that identifies an effective sequence of nodes to query on unseen graphs. The success of this policy is underpinned by a set of careful choices for embedding local and global information about the graph, and providing appropriate reward signals during training. We experiment with real-world social networks from four different domains and show that the policies learned by our RL agent provide a 7-23% improvement over the current state-of-the-art method.