Modeling and prediction of traffic systems is a challenging task due to the complex interactions within the system. Identification of significant regressors and using them to improve travel time predictions is a concept of interest. In previous studies, such regressors were identified offline and were static in nature. In this study, an iterative joint clustering and prediction approach is proposed to accurately predict spatiotemporal patterns in travel time. The clustering module is tied to the prediction module, and a prediction model is trained on each cluster. The combined clustering and prediction are then iterated until a chosen metric is optimized. This orients clusters of data towards prediction while enabling model development on subsets of travel time data with similar prediction complexity. The clusters created using the joint clustering and prediction approach confirmed to the real-world traffic scenario, forming clusters of high travel time at busy intersections and bus stops across the study stretch and forming clusters of low travel time in the sub-urban areas of the city. Further, a comparison of the developed framework with base methods demonstrated a decrease in prediction errors by at least 22.83%. This indicates that creating clusters of data that are sensitive to the quality of predictions using the joint clustering and prediction framework improves the accuracy of travel time predictions. The study also proposes criteria for choosing the best predictions when cluster-based predictions are used.