Automating construction progress monitoring is essential for the timely completion of projects. Computer vision-based construction progress monitoring (CV-CPM) stands out as a promising technology, leveraging 3D point clouds as inputs. Both heuristics-based and learning-based approaches have been explored for identifying building elements. Nevertheless, prevailing supervised methods require project-specific manual labeling, rendering them non-generalizable. This paper introduces a hybrid self-supervised learning architecture named ConPro-NET, which integrates heuristics with learning-based techniques for element identification from construction point clouds. The proposed approach conducts unsupervised segmentation through a region-growing-based method, followed by feature extraction using contrastive learning. Contrastive learning matches object pairs to learn their features, which are refined and augmented with handcrafted features based on local geometric and visual properties to form the hybrid feature vector. The model demonstrates an overall classification accuracy of 80.86% on the S3DIS dataset and 80.95% on a case study dataset, encompassing the classification of six object classes.