Abstract:Provenance analysis is a pivotal technique in geological studies, which elucidates the origins and transportation pathways of materials by examining the geochemical and mineralogical attributes of sediments, soils, and rocks. This approach holds significant scientific value for deciphering geological history, paleoclimate variations, contemporary surface processes, crustal thickness reconstruction, and environmental evolution. Despite the accomplishments of conventional source analysis methods, such as heavy mineral assemblage analysis, zircon U-Pb dating, and isotope tracing, in delineating paleogeographic configurations and tectonic activities, they often grapple with difficulties like technical intricacy, high costs, lengthy procedures, and the need for sophisticated data analysis skills. In light of the rapid advancement in big data and machine learning technologies, machine learning has emerged as a promising tool in source analysis, offering robust data processing and the ability to unravel complex nonlinear relationships. This study first synthesizes and systematically evaluates recent advancements in machine learning applications for provenance analysis, supported by empirical case studies. Subsequently, we conduct a comparative assessment of ML methodologies against conventional approaches, elucidating their respective strengths and limitations, while outlining future research trajectories for ML-driven provenance investigations. It also assesses the comparative strengths and limitations of machine learning approaches versus traditional methods and offers a perspective on the future role of machine learning in the domain of source analysis. The research demonstrates that machine learning can enhance the precision and efficiency of source analysis, while also providing innovative data-driven strategies for earth science inquiries.