Abstract:In response to the digital and intelligent transformation demands of the engineering supervision industry, and addressing the limitations of foundation models in professional semantic understanding and engineering-scene adaptation, this study systematically investigates high-quality dataset construction and Retrieval-Augmented Generation (RAG) techniques. Within the framework of the Guidelines for High-Quality Dataset Development, a methodological pathway for constructing and applying high-quality datasets tailored to engineering supervision scenarios is proposed.Based on the JKinco Zhuyan Engineering Supervision Large Model Evaluation Benchmark, a six-dimensional evaluation framework encompassing faithfulness, answer correctness, answer relevance, context precision, context recall, and noise sensitivity was established to comparatively assess the performance of a baseline foundation model and its RAG-enhanced counterpart. The results indicate that the baseline model achieved an overall performance score of 0.52, with relatively weak performance across all evaluation dimensions. After integrating the RAG architecture, the overall score increased to 0.73, representing an improvement of approximately 40.3%, with substantial enhancements observed across all evaluation metrics.The findings demonstrate that the incorporation of external high-quality datasets through the RAG framework effectively mitigates the deficiencies of foundation models in engineering supervision knowledge and complex scenario reasoning. This study provides a feasible technical pathway for developing domain-specific large models in the engineering supervision sector and offers practical insights for advancing the industry’s digital and intelligent transformation during China’s Fifteenth Five-Year Plan period.