Resumen
There is a strong need for synthetic yet realistic distribution system test data sets that are as diverse, large, and complex to solve as real systems. Such datasets can facilitate the development of advanced algorithms and assessment of emerging distributed energy resources while avoiding the need for and restrictions of proprietary, critical infrastructure, or private data. However, such synthetic datasets are only useful if they can be shown to be realistic enough to look and behave similarly to actual systems. This paper presents a comprehensive framework for validating synthetic distribution data sets using three-pronged statistical, operational, and expert validation approach. It also presents a set of statistical and operational metric targets for achieving realistic data sets based on detailed characterization of over ten thousands real U.S. utility systems, and demonstrates their use for validating synthetic data sets developed by the authors representing three areas: Santa Fe, New Mexico; Greensboro, North Carolina; and the San Francisco Bay Area, California.
Validation of synthetic U.S. power distribution system data sets