Synthetic Data

Synthetic data is on the rise and promises to help in solving one of the major issues around the proliferation of data – the protection of sensitive data points. Synthetic data is data prepared by applying algorithms or generation software to real data. It can be designed to be used for several useful purposes, such as development of machine learning systems, model building and data research. With the growth of edge computing (advanced data management on end user equipment) data can be distributed with less risk of exposure to sensitive data. It’s a key element in Gartner’s latest “Top Predictions for IT Organizations and Users for 2022 and Beyond” 

“By 2025, synthetic data will reduce personal customer data collection, a change that will enable organizations to avoid 70% of privacy violation sanctions. Gartner defines synthetic data as data that is “generated by applying a sampling technique to real-world data or by creating simulation scenarios where models and processes interact to create completely new data not directly taken from the real world.” This approach lets organizations create models without the need for collecting so much customer data. For CIOs it will enable a lower cost of data and a faster time to AI. Organizations can develop a synthetic data competency as part of the initiative.”

Leave a comment


  • No comments found