Do you know what “stimulated annealing” is? If not, then check out this cool project by Justin Matejka and George Fitzmaurice. The full title of their excellent work is “Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Stimulated Annealing,” and here is an excerpt:
The key insight behind our approach is that while it is relatively difficult to generate a dataset from scratch with particular statistical properties, it is relatively easy to take an existing dataset, modify it slightly, and maintain those statistical properties. We do this by choosing a point at random, moving it a little bit, then checking that the statistical properties of the set haven’t strayed outside of the acceptable bounds (in this particular case, we are ensuring that the means, standard deviations, and correlations remain the same to two decimal places.)
Fig 3. Making a number of small changes to a dataset on the left, while maintaining the same overall statistical properties (to two decimal places), shown on the right.
Fig 4. Transforming a random cloud of points into a circle, while maintaining the same statistical properties.
Hat tip: Taha Yasseri, via twitter.