A dataset demonstrating the utility of visualization. These 12 datasets are equal in standard measures: mean, standard deviation, and Pearson's correlation.
Format
A data frame with 182 rows and 24 variables:
bullseye_x: x-values for the
bullseye
datasetbullseye_y: y-values for the
bullseye
datasetcircle_x: x-values for the
circle
datasetcircle_y: y-values for the
circle
datasetdots_x: x-values for the
dots
datasetdots_y: y-values for the
dots
dataseth_lines_x: x-values for the
h_lines
dataseth_lines_y: y-values for the
h_lines
datasethigh_lines_x: x-values for the
high_lines
datasethigh_lines_y: y-values for the
high_lines
datasetslant_x: x-values for the
slant
datasetslant_y: y-values for the
slant
datasetslant_down_x: x-values for the
slant_down
datasetslant_down_y: y-values for the
slant_down
datasetslant_up_x: x-values for the
slant_up
datasetslant_up_y: y-values for the
slant_up
datasetstar_x: x-values for the
star
datasetstar_y: y-values for the
star
datasetv_lines_x: x-values for the
v_lines
datasetv_lines_y: y-values for the
v_lines
datasetwide_lines_x: x-values for the
wide_lines
datasetwide_lines_y: y-values for the
wide_lines
datasetx_shape_x: x-values for the
x_shape
datasetx_shape_y: y-values for the
x_shape
dataset
References
Matejka, J., & Fitzmaurice, G. (2017). Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing. CHI 2017 Conference proceedings: ACM SIGCHI Conference on Human Factors in Computing Systems. Retrieved from https://www.autodeskresearch.com/publications/samestats.
Examples
#save current settings
state <- par("mar", "mfrow")
# plot
par(mfrow = c(4, 3), mar=c(1,3,3,1))
nms <- names(twelve_from_slant_wide)
for (i in seq(1, 23, by = 2)){
nm <- substr(nms[i], 1, nchar(nms[i]) - 2)
plot(twelve_from_slant_wide[[nms[i]]],
twelve_from_slant_wide[[nms[i+1]]],
xlab = "", ylab = "", main = nm)
}
#reset settings
par(state)