Tidy Data in Action
Hadley Whickham wrote a famous paper (for a certain definition of famous) about the importance of tidy data when doing data analysis. I want to talk a bit about that, using an example from a StackOverflow post, with a solution using pandas. The principles of tidy data aren’t language specific. A tidy dataset must satisfy three criteria (page 4 in Whickham’s paper): Each variable forms a column. Each observation forms a row. Each type of observational unit forms a table. In this StackOverflow post, the asker had some data NBA games, and wanted to know the number of days since a team last played. Here’s the example data: ...