There is a need for a straightforward tutorial on how to set up, download, import, analyze, and visualize data specifically for University of Maryland students or students participating in Data Challenge competitions. Knowing where to start and being given a push in the “right” direction when it comes to data can be beneficial. According to observations from Dr. Bonsignore, an iSchool professor and researcher, students at Data Challenges and students interested in data analysis often don’t know where to start, but once they do start, they produce insightful analysis. A tutorial that could reduce this starting time would be beneficial to these students.
This project will produce a deliverable tutorial guidebook on how to download and use the “500 Cities” data provided by the CDC as well as give examples and accompanying visualizations of queries, statistical models, and tests one could apply to this data in order to give inspiration to future users and explorers of the data. There will be accompanying tutorial videos that show the process of downloading, reading-in, and analyzing the data while explaining the “how and why” of what was done. The steps of ingesting the data, tidying the data (if needed and explaining why it is/isn’t needed), exploring the data by making scatterplots, testing correlations, performing hypothesis testing and other appropriate tests, creating accompanying visualizations and models, and any other tasks we find appropriate and viable for this project within the time constraints will be explained in appropriate detail.
Made for INST490 Capstone Project at University of Maryland.