Gold in Them Tha-R Hills: A Review of R Packages for Exploratory Data Analysis
Kota Minegishi(a) and Taro Mieno(b)
(a) University of Minnesota, Twin Cities, (b)University of Nebraska-Lincoln
JEL Codes: A2, Q1, Y1
Keywords: Exploratory data analysis, data science, data visualization, R programming
Publish Date: June 25, 2020
Volume 2, Issue 3
View Full Article (PDF) | Request Teaching Notes/Supplemental Materials
Abstract
With an accelerated pace of data accumulation in the economy, there is a growing need for data literacy and practical skills to make use of data in the workforce. Applied economics programs have an important role to play in training students in those areas. Teaching tools of data exploration and visualization, also known as exploratory data analysis (EDA), would be a timely addition to existing curriculums. It would also present a new opportunity to engage students through hands-on exercises using real-world data in ways that differ from exercises in statistics. In this article, we review recent developments in the EDA toolkit for statistical computing freeware R, focusing on the tidy verse package. Our contributions are three-fold; we present this new generation of tools with a focus on its syntax structure; our examples show how one can use public data of the U.S. Census of Agriculture for data exploration; and we highlight the practical value of EDA in handling data, uncovering insights, and communicating key aspects of the data.
References
Alonzo, A. 2016. “Top 5 Broiler Producers Dominate US Production.”Retrieved from https://www.wattagnet.com/articles/26925-top-5-broiler-producers-dominate-us-production
Athey, S., J. Tibshirani, andS. Wager. 2019. “Generalized Random Forests.”The Annals of Statistics47(2):1148–1178.
Coble, K.H., A.K. Mishra, S.Ferrell, andT. Griffin. 2018. “Big Data in Agriculture: A Challenge for the Future.”Applied Economic Perspectives and Policy40(1):79–96.
Healy, K. 2018. Data Visualization: A Practical Introduction, 1sted. Princeton NJ: Princeton University Press.
Ismay, C., and A.Y. Kim. 2019. Statistical Inference via Data Science: A ModernDive into R and the Tidyverse,1sted. Boca Raton: Chapman and Hall/CRC.
Johnson, K.M., andG.V. Fuguitte. 2000. “Continuity and Change in Rural Migration Patterns, 1950–1995.”Rural Sociology65(1):27–49.
Kabacoff, R. 2018. Data Visualization with R.Online open-source book accessed athttps://rkabacoff.github.io/datavis/
Longworth, R.C. 2009. Caught in the Middle: America’s Heartland in the Age of Globalism.New York: Bloomsbury USA.
Lovelace, R., J. Nowosad, and J. Muenchow.2019. Geocomputation with R, 1sted. Boca Raton: ChapmanandHall/CRC.
O’Donoghue, E., R. Hoppe,D.Banker, and P. Korb. 2009. Exploring Alternative Farm Definitions: Implications for Agricultural Statistics and Program Eligibility. Economic Information Bulletin No. 49. Washington DC: U.S. Department of Agriculture.
Storm, H., K. Baylis, and H. Heckelei. 2019. “Machine Learning in Agricultural and Applied Economics.” European Review of Agricultural Economics. https://doi.org/10.1093/erae/jbz033
Twain, M. 1892. The American Claimant. New York: Charles L. Webster.
Walzer, N. 2003. The American Midwest: Managing Change in Rural Transition,1sted. Armonk NY: Routledge.
White, K.J.C. 2008. “Population Change and Farm Dependence: Temporal and Spatial Variation in the U.S. Great Plains, 1900–2000.”Demography45(2):363–386.
Wickham, H., and G. Grolemund.2017. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data, 1sted.
Sebastopol CA: O’Reilly Media.
Wickham, H., M. Averick, J.Bryan, W.Chang, L.McGowan, R. François, G.Grolemund, . . .H. Yutani. 2019. “Welcome to the
Tidyverse.” Journal of Open Source Software4(43):1686.
Wilkinson, L. 2005. The Grammar of Graphics.Springer.
Wood, D. 2018. “Costco Poultry Processing Plant to Boost Nebraska Economy.”Retrieved from https://www.acppubs.com/articles/7398-costco-poultry-processing-plant-to-boost-nebraska-economy
Articles in this issue
How Do Students Allocate Their Time? An Application of Prospect Theory to Tradeâ€offs between Time Spent to Improve GPA Versus Time Spent on Other Activities
Brian K. Coffey, Andrew Barkley, Glynn T. Tonsor and Jesse B. Tack
Convenient Economics: The Incorporation and Implications of Convenience in Market Equilibrium Analysis
George Davis
Making Business Statistics Come Alive: Incorporating Field Trial Data from a Cookstove Study into the Classroom
Andrew M. Simons
Interacting with Agricultural Policy 280 Characters at a Time: Twitter in the Classroom
Julianne Treme
Gold in Them Tha-R Hills: A Review of R Packages for Exploratory Data Analysis
Kota Minegishi and Taro Mieno
Enhancing Student Engagement in a Changing Academic Environment-Tested Innovations for Traditional Classes and Online Teaching
Kristin Kiesel, Na Zuo, Zoë T. Plakias, Luis M. Peña-Lévano, Andrew Barkley, Katherine Lacy, Erik