7 Rules for Spreadsheets and Data Preparation for Analysis and Machine Learning

With the hype of deep learning neural nets, and machine learning algorithms, it’s easy to forget that most of the work in data science involves accessing and preparing data for analysis.  Indeed, not all data is Kaggle-ready. The reality is: data is often far from perfect. Do your consultant (and budget) a favor and follow […]

Auditing your R magrittr data pipeline

Originally published July 22, 2016 (github -> https://github.com/jabus/givR) How do you audit the objects in your R data analysis pipeline? Given a dataframe object and a series of piped operations, (how) can you observe the intermediate objects that result from each operation in the pipeline? First, prepare your analysis environment by loading some useful libraries: […]