Data Unit Testing: EuroPython Tutorial

Posted on Fri 14 July 2017 in trainings

I gave a long and opinionated tutorial at EuroPython 2017 about how we should do unit testing and validation within a data science scope. The GitHub repository for the course (which is part of my O'Reilly Live Online training) is https://github.com/kjam/data-cleaning-101. I will continue editing and …


Continue reading

Practical Data Cleaning with Python Resources

Posted on Wed 03 May 2017 in trainings

Practical Data Cleaning Resources

(O'Reilly Live Online Training)

This week I will be giving my first O'Reilly Live Online Training via the Safari platform. I'm pretty excited to share some of my favorite data cleaning libraries and tips for validating and testing your data workflows.

This post hopes to be …


Continue reading

New O'Reilly Video Training: Data Pipelines with Python

Posted on Tue 13 December 2016 in trainings

I'm really excited to announce a new Python video course with O'Reilly on data pipelines. If you are interested in learning some of the popular options available for workflow automation and management in Python, take a look!

In the course, I cover:


Continue reading

Data Wrangling with Python Course

Posted on Mon 29 February 2016 in trainings

I'll be in New York on July 13th and 14th, teaching how to "big data" with Python. We'll cover Pandas, Hadoop, PySpark and more on automation, acquisition and managing your data.

Next Course: New York City, July 13-14

Tickets are available on Eventbrite with a special Early Bird and Student …


Continue reading