Let's Get Together: More Details on Me, You and My Dream Gig Hello! We may not know each other, but here you are on my website -- perhaps because you saw a post or someone shared a link. I'm resourceful, determined, intelligent and looking for new challenges. Welcome! Here's more about me, in case it is
Joining Dropout Labs! After months of searching, lots of fun (and some less fun) interviews and hours of self-reflection, I am excited to announce I am the new Head of Product at Dropout Labs! 🎉 The interview and decision process was quite iterative and disruptive! I am somewhat
Adversarial Learning for Good: My Talk at #34c3 on Deep Learning Blindspots When I first was introduced to the idea of adversarial learning for security purposes by Clarence Chio's 2016 DEF CON talk and his related open-source library deep-pwning, I immediately started wondering about applications of the field to both make robust and well-tested models, but
Towards Interpretable Reliable Models I presented a keynote at PyData Warsaw on moving toward interpretable reliable models. The talk was inspired by some of the work I admire in the field as well as a fear that if we do not address interpretable models as a community, we
GDPR & You: My Talk at Cloudera Sessions München Unless you have been avoiding all news, you have likely heard of the coming changes in European privacy regulations which go into effect in May 2018. The changes are covered under the General Data Privacy Regulation Directive, whose final text was made available in
Algorithmic Art and "Künstliche Kunst": My Talk at 404 Dublin I was invited to give a talk at 404 Dublin, a really cool conference joining community groups w/ tech folks and art installations. When thinking of what topics might be of interest to the audience, I selfishly went to one of my (side) passions.
Comparing scikit-learn Text Classifiers on a Fake News Dataset Finding ways to determine fake news from real news is a challenge most Natural Language Processing folks I meet and chat with want to solve. There is significant difficulty in doing this properly and without penalizing real news sources. I was discussing this problem
Data Unit Testing: EuroPython Tutorial I gave a long and opinionated tutorial at EuroPython 2017 about how we should do unit testing and validation within a data science scope. The GitHub repository for the course (which is part of my O'Reilly Live Online training) is https://github.com/kjam/
if Ethics is not None This past Wednesday, I had the pleasure of giving a keynote at EuroPython 2017. I covered a historical view of ethics in computing. The slides are shared here, but it was also recorded so I will post a video when it is available. (Updated:
Practical Data Cleaning with Python Resources Practical Data Cleaning Resources (O'Reilly Live Online Training) This week I will be giving my first O'Reilly Live Online Training via the Safari platform. I'm pretty excited to share some of my favorite data cleaning libraries and tips for validating and testing your data
PyData Amsterdam Keynote on Ethical Machine Learning I was kindly asked by the PyData Amsterdam organizers to keynote the conference. As a passionate fan of ethical machine learning and the great research being done by data scientists and academics around the world -- I am very enthused to present the topic
Ten Tips for First-Time Conference Speakers The saddest moment for me at conferences is when I'm in the middle of an interesting conversation with a bright person and I ask her when her talk is and she says, "Who me?" The number of folks I speak with every
The Practice of Programming: 18 Years Later Over the new year holiday time I had a chance to get away from it all, and snuck up to Finland to sit in a lodge on the Gulf of Finland, sip coffee, take saunas and read. I brought along a few books, the
New O'Reilly Video Training: Data Pipelines with Python I'm really excited to announce a new Python video course with O'Reilly on data pipelines. If you are interested in learning some of the popular options available for workflow automation and management in Python, take a look! In the course, I cover: Using Celery
Introduction to Data Wrangling @ PyConCZ PyConCZ 2016 was such a fun conference! First off, it was the first time I got to see Jackie Kazil since we started writing our O'Reilly book Data Wrangling with Python together, HOORAYYYY! OMG PYTHONISTAS! @JackieKazil & I are together for the first time
DAGs & Dask: How and When to Accelerate your Data Analysis I gave a talk about Directed Acyclic Graphs (DAGs) and Dask at PyConCZ 2016. It was super fun and I had a great time at the conference. If you want to read my slides below, here they are! There will be videos available later,
Europarl Scraper: 24 Languages of Politics, at your fingertips I participated in a two-day PyDataBerlin Hackathon event in early-October and decided to build a scraper for European Parliament. This was after I found the Europarl parallel corpus a bit underwhelming as it is messy and not tagged for party, speakers or topic (this
Chatbot Scraper: Using (today's) IRC logs as your NLP datasets I dunno about you, but I often find myself bored with NLP (natural language processing) datasets. Too often they are older, based around something that is not particularly interesting to me or something I've analyzed or used before. For me, IRC has often been
Automating your Data Cleanup with Python I gave a talk at PyCon UK 2016 on automating your data cleanup with Python. I want to again thank the organizers for having me and thank the folks who attended. If you have any questions or are interested in talking about data cleaning
Embedded *isms in Vector-Based Natural Language Processing You may have read recently about machine learning's bias problem particularly in word embeddings and vectors. It's a massive problem. If you are using word embeddings to generate associative words, phrases or to do comparisons, you should be aware of the biases you are
Obligatory Women In Tech Post Question: How does it feel to be a woman in tech? Answer: via GIPHY see also: OG PyLadies Interview
I Hate You, NLP ;) I had a great time talking about Sentiment Analysis and Natural Language processing at EuroPython 2016. Here are my slides for your review, feel free to reach out on Twitter or email if you'd like to chat further about NLP, machine learning and sentiment.
Python Flight Search Like many people, I enjoy travel. With family and friends all across the United States and a home base in Berlin, it’s fairly easy to find a reason to travel -- either globally or within the EU. That said, what I find more
Learn Big Data Wrangling with Python I'll be in New York on July 13th and 14th, teaching how to "big data" with Python. We'll cover Pandas, Hadoop, PySpark and more on automation, acquisition and managing your data. Early sign ups get a discount so.. sign up below!
Data Wrangling with Python Course -- Sign Up! Sign up and learn Python with our intensive 2-day courses. Designed to take you from the basics to more advanced data wrangling, we focus on real data problems and examples so you can apply what you learn immediately. Next Course: New York City, July