<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"><channel><title>kjam's blog</title><link>https://blog.kjamistan.com/</link><description></description><lastBuildDate>Fri, 06 Feb 2026 00:00:00 +0100</lastBuildDate><item><title>Differential Privacy Parameters, Accounting and Auditing in Deep Learning and AI</title><link>https://blog.kjamistan.com/differential-privacy-parameters-accounting-and-auditing-in-deep-learning-and-ai.html</link><description>&lt;p&gt;You've learned in the last few articles about &lt;a href="https://blog.kjamistan.com/differential-privacy-in-deep-learning.html"&gt;how differential privacy works&lt;/a&gt; and some of the &lt;a href="https://blog.kjamistan.com/differential-privacy-in-todays-ai-whats-so-hard.html"&gt;common pitfalls&lt;/a&gt; of actually using it in deep learning scenarios.&lt;/p&gt;
&lt;p&gt;In this article, you'll learn about tracking differential privacy: through parameter choice, accounting and auditing. If done well, these choices and methods reduce memorization …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Fri, 06 Feb 2026 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2026-02-06:/differential-privacy-parameters-accounting-and-auditing-in-deep-learning-and-ai.html</guid><category>ml-memorization</category></item><item><title>Get your data local: Setting up Network Attached Storage (NAS) and your first steps in self-hosting</title><link>https://blog.kjamistan.com/get-your-data-local-setting-up-network-attached-storage-nas-and-your-first-steps-in-self-hosting.html</link><description>&lt;p&gt;If you're just getting started with local AI and local-first development, one of the initial hurdles will be getting your data local.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;More of an audio-visual person? Check out the &lt;a href="https://youtu.be/TwCdM7fKw0c"&gt;accompanying YouTube video on the Probably Private channel&lt;/a&gt; if you'd rather watch and listen.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Why should you store data locally …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Fri, 30 Jan 2026 09:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2026-01-30:/get-your-data-local-setting-up-network-attached-storage-nas-and-your-first-steps-in-self-hosting.html</guid><category>personal-ai</category></item><item><title>Building out my home AI Lab for private and local AI</title><link>https://blog.kjamistan.com/building-out-my-home-ai-lab-for-private-and-local-ai.html</link><description>&lt;p&gt;So, you wanna do at-home AI? Yes, you do!&lt;/p&gt;
&lt;p&gt;There's a bunch of great reasons to run your own AI including having more control over your data and models, learning more about how deep learning works, testing out new ideas without having to pay extra cloud or subscription costs and …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Thu, 15 Jan 2026 09:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2026-01-15:/building-out-my-home-ai-lab-for-private-and-local-ai.html</guid><category>personal-ai</category></item><item><title>Differential Privacy in Today's AI: What's so hard?</title><link>https://blog.kjamistan.com/differential-privacy-in-todays-ai-whats-so-hard.html</link><description>&lt;p&gt;In the last article in the &lt;a href="https://blog.kjamistan.com/a-deep-dive-into-memorization-in-deep-learning.html"&gt;series on addressing the problems of memorization in deep learning and AI&lt;/a&gt;, you learned about differential privacy and how to apply it to deep learning/AI systems. In this article, you'll explore what can go wrong when using differential privacy training in deep learning …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Tue, 06 Jan 2026 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2026-01-06:/differential-privacy-in-todays-ai-whats-so-hard.html</guid><category>ml-memorization</category></item><item><title>Differential Privacy in Deep Learning</title><link>https://blog.kjamistan.com/differential-privacy-in-deep-learning.html</link><description>&lt;p&gt;Differential privacy influenced both privacy attacks and defenses you've investigated in this &lt;a href="https://blog.kjamistan.com/a-deep-dive-into-memorization-in-deep-learning.html"&gt;series on AI/ML memorization&lt;/a&gt;. You might be wondering: what exactly is differential privacy when it's applied to deep learning? And can it address the problem of memorization?&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Are you a visual learner? There's &lt;a href="https://youtu.be/p6p9i1Hbcns"&gt;a YouTube video on …&lt;/a&gt;&lt;/p&gt;&lt;/blockquote&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Mon, 10 Nov 2025 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2025-11-10:/differential-privacy-in-deep-learning.html</guid><category>ml-memorization</category></item><item><title>Attacks on Machine Unlearning: How Unlearned Models Leak Information</title><link>https://blog.kjamistan.com/attacks-on-machine-unlearning-how-unlearned-models-leak-information.html</link><description>&lt;p&gt;In the past articles, you've been exploring the field of &lt;a href="https://blog.kjamistan.com/machine-unlearning-what-is-it.html"&gt;machine unlearning&lt;/a&gt;, investigating if you can surgically remove memorized or learned data from models without retraining them from scratch or from an earlier checkpoint.&lt;/p&gt;
&lt;p&gt;Unlearning is one proposed solution to the &lt;a href="https://blog.kjamistan.com/a-deep-dive-into-memorization-in-deep-learning.html"&gt;AI/ML memorization problem explored in this multi-article series …&lt;/a&gt;&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Mon, 13 Oct 2025 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2025-10-13:/attacks-on-machine-unlearning-how-unlearned-models-leak-information.html</guid><category>ml-memorization</category></item><item><title>Machine Unlearning: How today's Unlearning is done</title><link>https://blog.kjamistan.com/machine-unlearning-how-todays-unlearning-is-done.html</link><description>&lt;p&gt;Building on our understanding of machine unlearning and &lt;a href="https://blog.kjamistan.com/machine-unlearning-what-is-it.html"&gt;its varied definitions&lt;/a&gt;, in this article you'll learn common approaches to implementing unlearning. To effectively use these approaches, you'll first want to define what unlearning definition and measurement fits your needs.&lt;/p&gt;
&lt;p&gt;In current unlearning research, there are three main categories of unlearning …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Fri, 19 Sep 2025 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2025-09-19:/machine-unlearning-how-todays-unlearning-is-done.html</guid><category>ml-memorization</category></item><item><title>Machine unlearning: what is it?</title><link>https://blog.kjamistan.com/machine-unlearning-what-is-it.html</link><description>&lt;p&gt;Machine unlearning sounds pretty cool. It is the idea that you can remove information from a trained model at will. If this was possible, you'd be able to edit out things you don't want the model to know, from criminal behavior, racialized slurs to private information. It would solve many …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Wed, 13 Aug 2025 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2025-08-13:/machine-unlearning-what-is-it.html</guid><category>ml-memorization</category></item><item><title>AI Risk and Threat Taxonomies</title><link>https://blog.kjamistan.com/ai-risk-and-threat-taxonomies.html</link><description>&lt;p&gt;It seems like every week &lt;a href="https://www.linkedin.com/in/katharinejarmul/"&gt;my LinkedIn&lt;/a&gt; feed is filled with new &lt;em&gt;just released&lt;/em&gt; AI risk taxonomies, threat models or AI governance handbooks. Usually these taxonomies come from governance consultants or standards authorities and are a great reference for understanding the wide variety of risks AI systems&lt;sup id="fnref:1"&gt;&lt;a class="footnote-ref" href="#fn:1"&gt;1&lt;/a&gt;&lt;/sup&gt; bring with …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Tue, 05 Aug 2025 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2025-08-05:/ai-risk-and-threat-taxonomies.html</guid><category>security</category></item><item><title>Algorithmic-based Guardrails: External guardrail models and alignment methods</title><link>https://blog.kjamistan.com/algorithmic-based-guardrails-external-guardrail-models-and-alignment-methods.html</link><description>&lt;p&gt;You've probably at some point heard the term "guardrails" when talking about security or safety in AI systems like LLMs or multi-modal models (i.e. models that include and produce multiple modalities, like speech and image, videos, image and text).&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Are you a visual learner? There's &lt;a href="https://youtu.be/IeyB-2cS5lM"&gt;a YouTube video for …&lt;/a&gt;&lt;/p&gt;&lt;/blockquote&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Mon, 28 Jul 2025 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2025-07-28:/algorithmic-based-guardrails-external-guardrail-models-and-alignment-methods.html</guid><category>ml-memorization</category></item><item><title>Blocking AI/ML Memorization with Software Guardrails</title><link>https://blog.kjamistan.com/blocking-aiml-memorization-with-software-guardrails.html</link><description>&lt;p&gt;One common way to control memorization in today's deep learning systems is to fix the problem by building software around it. This software can also be used to deal with other undesired behavior, like producing hate speech or mentioning criminal activities.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Are you a visual learner? There's &lt;a href="https://youtu.be/IeyB-2cS5lM"&gt;a YouTube video …&lt;/a&gt;&lt;/p&gt;&lt;/blockquote&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Fri, 11 Jul 2025 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2025-07-11:/blocking-aiml-memorization-with-software-guardrails.html</guid><category>ml-memorization</category></item><item><title>Defining Privacy Attacks in AI and ML</title><link>https://blog.kjamistan.com/defining-privacy-attacks-in-ai-and-ml.html</link><description>&lt;p&gt;In &lt;a href="https://blog.kjamistan.com/a-deep-dive-into-memorization-in-deep-learning.html"&gt;this article series&lt;/a&gt;, you've been able to investigate memorization in AI/deep learning systems -- often via interesting attack vectors. In security modeling, it's useful to explicitly define the threats you are defending against, so you can both discuss and address them and compare potential interventions.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Prefer to learn by …&lt;/p&gt;&lt;/blockquote&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Thu, 12 Jun 2025 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2025-06-12:/defining-privacy-attacks-in-ai-and-ml.html</guid><category>ml-memorization</category></item><item><title>Priveedly: your private and personal content reader and recommender</title><link>https://blog.kjamistan.com/priveedly-your-private-and-personal-content-reader-and-recommender.html</link><description>&lt;p&gt;I'm excited to open-source a project that I've been using for the past 2 and a half years: a private/personal reader and recommender.&lt;/p&gt;
&lt;p&gt;It works with:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;RSS feeds&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.reddit.com/"&gt;Reddit&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://news.ycombinator.com/"&gt;HackerNews&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://lobste.rs/"&gt;Lobste.rs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;and comes with an example Jupyter Notebook for training your own text-based recommendation model once you have …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Thu, 23 Jan 2025 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2025-01-23:/priveedly-your-private-and-personal-content-reader-and-recommender.html</guid><category>personal-ai</category></item><item><title>Adversarial Examples Demonstrate Memorization Properties</title><link>https://blog.kjamistan.com/adversarial-examples-demonstrate-memorization-properties.html</link><description>&lt;p&gt;In this article, the last in the &lt;a href="https://blog.kjamistan.com/a-deep-dive-into-memorization-in-deep-learning.html"&gt;problem exploration section of the series&lt;/a&gt;, you'll explore adversarial machine learning - or how to trick a deep learning system.&lt;/p&gt;
&lt;p&gt;Adversarial examples demonstrate a different way to look at deep learning  memorization and generalization. They can show us how important the learned decision space …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Wed, 15 Jan 2025 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2025-01-15:/adversarial-examples-demonstrate-memorization-properties.html</guid><category>ml-memorization</category></item><item><title>Differential Privacy as a Counterexample to AI/ML Memorization</title><link>https://blog.kjamistan.com/differential-privacy-as-a-counterexample-to-aiml-memorization.html</link><description>&lt;p&gt;At this point in reading the &lt;a href="https://blog.kjamistan.com/a-deep-dive-into-memorization-in-deep-learning.html"&gt;article series on AI/ML memorization&lt;/a&gt; you might be wondering, how did the field get so far without addressing the memorization problem? How did seminal papers like Zhang et al's &lt;a href="https://arxiv.org/abs/1611.03530"&gt;&lt;em&gt;Understanding Deep Learning Requires Rethinking Generalization&lt;/em&gt;&lt;/a&gt; not fundamentally change machine learning research? And maybe …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Thu, 02 Jan 2025 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2025-01-02:/differential-privacy-as-a-counterexample-to-aiml-memorization.html</guid><category>ml-memorization</category></item><item><title>How Memorization Happens: Overparametrized Models</title><link>https://blog.kjamistan.com/how-memorization-happens-overparametrized-models.html</link><description>&lt;p&gt;You've heard claims that we will "run out of data" to train AI systems. Why is that? In this article in the series on &lt;a href="https://blog.kjamistan.com/a-deep-dive-into-memorization-in-deep-learning.html"&gt;machine learning memorization&lt;/a&gt; you'll explore model size as a factor in memorization and the trend for bigger models as a general problem in machine learning.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Prefer …&lt;/p&gt;&lt;/blockquote&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Wed, 18 Dec 2024 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2024-12-18:/how-memorization-happens-overparametrized-models.html</guid><category>ml-memorization</category></item><item><title>How memorization happens: Novelty</title><link>https://blog.kjamistan.com/how-memorization-happens-novelty.html</link><description>&lt;p&gt;So far in &lt;a href="https://blog.kjamistan.com/a-deep-dive-into-memorization-in-deep-learning.html"&gt;this series on memorization in deep learning&lt;/a&gt;, you've learned how &lt;a href="https://blog.kjamistan.com/how-memorization-happens-repetition.html"&gt;massively repeated text and images incentivize training data memorization&lt;/a&gt;, but that's not the only training data that machine learning models memorize. Let's take a look at another proven memorization: novel examples.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Prefer to learn by video? This …&lt;/p&gt;&lt;/blockquote&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Mon, 09 Dec 2024 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2024-12-09:/how-memorization-happens-novelty.html</guid><category>ml-memorization</category></item><item><title>How memorization happens: Repetition</title><link>https://blog.kjamistan.com/how-memorization-happens-repetition.html</link><description>&lt;p&gt;In this article in &lt;a href="https://blog.kjamistan.com/a-deep-dive-into-memorization-in-deep-learning.html"&gt;the deep learning memorization series&lt;/a&gt;, you'll learn how one part of memorization happens -- highly repeated data from the "head" of the long-tailed distribution.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Prefer to learn by video? This post &lt;a href="https://youtu.be/rDgFIiRTAHE?si=omH4DxA5OqOkJS3y"&gt;is summarized on Probably Private's YouTube&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Recall from &lt;a href="https://blog.kjamistan.com/machine-learning-dataset-distributions-history-and-biases.html"&gt;the data collection article&lt;/a&gt; that some examples are …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Tue, 03 Dec 2024 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2024-12-03:/how-memorization-happens-repetition.html</guid><category>ml-memorization</category></item><item><title>Gaming Evaluation - The evolution of deep learning training and evaluation</title><link>https://blog.kjamistan.com/gaming-evaluation-the-evolution-of-deep-learning-training-and-evaluation.html</link><description>&lt;p&gt;In this article in the &lt;a href="https://blog.kjamistan.com/a-deep-dive-into-memorization-in-deep-learning.html"&gt;series on machine learning memorization&lt;/a&gt;, you'll dive deeper into how typical machine learning training and evaluation happens, a crucial step in ensuring the machine learning model actually "learns" something. Let's review the steps that lead up to training a deep learning model.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Two major steps are shown in rectangular boxes: Data Preparation and Preprocessing and Model Training and Evaluation. Above each of these major steps there are smaller boxes outlining substeps. The data preparation substeps are data collection, data cleaning and data labeling (if needed). The substeps for model training and evaluation are data encoding, model training and model evaluation." src="./images/2024/model_training_steps.png"&gt;
&lt;em&gt;High-level steps to …&lt;/em&gt;&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Tue, 26 Nov 2024 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2024-11-26:/gaming-evaluation-the-evolution-of-deep-learning-training-and-evaluation.html</guid><category>ml-memorization</category></item><item><title>Exploring new meadows</title><link>https://blog.kjamistan.com/exploring-new-meadows.html</link><description>&lt;p&gt;Hello!&lt;/p&gt;
&lt;p&gt;We may not know each other, but here you are on my website -- perhaps because you saw a post or someone shared a link. I'm resourceful, determined, intelligent and looking for new challenges. Welcome!&lt;/p&gt;
&lt;p&gt;Wenn Deutsch einfacher ist, schreiben Sie mir bitte per Email (katharine at kjamistan punkt com …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Wed, 20 Nov 2024 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2024-11-20:/exploring-new-meadows.html</guid><category>misc</category></item><item><title>Private and Personalized AI</title><link>https://blog.kjamistan.com/private-and-personalized-ai.html</link><description>&lt;p&gt;I recently had the wonderful experience of &lt;a href="https://pydata.org/paris2024"&gt;keynoting PyData Paris&lt;/a&gt;, thanks again for the invite! When deciding on a topic, I was considering my &lt;a href="https://blog.kjamistan.com/a-deep-dive-into-memorization-in-deep-learning.html"&gt;recent research about how AI/ML systems memorize data&lt;/a&gt;. As I've mentioned in a few talks, if we indeed embraced the fact that machine learning systems …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Tue, 19 Nov 2024 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2024-11-19:/private-and-personalized-ai.html</guid><category>personal-ai</category></item><item><title>Encodings and embeddings: How does data get into machine learning systems?</title><link>https://blog.kjamistan.com/encodings-and-embeddings-how-does-data-get-into-machine-learning-systems.html</link><description>&lt;p&gt;In &lt;a href="https://blog.kjamistan.com/a-deep-dive-into-memorization-in-deep-learning.html"&gt;this series&lt;/a&gt;, you've learned a bit about how data is collected for machine learning, but what happens next? You need to turn the collected data -- images, text, video, audio or even just a spreadsheet -- into numbers that can be learned by a model. How does this happen?&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;TLDR (too …&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;/table&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Mon, 18 Nov 2024 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2024-11-18:/encodings-and-embeddings-how-does-data-get-into-machine-learning-systems.html</guid><category>ml-memorization</category></item><item><title>Machine Learning dataset distributions, history, and biases</title><link>https://blog.kjamistan.com/machine-learning-dataset-distributions-history-and-biases.html</link><description>&lt;p&gt;You probably are already aware that many machine learning datasets come from scraped internet data. Maybe you received the infamous GPT response: "Please note that my knowledge is limited to information available up until September 2021." You might have also read fear-mongering opinions and articles that companies will &lt;a href="https://theconversation.com/researchers-warn-we-could-run-out-of-data-to-train-ai-by-2026-what-then-216741"&gt;"run out …&lt;/a&gt;&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Wed, 13 Nov 2024 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2024-11-13:/machine-learning-dataset-distributions-history-and-biases.html</guid><category>ml-memorization</category></item><item><title>Deep learning memorization, and why you should care</title><link>https://blog.kjamistan.com/deep-learning-memorization-and-why-you-should-care.html</link><description>&lt;p&gt;When's the last time that ChatGPT parroted someone else's words to you? Or the last time a diffusion model you used recreated someone's art, someone's photo, someone's face? Has Copilot &lt;a href="https://x.com/docsparse/status/1581461734665367554"&gt;given you someone else's code without permission or attribution&lt;/a&gt;? If this happened, how would you know for sure?&lt;/p&gt;
&lt;p&gt;In this …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Mon, 04 Nov 2024 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2024-11-04:/deep-learning-memorization-and-why-you-should-care.html</guid><category>ml-memorization</category></item><item><title>A Deep Dive into Memorization in Deep Learning</title><link>https://blog.kjamistan.com/a-deep-dive-into-memorization-in-deep-learning.html</link><description>&lt;p&gt;Want to learn more about how, when and why machine learning, particularly deep learning systems memorize data? By studying memorization, you'll learn more about how machine learning systems really function, along with how privacy works from a technical point-of-view. You'll also be better able to decide how, when and where …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Sun, 03 Nov 2024 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2024-11-03:/a-deep-dive-into-memorization-in-deep-learning.html</guid><category>ml-memorization</category></item><item><title>Building a Privacy-First Newsletter</title><link>https://blog.kjamistan.com/building-a-privacy-first-newsletter.html</link><description>&lt;p&gt;Building a newsletter is a fairly common activity these days, with many creators, writers and thinkers making part of their living via subscribers willing to give small amounts of money out per year or month to get exclusive access. Beyond the paid subscriptions, there's an increasing demand for free, or …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Sun, 12 Mar 2023 09:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2023-03-12:/building-a-privacy-first-newsletter.html</guid><category>internet</category></item><item><title>Joining Dropout Labs!</title><link>https://blog.kjamistan.com/joining-dropout-labs.html</link><description>&lt;p&gt;After months of searching, lots of fun (and some less fun) interviews and hours of self-reflection, I am excited to announce I am the new Head of Product at &lt;a href="https://dropoutlabs.com/"&gt;Dropout Labs&lt;/a&gt;! 🎉&lt;/p&gt;
&lt;p&gt;The interview and decision process was quite iterative and disruptive! I am somewhat to blame for this as I …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Sat, 23 Nov 2019 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2019-11-23:/joining-dropout-labs.html</guid><category>misc</category></item><item><title>Let's Get Together: More Details on Me, You and My Dream Gig</title><link>https://blog.kjamistan.com/lets-get-together-more-details-on-me-you-and-my-dream-gig.html</link><description>&lt;p&gt;Hello!&lt;/p&gt;
&lt;p&gt;We may not know each other, but here you are on my website -- perhaps because you saw a post or someone shared a link. I'm resourceful, determined, intelligent and looking for new challenges. Welcome!&lt;/p&gt;
&lt;p&gt;Here's more about me, in case it is news to you:&lt;/p&gt;
&lt;h4 id="about-me"&gt;[About Me]&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;Co-founder of …&lt;/li&gt;&lt;/ul&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Thu, 06 Jun 2019 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2019-06-06:/lets-get-together-more-details-on-me-you-and-my-dream-gig.html</guid><category>misc</category></item><item><title>Adversarial Learning for Good: My Talk at #34c3 on Deep Learning Blindspots</title><link>https://blog.kjamistan.com/adversarial-learning-for-good-my-talk-at-34c3-on-deep-learning-blindspots.html</link><description>&lt;p&gt;When I first was introduced to the idea of adversarial learning for security purposes by &lt;a href="https://www.youtube.com/watch?v=JAGDpJFFM2A"&gt;Clarence Chio's 2016 DEF CON talk&lt;/a&gt; and his related &lt;a href="https://github.com/cchio/deep-pwning"&gt;open-source library deep-pwning&lt;/a&gt;, I immediately started wondering about applications of the field to both make robust and well-tested models, but also as a preventative measure against …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Thu, 28 Dec 2017 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2017-12-28:/adversarial-learning-for-good-my-talk-at-34c3-on-deep-learning-blindspots.html</guid><category>conferences</category></item><item><title>Towards Interpretable Reliable Models</title><link>https://blog.kjamistan.com/towards-interpretable-reliable-models.html</link><description>&lt;p&gt;I presented a keynote at &lt;a href="https://pydata.org/warsaw2017/"&gt;PyData Warsaw&lt;/a&gt; on moving toward interpretable reliable models. The talk was inspired by some of the work I admire in the field as well as a fear that if we do not address interpretable models as a community, we will be factors in our own …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Sun, 29 Oct 2017 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2017-10-29:/towards-interpretable-reliable-models.html</guid><category>conferences</category></item><item><title>GDPR &amp; You: My Talk at Cloudera Sessions München</title><link>https://blog.kjamistan.com/gdpr-you-my-talk-at-cloudera-sessions-munchen.html</link><description>&lt;p&gt;Unless you have been avoiding all news, you have likely heard of the coming changes in European privacy regulations which go into effect in May 2018. The changes are covered under the General Data Privacy Regulation Directive, whose final text was made available in May 2016.&lt;/p&gt;
&lt;p&gt;I presented a talk …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Wed, 11 Oct 2017 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2017-10-11:/gdpr-you-my-talk-at-cloudera-sessions-munchen.html</guid><category>conferences</category></item><item><title>Algorithmic Art and "Künstliche Kunst"</title><link>https://blog.kjamistan.com/algorithmic-art-and-kunstliche-kunst.html</link><description>&lt;p&gt;I was invited to give a talk at &lt;a href="http://404.ie"&gt;404 Dublin&lt;/a&gt;, a really cool conference joining community groups w/ tech folks and art installations. When thinking of what topics might be of interest to the audience, I selfishly went to one of my (side) passions.. following artists who are doing amazing …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Sat, 07 Oct 2017 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2017-10-07:/algorithmic-art-and-kunstliche-kunst.html</guid><category>conferences</category></item><item><title>Comparing scikit-learn Text Classifiers on a Fake News Dataset</title><link>https://blog.kjamistan.com/comparing-scikit-learn-text-classifiers-on-a-fake-news-dataset.html</link><description>&lt;p&gt;Finding ways to determine fake news from real news is a challenge most Natural Language Processing folks I meet and chat with want to solve. There is significant difficulty in doing this properly and without penalizing real news sources.&lt;/p&gt;
&lt;p&gt;I was discussing this problem with Miguel Martinez-Alvarez on my last …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Mon, 28 Aug 2017 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2017-08-28:/comparing-scikit-learn-text-classifiers-on-a-fake-news-dataset.html</guid><category>research</category></item><item><title>Data Unit Testing: EuroPython Tutorial</title><link>https://blog.kjamistan.com/data-unit-testing-europython-tutorial.html</link><description>&lt;p&gt;I gave a long and opinionated tutorial at &lt;a href="https://ep2017.europython.eu/p3/schedule/ep2017/"&gt;EuroPython 2017&lt;/a&gt; about how we &lt;a href="https://ep2017.europython.eu/conference/talks/data-unit-testing-with-python"&gt;should do unit testing and validation within a data science scope&lt;/a&gt;. The GitHub repository for the course (which is part of my &lt;a href="https://blog.kjamistan.com/practical-data-cleaning-with-python-resources.html"&gt;O'Reilly Live Online training&lt;/a&gt;) is &lt;a href="https://github.com/kjam/data-cleaning-101"&gt;https://github.com/kjam/data-cleaning-101&lt;/a&gt;. I will continue editing and …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Fri, 14 Jul 2017 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2017-07-14:/data-unit-testing-europython-tutorial.html</guid><category>trainings</category></item><item><title>if Ethics is not None</title><link>https://blog.kjamistan.com/if-ethics-is-not-none.html</link><description>&lt;p&gt;This past Wednesday, I had the pleasure of giving a keynote at &lt;a href="https://ep2017.europython.eu/en/"&gt;EuroPython 2017&lt;/a&gt;. I covered a historical view of ethics in computing. The slides are shared here, but it was also recorded so I will post a video when it is available. (Updated: video added!)&lt;/p&gt;
&lt;p&gt;In addition, a series …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Fri, 14 Jul 2017 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2017-07-14:/if-ethics-is-not-none.html</guid><category>conferences</category></item><item><title>Practical Data Cleaning with Python Resources</title><link>https://blog.kjamistan.com/practical-data-cleaning-with-python-resources.html</link><description>&lt;h2 id="practical-data-cleaning-resources"&gt;Practical Data Cleaning Resources&lt;/h2&gt;
&lt;h4 id="oreilly-live-online-training"&gt;(O'Reilly Live Online Training)&lt;/h4&gt;
&lt;p&gt;This week I will be giving my first O'Reilly Live Online Training via the Safari platform. I'm pretty excited to share some of my favorite data cleaning libraries and tips for validating and testing your data workflows.&lt;/p&gt;
&lt;p&gt;This post hopes to be …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Wed, 03 May 2017 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2017-05-03:/practical-data-cleaning-with-python-resources.html</guid><category>trainings</category></item><item><title>PyData Amsterdam Keynote on Ethical Machine Learning</title><link>https://blog.kjamistan.com/pydata-amsterdam-keynote-on-ethical-machine-learning.html</link><description>&lt;p&gt;I was kindly asked by the PyData Amsterdam organizers to keynote the conference. As a passionate fan of ethical machine learning and the great research being done by data scientists and academics around the world -- I am very enthused to present the topic to the conference.&lt;/p&gt;
&lt;p&gt;My slides are currently …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Fri, 07 Apr 2017 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2017-04-07:/pydata-amsterdam-keynote-on-ethical-machine-learning.html</guid><category>conferences</category></item><item><title>Ten Tips for First-Time Conference Speakers</title><link>https://blog.kjamistan.com/ten-tips-for-first-time-conference-speakers.html</link><description>&lt;p&gt;The saddest moment for me at conferences is when I'm in the middle of an interesting conversation with a bright person and I ask her when her talk is and she says, "Who me?"&lt;/p&gt;
&lt;p&gt;The number of folks I speak with every year at conferences who have amazing stories to …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Sat, 11 Feb 2017 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2017-02-11:/ten-tips-for-first-time-conference-speakers.html</guid><category>conferences</category></item><item><title>The Practice of Programming: 18 Years Later</title><link>https://blog.kjamistan.com/the-practice-of-programming-18-years-later.html</link><description>&lt;p&gt;Over the new year holiday time I had a chance to get away from it all, and snuck up to Finland to sit in a lodge on the Gulf of Finland, sip coffee, take saunas and read. I brought along a few books, the only programming one being &lt;a href="http://www.cs.princeton.edu/~bwk/tpop.webpage/"&gt;Brian W …&lt;/a&gt;&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Fri, 20 Jan 2017 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2017-01-20:/the-practice-of-programming-18-years-later.html</guid><category>programming</category></item><item><title>New O'Reilly Video Training: Data Pipelines with Python</title><link>https://blog.kjamistan.com/new-oreilly-video-training-data-pipelines-with-python.html</link><description>&lt;p&gt;I'm really excited to announce a new &lt;a href="http://shop.oreilly.com/product/0636920055334.do"&gt;Python video course with O'Reilly on data pipelines&lt;/a&gt;. If you are interested in learning some of the popular options available for workflow automation and management in Python, take a look!&lt;/p&gt;
&lt;p&gt;In the course, I cover:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Using &lt;a href="http://www.celeryproject.org/"&gt;Celery&lt;/a&gt; for simple automation&lt;/li&gt;
&lt;li&gt;Setting up &lt;a href="https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html"&gt;Hadoop …&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Tue, 13 Dec 2016 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2016-12-13:/new-oreilly-video-training-data-pipelines-with-python.html</guid><category>trainings</category></item><item><title>DAGs &amp; Dask: How and When to Accelerate your Data Analysis</title><link>https://blog.kjamistan.com/dags-dask-how-and-when-to-accelerate-your-data-analysis.html</link><description>&lt;p&gt;I gave a talk about &lt;a href="https://en.wikipedia.org/wiki/Directed_acyclic_graph"&gt;Directed Acyclic Graphs (DAGs)&lt;/a&gt; and &lt;a href="https://github.com/dask"&gt;Dask&lt;/a&gt; at &lt;a href="https://cz.pycon.org/2016/"&gt;PyConCZ 2016&lt;/a&gt;. It was super fun and I had a great time at the conference. If you want to read my slides below, here they are! There will be videos available later, so I'll post the link / video …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Sat, 29 Oct 2016 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2016-10-29:/dags-dask-how-and-when-to-accelerate-your-data-analysis.html</guid><category>conferences</category></item><item><title>Introduction to Data Wrangling @ PyConCZ</title><link>https://blog.kjamistan.com/introduction-to-data-wrangling-pyconcz.html</link><description>&lt;p&gt;&lt;a href="https://cz.pycon.org/2016/"&gt;PyConCZ 2016&lt;/a&gt; was such a fun conference! First off, it was the first time I got to see &lt;a href="https://twitter.com/JackieKazil"&gt;Jackie Kazil&lt;/a&gt; since we started writing our &lt;a href="http://shop.oreilly.com/product/0636920032861.do"&gt;O'Reilly book Data Wrangling with Python&lt;/a&gt; together, HOORAYYYY!&lt;/p&gt;
&lt;blockquote class="twitter-tweet" data-lang="en"&gt;&lt;p lang="en" dir="ltr"&gt;OMG PYTHONISTAS! &lt;a href="https://twitter.com/JackieKazil"&gt;@JackieKazil&lt;/a&gt; &amp;amp; I are together for the first time since we started the &lt;a href="https://twitter.com/OReillyMedia"&gt;@OReillyMedia&lt;/a&gt; Data Wrangling …&lt;/p&gt;&lt;/blockquote&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Sat, 29 Oct 2016 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2016-10-29:/introduction-to-data-wrangling-pyconcz.html</guid><category>conferences</category></item><item><title>Chatbot Scraper: Europarl Scraper: 24 Languages of Politics, at your fingertips</title><link>https://blog.kjamistan.com/chatbot-scraper-europarl-scraper-24-languages-of-politics-at-your-fingertips.html</link><description>&lt;p&gt;I participated in a two-day &lt;a href="http://www.meetup.com/PyData-Berlin/events/232774832/?eventId=232774832"&gt;PyDataBerlin Hackathon event&lt;/a&gt; in early-October and decided to build a scraper for European Parliament. This was after I found the &lt;a href="http://www.statmt.org/europarl/"&gt;Europarl parallel corpus&lt;/a&gt; a bit underwhelming as it is messy and not tagged for party, speakers or topic (this is understandable, as it is primarily …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Thu, 20 Oct 2016 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2016-10-20:/chatbot-scraper-europarl-scraper-24-languages-of-politics-at-your-fingertips.html</guid><category>hacking</category></item><item><title>Chatbot Scraper: Using (today's) IRC logs as your NLP datasets</title><link>https://blog.kjamistan.com/chatbot-scraper-using-todays-irc-logs-as-your-nlp-datasets.html</link><description>&lt;p&gt;I dunno about you, but I often find myself bored with NLP (natural language processing) datasets. Too often they are older, based around something that is not particularly interesting to me or something I've analyzed or used before.&lt;/p&gt;
&lt;p&gt;For me, &lt;a href="https://wikipedia.org/wiki/Internet_Relay_Chat"&gt;IRC&lt;/a&gt; has often been a source of community, fun, sometimes …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Thu, 29 Sep 2016 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2016-09-29:/chatbot-scraper-using-todays-irc-logs-as-your-nlp-datasets.html</guid><category>hacking</category></item><item><title>Automating your Data Cleanup with Python</title><link>https://blog.kjamistan.com/automating-your-data-cleanup-with-python.html</link><description>&lt;p&gt;I gave a talk at &lt;a href="http://2016.pyconuk.org/"&gt;PyCon UK 2016&lt;/a&gt; on automating your data cleanup with Python. I want to again thank the organizers for having me and thank the folks who attended. If you have any questions or are interested in talking about data cleaning problems, feel free to reach out …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Sat, 17 Sep 2016 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2016-09-17:/automating-your-data-cleanup-with-python.html</guid><category>conferences</category></item><item><title>Embedded *isms in Vector-Based Natural Language Processing</title><link>https://blog.kjamistan.com/embedded-isms-in-vector-based-natural-language-processing.html</link><description>&lt;p&gt;You may have read recently about &lt;a href="http://www.nytimes.com/2016/06/26/opinion/sunday/artificial-intelligences-white-guy-problem.html?_r=0"&gt;machine learning's&lt;/a&gt; &lt;a href="https://www.oreilly.com/learning/how-we-amplify-privilege-with-supervised-machine-learning"&gt;bias problem&lt;/a&gt; particularly in word &lt;a href="https://arxiv.org/abs/1606.06121"&gt;embeddings&lt;/a&gt; and &lt;a href="https://www.technologyreview.com/s/602025/how-vector-space-mathematics-reveals-the-hidden-sexism-in-language/"&gt;vectors&lt;/a&gt;. It's a massive problem. If you are using word embeddings to generate associative words, phrases or to do comparisons, you should be aware of the biases you are introducing into your work. In preparation …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Fri, 16 Sep 2016 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2016-09-16:/embedded-isms-in-vector-based-natural-language-processing.html</guid><category>research</category></item><item><title>Obligatory Women In Tech Post</title><link>https://blog.kjamistan.com/obligatory-women-in-tech-post.html</link><description>&lt;p&gt;&lt;strong&gt;Question:&lt;/strong&gt; How does it feel to be a woman in tech?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Answer:&lt;/strong&gt;&lt;/p&gt;
&lt;iframe src="https://giphy.com/embed/13Xs7FQmAsqsHS" width="480" height="256" frameBorder="0" class="giphy-embed" allowFullScreen&gt;&lt;/iframe&gt;

&lt;p&gt;&lt;a href="https://giphy.com/gifs/hair-blow-dries-13Xs7FQmAsqsHS"&gt;via GIPHY&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;see also:&lt;/em&gt; &lt;a href="http://www.laweekly.com/arts/geek-chicks-pyladies-a-gang-of-female-computer-programmers-2373431"&gt;OG PyLadies Interview&lt;/a&gt;&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Fri, 16 Sep 2016 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2016-09-16:/obligatory-women-in-tech-post.html</guid><category>life</category></item><item><title>I Hate You, NLP ;)</title><link>https://blog.kjamistan.com/i-hate-you-nlp.html</link><description>&lt;p&gt;"I had a great time talking about Sentiment Analysis and Natural Language processing at &lt;a href="https://ep2016.europython.eu/"&gt;EuroPython 2016&lt;/a&gt;. Here are my slides for your review, feel free to reach out &lt;a href="https://twitter.com/kjam"&gt;on Twitter&lt;/a&gt; or email if you'd like to chat further about NLP, machine learning and sentiment. I look forward to starting more …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Thu, 21 Jul 2016 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2016-07-21:/i-hate-you-nlp.html</guid><category>conferences</category></item><item><title>Python Flight Search</title><link>https://blog.kjamistan.com/python-flight-search.html</link><description>&lt;p&gt;Like many people, I enjoy travel. With family and friends all across the United States and a home base in Berlin, it's fairly easy to find a reason to travel -- either globally or within the EU. That said, what I find more difficult is to determine what's the best way …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Tue, 29 Mar 2016 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2016-03-29:/python-flight-search.html</guid><category>hacking</category></item><item><title>Data Wrangling with Python Course</title><link>https://blog.kjamistan.com/data-wrangling-with-python-course.html</link><description>&lt;p&gt;I'll be in New York on July 13th and 14th, teaching how to "big data" with Python. We'll cover Pandas, Hadoop, PySpark and more on automation, acquisition and managing your data.&lt;/p&gt;
&lt;h3 id="next-course-new-york-city-july-13-14"&gt;Next Course: New York City, July 13-14&lt;/h3&gt;
&lt;p&gt;&lt;a href="https://www.eventbrite.co.uk/e/learn-big-data-wrangling-with-python-tickets-24220425946"&gt;Tickets are available on Eventbrite&lt;/a&gt; with a special Early Bird and Student …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Mon, 29 Feb 2016 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2016-02-29:/data-wrangling-with-python-course.html</guid><category>trainings</category></item><item><title>Data Wrangling with Python</title><link>https://blog.kjamistan.com/data-wrangling-with-python.html</link><description>&lt;p&gt;Just a quick note that my book: Data Wrangling with Python is available for &lt;a href="http://www.amazon.com/Data-Wrangling-Python-Jacqueline-Kazil/dp/1491948817/ref=sr_1_1?s=books&amp;amp;ie=UTF8&amp;amp;qid=1445422551&amp;amp;sr=1-1&amp;amp;keywords=katharine+jarmul"&gt;prepurchase on Amazon&lt;/a&gt; as well as in &lt;a href="http://shop.oreilly.com/product/0636920032861.do"&gt;early release on O'Reilly's web site&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Data Wrangling with Python" src="http://ecx.images-amazon.com/images/I/51qWQ75%2BCXL._SX379_BO1\n,204\n,203\n,200_.jpg"&gt;&lt;/p&gt;
&lt;p&gt;Pick up a copy for less than full amount now. I'll be posting some examples of problems we work through in the book …&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Sun, 01 Nov 2015 00:00:00 +0100</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2015-11-01:/data-wrangling-with-python.html</guid><category>books</category></item><item><title>Europython 2015</title><link>https://blog.kjamistan.com/europython-2015.html</link><description>&lt;h3 id="introduction-to-data-analysis-tutorial"&gt;Introduction to Data Analysis Tutorial&lt;/h3&gt;
&lt;p&gt;Want to learn how to analyze data using Python? If you're at #europycon you should drop by my course! If not, watch the video online later today (will post link!)&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.haikudeck.com/p/b8T4gEIWvi/introduction-to-data-analysis---europython-2015"&gt;Slides&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/kjam/data-wrangling-pycon"&gt;Repo&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://ipynb.kjamistan.com:8888"&gt;Notebooks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://bit.ly/data-class-feedback"&gt;Feedback&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">katharine</dc:creator><pubDate>Thu, 23 Jul 2015 00:00:00 +0200</pubDate><guid isPermaLink="false">tag:blog.kjamistan.com,2015-07-23:/europython-2015.html</guid><category>conferences</category></item></channel></rss>