Exploring new meadows

Posted on Mi 20 November 2024 in misc

Hello!

We may not know each other, but here you are on my website -- perhaps because you saw a post or someone shared a link. I'm resourceful, determined, intelligent and looking for new challenges. Welcome!

Wenn Deutsch einfacher ist, schreiben Sie mir bitte per Email (katharine at kjamistan punkt com …


Continue reading

Private and Personalized AI

Posted on Di 19 November 2024 in personal-ai

I recently had the wonderful experience of keynoting PyData Paris, thanks again for the invite! When deciding on a topic, I was considering my recent research about how AI/ML systems memorize data. As I've mentioned in a few talks, if we indeed embraced the fact that machine learning systems …


Continue reading

Encodings and embeddings: How does data get into machine learning systems?

Posted on Mo 18 November 2024 in ml-memorization

In this series, you've learned a bit about how data is collected for machine learning, but what happens next? You need to turn the collected data -- images, text, video, audio or even just a spreadsheet -- into numbers that can be learned by a model. How does this happen?

TLDR (too …

Continue reading

Machine Learning dataset distributions, history, and biases

Posted on Mi 13 November 2024 in ml-memorization

You probably are already aware that many machine learning datasets come from scraped internet data. Maybe you received the infamous GPT response: "Please note that my knowledge is limited to information available up until September 2021." You might have also read fear-mongering opinions and articles that companies will "run out …


Continue reading

Deep learning memorization, and why you should care

Posted on Mo 04 November 2024 in ml-memorization

When's the last time that ChatGPT parroted someone else's words to you? Or the last time a diffusion model you used recreated someone's art, someone's photo, someone's face? Has Copilot given you someone else's code without permission or attribution? If this happened, how would you know for sure?

In this …


Continue reading

A Deep Dive into Memorization in Deep Learning

Posted on So 03 November 2024 in ml-memorization

Want to learn more about how, when and why machine learning, particularly deep learning systems memorize data? By studying memorization, you'll learn more about how machine learning systems really function, along with how privacy works from a technical point-of-view. You'll also be better able to decide how, when and where …


Continue reading

Building a Privacy-First Newsletter

Posted on So 12 März 2023 in internet

Building a newsletter is a fairly common activity these days, with many creators, writers and thinkers making part of their living via subscribers willing to give small amounts of money out per year or month to get exclusive access. Beyond the paid subscriptions, there's an increasing demand for free, or …


Continue reading

Joining Dropout Labs!

Posted on Sa 23 November 2019 in misc

After months of searching, lots of fun (and some less fun) interviews and hours of self-reflection, I am excited to announce I am the new Head of Product at Dropout Labs! 🎉

The interview and decision process was quite iterative and disruptive! I am somewhat to blame for this as I …


Continue reading

Let's Get Together: More Details on Me, You and My Dream Gig

Posted on Do 06 Juni 2019 in misc

Hello!

We may not know each other, but here you are on my website -- perhaps because you saw a post or someone shared a link. I'm resourceful, determined, intelligent and looking for new challenges. Welcome!

Here's more about me, in case it is news to you:

[About Me]

  • Co-founder of …

Continue reading

Adversarial Learning for Good: My Talk at #34c3 on Deep Learning Blindspots

Posted on Do 28 Dezember 2017 in conferences

When I first was introduced to the idea of adversarial learning for security purposes by Clarence Chio's 2016 DEF CON talk and his related open-source library deep-pwning, I immediately started wondering about applications of the field to both make robust and well-tested models, but also as a preventative measure against …


Continue reading