SXSW 2025
Break the Bot: Red-Teaming Large Language Models
Description:
Red-teaming has long been a crucial component of a robust security toolkit for software systems. Now, companies developing large language models (LLMs) and other GenAI products are increasingly applying the technique to model outputs, as a means of uncovering harmful content that generative models may produce. Thus, developers can identify and mitigate issues before they occur in production.
In this accessible, hands-on workshop, join Numa Dhamani and Maggie Engler, co-authors of Introduction to Generative AI, to learn a complete workflow and arsenal of strategies for red-teaming LLMs.
Related Media
Other Resources / Information
Takeaways
- LLMs are trained on human language and can inherit human vulnerabilities, so social engineering principles apply even though models aren't "social."
- Red-teaming doesn't need to be sophisticated to be effective; often, the best red-teaming inputs are simply those that are unusual or unexpected.
- Finding a successful prompt is great — but being able to understand the reproducibility and generalizability of the vulnerability is even better.
Speakers
- Numa Dhamani, Vice President of Science, Valkyrie Andromeda
- Maggie Engler, Member of Technical Staff, Microsoft AI
Organizer
Numa Dhamani, Vice President of Science, Valkyrie Andromeda
SXSW reserves the right to restrict access to or availability of comments related to PanelPicker proposals that it considers objectionable.
Add Comments