SXSW 2025

Break the Bot: Red-Teaming Large Language Models

Description:

Red-teaming has long been a crucial component of a robust security toolkit for software systems. Now, companies developing large language models (LLMs) and other GenAI products are increasingly applying the technique to model outputs, as a means of uncovering harmful content that generative models may produce. Thus, developers can identify and mitigate issues before they occur in production.

In this accessible, hands-on workshop, join Numa Dhamani and Maggie Engler, co-authors of Introduction to Generative AI, to learn a complete workflow and arsenal of strategies for red-teaming LLMs.


Related Media

Other Resources / Information


Takeaways

  1. LLMs are trained on human language and can inherit human vulnerabilities, so social engineering principles apply even though models aren't "social."
  2. Red-teaming doesn't need to be sophisticated to be effective; often, the best red-teaming inputs are simply those that are unusual or unexpected.
  3. Finding a successful prompt is great — but being able to understand the reproducibility and generalizability of the vulnerability is even better.

Speakers


Organizer

Numa Dhamani, Vice President of Science, Valkyrie Andromeda


Meta Information:

  • Event: SXSW
  • Format: Workshop
  • Track: Artificial Intelligence
  • Track 2
  • Level: Beginner


Add Comments

comments powered by Disqus

SXSW reserves the right to restrict access to or availability of comments related to PanelPicker proposals that it considers objectionable.