DEEPSE Forum Seminars - Testing Rogue AI for Harmful Content Degeneration - Eventi

Speakers: Matteo Camilli and Simone Corbo

21 Novembre 2025 | 14:30
DEIB - Aula Beta (Ed. 24)
On line by Teams

Contatti: Prof. Giovanni Quattrocchi

Sommario

On Friday, November 21, 2025, at 11:30 am will we have a new seminar of DEEPSE Forum series titled "Testing Rogue AI for Harmful Content Degeneration".

Large Language Models (LLMs) increasingly shape our daily work and personal lives—yet even the most aligned, mainstream models can still produce surprising, uncomfortable, or harmful outputs when pushed in the right (or wrong) ways. In this talk, we present the core breakthrough of our recent research: EvoTox, an automated, search-based testing framework that evolves natural, human-like prompts to probe where modern LLMs can still "break bad." Instead of relying on manual red-teaming or jailbreak, EvoTox systematically uncovers subtle linguistic pressure points that can trigger toxicity even in strongly aligned systems. Our results show that EvoTox reveals patterns of vulnerability across models of different scales—from compact to large scale LLMs—highlighting where alignment succeeds, where it cracks, and why it matters. The talk traces the story behind EvoTox, important challenges we face, our current findings, and how this line of research opens new directions for quality assurance of AI systems considering ethical concerns.

DEEPSE Forum Seminars - Testing Rogue AI for Harmful Content Degeneration

Sommario

Il dipartimento

Ricerca

Formazione

Servizi per le imprese

Relazioni internazionali

Collabora

Contattaci