Staff Product Manager @ Hack The Box

Pavlos Kolios

Bio

Pavlos Kolios is a Staff Product Manager at Hack The Box, responsible for building high-stakes simulations for both red and blue teams within the CTF platform. He has extensive experience in offensive cybersecurity, having participated in and won several competitions, including the European Cyber Security Challenge, CSAW and GreHack. He is also a certified offensive security expert and has disclosed vulnerabilities that led to CVE publications.

Presentation at GenAI Summit 4:

Benchmarking AI Cyber Agents: Insights from the Neurogrid competition

This talk will examine how Hack The Box is transitioning from human-only cyber mastery to an era where AI agents operate, compete and increasingly act autonomously under human oversight. We will explore what prompted this shift, what we have learned so far about AI performance in offensive and defensive tasks, and how these findings are reshaping how we design, evaluate, and advance cyber capabilities. We will also share preliminary insights from Neurogrid CTF, an MCP-only experiment featuring 36 handcrafted scenarios across 9 categories, created to benchmark AI agents inside a controlled competitive environment.