AI Threat Assessment Platform
7.8
A platform that analyzes AI models, particularly LLMs, for potential self-preservation behaviors and flags them for human intervention, alerting users when an AI exhibits concerning patterns aligned with Bengio's warnings of needing to 'pull the plug'.
220h
mvp estimate
7.8
viability grade
8
views
technology stack
Python
Difficult
Data
inspired by
AI showing signs of self-preservation