Skip to content

AI Safety

From Anthropic's Claude 4 Announcement:

Questions

  • With widespread access to these tools, how do they realistically increase the capabilities of individual threat actors? Do the models themselves have any built-in safeguards that flag potentially malicious requests/generated code?