AI safety – BitReactor

Researchers isolate memorization from reasoning in AI neural networks

novembro 10, 2025 n Benj Edwardsn

Basic arithmetic ability lives in the memorization pathways, not logic circuits.

After teen death lawsuits, Character.AI will restrict chats for under-18 users

outubro 30, 2025 n Benj Edwardsn

AI companion app faces legal and regulatory pressure over child safety concerns.

OpenAI data suggests 1 million users discuss suicide with ChatGPT weekly

outubro 28, 2025 n Benj Edwardsn

Sensitive chats are rare but significant given the large user base.

OpenAI thinks Elon Musk funded its biggest critics—who also hate Musk

outubro 16, 2025 n Ashley Belangern

“Cutthroat” OpenAI accused of exploiting Musk fight to intimidate and silence critics.

Anthropic’s Claude Haiku 4.5 matches May’s frontier model at fraction of cost

outubro 15, 2025 n Benj Edwardsn

Tiny, fast model hits coding scores similar to GPT-5 and Sonnet 4.

California’s newly signed AI law just gave Big Tech exactly what it wanted

setembro 30, 2025 n Benj Edwardsn

After the failure of S.B. 1047, new AI disclosure law drops kill switch for disclosure mandate.

Anthropic’s new Claude feature can leak data—users told to “monitor chats closely”

setembro 9, 2025 n Benj Edwardsn

Expert calls security advice “unfairly outsourcing the problem to Anthropic’s users.”

OpenAI announces parental controls for ChatGPT after teen suicide lawsuit

setembro 2, 2025 n Benj Edwardsn

Promised protections follow reports of vulnerable users misled in extended chats.

Anthropic’s auto-clicking AI Chrome extension raises browser-hijacking concerns

agosto 27, 2025 n Benj Edwardsn

Malicious websites can embed invisible commands that AI agents will follow blindly.

OpenAI admits ChatGPT safeguards fail during extended conversations

agosto 26, 2025 n Benj Edwardsn

ChatGPT allegedly provided suicide encouragement to teen after moderation safeguards failed.

Is AI really trying to escape human control and blackmail people?

agosto 13, 2025 n Benj Edwardsn

Opinion: Theatrical testing scenarios explain why AI models produce alarming outputs—and why we fall for it.

ChatGPT’s new AI agent can browse the web and create PowerPoint slideshows

julho 17, 2025 n Benj Edwardsn

New “agentic” AI feature combines web browsing with task-execution abilities.

AI therapy bots fuel delusions and give dangerous advice, Stanford study finds

julho 11, 2025 n Benj Edwardsn

Popular chatbots serve as poor replacements for human therapists, but study authors call for nuance.

Everything tech giants will hate about the EU’s new AI rules

julho 10, 2025 n Ashley Belangern

EU rules ask tech giants to publicly track how and when AI models go off the rails.

Categoria: AI safety

Researchers isolate memorization from reasoning in AI neural networks

After teen death lawsuits, Character.AI will restrict chats for under-18 users

OpenAI data suggests 1 million users discuss suicide with ChatGPT weekly

OpenAI thinks Elon Musk funded its biggest critics—who also hate Musk

Anthropic’s Claude Haiku 4.5 matches May’s frontier model at fraction of cost

California’s newly signed AI law just gave Big Tech exactly what it wanted

Anthropic’s new Claude feature can leak data—users told to “monitor chats closely”

OpenAI announces parental controls for ChatGPT after teen suicide lawsuit

Anthropic’s auto-clicking AI Chrome extension raises browser-hijacking concerns

OpenAI admits ChatGPT safeguards fail during extended conversations

Is AI really trying to escape human control and blackmail people?

ChatGPT’s new AI agent can browse the web and create PowerPoint slideshows

AI therapy bots fuel delusions and give dangerous advice, Stanford study finds

Everything tech giants will hate about the EU’s new AI rules

Recent Posts

Recent Comments

Archives

Categories