How AI could destroy the world by accident

2023/6/15

AI could be our biggest existential threat this century. If you enjoyed this video, here are some places to find out more about these ideas:

_Human compatible_ US: https://amzn.to/3Pdi0qS UK: https://amzn.to/463vawM by Stuart ‘You can’t fetch the coffee if you’re dead’ Russell
_The Alignment Problem: Machine Learning and Human Values_ US: https://amzn.to/3N7cLpV UK: https://amzn.to/45YMZgq by Brian Christian
@eightythousandhours’ problem profile on ‘Preventing an AI-related catastrophe’: https://80000hours.org/problem-profiles/artificial-intelligence/
@RobertMilesAI ’s channel: https://www.youtube.com/robertmilesai

Read the worst cat/sat/mat-based short story ever written here: https://andrewsteele.co.uk/blog/2023/06/chatgpt-ai-extinction-risk-video/

Amazon links are affiliates and I will receive a small payment if you choose to purchase through them. Thanks!

*Chapters*

00:00 Introduction
02:04 How does ChatGPT work?
06:48 Problem 0: AI misuse
08:01 Problem 1: AI is an alien mind
11:18 Problem 2: Defining goals is hard
17:05 Problem 3: ‘Instrumental convergence’
19:17 Problem 4: Exponential progress
22:32 What can we do?

*Sources and further reading*

_Introduction_
ChatGPT’s user growth https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/
Try hilariously bad 2020 text-to-image generator X-LXMERT here: https://vision-explorer.allenai.org/text_to_image_generation
Run Stable Diffusion locally using its web UI: https://github.com/AUTOMATIC1111/stable-diffusion-webui
‘Sony World Photography Award 2023: Winner refuses award after revealing AI creation’ – BBC News https://www.bbc.com/news/entertainment-arts-65296763

_How does ChatGPT work?_
An _absolutely humungous_ list of papers about LLMs https://github.com/Hannibal046/Awesome-LLM
GPT and other LLMs don’t usually work on the word level, they actually normally work on ‘tokens’—many of which are words, but not all of which are. You can get a sense for the difference by trying out OpenAI’s Tokenizer, here https://platform.openai.com/tokenizer
Emergent abilities of large language models https://openreview.net/pdf?id=yzkSU5zdwD
ChatGPT playing chess https://www.lesswrong.com/posts/xyjhFCSSXZsW6HDBb/a-chess-game-against-gpt-4

_Problem 1: AI is an alien mind_
Paper on using psychedelic specs to fool facial recognition AI https://users.ece.cmu.edu/~lbauer/papers/2016/ccs2016-face-recognition.pdf
‘Psychedelic toasters fool image recognition tech’ – BBC News https://www.bbc.com/news/technology-42554735
Thread on how little we know about how ChatGPT works—including an absolutely baffling algorithm it uses internally to add numbers together! https://twitter.com/robertskmiles/status/1663534255249453056

_Problem 2: Defining your goals_
More about OpenAI’s CoastRunner-smashing reinforcement learning algorithm https://openai.com/research/faulty-reward-functions
Astrophysicist Grant Tremblay correcting Bard on Twitter https://twitter.com/astrogrant/status/1623091683603918849

_Problem 3: Instrumental convergence_
Great video with Rob Miles about how hard it is to build an off switch for an AI https://www.youtube.com/watch?v=3TYT1QfdfsM

_Problem 4: Exponential progress_
Article on how ChatGPT can help with code (and its limitations) https://www.nature.com/articles/d41586-023-01833-0
GPT-4 cost over $100m to train https://www.wired.com/story/openai-ceo-sam-altman-the-age-of-giant-ai-models-is-already-over/

_What can we do?_
AI governance is a huge field, and a good overview of resources can be found at https://80000hours.org/problem-profiles/artificial-intelligence/#ai-governance-and-strategy (link should take you straight to the AI governance and strategy’ heading)

*Credits*

Milla Jovovich image CC BY-SA Georges Biard https://upload.wikimedia.org/wikipedia/commons/c/c1/Milla_Jovovich_Cannes_2011.jpg

*And finally…*

Follow me on Twitter https://twitter.com/statto
Follow me on Instagram https://www.instagram.com/andrewjsteele
Like my page on Facebook https://www.facebook.com/DrAndrewSteele
Follow me on Mastodon https://mas.to/@statto
Read my book, _Ageless: The new science of getting older without getting old_ https://ageless.link/