The Alignment Problem

One primary ethical question surrounding artificial intelligence and algorithms is how to ensure such systems are aligned with human values and goals. In other words, developers may aim to make algorithms that cause the most good, effect the least harm, or to make the most ethical choices. But how do developers code such values? Is it possible to align AI to any sort of objective sense of human values, if this even exists?

The first question developers of an AI system may wish to answer before beginning any technical work on their product is what values they wish for this algorithm or product to espouse. This is where The Alignment Problem comes in; how can/should developers encode appropriate ethical/human values into AI systems? Through training with curated data? Through ensuring that it always follows specific instructions or intentions of the user? How can an AI system become moral and remain moral? Through both academic writings and narratives, you will explore questions of ethical alignment for AI systems.

Goal:

Understand current discussions on how to align AI algorithms goals to human values
Explore narratives related to problematic goal settings.
Discuss how AI goals should be aligned with the Common Good.

Related Themes and Technologies:

In this module we will begin with reading an academic article from Iason Gabriel to help orient us in terms of what exactly we mean when we reference The Alignment Problem. Then, we will explore a couple of fictional film narratives which depict attempts on the part of developers to encode positive values into AI systems—with less-than-ideal results.

Before We Meet

Read the following articles

30 min
2020

Artificial Intelligence, Values, and Alignment

4 min
VentureBeat
2020

Researchers Find that Even Fair Hiring Algorithms Can Be Biased

A study on the engine of TaskRabbit, an app which uses an algorithm to recommend the best workers for a specific task, demonstrates that even algorithms which attempt to account for fairness and parity in representation can fail to provide what they promise depending on different contexts.

After Meeting

Watch the following narratives with the Alignment Problem in mind. Brainstorm some answers to the discussion questions packaged with each narrative.

14 min
Kinolab
2014

Liberty, Autonomy, and Desires of Humanoid Robots

Caleb, a programmer in a large company, is invited by his boss Nathan to test a robot named Ava. During one session of the Turing Test, Ava fearfully interrogates Caleb on what her fate will be if she is deemed not capable or human enough by the results of the test. Caleb struggles to deliver the honest answer, especially given that Ava displays attachment toward him, a sentiment which he returns. After Caleb discovers that Nathan wants to essentially kill Ava, he loops her in to his escape plan, offering her freedom and a chance to live a human life. Once Nathan is killed, Ava goes to his robotics repository and bestows a new physical, humanlike appearance upon herself. She then permanently traps Caleb, the only remaining person who knows she is an android, in Nathan’s compound before escaping to live a human life in the real world.

8 min
Kinolab
1982

Digital Hegemony in the Real and Virtual Worlds

Main Control Program, an Artificial Intelligence presence, has self-developed beyond the imagination of its creators and sets its sights on hacking global governments, including the pentagon. It believes that with its growing intelligence, it can rule better than any human can, and forces the hand of Dillinger, a human, to help move its hacking beyond corporations. Meanwhile, a team of hackers attempt to break into the mainframe of this system. When the rebel hacker Flynn attempts to hack into the mainframe of the MCP, he is drawn into the digital world of the computer which is under the dominion of the MCP. Sark, one of the digital beings who serves the MCP, is tasked with killing Flynn.

During the Second Meeting

As a large group, we will watch these clips from 2001: A Space Odyssey. Then you will break out into groups to discuss HAL in terms of the Alignment Problem.

7 min
Kinolab
1968

HAL Part I: AI Camaraderie and Conversation

Dr. Dave Bowman and Dr. Frank Poole are two astronauts on the mission Discovery to Jupiter. They are joined by HAL, an artificial intelligence machine named after the most recent iteration of his model, the HAL 9000 computer. HAL is seen as just another member of the crew based upon his ability to carry conversations with the other astronauts and his responsibilities for keeping the crew safe.

12 min
Kinolab
1968

HAL Part II: Vengeful AI, Digital Murder, and System Failures

See HAL Part I for further context. In this narrative, astronauts Dave and Frank begin to suspect that the AI which runs their ship, HAL, is malfunctioning and must be shut down. While they try to hide this conversation from HAL, he becomes aware of their plan anyway and attempts to protect himself so that the Discovery mission in space is not jeopardized. He does so by causing chaos on the ship, leveraging his connections to an internet of things to place the crew in danger. Eventually, Dave proceeds with his plan to shut HAL down, despite HAL’s protestations and desire to stay alive.