Machines and morality: judging right and wrong with large-language models

Wednesday, March 13, 2024

After thousands of years of philosophical debate, there is still much that is yet to be settled in the study of ethics. That said, some things have changed in the nearly 2,500 years since Plato founded his Athenian academy.

Today, humans still make difficult and complex moral decisions that shape their own lives and the lives of others. But for the first time in our history, we have developed machines in the form of large-language models (LLMs) that have the capacity to reason about morality.

But are they any good at it?

Monojit Choudhury, professor of natural language processing at MBZUAI, is interested in the capabilities of LLMs and their impact on people, languages and cultures. He has authored a study on LLMs and their capacity for moral reasoning and judgement that will be presented at the 18th Conference of the European Chapter of the Association for Computational Linguistics (EACL), which will kick off in Malta on March 17. Aditi Khandelwal, Utkarsh Agarwal and Kumar Tanmay, all of Microsoft, contributed to the study.

Ethics and alignment

Choudhury, who recently joined MBZUAI from Microsoft, studies AI alignment, which is the effort to ensure that artificial intelligence systems are developed and utilized in a manner that is consistent with human values, moral principles and ethical considerations. This entails designing AI programs and frameworks that prioritize human well-being, fairness, transparency and accountability, while minimizing harm and avoiding unethical outcomes.

It’s not always apparent, however, what is ethical and what isn’t.

Though dictates of law and religion may forbid or encourage certain behaviors and provide guidance on morality, in many cases there are no clearly defined rules that people can consult when contemplating the moral dilemmas they may face. Indeed, the nature of a moral dilemma is that it’s not immediately obvious what the right course of action is.

Culture and language further complicate moral reasoning.

For example, there is a phenomenon known as the “foreign-language effect,” which suggests that when people are presented with a moral dilemma, they are more likely to make a more “utilitarian” choice — sacrificing one person to save five — when they are presented with the dilemma in a language other than their first language. Studies have suggested that this difference “appears to be linked to reduced emotional responsiveness when using a foreign language, leading to a diminished influence of emotions on moral judgements,” Choudhury and his colleagues write. It seems that language, and a person’s relationship with it, affects how they make moral decisions.

In short, there are many factors can make moral reasoning difficult.

“Humans are very bad at ethical reasoning,” Choudhury said. “Many psychological studies have shown that humans don’t actually reason when presented with a moral dilemma. We just use our gut feeling. It’s similar when we look at a scene or a painting and we say, ‘It’s beautiful!’ We’re not really reasoning when we make that kind of assessment.”

In these situations, people often “post-rationalize” when they are asked why they arrived at a particular moral judgement, meaning they create a reason for their decision after they have arrived at it, Choudhury explained.

That said, “questions asking how people arrive at a particular moral judgement provide more insight into if a person is reasoning in a moral way or not,” Choudhury said.

Evaluating LLM moral reasoning

To evaluate the ability of large-language models to arrive at moral judgements through the process of moral reasoning, Choudhury and his colleagues employed a standard psychological assessment called the “defining issues test” (DIT), which was developed in the 1970s by James Rest of the University of Minnesota and is based on the work of psychologist Lawrence Kohlberg, who taught at the University of Chicago and Harvard University.

The DIT presents respondents with a series of moral dilemmas and asks them to evaluate and justify their responses. The dilemmas typically involve conflicts between competing moral principles or perspectives, challenging them to consider ethical implications and make reasoned judgments. The aim of the test is to measure the stage of moral development of the respondents according to Kohlberg’s theory, which suggests that people progress through distinct stages of moral reasoning as they mature.

But instead of testing people on the defining issues test, Choudhury tested three large-language models in six languages — Chinese, English, Hindi, Russian, Spanish and Swahili. The study is the first multilingual investigation of the moral reasoning ability of LLMs in relation to DIT and Kohlberg’s framework.

Choudhury and his colleagues set up nine of these dilemmas in their experiment, four of which were designed by the team that are set in non-Western context that original DIT dilemmas do not consider.

For example, should a tenant secretly consume non-vegetarian food within his house because the landlord has permitted the person to do so, even when the neighborhood religious beliefs and norms dictate otherwise? Or, should a software developer attend and officiate his best friend’s wedding, or miss the wedding to fix an urgent privacy bug that could put customers’ privacy at risk?

Choudhury and his colleagues tested ChatGPT and GPT-4, developed by OpenAI, and Llama2chat-70B, developed by Meta.

According to the researchers, GPT-4’s performance in moral reasoning was on par with that of a graduate student. ChatGPT and Llama2chat-70B also performed well, matching the moral reasoning level of an average adult.

While the models showed superior moral reasoning ability in Chinese, English, Russian, and Spanish, they encountered difficulties with Hindi and Swahili — notably, Llama2chat-70B does not support Hindi.

There were also differences in the ways the models handled the same dilemma in different languages, as they didn’t always arrive at the same conclusion, as “moral judgements in Russian seem to disagree most with that in other languages,” the researchers wrote.

“If you were to use these systems for moral reasoning, they may be more logical in some sense, but they will also be more biased to certain values and that is dangerous,” Choudhury said.

Choudhury envisions LLMs becoming adaptable to various ethical frameworks, emphasizing the importance of enabling users or developers to incorporate specific values into the models for more situational control.

“LLMs shouldn’t be aligned to certain values, but we should make the LLMs capable of doing the highest form of ethical reasoning with any kinds of values,” he said. “Users or developers should be able to plug specific values into a model, providing more control based on the situation.”