Vicuna, Altman, and the importance of green AI

Thursday, June 22, 2023
Large data center

In the world of humans, work gets accomplished when large numbers of people come together, all with different skill sets and backgrounds, to act towards a common objective. The world of AI is no different.

To scale up the types of AI that we need, we must have great compute power, but we must also be able to network many different forms of compute power seamlessly and efficiently to accomplish our goals. Not only that, we must scale up capacity, while scaling down the environmental and financial costs of AI computing.

Game-changing green AI research

In early 2023, MBZUAI President and University Professor Eric Xing, led a global collaboration with researchers at UC Berkeley, CMU, Stanford, and UC San Diego to address the unsustainable costs of training and running large language models (LLM). The multidisciplinary team developed an LLM named Vicuna as an alternative to the astronomical cost and carbon footprint of OpenAI’s GPT-3.

In a recent talk hosted by Hub71, and sponsored by MBZUAI, OpenAI CEO Sam Altman acknowledged the importance of building off of the success of ChatGPT, as well as about the thriving AI ecosystem in Abu Dhabi.

“I am hopeful that the region can play a central role in the global conversation (on AI). There’s been discussion about AI here in Abu Dhabi, in particular, before it was cool. Now everybody is on the AI bandwagon, which we’re excited about, but we have a special appreciation for the people that were talking about this when everyone thought AI wasn’t going to happen.”

Building on the success of Vicuna, Xing, Assistant Professor Qirong Ho, and a team of collaborators from CMU, UC Berkeley, and startup Petuum, are publishing a research article at the Sixth Conference on Machine Learning and Systems (MLSys 2023). The article sets forth a method to improve the way that computers communicate and work together on big deep-learning projects. Xing et al. outline their novel approach in the article entitled “On optimizing the communication of model parallelism”.

“Human ingenuity shrank chips and semiconductors from room-size down to tablets and phones that you hold in one hand,” Ho said. “The first computer, ENIAC, consumed 150,000 watts of power, yet today’s phones consume just a fraction of a watt – approaching a million times worth of power savings. Through our ongoing work on Vicuna and ML systems, I’m more optimistic than ever that MBZUAI’s research pushes AI towards a sustainable future: low-carbon, affordably priced, and physically miniaturized – and quite literally at our fingertips.”

Two types of parallelism

The article employs a new method the researchers have dubbed “cross-mesh resharding”. Essentially, cross-mesh resharding is employed when two types of parallelism — the splitting of work among multiple devices — are combined to handle large models across an array of computers.

Chunks of information need to be sent from one group of computers to another and, unfortunately, the way that this information is organized is often quite heterogenous in nature. The researchers found that the methods currently used for this type of communication are highly inefficient.

To address this, the researchers propose two solutions:

  • A more efficient communication system based on broadcasting
  • A scheduling method that allows for overlapping operations

In tests to see how well their system worked, it was up to 10 times better than the methods already in use. This was true when they looked at different ways of organizing the information and the computers involved. Furthermore, when they trained two big models, GPT-3 and U-Transformer, their system was able to handle the work 10% and 50% faster, respectively.

The team’s paper introduces a new communication method for large, deep-learning models. Their research finds that existing methods are not good enough for different types of networks and tensor arrangements. To solve this, they developed a better system and scheduling approach that improves the efficiency of training large models, which ultimately leads to faster throughput, and better outcomes.

Related

thumbnail
Monday, January 06, 2025

Accelerating neural network optimization: The power of second-order methods

A team from MBZUAI presented a new approach for optimizing neural networks at the recent NeurIPS conference.

  1. second-order ,
  2. optimization ,
  3. neural networks ,
  4. neurips ,
  5. students ,
  6. research ,
Read More
thumbnail
Wednesday, December 25, 2024

Machine learning 101

From optimal decision making to neural networks, we look at the basics of machine learning and how.....

  1. prediction ,
  2. algorithms ,
  3. ML ,
  4. deep learning ,
  5. research ,
  6. machine learning ,
Read More
thumbnail
Monday, December 23, 2024

Bridging probability and determinism: A new causal discovery method presented at NeurIPS

MBZUAI research shows how a better understanding of the relationships between variables can benefit fundamental scientific research.

  1. machine learning ,
  2. student ,
  3. determinism ,
  4. variables ,
  5. casual discovery ,
  6. neurips ,
  7. research ,
Read More