OpenAI Thinks Superhuman AI is Coming and Control

OpenAI thinks superhuman AI is coming and control, despite recent internal shake-ups, remains steadfast in its commitment to addressing the profound challenges posed by super intelligent AI. The Super alignment team, led by OpenAI co-founder Ilya Sutskever, is at the forefront of efforts to develop tools for steering, regulating, and governing AI systems with intelligence surpassing that of humans.

Navigating the Challenge:

The primary focus of the Superalignment team is to tackle the intricate task of aligning AI models that exceed human intelligence. Collin Burns, Pavel Izmailov, and Leopold Aschenbrenner, members of the team, recently presented OpenAI’s innovative work at NeurIPS, emphasizing the urgency of addressing alignment concerns as AI progresses rapidly.

Defining the Challenge:

Aligning models that surpass human intelligence proves to be a daunting challenge. While current alignment methods work well for models at or below human-level intelligence, the Superalignment team is pushing the boundaries to ensure the alignment of superintelligent AI systems, a task deemed by some as premature and by others as a potential distraction from more immediate AI regulatory issues.

Superalignment Framework:

The team is actively working on building governance and control frameworks applicable to future powerful AI systems. Given the ongoing debate about the definition of “superintelligence,” the team has adopted an approach that involves employing a less-sophisticated AI model, akin to GPT-2, to guide a more advanced model, such as GPT-4, toward desirable outcomes and away from undesirable ones.

AI Guiding AI:

Using an analogy of a sixth-grade student supervising a college student, the Superalignment team explains how the weak model (e.g., GPT-2) provides guidance to the strong model (e.g., GPT-4) on specific tasks. Despite potential errors and biases in the weak model’s labels, the strong model can generalize correctly based on the intended guidance.

Addressing Hallucinations:

The team believes that the weak-strong model approach could contribute to reducing hallucinations in AI systems. By summoning the strong model’s knowledge and discerning between fact and fiction, the research aims to enhance the reliability of AI-generated content.

OpenAI’s Call for Collaboration:

In a bid to foster collaboration and gather diverse perspectives, OpenAI is launching a $10 million grant program for technical research on superintelligent alignment. This initiative is expected to include academic labs, nonprofits, individual researchers, and graduate students. OpenAI also plans to host an academic conference on superalignment in early 2025 to showcase and promote the work of grant recipients.

Eric Schmidt’s Involvement:

Former Google CEO and chairman Eric Schmidt is contributing to the grant program, raising questions about potential commercial interests. Schmidt, a proponent of proactive AI research, denies any ulterior motives, asserting his commitment to ensuring AI aligns with human values.

Commitment to Open Access:

Despite potential concerns, the Superalignment team affirms its commitment to transparency and open access. Both OpenAI’s research, including code, and the work of grant recipients will be shared publicly, aligning with the company’s mission of contributing to the safety of AI models across the research community.


OpenAI’s Superalignment initiative, though met with skepticism by some in the AI research community, signifies a proactive stance in addressing the challenges associated with superintelligent AI. The collaboration with external researchers and the commitment to open access underscore the company’s dedication to ensuring the responsible development of advanced AI systems for the benefit of human.

