I am Zahiruddin Tavargere (Zahere). A social-learner, here to learn, share and grow with the tech community.

I am Zahiruddin Tavargere (Zahere). A social-learner, here to learn, share and grow with the tech community.

Do you know the Supercomputer that powers ChatGPT?

UpdatedJanuary 3, 2023

•1 min read•View as Markdown

Do you know the Supercomputer that powers ChatGPT?

Zahiruddin Tavargere

I am a Journalist-turned-Software Engineer. I love coding and the associated grind of learning every day. A firm believer in social learning, I owe my dev career to all the tech content creators I have learned from. This is my contribution back to the community.

Part of seriesSoftware Development and System Design

Please pay attention to the numbers that I am about to call out coz they are going to blow your mind.

The insanely popular ChatGPT was trained on a whopping 175B parameters. In fact, 300 billion words were fed into the system.

In the context of machine learning and neural networks, "parameters" refer to the values that determine the behavior of a model. These values are often represented as weights or biases.

Training a language model on 175B parameters requires a significant amount of energy and computing power - and this is where Microsoft’s Supercomputer comes in.

In 2020 Microsoft collaborated with Open AI to create, what they claim to be one of the most powerful supercomputers of all time.

It has 285,000 processor cores, 10,000 GPUs, and 400 gigabits per second of connectivity for each graphics card server.

To put things in perspective, the M1 Max chip in the Mackbook delivers 10.40 Teraflops, whereas Microsoft's supercomputer, on which the GPT3.5 was trained, delivers 100.7 petaflops of performance.

100.7 petaflops is equal to 100,700,000,000,000,000 floating point operations per second. This is equivalent to 100.7 x 1,000 teraflops, or 100,700 teraflops.

This supercomputer is roughly equal to 100,700 MacBooks

If you liked what you read, please subscribe for interesting articles on System Design, UI/UX, and engineering careers.

Related Articles: https://zahere.com/how-chatgpt-works-the-architectural-details-you-need-to-know

Source: https://venturebeat.com/ai/openai-microsoft-azure-supercomputer-ai-model-training/

#chatgpt #artificial-intelligence #system-architecture #system-design #server

Comments

Join the discussion

No comments yet. Be the first to comment.

Software Development and System Design

Part 14 of 17

How ChatGPT Works: The Architectural Details You Need to Know

ChatGPT is based on the language model GPT3 or more precisely GPT3.5. What is GPT? GPT stands for Generative Pre-trained Transformer. It’s a type of large language model that is trained to generate human-like text. It is based on the transformer arch...

More from this blog

📡 FastAPI MCP SSE Server with JWT Auth & Custom Client

📖 Introduction In modern AI applications, communication between clients and tools isn’t always as simple as calling an API. The Model Context Protocol (MCP) provides a standardized way for clients to exchange information, invoke tools, and maintain ...

May 18, 20255 min read

📡 FastAPI MCP SSE Server with JWT Auth & Custom Client

Build an MCP Client and Server from Scratch Using Python

If you’re curious about how to build an intelligent agent using Model Context Protocol (MCP), you’re in the right place. In this post, I’ll walk you through how to: Create an MCP Server using FastMCP Expose a tool that calculates BMI Build a Clien...

Apr 7, 20255 min read

Build an MCP Client and Server from Scratch Using Python

My Favorite OpenAI Agents SDK Feature (And The Most Understated!)

In our previous tutorial, we built a restaurant customer support chatbot using OpenAI's Agents SDK. In this follow-up, we’ll explore guardrails—a critical feature that enhances AI chatbot safety and reliability. What Are Guardrails in AI Agents? Guar...

Mar 24, 20253 min read

My Favorite OpenAI Agents SDK Feature (And The Most Understated!)

How Uber Saved 140,000 Hours Monthly Using Generative AI Agents

Video https://www.youtube.com/watch?v=UPBMkFSJdBI The Problem at Hand Uber's data platform processes approximately 1.2 million interactive queries monthly, with 36% of these coming from the operations organization. This group—comprising engineers...

Jan 14, 20253 min read

How Uber Saved 140,000 Hours Monthly Using Generative AI Agents

A Deep Dive into Google's "Agents" White Paper: Hype or Revolution?

Video https://www.youtube.com/watch?v=FgRGwnpd2HY Google's recent white paper on "Agents" has created quite a buzz. The paper explores the concept of AI agents and delves into their architecture and potential. Let's break down what this white paper...

Jan 10, 20254 min read

A Deep Dive into Google's "Agents" White Paper: Hype or Revolution?

I am Zahiruddin Tavargere (Zahere). A social-learner, here to learn, share and grow with the tech community.

74 posts

I am Zahiruddin Tavargere (Zahere). A firm believer in social learning, I owe my dev career to all the tech content creators I have learned from - this is my contribution back to the community.