Uncomfortable Truth Behind LLMs

How LLMs work, environmental effect, the $100 billion free AI trap, and what to do?

I created AI Engineer HQ, and I would like to share with you in today’s edition. Today’s newsletter is a guide for you to follow to increase your odds of winning with AI.

I added content, news, and resources about

In today’s edition:

  • Dive Deep Drill— Uncomfortable Truth Behind LLMs

  • Build Together— Here’s How I Can Help You

The Elite [AI Leadership Accelerator] 

Build your second brain for Leading AI Products/Projects.

Available only in a 1:1 session.

Dive Deep Drill

I divided this documentary into 4 chapters

  1. How LLMs really work?

  2. The Environmental Time Bomb

  3. The $100 Billion Free AI Trap

  4. Fighting Back - What to do?

How LLMs really work?

The Origin

You have heard a lot of noise about large language models (LLMs), one of which is ChatGPT.

The first transformer model was introduced in the paper "Attention Is All You Need" by Google researchers in 2017.

The original Transformer model was a general architecture, and BERT was introduced later in 2018 as one of its applications.

In 2020, GPT-3 was launched with significant advancements.

Other LLMs like Turing-NLG (by Microsoft) also existed, but GPT-3's strength was its scale (175 billion parameters).

Fast forward to 2025, and there is an industry trend toward efficiency rather than just increasing parameters.

New models are also moving toward Mixture of Experts (MoE) and retrieval-based architectures rather than just raw scaling.

Training these models has environmental effects as well.

A study found that training GPT-3 consumed 1,287 MWh, making it equivalent to the carbon dioxide emissions from 550 round-trip flights from New York to San Francisco.

You and I will dive deep and build our intuition around how large language models like ChatGPT work, the environmental effect, the free AI trap, and what you should do.

Let’s go!

What Exactly is GPT? (Generative, Pre-Trained, Transformer)

Let’s break everything down -

  • Generative

  • Pre-Trained

  • Transformer

Generative

The word “Generative” comes from statistics.

When I was doing my master’s, there was a subject called statistical modeling.

In statistical modeling, you will find a branch of generative modeling.

Generative modeling is a class of models that learn the joint probability distribution of data and generate new data samples.

It’s not just about predicting numbers but about capturing data distribution to generate new instances.

But it can also confidently generate false or biased information that was part of the training text data.

Whether you generate images or text, the machine is ultimately generating numbers.

Pre-Trained

As humans, how do you learn?

You generally don’t jump to complex topics before fundamentals.

You first learn foundational moves in chess, how chess pieces move, you practice it, and then you go on to compete with other players.

Similarly, pre-trained models work.

Pre-trained models

You train these models on foundational tasks and use them to fine-tune complex tasks.

These models create their memory, called parameters, that are optimized based on the learning they got from the data.

Instead of training a model again, you can use the already trained model and fine-tune it to your specific use case.

To train these models, you need a huge amount of data, like fine-web on Huggingface.

GPT-3 is trained on a huge corpus of text with 5 datasets - Common Crawl, WebText2, Books1, Books2, and Wikipedia.

These datasets contain around half a trillion words, which is sufficient to train the model for understanding the relationship between words, the grammar, the formation of sentences, which word will come next, etc.

While these datasets are part of GPT-3’s training data, OpenAI has not disclosed the full dataset details.

The model was trained on a mixture of publicly available and licensed data, but the complete dataset remains proprietary.

Transformer

A transformer is a neural network architecture.

To communicate with the machine, you need to learn binary language.

Later, we came up with a programming language, you can learn Python, and give instructions to the machine.

But now you can directly give instructions in the English language.

The gap between humans and machines is reduced; you don’t need to learn programming to interact with a machine.

The transformer was introduced in a research paper, Attention Is All You Need, in 2017 by Google researchers.

Transformer Architecture

What is a Prompt and Why Does it Matter?

“Who is the prime minister of India?“ - is the prompt I passed to ChatGPT.

The input you give to LLMs is a prompt.

To get a better response, you should give a better prompt.

The more clearly you define the prompt, the better you get the response.

Because LLM predicts the next word based on the previous word, the more words you provide in the prompt, the easier it will be for the model to find patterns between words and generate better responses.

So, you should adapt to the way LLM works, not the other way around.

Are you thinking about how the prompt is interpreted by ChatGPT?

Well, the answer is tokens.

What Are Tokens, and Why Are They Important?

Tokens can be individual or partial words, as seen in the image below.

Large Language Models use tokens to measure 3 things:

  • The size of the data they trained on

  • The input they can take

  • The output they can produce

OpenAI tokenizer

The tokens will be converted into numeric embeddings, as all types of models process numbers only.

Each token is associated with a unique integer ID, which is how the model understands the text.

Token IDs for “I love Himanshu’s AI Newsletter“:

[40, 3047, 24218, 616, 6916, 885, 20837, 27055]

Here, each token corresponds to a specific number that represents it in the model’s vocabulary.

Types of Tokens

There are typically three types of tokens used in LLMs:

Word Tokens

These are individual words. For example, "apple," "runs," and "cat" are word tokens.

Word-based tokenization is simpler but may struggle with rare or compound words.

Subword Tokens

These represent parts of words, typically used when a word is too rare or complex for the model.

For example, "unhappiness" might be split into subword tokens like "un" and "happiness".

It is also called Byte Pair Encoding (BPE), used by ChatGPT.

Character Tokens

These are individual characters (letters, numbers, punctuation marks).

This type of tokenization is very fine-grained and is usually used for languages with complex scripts or specific tasks like spelling correction.

One last thing, this system favors English and European languages, making it harder for non-Western languages to be understood properly.

How does ChatGPT remember what I said earlier in the chat window?

Attention Mechanism

LLM works on the attention mechanism. For example -

Himanshu is an AI Engineer. He is going to solve your problems.“

In this sentence, “Himanshu“ and “He” both have a relationship, as a human, you understand that “He” is used for “Himanshu” in the text, that’s attention.

Transformers can keep this attention information for long text, which is why you have seen in ChatGPT that in a new chat, if you ask something that you mentioned in the chat earlier, ChatGPT will know what you are talking about.

How Does ChatGPT Work?

Working flow of LLM predicting the next word | Image - NVIDIA

There are 171,476 words in the English language, you can assign a probability for each word to be the next in the sentence “The sky is ……“.

The word with the highest probability will win the spot, in this case, “blue“.

LLMs do not improve in real-time; they always start fresh.

While each ChatGPT session starts fresh (i.e., it does not remember previous interactions across different sessions), OpenAI does fine-tune models periodically, incorporating user feedback and reinforcement learning techniques to improve future versions.

How did we humans come up with the word “blue“?

We have been reading the English language for years, we don’t remember sentences word by word, but our understanding of phrases, the relationship between words, and our knowledge gives us the solution that the next word will be “blue”.

What's concerning is the probabilistic nature of "Intelligence"

You're not communicating with an intelligent entity when you interact with ChatGPT or any modern LLM.

You're engaging with a sophisticated statistical pattern matcher that predicts the next token based on probability distributions learned from massive text.

The Environmental Time Bomb

Training modern LLMs consumes huge amounts of energy and water.

This study found that training just one AI model can emit more than 626,000 pounds of carbon dioxide, equivalent to nearly five times the lifetime emissions of an average American car.

Google reported that 60% of its ML energy use came from inference, and the remaining 40% from training.

Training a single trillion-parameter model consumes as much electricity as 300 homes use annually.

Despite efficiency claims, the carbon footprint of AI has doubled since 2022, with the industry now responsible for 2.5% of global emissions, surpassing aviation.

A study found that training GPT-3 consumed 1,287 MWh, making it equivalent to the carbon dioxide emissions from 550 round-trip flights from New York to San Francisco.

Data centers use 1.4 million gallons of water daily for cooling.

Now the question is, if these AI tools are so costly, why are these AI tools FREE?

The answer is POWER.

The $100 Billion FREE AI Trap

Big tech is not doing charity business, they are doing business of power.

The Investment Breakdown

  • OpenAI: $58 billion (Microsoft, SoftBank)

  • Anthropic: $18 billion (Amazon)

  • Google DeepMind: $100 billion pipeline

The strategy of Silicon Valley is simple:

  1. Flood the market with free tools like ChatGPT to create dependency.

  2. Burn billions to undercut ethical competitors.

  3. Harvest user data like every prompt, interaction, and idea, to refine models and lock in dominance.

Hidden Costs of “Free” AI

  • Each GPT-4 query costs $0.40 in energy and infrastructure. Users pay not with cash but with their privacy and intellectual property.

  • Inputs into tools like GitHub Copilot become training data for future models, stripping creators of ownership.

  • Data labelers in Kenya and the Philippines earn $2/hour to filter graphic content, facing PTSD with no mental health support (exclusive here)

When something is free, you are price.

Fight Back - What Should You Do?

You and I are engineers; we always find better ways to solve different problems, and in this career of ours, we will equip ourselves with solid AI skills.

I created a roadmap for you to master AI without a PhD.

I am calling it Headquarters, as this is not just a course but a learning experience for you.

I started this in 2019 and kept on updating the curriculum as per the industry requirements.

Over the years, I found that the best way to be relevant is to adapt to change, as change is the only constant in tech (and in life).

Final Thought

Large Language Models will not change your life, it is a technology that you will learn and move on.

It is an assistant to us.

LLMs are not magic, there is a technical part to it.

It is also not the answer to all the problems in your organization.

In some business scenarios, machine learning will work best.

You need to understand when to use LLMs.

LLMs are secure when implemented with the right safeguards, but users and organizations must follow data handling practices.

Transparency, robust security measures, and privacy-conscious deployment are essential for safe and ethical use.

The AI world is moving fast; there is no first-mover advantage; it is the fast-mover advantage.

Learn fast, build fast, win fast, and move fast.

Happy AI

Start learning AI in 2025

Keeping up with AI is hard – we get it!

That’s why over 1M professionals read Superhuman AI to stay ahead.

  • Get daily AI news, tools, and tutorials

  • Learn new AI skills you can use at work in 3 mins a day

  • Become 10X more productive

Want to work together? Here’s How I Can Help You

I use BeeHiiv to send this newsletter.

How satisfied are you with today's Newsletter?

This will help me serve you better

Login or Subscribe to participate in polls.

PS: AI Engineer HQ is available with financial aid. Reply to this email for details.

Reply

or to participate.