Tokens VS Parameters in LLMs

What are tokens? What are Parameters?

Tokens were considered individual words or 3 to 4 characters, but it’s false.

Tokens can be individual or partial words, as seen in the above image.

Large Language Models use tokens to measure 3 things →

  • the size of the data they trained on

  • the input they can take

  • the output they can produce

OpenAI tokenizer - Himanshu Ramchandani

The tokens will be converted into numeric embeddings, as all types of models process numbers only.

The GPT was trained on more than 500 billion tokens.

The GPT was trained on 175 billion parameters.

Both the statements are true.

Parameters are the memory of the model or the weights that a model determines based on the training data.

The GPT was trained on data and created this huge complex n-dimensional matrix of numbers we call parameters.

Anology→

When we as humans learn something, we try to get all the information(data) that we can break down into tokens, then we create our understanding and remember only important things about it (parameters).

Newsletter post every Thursday & Saturday.

AI newsletter for Leaders → building AI teams/products.

Happy AI.

Reply

or to participate.