Intro to LLM

Introduction to Large Language Models (LLMs)

https://ubc-library-rc.github.io/llm/

Land Acknowledgement

UBC Vancouver is located on the traditional, ancestral, and unceded territory of the xʷməθkʷəy̓əm (Musqueam) peoples.

Use the Zoom toolbar to engage

Participants window

Learning Objectives

Understand architecture and working of LLM
Fine tune pre-trained LLM model to customize for a sample dataset
Understand various aspects of using LLMs for research.

Pre-workshop setup

GOOGLE COLAB (https://colab.research.google.com/)
HUGGINGFACE (https://huggingface.co/)

Background

Before diving into LLMs, let’s quickly go over some basics that make LLMs, LLMs. Neural Networks: A type of machine learning process, called deep learning, that uses interconnected nodes or neurons in a layered structure that resembles the human brain. RNN: a type of neural network designed to work with sequence data. Explore LSTMs and GRUs, two RNN variants that are capable of learning long-term dependencies. Natural Language Proceesing: A branch of AI—concerned with giving computers the ability to understand text and spoken words in much the same way human beings can. Text Preprocessing: Tokenization (splitting text into words or sentences), stemming (reducing words to their root form), lemmatization (similar to stemming but considers the context), stop word removal. Feature Extraction Techniques: Converting text data into a format that can be understood by machine learning algorithms. Key methods include Bag-of-words (BoW), Term Frequency-Inverse Document Frequency (TF-IDF), and n-grams. Word Embeddings: Word embeddings are a type of word representation that allows words with similar meanings to have similar representations. Key methods include Word2Vec, GloVe, and FastText.

What are Large Language Models?

Large Language Models (LLMs) are artificial intelligence systems designed to understand and generate human-like language.
LLMs are fundamental to natural language processing, powering applications like chatbots, language translation, and content generation.

Let's pretend to be LLM (..or just a human)

Excerise 1: The cat sat on a _____.
LLM says: "The cat sat on a sunny windowsill, basking in the warmth of the afternoon sun."
Excerise 2: Tell me a two sentence story of a dog named Pluto
LLM says: "Pluto, a spirited golden retriever with a heart full of curiosity, embarked on a solo adventure through the bustling city streets. With a wagging tail and a friendly demeanor, he charmed everyone..."

Let's pretend to be LLM (..or a smart human)

Excerise 3: Write a html code of a .....
LLM writes the whole code in 10 seconds.

Many such applications of LLMs

From Appypie

Architecture of a typical LLM

From https://magazine.sebastianraschka.com/p/understanding-encoder-and-decoder

Now lets’s understand the architecture of LLMs. Encoder-decoder transformer Encoder: Think of an encoder like someone who reads a story and takes notes on important details. In the context of a Language Model, the encoder takes a sentence or a piece of text as input and processes it to understand the key information. It's like breaking down a sentence into smaller parts and figuring out the main ideas. Decoder: Now, imagine a decoder as someone who uses those notes to tell a story in a different language. In the Language Model, the decoder takes the encoded information and generates a new piece of text. It's like translating the summarized notes into a different language or context. Summary: Together, they make sure the story is understood and shared in a new way. So, in summary, an encoder processes information, and a decoder uses that information to generate something new, forming a crucial part of how Language Models work. Tokenization Tokenization is the process of breaking down text into smaller units called tokens. Tokens can be words, subwords, or even characters, depending on the chosen tokenization strategy. This step is crucial for the model to understand and process the input text efficiently. Attention Think of a classroom where a teacher is explaining a lesson. The teacher pays attention to different students at different times based on their needs or questions. Similarly, the attention mechanism in a transformer allows the model to focus on different parts of a sentence or text, giving importance to specific words based on the context. Components of Attention: Query: Think of it as the question the student asks. Key: Think of it as the student who has the answer. Value: Think of it as the actual answer the student provides. Text generation LLMs utilize the decoder part of the transformer architecture for text generation, sampling or selecting tokens sequentially.

Some popular LLMs

BERT (Bidirectional Encoder Representations from Transformers) (by Google)
GPT (Generative Pre-trained Transformer) (by OpenAI)
LLaMA (by Meta).

Fine tuning LLMs

From Medium

BERT: BERT, developed by Google, is a cutting-edge natural language processing model that understands context by considering both preceding and following words, significantly enhancing the representation of word meanings in a bidirectional manner. GPT (Generative Pre-trained Transformer) (by OpenAI): GPT, pioneered by OpenAI, is a state-of-the-art language model that utilizes a transformer architecture, capable of generating coherent and contextually relevant text. It demonstrates powerful language understanding and generation through pre-training on diverse datasets. LLaMA (by Meta): LLaMA, developed by Meta, is an acronym for Language Model for Many Applications. Although specific details may vary, it is an initiative focused on creating versatile language models that can be applied across a broad range of tasks and applications within the Meta ecosystem.

Open Jupyter Notebooks

Tokenization: Types

From https://medium.com/@abdallahashraf90x/tokenization-in-nlp-all-you-need-to-know-45c00cfa2df7

Tokenization: Example

From https://towardsdatascience.com/why-are-there-so-many-tokenization-methods-for-transformers-a340e493b3a8

Embeddings

From https://medium.com/@hari4om/word-embedding-d816f643140

Quantization

From https://www.tensorops.ai/post/what-are-quantized-llms

Fine tuning LLMs

Full fine-tuning: Full fine-tuning refers to training all the parameters in the model. It is not an efficient technique, but it produces slightly better results.
LoRA: A parameter-efficient technique (PEFT) based on low-rank adapters. Instead of training all the parameters, we only train these adapters.

Pre-training: Initially, a language model is trained on a massive amount of diverse and general language data. This phase helps the model learn the intricacies of language, grammar, and contextual understanding. Models like BERT, GPT, and LLaMA go through this extensive pre-training to capture a broad understanding of language. Fine-tuning: After pre-training, the model can be fine-tuned for specific tasks or domains. This involves training the model on a smaller, task-specific dataset. The goal is to adapt the pre-trained model to the nuances and characteristics of the target task, improving its performance on that particular application. Fine-tuning allows leveraging the general knowledge acquired during pre-training while tailoring the model to perform exceptionally well on targeted tasks, making LLMs versatile tools for a wide array of natural language processing applications. LoRA: Low Rank Adaptation of LLM Helps in efficiently fine tuning LLM with lesser computation requirements. Reduces the number of parameters that require fine-tuning.

AI Literacy is not just about understanding AI functions and usage but also:

Right Evaluation: Generalizibility and AI hallucination.
Ethical considerations: Fairness, accountability, transparency, safety, etc.

Is it safe to use chatGPT?

Image by Aleksandr Tiulkanov, which is licensed under CC BY.

Using LLMs for research

- Covers multiple domains
- Can be used for brainstorming (wording your thoughts)
- Sentence formation for papers
- Lacks specificity
- Potential bias
- Lacks source

Ethics

Image from: Lepri, Bruno, Nuria Oliver, and Alex Pentland. "Ethical machines: The human-centric use of artificial intelligence." IScience 24.3 (2021): 102249.

Where to go from here?

Learn math, programming, and software development: Digital Scholarship workshops at UBC Library
Online courses such as Supervised Machine Learning course on Coursera
Youtube Tutorials such as Python Machine Learning Tutorial Series
Find similar examples in Scikit-learn documentation
Join communities such as Kaggle or AMS Data Science Club

Future workshops

Title	Series
Regression models	Tue, Mar 19, 2024 (1:00pm to 3:00pm)
Classification and clustering models	Tue, Mar 26, 2024 (1:00pm to 3:00pm)
Neural networks	Tue, Apr 2, 2024 (1:00pm to 3:00pm)

More from the Research Commons at (UBC-V)

And from the Center for Scholarly Communication (UBC-O)