The AI Glossary for Humans: 30+ Essential Terms

Introduction
The AI field is buzzing with activity and it can be tough to keep track of all the acronyms and terms. This article will serve as a glossary of AI terms, acronyms and high-level definitions. I've also set up a GitHub repository, which I will try to update regularly as the AI space evolves.
Feel free to ⭐ star and contribute to the repo. PR’s are welcome!
🔠 Core Terms
Artificial Intelligence (AI) refers to the simulation of human intelligence processes by machines, especially computer systems. These processes include learning (the acquisition of information and rules for using the information), reasoning (using rules to reach approximate or definite conclusions), and self-correction. AI applications include expert systems, natural language processing, speech recognition, and machine vision.
Machine Learning (ML) is a subset of artificial intelligence that involves the use of algorithms and statistical models to enable computers to improve their performance on a specific task through experience. It allows systems to learn from data, identify patterns, and make decisions with minimal human intervention.
Generative AI refers to a type of artificial intelligence that is capable of creating new content, such as images, music, text, or videos, by learning patterns from existing data. Unlike traditional AI, which focuses on recognizing patterns and making predictions, generative AI models can produce original outputs that mimic the style and structure of the input data.
Artificial General Intelligence (AGI) refers to a form of artificial intelligence that possesses the ability to understand, learn, and apply knowledge across a wide range of tasks at a level comparable to human intelligence. Unlike narrow AI, which is designed for specific tasks, AGI aims to perform any intellectual task that a human can do, with the ability to transfer knowledge and skills from one domain to another. AGI remains a theoretical concept and is a major goal in the field of AI research.
Large Language Model (LLM) is a type of artificial intelligence model designed to understand and generate human-like text. These models are trained on vast amounts of text data and use deep learning techniques to predict and produce coherent and contextually relevant language. LLMs are capable of performing a variety of language tasks, such as translation, summarization, and conversation, by leveraging their extensive training on diverse datasets.
Retrieval-Augmented Generation (RAG) is a technique in artificial intelligence that combines retrieval-based methods with generative models to enhance the quality and accuracy of generated text. In RAG, the system retrieves relevant information from a large dataset or knowledge base and uses this information to inform and improve the text generation process. This approach allows the model to produce more informed and contextually accurate responses by incorporating external data into the generation process. RAG is particularly useful in applications where up-to-date or domain-specific information is required.
Tokens are the smallest units of text that a model processes. A token can be a word, a part of a word, or even a character, depending on the tokenization method used. Tokenization is the process of breaking down text into these smaller units, which are then used by AI models to analyze and generate language.
Completions refer to the output generated by a model in response to a given prompt. When a user provides an input, the AI model predicts and generates a continuation or completion of that text. Completions rely on the model's understanding of language patterns and context to produce meaningful and appropriate responses.
Streaming: the process of handling and processing data as it arrives continuously, rather than waiting for a complete dataset. This is particularly useful in applications requiring real-time insights or actions, such as live translations, real-time analytics, and interactive AI systems. Streaming ensures that data is processed with minimal latency, enabling timely and dynamic interactions.
Model Training involves the process of teaching an AI system to perform specific tasks by exposing it to large amounts of data. During training, the model learns to recognize patterns and make predictions based on the input data. This involves adjusting the model's parameters to minimize errors and improve accuracy. Training is a fundamental step in developing AI systems, enabling them to understand and generate text, recognize images, or perform other complex tasks.
Hallucination refers to when an AI model creates content not based on input data or reality, often resulting in incorrect or made-up information. This can happen due to limitations in training data or when the model tries to fill gaps with plausible but wrong details.
Prompt Engineering is the practice of designing and refining the input prompts given to AI models, especially in natural language processing tasks. The goal of prompt engineering is to guide the model to produce the most accurate and relevant responses. By carefully crafting prompts, developers can influence the model's behavior and improve the quality of its outputs.
Natural Language Processing (NLP) is a part of artificial intelligence that enables computers to understand and respond to human language effectively. It uses computational linguistics along with machine learning and deep learning models to handle and analyze large amounts of natural language data.
Neural Network a series of algorithms that aim to recognize underlying relationships in a set of data through a process that mimics the way the human brain operates. It consists of layers of interconnected nodes, or neurons, where each connection can transmit a signal to another neuron. These networks are used in various applications, including image and speech recognition, by learning from large amounts of data to make predictions or decisions.
Chatbots are interfaces designed to simulate human conversation. It uses artificial intelligence and natural language processing to understand and respond to user inputs, providing a conversational experience. Chatbots can be used in various applications, such as customer service, information retrieval, and personal assistance, to automate interactions and improve efficiency.
Agentic AI refers to artificial intelligence systems that can make decisions and take actions independently, similar to how an agent would act on behalf of a user or organization. These systems are designed to perform tasks autonomously, often using machine learning and other AI technologies to adapt and respond to new situations without direct human intervention.
🛠️ Tools
GitHub Copilot is an AI-powered code completion tool developed and designed to assist developers by suggesting code snippets and entire functions in real-time as they write code. By leveraging machine learning models trained on a vast amount of open-source code, GitHub Copilot aims to enhance productivity and streamline the coding process.
JetBrains AI is built into JetBrains' development tools to make coding easier. It offers smart code suggestions, finds errors, and helps with automated refactoring. Using AI, JetBrains wants to boost developer productivity and simplify software development. It provides features for code completion, debugging, and project management in different programming languages.
Cursor is an AI code editor designed to help users become highly productive and claims to be the “best way to code with AI.” It allows writing code using natural language, obtains context based on the codebase, and facilitates easy changes by predicting the next edit.
Windsurf (formerly Codeium) makes an AI code editor and plugin that can be installed in a variety editors. It includes Cascade, which is an agent designed to code, troubleshoot, and anticipate future steps. It helps maintain workflows by grasping intentions and managing intricate codebases.
Ollama is an open-source platform that allows users to run large language models (LLMs) on their own devices. It provides tools and infrastructure to easily deploy and manage these models locally, ensuring privacy and control over data. Ollama is designed to make it simple for developers to integrate AI capabilities into their applications without relying on external cloud services.
Hugging Face is a company known for its open-source platform that provides tools and models for natural language processing (NLP). It offers a library called “Transformers”, which includes pre-trained models for tasks like text classification, translation, and question answering. Hugging Face aims to make NLP accessible and user-friendly for developers and researchers.
🤖 Models
ChatGPT is an LLM developed by OpenAI, designed to engage in human-like conversations. It is based on the GPT (Generative Pre-trained Transformer) architecture and is capable of understanding and generating text in response to user inputs.
Claude is an LLM developed by Anthropic, designed to engage in natural language understanding and generation. Similar to other advanced AI models, Claude is built to assist in tasks such as answering questions, providing explanations, and generating text based on user input.
Gemini is an LLM developed by Google DeepMind. It is designed to perform a wide range of natural language processing tasks, including understanding and generating human-like text.
Midjourney is a popular AI-powered image generation tool that creates visually stunning artwork from text prompts. It's available as a Discord bot, allowing users to input text descriptions and receive generated images in various styles. Midjourney is developed by Midjourney, Inc.,
Llama is a family of open-source, LLMs developed by Meta. Since it’s open-sourced, Llama allows developers and researchers to access, modify, and improve the models, fostering innovation and collaboration within the AI community.
🏢 Companies & Labs
OpenAI is an AI research and deployment company whose mission is to ensure that artificial general intelligence benefits all of humanity. They popularized the concepts of chatbots with the invention of ChatGPT.
DeepSeek is a Chinese AI company developing and open-sourcing LLMs, including the DeepSeek-R1 model. It has gained attention for its claim of developing LLMs that match or surpass the performance of industry leaders like OpenAI, while reportedly costing less to create.
Anthropic is an AI safety and research company focused on developing reliable and interpretable AI systems. They are the creators of Claude.
Google DeepMind is a leading artificial intelligence research lab that aims to solve intelligence and use it to advance science and benefit humanity. They focus on developing advanced AI technologies and applying them to real-world problems, with a commitment to ethical and responsible AI development.
El Fin 👋🏽
If you enjoy what you read, feel free to like this article or subscribe to my newsletter, where I write about programming and productivity tips.
As always, thank you for reading, and happy coding!




