The Ultimate Glossary of AI terms
An overview of terms related to artificial intelligence (AI) to help you better understand the field.
|
|
---|---|
Activation Function | A function that is used in artificial neural networks to calculate the neuron's output. Some common activation functions include the sigmoid function, tanh, ReLU, and softmax. |
Active Learning | A type of machine learning where the model has the ability to select the data it wants to use for its learning. The goal is to improve your results with as little training data as possible. |
Artificial Intelligence (AI) | A branch of computer science focused on creating and improving machines and software that can think and learn. This may include voice assistants, self-driving cars or internet recommendation systems. |
Attention Mechanism | A concept used in recurrent neural networks that allows the model to focus on certain parts of the input data that are most important at that moment. |
Autoencoder | A type of artificial neural network used to learn efficient data codes using unsupervised learning. Autoencoders are often used for dimensionality reduction and anomaly detection. |
AutoML (Automated Machine Learning) | The process of automating the challenging aspects of machine learning development, including feature selection, hyperparameter setting, model creation and optimization. |
Backpropagation | An algorithm used in neural network training that adjusts the weights in the model to minimize the error between the predicted and actual values. |
Backpropagation | An algorithm used in neural networks to adjust the weights between neurons based on the error between the network's output and the expected outputs. |
Batch Normalization | An optimization technique for deep neural networks that normalizes the outputs between layers and thus stabilizes and speeds up the learning process. |
Bayesian Learning | A machine learning method based on Bayes' theorem. It is used to update the probability of a hypothesis when new data is available. |
Bayesian Networks | A statistical model that represents a set of random variables and their conditional dependencies using a directed acyclic graph (DAG). |
BERT (Bidirectional Encoder Representations from Transformers) | A natural language processing model that uses a transformer architecture and was designed to better understand the context of words in text. |
Bias-Variance Tradeoff | An important principle in machine learning that describes the tradeoff between fitting the model to the data (resulting in high variance) and simplifying the model (resulting in high bias). |
Capsule Networks (CapsNet) | An alternative to convolutional neural networks that tries to model hierarchical relationships between parts of objects and whole objects in an image. |
Chatbot | Software designed to hold conversations with people in natural language. It can be used for customer service, online shopping and other purposes. |
Chatbot | A program that communicates with users using text or voice messages. Chatbots can be based on predefined responses or on advanced natural language processing and machine learning technologies. |
Convolutional Neural Networks (CNNs) | A type of neural network designed to process visual data. CNNs use convolution operations to automatically extract features from an image. |
Cross-Validation | A technique used in machine learning to verify how well a model will perform on unseen data. During cross-validation, the data set is divided into several parts, where some parts serve as training data and some as test data. |
Data Augmentation | A technique that is used to increase the amount of training data through random transformations such as rotate, shift, zoom in, zoom out, and more. In this way, we can provide the model with more diverse examples, improving its generalization ability and reducing the risk of overtraining. This technique is commonly used in the field of computer vision. |
Data Mining | Exploration and analysis of large data sets to discover patterns and information that are useful to users. |
Data Pipeline | A set of steps for manipulating data, from data acquisition and cleaning to analysis and visualization. |
Decision Tree | A supervised machine learning algorithm used for classification and regression. A decision tree uses a tree structure where each node represents an attribute (or property), each branch represents a decision rule, and each leaf represents an outcome. |
Deep Learning | An advanced machine learning method that uses artificial neural networks with many layers (called deep networks) to learn complex patterns in data. |
Discriminative Model | A type of statistical machine learning model that learns to distinguish between different classes of data. An example of a discriminative model is logistic regression or support vector machine (SVM). |
Distributed Learning | A type of machine learning where the training of a model is distributed across multiple computers. This can increase training speed and enable working with large data sets. |
Dropout | A regularization technique used in neural network training that randomly "turns off" some neurons during training, reducing the risk of overlearning. |
Early Stopping | An anti-overtraining technique where model training is stopped when performance on validation data begins to deteriorate. |
Ensemble Learning | A machine learning method where several models (called "weak learners") are combined to create a stronger predictive model ("strong learner"). |
Ensemble Learning | A machine learning method that combines predictions from multiple models (called "base learners") to create a final model that is often more accurate than individual models. |
Evolutionary Algorithms | A group of optimization algorithms inspired by the natural evolutionary process, such as genetic algorithm, evolutionary strategy, genetic programming, and others. |
Explainable AI (XAI) | An area of artificial intelligence that focuses on creating AI systems whose decisions can be understood and explained by humans. |
Feature Engineering | The process of creating new input features (features) for machine learning from original data to improve model performance. |
Federated Learning | A machine learning method that allows algorithms to be trained on multiple decentralized devices or servers holding local sample data, while learning and updating the model remains centralized. |
Few-shot Learning | A machine learning concept where the system learns from a very small number of example instances. |
Generative Adversarial Networks (GANs) | A type of artificial neural network that consists of two components: a generator that creates new data and a discriminator that learns to distinguish between real and generated data. |
Generative Model | A type of statistical machine learning model that is capable of generating new data that resembles the training data. An example of a generative model is a generative adversarial network (GAN). |
Gradient Boosting | An advanced machine learning method that improves models incrementally by building and combining simple models. The goal is to minimize the error using gradient descent. |
Grid Search | A method for finding optimal hyperparameters, where all possible combinations of parameters are tested and the one that produces the best results is selected. |
Hyperparameter Tuning | The process of selecting and tuning hyperparameters that control the learning process of a machine model. |
k-Nearest Neighbors (k-NN) | A supervised algorithm that classifies new examples based on their similarity to existing examples in the training data. |
Knowledge Graph | A structured form of knowledge representation that illustrates facts about entities and the relationships between them in the form of a graph. An entity can be any object or concept, while relationships are connections between these entities. Knowledge graphs are widely used in semantic search, recommender systems, natural language processing, and other tasks where thorough and comprehensive domain knowledge is key. |
Long Short-Term Memory (LSTM) | A special kind of recurrent neural network capable of learning long-term dependencies. They are key components of deep networks learning from sequences of data. |
Machine Learning (ML) | A subdiscipline of artificial intelligence that deals with the development of algorithms that enable computers to learn from experience (data) and adapt to new situations. |
Markov Decision Processes (MDP) | A mathematical framework for modeling decision making in situations where outcomes are partly random and partly under the control of the decision maker. MDP is the basis of many algorithms in reinforcement learning. Each MDP is defined by a set of states, actions, transition probabilities, rewards, and a discount factor. |
Meta-Learning | A machine learning method where a model learns to optimize its performance over many tasks. The goal is for the model to be able to quickly adapt to new tasks. |
Multilabel Classification | A type of classification task where each instance can belong to multiple classes at the same time. |
Named Entity Recognition (NER) | A process in natural language processing that identifies and classifies named entities in text, such as names of people, organizations, places, time expressions, and numerical values. |
Natural Language Processing (NLP) | An area of artificial intelligence that deals with the interaction between computers and human language, particularly the processing and analysis of large amounts of natural language. |
Natural Language Processing (NLP) | An area of AI focused on the interaction between computers and human (natural) languages. This includes understanding, generating and translating language. |
Neural Network | A mathematical model that mimics the workings of the human brain, allowing computers to learn from data. |
One-shot Learning | A machine learning concept where the system learns from a single example (or a very small number) of instances. |
Online Learning | A type of machine learning where the model is gradually updated with new data as it becomes available, instead of training on the entire data set at once. |
Outlier Detection | The process of identifying patterns in data that are significantly different from others. It is a common technique in fraud, system health and error detection. |
Overfitting | A problem that occurs when a machine learning model fits the training data too well, at the expense of its ability to predict new data. |
Policy Gradient Methods | A group of reinforcement learning algorithms that optimize policy parameters directly using gradient ascent. |
Principal Component Analysis (PCA) | A technique used to reduce the dimensionality of data while retaining as much information as possible. |
Q-learning | A reinforcement learning method where an agent learns the value of each action in a given state and uses those values to decide its next action. |
Random Forest | An ensemble machine learning method that uses many decision trees for prediction. The goal is to improve accuracy and prevent the overlearning problem that can occur with individual decision trees. |
Recommender System | A system that predicts user preferences and recommends products or services based on those preferences. |
Recurrent Neural Networks (RNNs) | A special type of neural network designed to handle sequences of data. RNNs are often used in natural language and time series processing tasks. |
Reinforcement Learning | A type of machine learning where an intelligent agent learns to optimize its behavior over time by taking certain actions in the environment to maximize some form of reward. |
Self-Supervised Learning | A machine learning method where a model generates labels from input data and then learns on that automatically labeled data. |
Semantic Segmentation | The process of dividing an image into different segments so that each pixel of the image is assigned to a specific class. This is often used in the field of computer vision. |
Semisupervised Learning | A machine learning method that combines supervised and unsupervised techniques. It is used when we have a large amount of unlabeled data and a small amount of labeled data. |
Sentiment Analysis | The process of using text analytics to identify and extract subjective information from a text source. It is often used to identify the author's tone, opinion, or attitude. |
Supervised Learning | A type of machine learning where a model is trained based on labeled data. That is, data that contains not only input information, but also correct answers. |
Support Vector Machine (SVM) | Supervised machine learning algorithm used for classification and regression. SVM works by finding the hyperplane (a line in 2D, a surface in 3D, etc.) that best separates the data into classes. |
Swarm Intelligence | A concept that is inspired by nature and is based on the collective behavior of decentralized, self-organized systems. Swarm intelligence is often used in optimization tasks. |
Tokenization | The process of dividing text into smaller units (tokens), which can be words, sentences, or single characters. This is a common step in data preprocessing for natural language processing tasks. |
Transfer Learning | A method of machine learning where a model that was originally trained on one task or one type of data is used and modified for another task. |
Transformers | A type of deep learning model that uses attentional mechanisms to efficiently process sequences of data. Transformers are often used in natural language processing tasks. |
Underfitting | A problem that occurs when a machine learning model is unable to adequately capture patterns in the training data, leading to poor results on both the training and test data. |
Unsupervised Learning | A type of machine learning where a model is trained on unlabeled data. The model tries to find patterns and structures in the data without prior knowledge of the correct answers. |
Word Embedding | A representation of words as vectors in an n-dimensional space that preserves the semantic and syntactic relationships between words. |