Machine Learning Techniques

Expert-defined terms from the Professional Certificate in Data Science in E-commerce course at Greenwich School of Business and Finance. Free to read, free to share, paired with a globally recognised certification pathway.

Machine Learning Techniques

Machine Learning Techniques #

Machine learning techniques are algorithms and statistical models that computer… #

These techniques enable computers to learn and improve from experience, making them more efficient at predicting outcomes and making decisions based on data. In the Professional Certificate in Data Science in E-commerce, machine learning techniques play a crucial role in analyzing customer behavior, predicting sales trends, and optimizing marketing strategies.

Supervised Learning #

Supervised learning is a type of machine learning technique where the algorithm… #

The algorithm is trained on input-output pairs, enabling it to make predictions or decisions when new data is presented. Examples of supervised learning algorithms include linear regression, decision trees, and support vector machines.

Unsupervised Learning #

Unsupervised learning is a machine learning technique where the algorithm learns… #

The algorithm identifies patterns and relationships in the data without explicit instructions, making it useful for clustering, dimensionality reduction, and anomaly detection tasks. Examples of unsupervised learning algorithms include k-means clustering, principal component analysis, and autoencoders.

Reinforcement Learning #

Reinforcement learning is a type of machine learning technique where an agent le… #

The agent receives feedback in the form of rewards or penalties based on its actions, allowing it to learn the optimal strategy to maximize its cumulative reward over time. Examples of reinforcement learning algorithms include Q-learning, deep Q-networks, and policy gradients.

Deep Learning #

Deep learning is a subset of machine learning techniques that use artificial neu… #

Deep learning algorithms have revolutionized fields such as image recognition, natural language processing, and speech recognition. Examples of deep learning architectures include convolutional neural networks, recurrent neural networks, and transformer models.

Neural Networks #

Neural networks are computational models inspired by the structure and function… #

These networks consist of interconnected nodes (neurons) organized in layers, where each neuron processes input data and passes its output to the next layer. Neural networks are used in deep learning to solve complex problems such as image classification, language translation, and game playing.

Convolutional Neural Networks (CNNs) #

Convolutional neural networks are deep learning architectures designed for proce… #

CNNs use convolutional layers to extract features from input data and pooling layers to reduce spatial dimensions. CNNs have achieved state-of-the-art performance in tasks like image recognition, object detection, and facial recognition.

Recurrent Neural Networks (RNNs) #

Recurrent neural networks are deep learning architectures designed for processin… #

RNNs have recurrent connections that allow information to persist across time steps, making them suitable for tasks like speech recognition, machine translation, and sentiment analysis. However, RNNs suffer from vanishing or exploding gradient problems.

Long Short #

Term Memory (LSTM):

Long Short #

Term Memory is a type of recurrent neural network architecture designed to address the vanishing gradient problem in traditional RNNs. LSTMs have memory cells that can store information for long periods, enabling them to learn dependencies in sequential data more effectively. LSTMs are widely used in applications like speech recognition, text generation, and time series forecasting.

Generative Adversarial Networks (GANs) #

Generative adversarial networks are deep learning architectures that consist of… #

The generator generates synthetic data samples, while the discriminator distinguishes between real and fake samples. GANs are used for tasks like image generation, data augmentation, and style transfer. However, training GANs can be challenging due to mode collapse and instability issues.

Autoencoders #

Autoencoders are neural network architectures designed for unsupervised learning… #

An autoencoder consists of an encoder that compresses input data into a latent representation and a decoder that reconstructs the original data from the latent representation. Autoencoders are used for tasks like image denoising, anomaly detection, and feature extraction.

Decision Trees #

Decision trees are a type of supervised learning algorithm that uses a tree #

like graph of decisions and their possible consequences. Each internal node represents a decision based on a feature, each branch represents the outcome of the decision, and each leaf node represents a class label. Decision trees are interpretable and used for tasks like classification, regression, and feature selection.

Random Forest #

Random forest is an ensemble learning technique that combines multiple decision… #

Each decision tree in the random forest is trained on a random subset of the training data and features. Random forest is used for classification, regression, and outlier detection tasks in areas like e-commerce, finance, and healthcare.

Support Vector Machines (SVM) #

Support vector machines are a type of supervised learning algorithm that finds t… #

SVMs maximize the margin between classes while minimizing classification errors, making them effective for binary classification tasks. SVMs are used in areas like image recognition, text classification, and bioinformatics.

K #

Nearest Neighbors (KNN):

K-nearest neighbors is a simple and intuitive supervised learning algorithm that… #

KNN is a non-parametric algorithm that does not make assumptions about the underlying data distribution. KNN is used for classification, regression, and anomaly detection tasks in various domains.

Principal Component Analysis (PCA) #

Principal component analysis is an unsupervised learning technique used for dime… #

PCA transforms high-dimensional data into a lower-dimensional space while preserving the most important information. PCA is used to identify patterns, outliers, and relationships in data and is commonly applied in fields like finance, marketing, and biology.

K #

Means Clustering:

K-means clustering is an unsupervised learning algorithm that partitions data in… #

The algorithm assigns data points to clusters by minimizing the sum of squared distances between data points and their cluster centroids. K-means clustering is used for customer segmentation, anomaly detection, and image compression tasks in e-commerce and other industries.

Association Rule Mining #

Association rule mining is a data mining technique that discovers interesting re… #

The Apriori algorithm is a popular method for finding frequent itemsets and generating association rules based on support and confidence measures. Association rule mining is used for market basket analysis, recommendation systems, and personalized marketing campaigns.

Gradient Descent #

Gradient descent is an optimization algorithm used to minimize the cost function… #

The algorithm iteratively updates the model parameters in the direction of the steepest descent of the cost function gradient. Gradient descent is used in training neural networks, linear regression, and logistic regression models to find the optimal set of parameters that minimize prediction errors.

Hyperparameter Tuning #

Hyperparameter tuning is the process of selecting the best set of hyperparameter… #

Hyperparameters are parameters that are set before the learning process begins and affect the learning process itself. Techniques like grid search, random search, and Bayesian optimization are used to search for the optimal hyperparameters in models like neural networks, support vector machines, and gradient boosting machines.

Overfitting #

Overfitting is a common problem in machine learning where a model performs well… #

Overfitting occurs when a model is too complex and captures noise in the training data rather than the underlying patterns. Techniques like cross-validation, regularization, and early stopping are used to prevent overfitting and improve the generalization ability of machine learning models.

Underfitting #

Underfitting is the opposite of overfitting, where a model is too simple to capt… #

Underfitting occurs when a model is not able to learn from the training data effectively, resulting in high bias and low variance. Techniques like increasing model complexity, adding more features, and using more powerful algorithms are used to address underfitting and improve model performance.

Cross #

Validation:

Cross #

validation is a technique used to evaluate the performance of machine learning models by splitting the data into multiple subsets for training and testing. Common cross-validation methods include k-fold cross-validation, leave-one-out cross-validation, and stratified cross-validation. Cross-validation helps assess the model's generalization ability and reduce the risk of overfitting on the training data.

Feature Engineering #

Feature engineering is the process of creating new features or transforming exis… #

Feature engineering involves tasks like encoding categorical variables, scaling numerical features, handling missing data, and creating interaction terms. Effective feature engineering can enhance model accuracy, reduce training time, and improve interpretability.

Natural Language Processing (NLP) #

Natural language processing is a subfield of artificial intelligence that focuse… #

NLP techniques are used for tasks like sentiment analysis, text classification, machine translation, and speech recognition. Common NLP tools and libraries include NLTK, spaCy, Transformers, and BERT.

Deep Reinforcement Learning #

Deep reinforcement learning combines deep learning with reinforcement learning t… #

Deep reinforcement learning algorithms, such as deep Q-networks and policy gradients, have achieved human-level performance in games like Go, chess, and video games. Deep reinforcement learning is used in robotics, autonomous vehicles, and recommendation systems.

Transfer Learning #

Transfer learning is a machine learning technique where a model trained on one t… #

Transfer learning leverages the knowledge learned from the source task to improve performance on the target task with limited labeled data. Transfer learning is used in areas like image recognition, natural language processing, and speech synthesis to accelerate model training and improve generalization.

Bayesian Optimization #

Bayesian optimization is a global optimization technique that uses probabilistic… #

Bayesian optimization balances exploration and exploitation to efficiently search for the best hyperparameters in a high-dimensional space. Bayesian optimization is used to tune hyperparameters in models like neural networks, support vector machines, and gradient boosting machines.

Ensemble Learning #

Ensemble learning is a machine learning technique that combines multiple models… #

Ensemble methods like bagging, boosting, and stacking leverage the wisdom of crowds to make better predictions than individual models. Ensemble learning is used in tasks like classification, regression, and anomaly detection to reduce variance, bias, and overfitting.

Time Series Analysis #

Time series analysis is a statistical technique used to analyze and forecast tim… #

Time series data exhibits temporal dependencies and trends that can be captured using methods like autoregressive integrated moving average (ARIMA), exponential smoothing, and long short-term memory (LSTM) networks. Time series analysis is used in e-commerce for sales forecasting, inventory management, and anomaly detection.

Feature Selection #

Feature selection is the process of identifying the most relevant features in th… #

Feature selection methods like filter, wrapper, and embedded techniques help reduce dimensionality, improve model interpretability, and speed up training. Feature selection is essential for building efficient and accurate machine learning models in e-commerce and other domains.

Anomaly Detection #

Anomaly detection is a machine learning technique that identifies data points th… #

Anomaly detection algorithms like isolation forests, one-class SVM, and autoencoders are used to detect outliers, fraud, and unusual patterns in e-commerce transactions, network traffic, and sensor data. Anomaly detection helps improve security, quality control, and risk management in various industries.

Model Evaluation Metrics #

Model evaluation metrics are quantitative measures used to assess the performanc… #

Common evaluation metrics include accuracy, precision, recall, F1 score, area under the receiver operating characteristic curve (AUC-ROC), and mean squared error (MSE). Model evaluation metrics help compare different models, optimize hyperparameters, and make informed decisions in e-commerce applications.

Hyperparameter Optimization #

Hyperparameter optimization is the process of searching for the best hyperparame… #

Hyperparameter optimization techniques like grid search, random search, Bayesian optimization, and genetic algorithms help fine-tune model parameters and improve prediction accuracy. Hyperparameter optimization is crucial for building robust and efficient machine learning models in e-commerce and other domains.

Model Interpretability #

Model interpretability is the ability to explain how a machine learning model ma… #

Interpretable models like decision trees, linear regression, and logistic regression provide insights into feature importance, model behavior, and prediction rationale. Model interpretability is essential for building trust, gaining insights, and meeting regulatory requirements in e-commerce and other industries.

May 2026 cohort · 29 days left
from £99 GBP
Enrol