Linear Discriminant analysis

By: Gokula Nandhini K May 08, 2023 | 03:15 PM Technology

Linear Discriminant Analysis (LDA) is a supervised learning algorithm used for classification tasks in machine learning. It is a technique used to find a linear combination of features that best separates the classes in a dataset.

LDA works by projecting the data onto a lower-dimensional space that maximizes the separation between the classes. It does this by finding a set of linear discriminants that maximize the ratio of between-class variance to within-class variance. In other words, it finds the directions in the feature space that best separate the different classes of data.

LDA assumes that the data has a Gaussian distribution and that the covariance matrices of the different classes are equal. It also assumes that the data is linearly separable, meaning that a linear decision boundary can accurately classify the different classes. [1]

Figure 1. Linear Discriminant analysis

Discriminant analysis is shown in figure 1. Linear Discriminant analysis is one of the most popular dimensionality reduction techniques used for supervised classification problems in machine learning. It is also considered a pre-processing step for modeling differences in ML and applications of pattern classification.

Drawbacks of Linear Discriminant Analysis (LDA)

Although, LDA is specifically used to solve supervised classification problems for two or more classes which are not possible using logistic regression in machine learning. But LDA also fails in some cases where the Mean of the distributions is shared. In this case, LDA fails to create a new axis that makes both the classes linearly separable.

To overcome such problems, we use non-linear Discriminant analysis in machine learning. [2]

Example of LDA

Consider another simple example of dimensionality reduction and feature extraction, you want to check the quality of soap based on the information provided related to a soap including various features such as weight and volume of soap, peoples’ preferential score, odor, color, contrasts, etc.

A small scenario to understand the problem more clearly;

  1. Object to be tested -Soap;
  2. To check the quality of a product- class category as ‘good’ or ‘bad’( dependent variable, categorical variable, measurement scale as a nominal scale);
  3. Features to describe the product- various parameters that describe the soap (independent variable, measurement scale as nominal, ordinal, internal scale);

When the target variable or dependent variable is decided then other related information can be dragged out from existing datasets to check the effectivity of features on the target variables. [3]

LDA is a useful algorithm for classification tasks, particularly when dealing with high-dimensional datasets. However, its assumptions about the data can limit its applicability in certain scenarios. It is always important to carefully evaluate the assumptions of any machine learning algorithm before using it for a particular problem.

References:

  1. https://www.geeksforgeeks.org/ml-linear-discriminant-analysis//
  2. https://www.javatpoint.com/linear-discriminant-analysis-in-machine-learning
  3. https://www.analyticssteps.com/blogs/introduction-linear-discriminant-analysis-supervised-learning

Cite this article:

Gokula Nandhini K (2023) Linear Discriminant analysis, Anatechmaz, pp.74

Recent Post

Blog Archive