Named Entity Recognition (NER)

By:Janani R May 4, 2023| 10:30 AM Technology

Name Entity Recognition (NER) is a task of Natural Language Processing (NLP) that involves identifying and classifying named entities in a text into predefined categories such as person names, organizations, locations, and others. The goal of NER is to extract structured information from unstructured text data and represent it in a machine-readable format. Approaches typically use BIO notation, which differentiates the beginning (B) and the inside (I) of entities. O is used for non-entity tokens.[1]

Data classification and annotation are important for a wide range of applications such as autonomous vehicles, recommendation systems, and more. However, classifying data from unstructured data proves difficult for nearly all traditional processing algorithms. Named entity recognition (NER) is a language processor that removes these limitations by scanning unstructured data to locate and classify various parameters. Besides identifying person names, organizations, brands, etc. NER classifies dates and times, email addresses, and numerical measurements like money and weight. NER models thus facilitate data extraction workflows across industries.[2]

Named-entity recognition (NER), in general, (also known as entity identification or entity extraction) is a subtask of information extraction (text analytics) that aims at finding and categorizing specific entities in text, e.g., nouns. The phrase “Named Entity” was coined in 1996 at the 6th Message Understanding Conference (MUC) when the extraction of information from unstructured text became an important problem (Nadeau and Sekine, 2007). In the linguistic domain, Named Entity Recognition involves the automatic scanning through unstructured text to locate “entities,” for term normalization and classification into categories, e.g., as person names, organizations (such as companies, government organizations, committees.), locations (such as cities, countries, rivers) or date and time expressions (Mansouri et al., 2008).[3]

Figure .Named Entity Recognition (NER)1

Figure 1 shows Named Entity Recognition (NER) is a natural language processing task that involves identifying and classifying named entities in text into predefined categories, such as people, organizations, locations, dates, and so on.

NER can be useful in many applications, such as information retrieval, text summarization, sentiment analysis, and question answering. For example, in a news article, NER can be used to identify the names of people, organizations, and locations mentioned in the article, which can be used to extract relevant information and generate a summary of the article.

There are several approaches to NER, including rule-based systems, statistical models, and deep learning models. Rule-based systems use handcrafted rules and heuristics to identify named entities, while statistical models and deep learning models learn to identify named entities automatically from annotated training data.

One of the most popular deep learning models for NER is the Bidirectional LSTM-CRF (Conditional Random Field) model. This model consists of a bidirectional LSTM (Long Short-Term Memory) layer to capture the contextual information of the text, followed by a CRF layer to make the final NER predictions while considering the dependencies between neighboring named entities.

NER is a challenging task, as named entities can have complex and varying structures, and can be ambiguous or overlapping. However, with the advances in deep learning and the availability of large annotated datasets, NER has become a widely studied and useful tool in natural language processing.

References:

  1. [https://paperswithcode.com/task/named-entity-recognition-ner
  2. https://www.startus-insights.com/innovators-guide/natural-language-processing-trends/ - named-entity-recognition
  3. https://www.frontiersin.org/articles/10.3389/fcell.2020.00673/full

Cite this article:

Janani R (2023),Named Entity Recognition (NER), Anatechmaz, pp.234

Recent Post

Blog Archive