Types and Benefits of Data Cataloging

Thanusri swetha J October 17, 2021 | 10:45 AM Technology

A data catalog is a detailed inventory of all data assets in an organization, designed to help data professionals quickly find the most appropriate data for any analytical or business purpose. [1]

Figure 1. The Types and Benefits of Data Cataloging

Figure 1 shows A data catalog links data with the assets that make it meaningful — documentation, queries, history, glossaries, etc. By combining metadata with data management, governance, and search capabilities, a data catalog helps a company organize its data, discover the right data assets, and evaluate if an asset is right for a specific use case. [3]

Data catalog benefits:

  • Better understanding of data through improved context:
    Analysts can find detailed descriptions of data, including comments from other data citizens, and better understand how data is relevant to the business.
  • Increased operational efficiency:
    A data catalog creates an optimal division of labor between users and IT—data citizens can access and analyze data faster, and IT staff can spend more time focusing on high-priority tasks.
  • Reduced risk:
    Analysts have greater confidence that they’re working with data they’re authorized to use for a given purpose, in compliance with industry and data privacy regulations. They can also quickly review annotations and metadata to spot null fields or incorrect values that can impact analysis.[4]

Types of data catalogs:

When it comes to organizing big data, there’s no such thing as a one-size-fits-all approach. Gartner identifies three distinct subcategories of data catalogs, so you can determine which type is right for your company’s situation:

  • Tool-specific or vendor data catalogs
    These data catalogs may be delivered as part of a cloud-based data lake, data preparation tool, or Hadoop distribution. This method requires little input on the part of the organization, but has its limits, since you may end up with multiple data catalogs as your list of vendors grows. This makes it more laborious when it comes time to plug in a BI solution and set up your single source of truth.
  • Data catalogs specifically meant for data lakes
    This type of data catalog is used primarily by data scientists and data engineers. This type of use case, while thorough, has limited adaptability across the organization and doesn’t easily allow for business users to access the data and leverage it for their own digital initiatives.
  • Enterprise data catalogs for analysis and teamwork
    Gartner defines these as “generalist, business-oriented data catalogs for broader use in information governance and infonomics – targeted at the Chief Data Officer (CDO).” [2]
References:
  1. https://www.ibm.com/topics/data-catalog
  2. https://www.sisense.com/glossary/data-cataloging/
  3. https://atlan.com/what-is-a-data-catalog/
  4. https://www.ibm.com/topics/data-catalog
Cite this article:

Thanusri swetha J (2021), Types and Benefits of Data Cataloging, Anatechmaz, pp. 36

Recent Post

Blog Archive