By Jason Rushin
Published on July 19, 2021
Metadata is information about data. A clothing catalog or dictionary are both examples of metadata repositories. Indeed, a popular online catalog, like Amazon, offers rich metadata around products to guide shoppers: ratings, reviews, and product details are all examples of metadata.
Folks who work closely with data, like analysts, data scientists, and IT teams, rely on metadata to give them crucial context for how to use a given asset. Today, metadata is extremely helpful in classifying, describing, and providing critical information about digital data.
Yet not all forms of metadata are created equal. For metadata to be useful, organizations need to understand how to best identify, capture, and share metadata with their workers. That’s where metadata management best practices can help.
Co-author of Knowledge and Dignity in the Era of “Big Data”
Every organization is swimming in data, which makes finding the right data a challenge. But there is a way to catalog and classify data that is mind blowing: it’s data…about data!
Yes, we’re talking about metadata, or information that describes other data. For enterprise data, metadata, and effective metadata management, is a critical component of a good data management strategy. By providing information on the underlying data, metadata enables organizations to manage, govern, and utilize the data in effective, appropriate ways.
Metadata management best practices ensure accurate metadata is accessible to all who need it across the enterprise. This requires a metadata management solution to enable data search & discovery and data governance, both of which empower access to both the metadata and the underlying data to those who need it. In today’s world, metadata management best practices call for a data catalog.
Metadata helps users find the data they need, see an inventory of available data, and evaluate the data’s fitness for intended uses. Metadata includes things like:
Descriptive information. This includes information on title, purpose, creation date, and creator. It’s useful for cataloging and discovering data, and identifying which data is most appropriate for use.
Structural information. This includes how data is formatted, with information on tables, pages, types, and relationships. It’s useful for understanding how data is organized, determining if it can be combined with other data, and enabling relevant data discovery.
Administrative information. This includes access permissions, locations, file size, and ownership information. It’s useful for data governance, compliance, controls, and data management.
Reference information. This includes information on quality, sources, processes used, schemas, and formulas. It’s useful in determining how data can be utilized.
High volumes of data and metadata are a growing problem. To manage the sheer volume of metadata, a new category has emerged called active metadata. Guido De Simoni, senior director at Gartner, a global research and advisory firm, states, “The metadata management market made a dramatic shift beginning in 2020, and its primary focus is now active metadata.”
Active metadata signals a shift from manual processes to automated. Artificial intelligence and machine learning (AI and ML) are removing some of the burden of manual metadata management, which has grown too cumbersome for people to manage alone. Data intelligence integrates intelligence derived from active metadata into categories like data quality, governance, and profiling.
For metadata to be useful, it needs to be accessible, searchable, and usable. This requires a metadata management process. But where to begin? Here are 5 metadata management best practices that can enable the effective, sustainable, and beneficial use of metadata across the enterprise.
The success of any initiative requires a dedicated team, and metadata management is no different. An administration team will develop an organization’s metadata management process and metadata strategy, coordinate the rollout of metadata management processes and policies, and guide the selection of a metadata management tool.
The metadata administration team should have experience in data management, data governance, and the organization’s overall data landscape. They should also have the business acumen to connect a metadata management strategy with the organization’s data and business strategies.
Defining a metadata strategy requires an organization to consider its data goals. Leadership may wish to gain control of mountains of data, instill a data culture, enable faster, more agile, and more accurate decision-making, or something else entirely.
The metadata strategy should take those goals into account and provide direction so they are reached effectively. The metadata strategy should also consider:
The metadata required
Where it’s located
Any technical or infrastructure hurdles to overcome
How the metadata will be acquired and accessed later
Where it will be stored
Who will be responsible for its ongoing maintenance
Every organization should adopt a set of metadata standards to ensure uniformity. Such standardization of the metadata will serve as the basis for a metadata management process. There are commonly accepted metadata standards, such as the Dublin Core Metadata Element Set and the related ISO 158369 standard, which establish core properties for describing metadata resources.
A dedicated metadata management tool allows organizations to collect and utilize metadata. Such a tool typically takes the form of a data catalog, which enables easy storage and searching of metadata. It can even leverage artificial intelligence and machine learning to automatically capture and categorize the metadata. Advanced metadata management tools have capabilities covering metadata management processes, policies, and data governance needs.
Once the above metadata management best practices are in place, it’s time to roll the metadata management strategy out to the entire organization. This can take a phased approach, covering specific organizational departments or types of data.
A common metadata management best practice is to bring industry experts into the process early to help ensure the underlying metadata management strategy is well designed and sustainable. It’s also important to continuously improve, adjust, and update the metadata management strategy, metadata management processes, and related policies, standards, and more.
One of the top sources for information on metadata management best practices is the Gartner Magic Quadrant for Metadata Management Solutions. This document describes the market and provides deep details on the strengths and cautions of nearly 20 data catalog and metadata management solution vendors.
Gartner Magic Quadrants rank solution vendors in four categories: Leader, Challenger, Visionary, and Niche Players. Alation has been recognized as a Leader in the Gartner Magic Quadrant for Metadata Management Solutions for, among other things, its market visibility and traction, innovation with machine learning, and focus on active metadata and collaboration. This is the fourth consecutive year Alation has been recognized as a Leader.
The Alation Data Catalog provides a platform for metadata management best practices. By using its repository of metadata on information sources from across the enterprise, including data sets, business intelligence reports, visualizations, and conversations, the catalog helps people quickly find and understand data to improve analytics, data governance, privacy, cloud transformation, and more.
It dramatically improves the productivity of analysts, increases the accuracy of analytics, and enables confident data-driven decision making while empowering workers to find, understand, and govern data.
Metadata is information about data.
Metadata enables organizations to manage, govern, and utilize the data in effective, appropriate ways.
The metadata management best practices include assigning a metadata administration team, defining a metadata strategy, adopting metadata standards, deploying a metadata management tool, and expanding the metadata management strategy across the organization.