As data itself has become currency, the metadata describing it — and what happens to it — has also emerged as a core asset of modern business. Metadata interweaves itself throughout all information; like DNA, it serves as the genetic makeup of data. So even though metadata may not be the most obvious data created, it holds tremendous value in unlocking and exploiting the value of enterprise information.
To that end, an organized process of enterprise metadata management (EMM) can offer benefits beyond the structured world of information. It can tap into unstructured data, and it can bring to light valuable information from unconventional data sources, like social media and streaming data, too.
EMM is the process of organizing and harnessing this asset to optimize business operations. As Gartner research director Guido De Simoni notes, “[metadata management] is about a distinct organizational discipline leveraging metadata across programs.”
Key areas in which EMM plays a role are:
- Metadata Repositories
- Business Glossary
- Data Lineage
- Impact Analysis
- Rule Management
- Metadata Ingestion and Transformation
Metadata repositories are containers for metadata and are critical for maximizing its potential. The process of profiling data — structured and unstructured, big data and dark data — extracts metadata, which needs to find itself in one of the three types of metadata repository architectures described by TechTarget as: centralized, distributed, and federated.
Centralized repositories offer a single data store for metadata culled from profiled data. Distributed repositories access data in real-time. Federated repositories attempt to leverage the strengths of both the centralized and distributed systems while minimizing the risks of their weaknesses by accessing metadata sources in real-time, but centralizing metadata definitions and locations to increase system efficiency.
Regardless of their type, metadata repositories provide a vital infrastructure for compiling metadata and creating a comprehensive business glossary. Information leaders should evaluate these architectures, along with their strengths and weaknesses, to find which structures best suit their business environments.
“Without relationships, the data is only of limited use,” writes De Simoni, on the topic of big data. A definitive business glossary is an indispensable resource for establishing the relationships between vast amounts of information from disparate sources. Such a glossary can tie information to specific databases, graphs, models, etc., and easily distinguish relationships between sets of information.
Further, enterprise-established definitions help to avoid ambiguity when discussing business-sensitive information by clearly defining terms in the context of the business lexicon. Word specificity mitigates the risk of costly human errors caused by misaligned views on vocabulary. Set definitions can identify the meaning of “client,” compare and contrast it with “customer,” and create a mutual understanding among departments to help avoid vague, unintentional, or complicated communication.
Metadata management makes the process of building the business lexicon simpler. Although a glossary can be compiled manually, maintaining it becomes arduous when the number of definitions grows into the hundreds or thousands. Analyzing metadata enables accurate, holistic descriptions of terms through immediate sourcing and identification.
“Business metadata is all about adding context to data,” writes Bonnie O’Neil, president of Westridge Consulting and an internationally recognized expert on data warehousing and business rules. “A Dictionary or Glossary is a part of business metadata, and it is all about making meaning explicit and providing definitions to business terms, data elements, and abbreviations.”
Data lineage traces the origins of data and its movement through a life cycle. However, the term may “also describe what happens to data as it goes through diverse processes,” De Simoni remarks. Capturing metadata consolidates and analyzes information more efficiently so that critical intelligence can be tracked with precision.
More specifically, lineage metadata reveals data formats, locations, users, and stewards. It shows the journeys information underwent to reach its current state, and who accessed and manipulated it along the way. This is particularly valuable in data reuse and auditing contexts.
Thus EMM in this context is not only about having clean repositories of information, but also exploiting the repositories to catalogue information pedigree and augment business processes.
Impact analysis is a form of business premonition. It is a technique used to assess the impacts of possible changes to an existing business structure, and identify weaknesses, threats, or problems before they manifest. By anticipating possible outcomes, information managers can develop contingency plans to avert implementation problems. Performing an impact analysis may also expose unapparent cross-dependencies of information, thus allowing for more accurate predictions and courses of action.
Metadata management supports impact analysis by maintaining an account of company information and facilitating data visibility. Having the maximum amount of information about their data allows managers to make the best decisions during the impact analysis process, creating more reliable plans and improving business outcomes.
Business rules are also forms of metadata. They provide context for business operations and describe how processes should execute with basic if-then functions with true-false outcomes. For example, if a customer has patronized a business for more than a year, then give that customer a $20 credit for his/her first transaction after the one year-mark. Business rules, however, have their own accompanying metadata that further describes them.
Specific terms used in the rules can be imported to the business glossary. Outcomes of potential rule changes can be evaluated through impact analyses, while metadata lineage tracks how the rules have developed and who governs them. These processes provide critical information toward achieving overall business goals.
Metadata Ingestion and Transformation
“Data ingestion is the process of obtaining, importing, and processing data for later use or storage in a database,” as defined by TechTarget. Transformation is the process of converting the original data format into a type easily stored and accessible in databases. Metadata improves these actions by consolidating heterogeneous data sources and acting as a liaison between them. However, a company must first have clean and accurate metadata enterprise-wide to benefit from these operations. Once accomplished, an ingestion solution that transforms metadata into database-friendly formats can unlock the potential of information.