By Joe Hilleary
Published on April 12, 2022
In 2002, Capital One became the first company to appoint a Chief Data Officer (CDO). Since then, many other companies have added a CDO to their c-suite, and the responsibilities of the role have grown.
Early CDOs largely sought to ensure compliance with regulations around financial data, taking a defensive posture to guard company and customer information. Two decades on, the role has expanded to include responsibility for analytics, and even data monetization. As the role’s obligations have increased, so too has its prevalence. Between 2010 and 2018 the number of CDOs present in Fortune 1500 companies increased nearly 8-fold.
Today, the modern CDO drives the data strategy for the entire organization. The individual initiatives that make up a data strategy may, at times, seem at odds with one another, but tools, such as the enterprise data catalog, can help CDOs in striking the right balance between facilitating data access and data governance.
A comprehensive data strategy must balance two core impulses: making data useful and keeping it safe. On the one hand, facilitating access to data through data democratization allows a company to become more data driven and derive more value from its data. More people can connect to useful data faster, and IT doesn’t have to play data middleman to the masses.
On the other, a data free-for-all creates risk—not only of data leaks with regulatory penalties, but also of inefficiency, as the same work and assets may be duplicated across the organization. So how do you strike the right balance between playing offense and defense with data? As with most things in life, the answer is somewhere in the middle. A modern data catalog, which promotes both governance and access, can aid in many of the tasks required of today’s CDO.
Although the exact list of CDO responsibilities varies dramatically between organizations and industries, the five duties highlighted below are common across the board:
Many businesses view data as a cost center. It’s expensive to collect and maintain, so unless it drives business actions that generate revenue, it drags on the bottom line. Data creates value when it informs decisions by businesspeople. Because of their domain knowledge and business expertise, these individuals understand what data is important and how it impacts the business. Unfortunately, connecting businesspeople’s expertise with data has historically been difficult because they are not involved in the data management process.
A data catalog enables these individuals to answer questions and provide commentary on the data assets with which they’re most familiar. Their input can help data teams prioritize which assets to develop further and prevent a mismatch between actual business data needs and what data teams assume the business needs. By bringing domain experts directly to the data, a catalog reduces the cycle time for data projects and enables businesspeople to identify new opportunities from the data more rapidly, spurring innovation and creating more business value.
Many companies have tasked their CDOs with enabling business users to perform their own analytics. Even when users are well versed in their preferred business intelligence (BI) tool, however, finding and accessing the right data assets continues to represent a key hurdle.
CDOs can leverage data catalogs to connect people to the right data for a given task. In the same way a card catalog indexes all books and periodicals in a library, a data catalog indexes all data throughout the enterprise. This makes it easier for businesspeople to find the right data, understand its meaning, send questions to the owner, and read comments and observations about the data from people who previously used it.
This ability to “browse” for data and gauge its fitness empowers users to truly serve themselves. Viewed from that perspective, the most important feature of a data catalog is a search bar. This enables business users to search for data assets using keywords and text (i.e., natural language). A search bar is an ideal interface for business users with limited or no technical knowledge. It makes the process of looking for data assets as familiar as shopping on Amazon.
A perennial headache for CDOs is the continued proliferation of data silos. Different lines of business or function within the organization often maintain their own data environments, creating insular bubbles. In a siloed data landscape, incongruent terms and definitions proliferate. The same business metrics may have different values depending on which team you ask. Furthermore, groups operating in a silo are unable to benefit from the data, insights, and expertise of other teams. Without interaction, no opportunity exists for synergy.
An enterprise data catalog contains listings of data assets from across the entire organization. Depending on data access controls, business users can discover tables from multiple lines of business. In addition, data curators can certify data sets and metrics within the data catalog so all business users use data in a consistent manner.
Even as CDOs have taken on other tasks, maintaining the privacy of sensitive data has remained a key responsibility. A large part of protecting data involves regulating access. Not everyone should be able to see or use all of an organization’s data. As an example, there are legitimate reasons that particular roles might need to view employee social security numbers, but in general, the fewer, the better.
Through serving as a centralized conduit for discovering and requesting access to data, a data catalog provides CDOs and their data governance teams with information and metadata to determine which people should see and access what data. Furthermore, many data catalogs can automatically flag sensitive data during their initial survey of company data assets, allowing governance teams to proactively create access protocols for those tables.
Finally, CDOs are often accountable for maintaining and improving data quality. As the volume of data assets under management has increased, data teams have struggled to keep up. This has led to mistrust of data as businesspeople find errors in their reports and dashboards over and over again.
A data catalog, though not itself a data quality tool, helps in two ways. Integrations with dedicated data quality platforms allow data catalog search functions to surface higher quality data first. Data catalogs can promote those assets with high scores for key data quality indicators. Even without integration, usage metrics and company certifications can guide catalog users to the highest quality versions of different data assets.
Moreover, analysis of usage metrics can help beleaguered data teams prioritize their efforts, showing them which tables to focus their data quality efforts on.
Beyond the concrete implementation of a data strategy, CDOs often have to foster a data culture. In practice this means increasing the reliance on and understanding of data at every level of the organization. This begins with education, knowledge sharing, and promoting data literacy. A data catalog that captures tribal knowledge about specific assets will empower data newcomers with crucial context about an asset’s origin, use cases, and best practices. Wiki-like articles, in which subject-matter experts add their advice on how to best leverage an asset, will help less technical users leverage data wisely.
At the same time, it brings a wide swath of stakeholders together. Businesspeople, data analysts, data stewards, data scientists, and IT debuggers, along with many other personas, can all use the same enterprise data catalog. Although many of these roles might never otherwise interact, within a catalog platform they can learn from one another and build a data culture.
Ultimately, however, the success or failure of a CDO depends not on how well they execute a data strategy or create a data culture, but rather on the impact those interventions have on the business as a whole.
CDOs must demonstrate a return on investment in data. CDOs that can prove their value to the business are more likely to outlast the historically short tenures which have plagued this role over time. To truly succeed (and increase investment in the data department), CDOs must link their own efforts to revenue opportunities, loss avoidance, and threat elimination. Those who can link top to bottom-line revenue not only prove their impact, they build credibility for the entire data operation.