By Aaron Kalb
Published on February 13, 2020
Meet TrustCheck: Your Spell Check for SQL or BI. Just like Spell Check prompts you as you’re writing—or like Waze nudges you as you’re driving—TrustCheck guides you to desired behavior while ensuring accuracy as you’re creating a query or a dashboard.
With TrustCheck, data analysts see color-coded visual cues whenever they use a questionable source, right in their natural workflow in real-time, whether they’re working in Alation Compose, in Tableau or in SalesForce Einstein Analytics. As a result, self-service analytics users instantly know whether the data asset or query logic they’re looking at is trustworthy or not. Alation’s TrustCheck technology enables a new and modern approach to agile data governance. TrustCheck is helping customers improve analytics behavior and ensure compliance, without restricting analytical agility. And, it recently received the 2018 Digital Innovation Award for Big Data from Ventana Research.
I had the pleasure of chatting with Peter Burris of theCUBE about all things TrustCheck. Tune in to our conversation to learn about our award-winning technology. A transcript is also available below.
Peter: Hi, I’m Peter Burris, and welcome to another CUBE Conversation from theCUBE Studios in beautiful Palo Alto, California. Got a great conversation today. We’re going to be talking about some of the new advances that are associated with big data analytics and improving the rate at which human beings, people who actually work with data, can get more out of their data, be more certain about their data, and improve the social system that actually is dependent upon data. To do that, we’ve got Aaron Kalb of Alation here with us. Aaron is the co-founder and is VP of design and strategic initiatives. Aaron, welcome back to theCUBE.
Aaron: Thanks so much for having me, Peter.
Peter: So, then, let’s start this off. The concern that a lot of folks have when they think about analytics, big data, and the promise of some of these new advanced technologies is they see how they could be generating significant business value, but they observe that it often falls short. It falls short for technological reasons, you know, setting up the infrastructure is very, very, difficult. But we’ve started solving that by moving a lot of these workloads to the cloud. They also are discovering that the toolchains can be very complex, but they’re starting to solve that by working with companies with vision, like Alation, about how you can bring these things together more easily.
There are some good things happening within the analytics space, but one of the biggest challenges is, even if you set up your pipelines and your analytics systems and applications right, you still encounter resistance inside the business, because human beings don’t necessarily have a natural affinity for data. Data is not something that’s easy to consume, it’s not something easy to recognize. People just haven’t been trained in it. We need more that makes it easy to identify data quality, data issues, et cetera. Tell us a little bit about what Alation’s doing to solve that human side, the adoption side of the challenge.
Aaron: That’s a great point and a great question, Peter. Fundamentally, what we see is it used to be a problem of quantity. There wasn’t enough ability to generate data assets, and to distribute them, and to get to them. Now, there’s just an overwhelming amount of places to gather data. The problem becomes finding development data for your need, understanding and putting it into context, and most fundamentally, trusting that it’s actually telling you a true story about the world. You know, what we find now is, as there’s been more self-service analytics, there’s more and more dashboards and queries and content being generated, and often an executive will look at two different answers to the same question that are trending in totally different directions. They’ll say, “I can’t trust any of this. On paper, I want to be data-driven but in actuality, I’m just going to go back to my gut, ‘cause the data is not always trustworthy, and it’s hard to tell what’s trustworthy and what’s not.”
Peter: This is, even after they’ve found the data and enough people have been working on it to say, to put it in context to say, “Yes, this data is being used in marketing,” or, “This data has been used in operations production.” There’s another layer of branding or what not that we can put on data that says, “This data is appropriate for use in this way.” Is that what we’re talking about here?
Aaron: Absolutely right. To help with finding and understanding data, you can group it and make it browsable by topic. You can enable keyword search over it in that natural language. That’s stuff that Alation has done in the past. What we’re excited to unveil now is this idea of TrustCheck, which is all about saying, wherever you’re at in that data value chain of taking raw data and schematizing it and eventually producing pretty dashboards and visualizations, that at every step, we can ensure that only the most trustworthy data sets are being used, because any problem upstream flows downstream.
Peter: TrustCheck, it’s something that comes out of Alation. Is it also being used with other visualization tools or other sources or other applications?
Aaron: That’s a great question. It’s all of the above. TrustCheck starts with saying, if I’m an analyst who wants to create a dashboard or a visualization, I’m going to have to write some SQL query to do that. What we’ve done in that context with Alation Compose, is our home-grown SQL tool, is provided a tool, and TrustCheck kind of gets its name from spell check. It used to be there was a dictionary, and you could look it up by hand, and you could look it up online, but that’s a lot of work for every single word to check it. And then, you know, Microsoft, I think, was the first innovative company saying, “Oh, let’s put a little red squiggle that you can’t miss right in your workflow as you’re writing, so you don’t have to go to it, it comes to you.”
We do the exact same thing. I’m about to query a table that is deprecated or has a data quality issue. I immediately see bright red on my screen, can’t miss it, and I can fix my behavior. That’s as I’m creating a data asset. We also, through our partnerships with Salesforce and with Tableau, each of whom have very popular visualization tools, to say if people are consuming a dashboard, not a SQL query, but looking at a Tableau dashboard or a visualization in Salesforce Einstein Analytics, what would it mean to badge right there and then, put a stamp of approval on the most trustworthy sources and a warning or caveat on things that might have an upstream data quality problem.
Peter: So, when you say warning or caveat, you’re saying literally that there are exceptions or there are other concerns associated with the data, and reviewing that as part of the analytic process.
Aaron: That’s exactly right. Much like, again, spell check underlines, or looking at, if you think about if I’m driving in my car with Waze, and it says, “Oh, traffic up ahead, view route this way.” What does it mean to get in the user interface where people live, whether they’re a business user in Salesforce or Tableau, or a data analyst in a query tool, right there in their flow having onscreen indications of everything happening below the tip of the iceberg that affects their work and the trustworthiness of the data sets they’re using.
Peter: So that’s what it is. I’ll tell you a quick story about spell check.
Aaron: Please.
Peter: Many years ago, I’m old enough that I was one of the first users of some of these tools. When you typed in IBM, Microsoft Word would often change it to DUM, which was kind of interesting, given the things that were going on between them. But it leads you to ask questions. How does this work? I mean, how does spell check work? Well, how does TrustCheck work, because that’s going to have an enormous implication. People have to trust how TrustCheck works. Tell us a little bit about how TrustCheck works.
Aaron: Absolutely. How do you trust TrustCheck? The little red or yellow or bright, salient indicators we’ve designed are just to get your attention. Then, as a user, you can click into those indicators and see why is this appearing. The biggest reason that an indicator will appear in a TrustCheck context is that a person, a data curator or data steward, has put a warning or a deprecation on the data set. It’s not, you know, oh, IBM doesn’t like Microsoft, or vice versa.
You know, you can see the sourcing. It isn’t just, oh, because Merriam-Webster says so. It emerges from the logic of your own organization. But now Alation has this entire catalog backing TrustCheck where it gives a bunch of signals that can help those curators and stewards to decide what indicators to put on what objects. For example, we might observe, this table used to be refreshed frequently. It hasn’t in a while. Does that mean it’s ripe for getting a bit of a warning on it? Or, people aren’t really using this data set. Is there a reason for that? Or, something upstream was just flagged having a data quality issue. That data quality issue might flow downstream like pollution in a creek, and that can be an indication of another reason why you might want to label data as not trustworthy.
Peter: In Alation context with Salesforce and Tableau partners, and perhaps some others, this TrustCheck ends up being a social moniker for what constitutes good data that is branded as a consequence of both technological as well as social activities around that data captured by Alation. I got that right?
Aaron: That’s exactly right. We’re taking technical signals and social signals, because what happens in our customers today before we launched TrustCheck, what they would do is, if you had the time, you would phone a friend. You’d say, “Hey, you seem to be data-savvy. Does this number look weird to you? Do you know what’s going on? Is something wrong with the table that it’s sourced from?” The problem is, that person’s on vacation, and you’re out of luck. This is saying, let’s push everything we know across that entire chain, from the rawest data to the most polished asset and have all that information pushed up to where you live in the moment you’re making a decision, should I trust this data, how should I use it.
Peter: In the whole, going back to this whole world of big data and analytics, we’re moving more of the workloads to the cloud to get rid of the infrastructure problems. We’re utilizing more integrated toolchains to get rid of the complexity associated with a lot of the analytic pipelines. How does TrustCheck then applied, go back to this notion of human beings not being willing to accept somebody else’s data. Give us that use case of how someone’s going to sit down in a boardroom or at a strategic meeting or whatever else it is, see TrustCheck, and go, “I get it.”
Aaron: Absolutely, that’s a fantastic question. There’s two reasons why, even though all organizations, or 80% according to Gartner, claim they’re committed to being data-driven, you still have these moments, people say, “Yeah, I see the numbers but I’m going to ignore them, or discount them, or be very skeptical of them.”
One issue is just how much of the data that gets to you in the boardroom or the exec team meeting is wrong. We had an incredibly successful data-driven customer who did an internal audit and found that 1/3 of the numbers that appeared in the PowerPoint presentations on which major business decisions were being made, a full 1/3 of them were off by an extraordinary amount, an amount so big that it would, the decision would’ve cut the other way had the number been accurate. The sheer volume of bad data coming in to undermine trust.
The second is, even if only 5% of the data were untrustworthy, if you don’t know which is which, the 95% that’s trustworthy and the 5% that’s not, you still might not be able to use it with confidence. We believe that having TrustCheck be at every stage in this data value chain will solve, actually, both problems by having that spell-check-like experience in the query tool, which is where most analytics projects start.
We can reduce the amount of garbage going into the meeting rooms where business choices are being made. And by putting that badge saying “This is certified,” or, “Take this with a grain of salt,” or, “No, this is totally wrong,” that putting that badge on the visualizations that business leaders are looking at in Salesforce and Tableau, and over time, in ideally every tool that anybody would use in an enterprise, we can also help distinguish the wheat from the chaff in that context as well. We think we’re attacking both parts of this problem, and that will really drive a data-driven culture truly being adoptable in an organization.
Peter: I want to tie a couple things that you said here. You mentioned the word design a couple times. You’re the VP of design at Alation. It also sounds like when you’re talking about design, you’re not just talking about design of the interface or the software. You’re talking about design of how people are going to use the software. What is the extent to which design, what’s the scope of design as you see it in this context of advanced analytics, and is TrustCheck just a first step that you’re taking? Tell us a little bit about that.
Aaron: Yeah, that’s a great set of questions, Peter. Design for us means really looking at humans, and starting by listening and watching. You know, a lot of people in the cataloging space and the governance space, they list a lot of should statements. “People should adopt this process, “because otherwise, mistakes will be made.”
Peter: Because Gartner said 80% of you have!
Aaron: Right, exactly. We think the shoulds only get you so far. We want to really understand the human psychology. How do people actually behave when they’re under pressure to move quickly in a rapidly changing environment, when they’re afraid of being caught having made a mistake? There’s all these pressures people are under. And so, it’s not realistic to say, again, you could imagine saying, “Oh, every time before you go out the door, go to MapQuest or some sort of traffic website and look up the route and print it out, so you make sure you plot correctly.” No one has time for that, just like no one has time to look up every single word in their essay or their memo or their email and look it up in the dictionary to see if it’s right.
But when you have an intervention that comes into somebody’s flow and is impossible to miss, and is an angel on your shoulder keeping you from making a mistake, or, you know, in-car navigation that tells you in real time, “Here’s how you should route.” Those sort of things fit into somebody’s lifestyle and actually move impact. Our idea is, let’s meet people where they are. Acknowledge the challenges that humans face and make technology that really helps them and comes to them instead of scolding them and saying, “Oh, you should change your flow in this uncomfortable way and come to us, and that’s the only way you’ll achieve the outcome you want.”
Peter: So, invest the tool into the process and into the activity, as opposed to force people to alter the activity around the limitations or capabilities of the tool.
Aaron: Exactly right. And so, while design is optimizing the exact color and size and UI/UX both in our own tools and working with our partners to optimize that, it’s starting at an even bigger level of saying, “How do we design the entire workflow so humans can do what they do best and the computer just gives them what they need in real time?”
Peter: And as something as important, and this kind of takes it full circle, something as important and potentially strategic as advanced analytics, having that holistic view is really going to determine success or failure in a lot of businesses.
Aaron: That is absolutely right, Peter, and you asked earlier, “Is this just the beginning?” That’s absolutely true. Our goal is to say, whatever part of the analytics process you are in, that you get these realtime interventions to help you get the information that’s relevant to you, understand what it means in the context you’re in, and make sure that it’s trustworthy and reliable so people can be truly data-driven.
Peter: Well, there’s a lot of invention going on, but what we’re really seeking here is changes in social behavior that lead to consequential improvements in business. Aaron Kalb, VP of design and strategic initiatives at Alation, thanks very much for talking about this important advance in how we think about analytics.
Aaron: Thank you so much for having me, Peter.
Peter: This is, again, Peter Burris. This has been a CUBE Conversation. Until next time.