A journalist with a particular interest in data-driven stories, Guy Scriven is The Economist’s U.S. Technology Editor. He initially joined the magazine as a researcher in 2010, and was previously climate risk correspondent, writing about the intersection of climate change and business. Before that he was the South-East Asia correspondent, worked on the Britain desk, and has written for Business, International, Europe, Asia and Finance sections.
As the Co-founder and CEO of Alation, Satyen lives his passion of empowering a curious and rational world by fundamentally improving the way data consumers, creators, and stewards find, understand, and trust data. Industry insiders call him a visionary entrepreneur. Those who meet him call him warm and down-to-earth. His kids call him “Dad.”
Producer 1: (00:01) Hello and welcome to Data Radicals. On today's episode, Satyen sits down with Guy Scriven. Guy is a journalist for The Economist covering technology in the US. In his tenure at the publication, he served as a researcher and climate risk correspondent and has grown his affinity for telling data driven stories. In this episode, Satyen and Guy discuss the role of data in journalism, instilling a culture of debate and the unsexy but critical side of AI.
Producer 2: (00:28) This podcast is brought to you by Alation. Successful companies make data-driven decisions at the right time quickly by combining the brilliance of their people with the power of their data. See why thousands of business and data leaders embrace Alation at alation.com.
Satyen Sangani: (00:53) Today on Data Radicals, we have Guy Scriven, U.S. technology editor at The Economist. Guy joined the publication in 2010 as a researcher and has written for the Britain, Business, International, Europe, Asia, and Finance sections. Prior to his current role, Guy served as the climate risk correspondent, Southeast Asia correspondent in Singapore and worked on the Britain Desk. Guy, welcome to Data Radicals.
Guy Scriven: (01:14) Thank you so much for having me.
Satyen Sangani: (01:16) So for those that are not familiar, maybe start by telling us about The Economist. What is it and how does it differ from other periodicals?
Guy Scriven: (01:23) The Economist is a kind of media organization, the kind of core of it is a weekly magazine. I guess some of the ways in which it's pretty different is that it's pretty old. I think we've been running now for 185 years. The thing that most people notice first is that we don't print bylines. With most daily newspapers, you'll have a little bit under an article that tells you who wrote it. We very rarely print that, basically. The thinking behind that is that it's a way for the kind of magazine to speak as one, and so that's kind of external reason. The internal reason is it removes a lot of the politics around whose name comes first in the byline and who comes after the article in the little bit that says additional reporting from. Colleagues who have worked at kind of daily publications and The Economist do tend to say that they kind of prefer it. They get it easily creates a bit more of a sense of comradery.
Satyen Sangani: (02:22) What I find interesting about the periodical in that sense is that it's, in some sense, it's like sort of has an... It's an editorial-first mindset. There's an economic perspective on the news. And so less than reporting the news, it's commenting on the news. Yet when you look at your traditional periodicals, whether it's Newsweek or the New York Times, the byline is heavily featured, where the idea is that we're reporting on the news and it's just the facts. Here you've got an editorial magazine where it's literally the opposite that's true. Which is that the names are hidden, but it's the opinion that shines through as a single voice. I would imagine forming that single voice is pretty hard in a pretty large institution where things are constantly changing. What is the process that you go through to get to that single voice and are there design principles or publishing principles that are held by every single one of the writers?
Guy Scriven: (03:13) So the process is, essentially, it's lots of meetings. So in a sense, almost the most important part of the weekly magazine is what we call the Leaders, which are the kind of editorials and it's five articles, which are the kind of what we think about the world as a magazine. The process for getting to that is there's a kind of handful of meetings at the end and the beginning of the week, where there's a kind of live and open debate about all these topics. So that, we had a discussion about, well, how to think about OpenAI. We obviously — lots of discussions about Trump recently. Every one of those articles behind that is not only a kind of whole bunch of reporting, but also a kind of internal debate about how The Economist should think about this. The Economist was founded on essentially kind of free trade principles.
Guy Scriven: (03:57) And that is basically at the heart of a lot of the way that we view the world. So we are in favor of kind of globalization and free trade and further that out of competition. A lot of the ways in which we view the world and the opinions that we form about what's happening today, are based on those liberal principles. It's also kind of not the case that necessarily everybody in The Economist agrees with every single opinion that we have, but it's up to the editor-in-chief to decide what our line on day-to-day events are. And that line basically comes from a kind of long debate about the leaders and the lead advice we should take.
Satyen Sangani: (04:37) It's funny because for me, I mean, you mentioned that it's based on free trade principles and it was super formative for me. I mean, my initial experience with The Economist was as a high school debater and I did this thing called Lincoln–Douglas Debate and side to that we would do this thing called Xterm, which where you would sort of be asked some arbitrary question, like today's version would be, what are the implications of Sam Altman's proposed ouster on AI development going forward? And you'd have to prepare for 30 minutes to give a seven minute speech. The Economist would be the one publication that would actually tell you what to think. In some cases I'd sort of, it started me down the path of then measuring economics and becoming an analyst. And so in some ways if journalism is the first cut in the history, it almost feels like The Economist is a free market version of a second cut.
Satyen Sangani: (05:23) I've always appreciated that there's a clear perspective, and yet if you buy into that perspective, or even if you don't, it gives you a way to think about current events or a framework to think about current events, which is always quite helpful. It takes you through the economic lens, which is obviously, for many of us, perhaps the most important one. How did you get there? I mean, did you always know that you wanted to be at The Economist as a journalist? Is this a dream job or did you happen upon it after some consideration?
Guy Scriven: (05:50) Yeah, it is a wonderful job in many ways. I get to speak to loads of interesting people and have time to kind of think and write about things and to kind of try to come up with new ideas and frame debates. It wasn't something I've necessarily always wanted to do. I think out of university, I knew I wanted to kind of write. I've basically worked for The Economist all my life except for one job I did straight out of university, which was writing kind of market industry reports for a small company in London. It was essentially kind of a bit of a sham in that their business model was to get loads of graduates to write these basically pretty bland reports based usually off lots of looking at Wikipedia and the internet. And that they had an enormous sales team that would push these reports onto people. It's one of those, you get an executive summary and you don't actually get to see what the kind of actual detail is and the level of expertise of the author until you actually pay for the product.
Guy Scriven: (06:45) I did that for a few months and in lots of journalism, you get a point in an article or a business article and it'll say, according to, whatever organization, the oil and gas pipeline industry is worth however many billion per year. The main kind of output of these reports was that figure, but the way in which she got to it was not nearly as rigorous as someone like McKinsey or Bain would do it. I did that for about nine months and then, basically, an internship in the research department in The Economist came up and I didn't get it, but a year later they called me up and said, "Look, there's another opening, come and work for a few months." After that, I essentially managed to stick around and managed to go and slowly write more and more. At some point after that, we had a data team. We created a data team and I worked for that for a bit because I was quite good at numbers and then from there I worked on the Britain desk and then I was the Southeast Asia correspondent for a bit and then I wrote about climate finance and now I write about tech. I'm based in San Francisco and I write about U.S. technology trends, which is completely fascinating.
Satyen Sangani: (07:48) And you're a technology editor, which means you're writing and you're also editing other stories.
Guy Scriven: (07:54) That's slightly misleading actually. So it's... This is unfortunately part of the title-inflation trend, which has affected all industries. I don't actually do any sub-editing. I don't look over other people's copy and tweak it and fiddle with it, I'm just a reporter. But increasingly in journalism, and it's not just The Economist that does this, people who are basically reporters are given kind of slightly more flattering titles. So you know how years ago that this was happening a lot in banks. And so there was that thing where like, if, at Goldman Sachs, a third of all employees, I don't know if that was quite the right number…
Satyen Sangani: (08:28) …are vice presidents. [chuckle]
Guy Scriven: (08:30) Yeah, exactly, exactly. If you're not a vice president at Goldman, you're no one.
Satyen Sangani: (08:33) It's so very British adjournal of you to refer to your title as inflated. I think that's great. I think lots of people would give their left arm to work at The Economist. It sounds like a lot of fun. So you mentioned that you had a stop in data. And of course, this podcast is all about data. And I love having journalists talk because I find that, and at some level, analytics is an internal journalism job. You're sort of trying to find the quest for some version of the truth inside of a company. What was your stop in data like and what did you do there? And how did you get into it?
Guy Scriven: (09:04) It was completely fascinating. I genuinely think it's kind of shaped the way I think about journalism now. When I started work, doing kind of data journalism rather than “journalism journalism,” we just started, created a kind of data team and the idea was to write more stories where the first, the kind of thing that you've identified, which is new and interesting, originates from a database rather than from a conversation. And a lot of my reporting now is, I go out and speak to a whole bunch of VCs and people in tech and then out of that an idea will form. And more or less the kind of way we approach data journalism, at least back then when I was doing more of it, was that the idea is that you either have a question that you answer with data or you discover a trend that no one has reported about using data. And so it was a kind of slightly distinct thing. Even at The Economist, where we're kind of a number heavy newspaper and a lot of our reporting is based on data, this was trying to be a distinct thing where the idea or the answer to a question kind of comes originally from a massive spreadsheet rather than a chat with someone.
Satyen Sangani: (10:12) And what kind of datasets were you dealing with when you were doing this kind of research?
Guy Scriven: (10:15) Yeah, all sorts back then. There was a lot of trying to get more out of national databases, basically. The UK publishes reams and reams of data, as does the US, Europe does loads. At that point, I was doing lots of stuff with the Europe section. So I spent loads of time just trying to work out the best way to kind of get numbers out of Eurostat in an interesting way. I spent loads on doing that. That was a really obvious first port of call. What are government datasets telling us? And then after that, there was a whole bunch of other stories that we tried to do, which we were basically trying to find more novel datasets around interesting topics. I did a long piece about the dark web, which someone else had scraped the dark web — basically, this independent researcher and dumped loads of the internet web page files in a massive database somewhere, or in kind of BitTorrent.
Guy Scriven: (11:07) I downloaded that and then I scraped his scraping of the dark web. And the final piece was basically drug dealers on the dark web are getting ever more sophisticated piece. You could tell from the piece how sophisticated they were getting. And it was stuff like they were offering specials, Black Friday sales and stuff like that, specials. And yeah, [chuckle] which is great.
Satyen Sangani: (11:29) Let's make your Black Friday even blacker.
Guy Scriven: (11:31) They're doing stuff like lots of quite clever marketing. And you could also get a sense of who was more successful. So the whole of the dark web at that point was a bunch of websites. So there was a bit like Amazon for drugs. So you'd go on and you'd buy your drugs, it would get delivered in the mail and then you'd write a review. And so you knew that the drugs got there. You could kind of confirm by seeing that there were X number of reviews. And so one of the things that we kind of looked at was the extent to which good reviews help business for these guys online basically. It followed the normal e-commerce pattern. But anyway, we had loads of data on it and there was lots of fun. I got in touch with some of the people on the dark web who were selling these drugs. And they told me these strange stories about what they were doing to promote their product. So that was lots of fun.
Guy Scriven: (12:20) A colleague of mine wrote this great story about prison tattoos. So there's a prison tattoo database for U.S. criminals. I think the idea is it helps you identify people, I assume that's the idea. But he went into all this detail about the extent to which you can predict an inmate's sentence based on the kind of tattoos they have. It told you a lot about the kind of underworld of crime and the kind of importance of, I guess, signaling in that kind of environment. There's these kind of whole group of standard stories where you go through government s and try to find something new that's happening or other kind of datasets. And then there's a whole subset of stories which are, you found this really interesting dataset and the topic of which is just inherently interesting. And then you do some analysis on that and try to come out of that with something interesting to say about tattoos or the dark web or some other kind of fun topic. And I think those were the two main buckets of data stories for us.
Satyen Sangani: (13:21) Those having national macro or micro datasets that would somehow describe the broader society, and then really random datasets that essentially were a story unto themselves. So we at Alation talk a lot about people building a data culture, and part of that is having people ubiquitously be able to find data and understand the data that they find, and then trust the data that they find, and ultimately use the data that they find. It sounds like you're kind of going through that life cycle as well. How did you discover these datasets? I mean, there's no real great data search utility out there, certainly for public data, and I can't imagine even for some of this really arbitrary random data. What was your process in discovery for finding these datasets?
Guy Scriven: (14:01) I guess internally what really helped is just that we had loads of reporters who used data. Although they weren't data journalists, they would really understand the data of a certain topic. At that point, I think one of the really big stories in the world was the European migration crisis. And we have got some fantastic reporters who constantly focused on this story. If I came across an interesting dataset about migration, or the kind of state of refugees across Europe, I'd drop an email to our in-house expert, and they would give me a good sense as to whether it's a credible source. If they came across a big dataset which they thought was interesting, or we might like to explore, they would send it onto the data team, and we would take a look and see if there's anything interesting to discover from that. So it's kind of from having internal people who know the subject really well was a really useful way to kind of check the credibility of data sources.
Guy Scriven: (15:00) In terms of discovering new ones, I think it's kind of just what everyone does. So it's lots of time on Reddit. There's a whole bunch of newsletters that's just like data. I spend loads of time going over kind of data is plural, or data are plural, that kind of fantastic newsletter. There's lots of bits of time talking to bank analysts and things like that to see, ask what they look into when they try to get this kind of sense of an industry. And in a sense, it is slightly odd, I feel like there've been loads of attempts to make search engines for publicly available datasets. Every time I've tried to use one, it's not really been satisfactory at all, which feels like something that ought to be there, but isn't. It feels like a really obvious useful thing to have, but I've never really seen a successful attempt at it.
Satyen Sangani: (15:43) Yeah, I think it is an interesting problem with that I've given a lot of thought to it too. And there are a whole host of sub-problems which make it hard. In some ways, I've always thought about data, or at least I started thinking about it more like content, like an article on The Economist as an example. And there's sort of a life to the data, and that life may be long or short depending on when the dataset was captured. And it'll be really interesting to see how that world ultimately evolves. As you'd find these datasets, how would you know that they were true? Did you ever find a dataset that you saw and at first thought was remarkable, but then realised was absolute bunk?
Guy Scriven: (16:19) Sometimes when you get into consultancy estimates of stuff, it starts to get a little iffy.
Satyen Sangani: (16:27) Because they're trying to tell a story, or because it's assumptions built on assumptions built on assumptions built on assumptions?
Guy Scriven: (16:33) Yeah, because it's basically a black box is why I feel uncomfortable with it. So some large consultancy has given you a number for the total of Nvidia's supply chain, or something like that, and there's just so little. You can call them up and ask them how we've derived this number, but you never really get that much detail. Their job is to make assumptions and estimates, but I feel a little bit uncomfortable about that. What you do find is that, and particularly with government datasets, is that it's based on a survey, and the survey doesn't have a very large sample size. At some point, I was looking at lots of UK crime data, and there are two ways to capture that. So you can go out and you can ask loads of people, have you experienced crime? And then the other way is to look at police records. And I think the UK government publishes records using both these methodologies. But if you're doing the survey method, and say you want to look at young people experiencing more car break-ins than old people? You're getting basically into the kind of cross tabs and the demographics.
Guy Scriven: (17:38) They may have started asking 1,000 people, and they do that every year. But then by the time you're looking at 17- to 19-year-olds, you're actually only talking about 25 people. You're trying to kind of extrapolate a trend from that over a bunch of different datasets. You kind of get to a point where if the trend looks really weird, and something that you kind of not expect, you kind of have to really go back and look at the methodology and think through, what am I actually counting here, if you see what I mean?
Satyen Sangani: (18:08) And with the credibility of The Economist who have a high standard, which is great, because of course, that's what you'd want. But it does tell the story where we think of data as an objective thing. But in fact, on some level, it's just as malleable as any other words might be.
Guy Scriven: (18:22) Completely. Even things like surveys, which you would think are kind of pretty solid. You get lots of surveys now from, which are essentially produced by PR agencies, that are there to promote a company or an idea, which is completely their job, but you just don't know how well a survey done has been re-weighted and how much thought has gone into the back of it. But you're completely right. I mean, I think data is kind of subjective in that sense that, and particularly if you are trying to create a narrative around data, you just have to be pretty confident that what you're saying is right. I mean, often what I try to do, if I'm not sure about a trend or if I've got questions about what a piece of data is telling me, you can just clearly find some corroborating evidence and try to find another dataset, which would give you a similar picture, maybe from a slightly different angle and see if that matches to whatever you're finding are.
Satyen Sangani: (19:19) Yeah, absolutely. I mean, being able to describe the truth in different formats and to look for some corroborating evidences, I think the job of an analyst. So within The Economist, it sounds like you actually have a pretty strong data culture. I mean, there's, and this is sort of a podcast that'sall about helping organizations evolve and build a data culture. Everybody is running around looking for the truth. There isn't really sort of a bias as it were toward any particular perspective outside of just telling a good story and obviously being aligned with free market principles. Tell us about that culture. You mentioned there's a lot of meetings, but tell us a little bit about the habits of the institution and what strikes you as remarkable. I mean, it sounds like you obviously spent a lot of your career there. So hard to contrast with other places, but you see a lot of other places. I mean, you see and study a lot of different companies. Tell us about The Economist, because I think it's a fascinating place.
Guy Scriven: (20:09) The committee is slightly flippant, but you know, there is a level of bureaucracy as there is with any company. One of the nice things about the culture is, we are kind of open to debate and like to discuss kind of ideas. I think lots of people internally enjoy that. And I think that helps kind of promote thinking about data in a healthy way in that the most convincing way to kind of win a debate and to be right in a debate is basically to have really strong data and evidence for whatever you're trying to argue. And so having a culture that is friendly to debate and where people enjoy debating, in and of itself, I think helps people or promotes a culture where people are kind of pretty serious about understanding statistics and getting data right. Another thing I think, which is kind of unusual and kind of unique to The Economist, so it's not unique to The Economist, but it's something we take pretty seriously, is fact checking and having the facts correct, basically.
Guy Scriven: (21:05) My first job at The Economist was in the research department, and a lot of that department's time is about doing fact checking. And so I think everything that we produce from podcasts, to films, to print pieces, to online pieces, is kind of fact-checked by a team. If you're even slightly fuzzy about a statistic in your copy, they would ask you for clarity and make sure you're kind of sourcing that correctly, which is a delightful privilege to have as a journalist. There's a kind of safety net, in a sense. But then you also know that if you're not clear and precise about what you write and what you produce, you're going to get questions from the fact-checking team.
Satyen Sangani: (21:46) There's a team that's devoted within The Economist to basically taking a statistic. Like you say, 89% of people who are in profession Y have this habit or whatever the fact happens to be. And some of those facts must be tremendously difficult to verify.
Guy Scriven: (22:00) Yes. So you send them the source notes. But if they think the kind of survey is incredible, or if they think that you are exaggerating, or they think you've kind of misinterpreted a piece of data, they will tell you and they send a note copying you in to the section editor, and then it's the editor's call on interpretation, basically. So that process kind of backstops all of our work and causes people to just, from the start of your reporting, just be very upfront about the kind of facts that you have and what you're able to demonstrate and what you're not.
Satyen Sangani: (22:34) So this is obviously a pretty interesting territory because in a world of generative AI, you get these manufactured or machine-authored snippets of text. And I find that ChatGPT is useful. But funnily enough, over Thanksgiving, one of my friends who's a chef took a whole bunch of recipes and gave the ChatGPT the links to it and said, "Give me a list of ingredients that I should go buy from the store." And it only got it like 70% or 80% correct. I feel like that's kind of true for all of what it does. But that fact-checking capability obviously is very sophisticated inside of The Economist. So there's a high bulls**t factor, as it were. But that doesn't quite exist in the world at large. And so how do you see that evolving? How do you see this sort of interplay between technology and data, technology and facts? Where's your head at as far as where the world is and where it needs to go? Because I mean, you've obviously seen the world of tech and AI. You're seeing all of this data. You sort of spend your career trying to find the truth, whatever that may be.
Guy Scriven: (23:34) Yeah, I completely agree. The hallucination problem is kind of very much real in ChatGPT. It makes the kind of technology, as things stand today, useful for a first draft of something like marketing copy or something like that, but not fully useful for anything that requires even kind of high to medium levels of accuracy. One thing I think is that the hallucination rate, so the kind of number of inaccuracies per piece of AI-generated text, I think that will basically improve over time. I don't really know whether it will completely get wiped out, but I think my sense is that lots of different technology firms are working on this in lots of different ways and that they have ways to kind of improve that. I mean, what it means for questions about misinformation, I think is still really up in the air. One of the things I'm interested in, I guess at the moment, is what this means for cybercrime. I was speaking to a source the other day, and he's a guy who works in cybercrime prevention. He used to work at the FBI and CIA and stuff like that.
Guy Scriven: (24:41) He was telling me that he and his family have now come up with internal code words that they use or can use when they're on the phone to each other in order to prevent scams from voice cloning technology, which kind of struck me as completely, slightly terrifying. And, you know, so he works for a cybersecurity company, so as ever in the technology world, he has skin in the game and all these companies to some extent have to talk their own book. But I mean, he was arguing that we might be heading to a world where basically you have less and less trust in lots and lots of the types of communications that we're so used to having now. So people, for instance, answer their phone a lot less than they used to, is one of the things that people in cybercrime note, because there's now so many scam calls. I don't know about you, Satyen, but I feel like I get kind of four or five scam likely calls every day.
Satyen Sangani: (25:38) Easily. I mean... And yeah, and I don't answer my phone.
Guy Scriven: (25:41) The point this guy was making was, well, that kind of social norm has actually changed in the last few years. That's partly because of the advancement of technology makes these scam calls so much easier to make. The worry is the kind of generative AI turbo charges that, I guess. And that we get to a place where not only do we not trust random calls we get, but calls that we get from loved ones who sound like loved ones are actually also scams, which is a kind of slightly terrifying where does society go though.
Satyen Sangani: (26:10) You started with a relatively optimistic narrative, but closed with a pretty depressing example. Since you see all this stuff every day and you're talking to all of these folks, where do you end up? Are you on the side of optimism or skepticism or pessimism? I mean, I guess the natural place for any journalist to do would be skepticism, but...
Guy Scriven: (26:30) I kind of veer from one to the other day-to-day depending on who I've just spoken to, I guess. I don't know if that's a useful place to be. I think it's basically just impossible to tell at the moment. With any new technology, there's the kind of doomers, and there are optimists. The optimists like to say, "Oh, there's always doomers." Even when there was the Sony Walkman, people were terrified that that would change society for the worse. And the doomers like to say, "Yes, but this time it's different." And the optimists said, "Well, you said that last time." So I don't know. I don't really know how to think about it. As I veer from side to side, day-to-day, there are certainly elements of generative AI that do make it feel different from previous technologies. I guess for me, the kind of fundamental thing I come back to is like, basically all software for years and years and years was like, there's a big database and an interface and you interact with the interface and then the interface goes and gets information from the database and gives it to you.
Guy Scriven: (27:26) And that's like Facebook and Google and Amazon, like that's the whole kind of basis of most software to this point. And then generative AI is obviously, there's a database, the machine has read the database and maybe has understood it. And then when you interact with the interface, you get a prediction of what could be in that database. And so, and that does seem to be like a fundamental difference in the way software has been previously and the way that software generative AI is. And because of that, basically, I do kind of feel like this time is different, both in the sense of like the risks are high and in the sense that it could also be a step change productivity boost for businesses. But I think, the extent to which we have evidence on that is going to be years and maybe decades before it's fully known.
Satyen Sangani: (28:18) What stories are you seeing in terms of actual commercial adoption and implementation? I mean, we all obviously know what's out there with OpenAI and ChatGPT, but what are companies doing and what have you seen? I mean, there's these terrifying stories of like AI-driven drones that the Pentagon seems to want to release out into the wild in the name of...
Guy Scriven: (28:38) There's kind of loads of stuff, isn't there? What I think about most is the extent to which the kind of enterprises is absorbing or adopting this. We've had this kind of period since the release of ChatGPT, so about a year, almost exactly a year now, date of recording. We've had this long period of experimentation and excitement. That's been basically marked by the supply side of AI just really ramping up. So you've had loads of model makers releasing new models. You've had the cloud players buying enormous amounts of specialized AI chips. You've had thousands of AI application startups who are going to build on top of the model makers who then use the AI chips from the cloud providers. So you've had this boom in the supply side of AI. Now the big question is whether the enterprise demand meets that and what shape it takes. I think we don't really have a good sense of that until at least the first couple of quarters of next year. In terms of use cases, you have the kind of generalized use cases. For me, the kind of most interesting one of those seems to be Microsoft's co-pilot, kind of a fleet of co-pilots where they're pushing kind of really hard to get these out into the open. They obviously have this incredible distribution network that they sell into.
Guy Scriven: (30:05) That's going to be really interesting to see how useful is it to have a kind of co-pilot in Teams that records all your meetings and then summaries it, and then you can look up whilst you're in Outlook and then use to help make PowerPoint presentations. How useful is all of that kind of stuff? Is that a really big change? Is that a kind of very small incremental change? That will be interesting to see. And then you have the kind of very specific vertical use cases. Harvey is obviously the kind of legal AI tool that people are quite excited about, but that's obviously being trained on kind of quite specific legal data. And that helps you kind of do things like write up contracts. Another corner, I haven't looked into this in detail, but someone was telling me that there's a bunch of GenAI startups that basically are going after the procurement market in that if you want to win a government contract, it's an incredibly tedious process and there's loads of paperwork and bureaucracy.
Guy Scriven: (31:00) And the idea is that these startups basically kind of make that really easy for you. And they just write up a contract bid in seconds. If that's true and gets kind of widely adopted, that's a massive change for the procurement market. Which is, I don't know, it's like 40% of GDP and tends to be dominated by companies that have built up a core competency of winning government contracts. That's what they do, as well as provide the software and stuff. But a lot of that kind of note is being able to win government contracts. And if the bunch of AI companies turn up and kind of remove that note, then that could be kind of fascinating battle to watch.
Satyen Sangani: (31:36) It feels like a lot of these use cases are low risk and maybe low excitement factor. I mean, they're not like you go to a doctor and the robot tells you what medicines to take or diagnoses you for surgery, but it is. The early ones feel like they're ones where you can get a lot of productivity gains out of otherwise jobs that might be wrote. Labeling data for us is kind of the same thing. There's not a lot of risk. There are some cases where it could be risky, but there's not a lot of risk in labeling data, but you can go ahead and do that. You mentioned that you wouldn't know about the enterprise adoption cycle for at least the first couple of quarters. What data will you be looking for in order to validate that enterprises are adopting? I mean, where will you be looking for evidence around adoption cycles and the demand side around this AI stuff?
Guy Scriven: (32:23) Yeah, it's a good question. It's essentially a case of just monitoring earnings calls and seeing what the really big public listed companies are saying about on the demand side. I think at the moment, you can't really do much more than that. So Microsoft, Amazon, and Google obviously just have this kind of enormous cloud operations, and they are kind of increasingly trying to sell AI products. What they say about the demand for their AI products is probably a really good first place to look. After that, I mean, there's a bunch of weaker indicators you can start to think through. So I wrote a piece earlier this year where I looked at the S&P 500 non-tech firms and tried to look at a whole bunch of different indicators, just to kind of AI, to give a sense of AI adoption. And some of the things I was looking at was the amount of job listings these companies are posting that mention AI skills. You can kind of — well, that's quite interesting because you could kind of go a bit deeper on that and you can look at exactly what they're asking for and try to get a sense of their sophistication.
Guy Scriven: (33:35) If they're asking for people with skills in PyTorch, which is a kind of programming language, or CUDA, which is another kind of programming language people use for AI, then that kind of implies that whichever company — like GSK or whichever company — is quite sophisticated. But if they're asking for people who just understand OpenAI's API, then it kind of implies that they're buying, not building. So you can kind of get a glimpse in that way. We looked at patent data, which is kind of sometimes helpful, sometimes not. Sometimes really important technologies get patented and sometimes they don't. But it can give you, I think, a sense of what companies are thinking internally. You could also look at, I guess, companies that they invest in as well. I think that's quite a useful indicator. So a whole bunch of really big corporations have gone out and bought stakes in smaller AI startups. And I think that's probably a useful way to try to sense how serious a company is about AI and kind of over the next couple of years, what their AI strategy might be.
Satyen Sangani: (34:38) As the tech editor, what percentage of your beat is AI now today versus everything else? I mean, I would imagine four years ago, crypto was top of mind and maybe the cloud and, what is it today?
Guy Scriven: (34:53) Yeah. So it's maybe 80% AI. In the last year, I think of the big stories I've written, almost all of them have been AI stories. It's such an incredible moment for the industry. And there's so much kind of excitement and activity. It kind of seems strange or foolish not to be thinking about AI quite a lot. And also like the story, I mean, I don't know your sense of the subject, but for me, it feels like the story has moved so quickly. In one year, it's kind of changed everything. I guess both the kind of level of interest in it, its potential changes to the economy and the speed at which it's all moved, have meant that it's just kind of, has been just a continuum focus over the last year or so. And so it's been completely kind of fascinating to watch.
Satyen Sangani: (35:40) Yeah, it is fascinating. I mean, I do think it's different from some of these other trends, because you have innovation simultaneously at almost every level of the stack. I mean, there's just all the way from the chips on up to the various use cases where there are seven different versions of companies that are trying to chase use cases around medical interaction, for example, like in some cases trying to supplement nurses. That's amazing to me. And it does feel like there's a whole bunch of other trends like visual recognition around documents seems to be there. So like you can take a highly unstructured document, all of the text would be recognized, you can then process that. It just feels like a moment where lots is changing and it almost feels like so much change is happening that you're almost reacting to the change. And so it's hard to keep up. I mean, even for those of us that are in it day-to-day.
Guy Scriven: (36:31) Yeah, completely. Yeah, it's hard to keep up. At some point you kind of, you get a bit of AI fatigue. Having kind of thought about it suddenly for seven months or something, you kind of feel like you need a break, but it moves so fast. I think, my sense is that it has slowed a bit now, basically. I mean, there was that period in, between February and May where every day there was an enormous announcement about AI, new models released, a new tool was announced or something like that. And now, I mean, saving the kind of debacle that OpenAI very recently, I feel that that pace of thing has slowed a little and we're kind of waiting to see what the enterprise does next year in terms of the demand side.
Satyen Sangani: (37:14) Yeah, that seems about right to me. I mean, do you think the OpenAI story is massively consequential or blip? Do you think it'll have implications on the industry at large?
Guy Scriven: (37:24) I don't think it's massively consequential. It could have gone a lot of different ways and if the core of OpenAI had ended up at Microsoft, I think that would be quite a big change. The way it's kind of at time of recording is that everything went full circle and we came back to a position where OpenAI's senior management team is the same and it is gonna get a kind of new upgraded board. Which I think is good for OpenAI and probably reassuring for its shareholders and the kind of armies of companies that rely on it for their data. I mean, I think probably one other thing you will get now more, which was already happening, but that will probably become more prominent, is that companies will build in more options for their models into their software. I think that was already happening. So if you were a company that were making an app that summarized doctor's audio notes or something like that. I think already companies were designing their software so that they could switch out whatever GPT model they were using and switch in Claude or one of Cohere’s models instead.
Guy Scriven: (38:35) You'll probably get more and more of that. People in tech like to describe it as “optionality,” but [chuckle] that's a kind of slightly horrendous word, but I think you'll get more of that. I think you'll get both at the startup level and at a CIO level where bigger companies are building their own software.
Satyen Sangani: (38:51) As we close out, what are the trends that people ought to be thinking about that they probably aren't? What are the blind spots within all of this? Where should people be giving more thought to the world of AI and what are the stories that you feel like are getting told well enough?
Guy Scriven: (39:06) I think more broadly in the general economy, one of the things people probably don't appreciate about AI is that, if you're a company and you want to be serious about AI, you need to — it will seem like I'm saying the obvious — but you need to be very serious about data and you need to have your kind of data structure and governance and management all in place. And you need to be doing that in a professional way. Because otherwise you basically run into immediate problems. I spoke to one company that was playing with one of these AI tools, it's a kind of internal AI tool, and they basically use it for advanced search within their own company's databases. They kind of quite quickly discovered that because they weren't that proficient at data governance and data permission and things like that, they had some people randomly able to view emails of colleagues and stuff like that, which they definitely weren't supposed to be able to or view kind of senior management documents, which they shouldn't have had access to. And so there's clearly risks, I think for companies using AI, of just thinking of it as a kind of silver bullet technology that will help. But actually you need to get the kind of slightly less sexy data governance stuff right first in order to be able to apply these tools.
Satyen Sangani: (40:19) By the way, obviously, I live that, because my audience like leaves exactly this question. It's a very unsexy topic. It's like telling people that they need to have a card catalog at the library. Do you think that that's making its way into sort of the recognition of the senior executives or where do you think we are in the evolutionary cycle of that story?
Guy Scriven: (40:44) I think more and more people are slowly understanding that. If you were to kind of really get to grips with that question — how much are CEOs and senior management understanding how strong the link is between kind of getting in total governance (data governance) right and being able to use AI properly or to its greatest benefit — I think you have to ask Snowflake and Databricks and Co what they're seeing in that area. Every time I've asked them about this, I think that their sense is that people are getting the message. Some industries are obviously very sophisticated and good at this partly for regulatory reasons. Finance and healthcare have traditionally been very good at this and have always thought very seriously about data governance. Then you've got a whole bunch of industries that are kind of catching up to that mark. So yeah, it varies industry from industry and I think it is kind of slowly getting there. It's probably easier now than it was a few years ago. I think more and more people have data on the cloud and that's obviously more helpful than having kind of on-prem data structures just on server city consultant space.
Satyen Sangani: (41:48) Yeah, it certainly makes for more “optionality” as you described it, because you can manipulate the technology much more quickly in the infrastructure. The onboarding cost of the infrastructure is much lower. So, Guy, maybe to close, I'd love to just get your sort of predictive juices flowing, which obviously is a dangerous place to be, but where do you see us being in a year? Given everything you know at this moment, what's the story that we're gonna print in 12 months about AI and tech? Obviously, of course, you don't know, so this is all speculative, but you see a lot.
Guy Scriven: (42:15) There's probably one scenario. I think in a year's time we're more likely to have seen the kind of negative side of AI and data and tech more prominently than the kind of positive side. The reason I say that is that, I think it's not very hard to imagine that a year from now there would've been a story about a very big hack or some cyber criminal turning off the national grid or something like that with the help of Llama 2 or whatever. I think that's kind of really quite not likely, but quite possible. That's something that would be immediately all over the press, huge amounts of attention. At the same time, the kind of positive side of AI is a kind of slow march towards higher productivity amongst really big companies. I think that will basically take place mostly behind closed doors with some promotion of, don't we have this whizzy AI tool doing this? But I think the really kind of cool, sophisticated stuff we basically won't know about for a while because it's not in any company's interest, if they've built a kind of very clever way to boost their productivity through whatever it is, have they managed to cut out 80% of their meetings because they can just summarize stuff much more easily and their work is a lot more productive.
Guy Scriven: (43:38) If you can do that, then you're not actually really gonna, I don't think you are really gonna show off about it because then all your rivals will do exactly the same thing and your advantage then becomes null. My best guess is that you'll probably hear a lot more about the kind of negative side of AI than you will the positive side. The other thing to think about for next year is that we have an enormous number of elections obviously coming up, obviously including a presidential election in America. I think the stat is that next year, for the first time, more than half the world will be eligible to vote in an election, which I think is the first time that's happened. You have this immediately at the kind of same point at which the cost of producing misinformation has kind of plummeted. And so that is something that we'll probably hear a lot more about next year. I don't really have a sense as to whether AI will actually have a big effect on what voters think. I think it basically probably doesn't have that big effect, but I think you'll hear a lot about misinformation as well as the other bad stuff that you'll hear about AI.
Satyen Sangani: (44:44) Yeah. So be on the lookout for catastrophic stories because on some sense if it bleeds, it leads and there may be some positive stuff, but we probably won't hear about very much of it. So keep your head up and soldier on as it were. [laughter]
Guy Scriven: (44:58) Yeah, sorry. That's incredibly depressing.
Satyen Sangani: (45:01) A friend of mine wrote a blog and basically it was like, don't let pessimism get in the way of the truth. Which is, just like there's so much bad news out there and I think we get media all the time and so there's all this anxiety and it mostly is the bad stuff that gets covered and therefore that's what people think about. On the other hand, there is a lot of good stuff happening and we don't hear about that as often, and I think there's a bias there just through kind of how news is constructed. So it makes sense.
Guy Scriven: (45:26) I think that's completely right. Yeah. Yeah.
Satyen Sangani: (45:29) Well, thank you for taking the time. This was as expected a fabulous conversation and I was very hesitant about coming on because, just didn't think it would be that interesting. I think we've now proven that to be completely unfounded. So great to have you on the call, on the podcast and thank you for taking the time and we'll need to get you back on in about a year to see if any of this came to your fruition.
Guy Scriven: (45:51) Well, thank you so much for having me on, and yeah, I'd be delighted to come back and [chuckle] test predictions in a year's time to see how it's all evolved, and I'd love to do that. Satyen, thank you so much, it's been a really fun conversation for me, so I very much appreciate it.
Satyen Sangani: (46:04) Awesome. Till then.
[music]
Satyen Sangani: (46:10) It's an exciting time for generative AI. We have a surplus of new AI tools, models, and capabilities, but there's still so much that we don't know about how they will be used and what their impact will be. With all of this investment, traditional skills like fact-checking are more critical than ever to ensure that the predictions and the technology are correct. People like Guy are on the front lines of telling this story. He notes that while AI affords immeasurable opportunities, there are areas that are at risk for misuse like cybercrime and the erosion of trust in communication channels that we've previously used every single day. Guy also stresses that AI shouldn't be used as a silver bullet, but rather companies need to focus on the unsexy parts like data governance and structure when adopting AI. So when implementing new tech into your strategies, the routine bits are just as important as the exciting features. Thanks for listening and thanks to Guy for joining today. I'm Satyen Sangani, CEO of Alation. Data Radicals: keep learning and sharing. Until next time.
Producer: (47:08) This podcast is brought to you by Alation. Your entire business community uses data, not just your data experts, learn how to build a village of stakeholders throughout your organization. To launch a data governance program and find out how a data catalog can accelerate adoption, watch the on-demand webinar titled Data Governance Takes a Village at alation.com/village.
Season 2 Episode 8
How can a software engineer create the next big thing? According to Matei Zaharia, creator of Apache Spark and co-founder of Databricks, it demands a single architect to build the cathedral – and an open bazaar to empower the masses. In this conversation, Matei shares his startup philosophy and reveals exciting advancements with Databricks Unity Catalog and Dolly 2.0, an LLM for enterprise.
Season 1 Episode 15
The “D” in “CDO” stands for “data.” But it could also stand for the dexterity needed to get boots-on-the-ground buy-in across the organization. NewVantage Partners founder and CEO Randy Bean shares his insights on how to set up the modern CDO for success.
Season 1 Episode 1
This episode features Paola Saibene, principal at Teknion Data Solutions and former CTO of the State of Hawaii. Paola is a seasoned data executive, having served as a CIO, CSO and Global Privacy Officer and VP at multi-billion dollar organizations around the world.