AA: I would like to discuss data strategies. I would love to know what you think are the key aspects of a data strategy. If you were asked by a company to develop a data strategy for them, what would you be thinking about? What aspects are front-of-mind for you, and how would you look at developing a roadmap for them to actually implement a strategy?
CA: I feel like I’m somewhat of a broken record on this, but I would always start with data, their business strategy, and then whoever the group that has the most data inside the company is. Maybe they have a big external project that they want to take on and they’re going to need external data, but most companies start with some internal data that they can ratchet up to produce things.
It’s hard to talk about a data strategy if you can’t talk about the actual use case, what the strategy of the company is, or the industry of the company. You need to know what the market levers are and what data is available, but first and foremost, what the actual business strategy is and who has it, and how much money they have to go after what they’re trying to accomplish. You also need to know about all the people inside the company that need to be at the table. I think the key aspect is having a business strategy and data strategy to support that, rather than something that’s separate and distinct, which I’ve seen happen, and which is definitely a mistake.
AA: When organizations are going from proof of concept to suddenly having to productionize that model and capability, I often see them hit roadblocks. They either don’t have the right processes in place or don’t have the right people. I’m thinking of MLOps and things like that to scale that capability. Have you seen some of these issues, and how do some of these organizations overcome this? What do you advocate for when helping them go from dev to prod at a large scale?
CA: The first part of this is understanding the executive sponsor and what they’re really trying to do, because somebody somewhere in that company is paying for the development of this AI or ML product or service. It’s usually a one- or two-person type of situation.
They may come with a whole team that thinks that they’re the ones who are dictating the project, but in reality, it’s one or two major executives who will be the ones that set the tone for the entire development project.
There are many reasons why data scientists can’t get past the proof-of-concept stage. One is that they don’t have thorough communication between them and the actual executive sponsor. When there is a breakdown in communication from the get-go as to what the executive sponsor is trying to accomplish – say you produce a proof of concept of some sort (maybe an MVP), and the sponsor says, “Hmm,” or “I’m not sure” – that means communication between you and them has already broken down.
The second thing that I see happen is the executive sponsor may not understand what the heck the data scientists have done. Data scientists need to clearly answer questions such as, “What did you do? What data is in there?” Explaining things in technical jargon is not helpful to a business sponsor. That is not communicating well.
Data scientists have to find a way to actually communicate, which means that the person on the other side of the table understands what they say.
Throwing in all the different models, all the different mathematical processes, and all the things that were thought through but didn’t go into the model will just make the business sponsor not trust the data science team. Instead, they will feel like the team is trying to snow them. The best thing the data science team can do is spell things out. If there’s something that the executive sponsor says that the data science team doesn’t understand, then the data scientists have to get under the hood of that. They need to keep asking questions of the sponsor until they think they have a good hold on the request or concerns, and then ask the sponsor, “OK, could you write that down?”
When people have to actually write things down, I find that things get a lot clearer for both the person writing and the person receiving the writing.
Those are some of the reasons. The biggest one, though, in my mind, is still trust: not having trust about what you’ve done and what went into the process, or whether you begged, borrowed, and stole to get there. A lot of teams feel like the overarching thing is that they’ve got to meet that MVP stage, which has to happen in anywhere from four weeks to eight weeks. That’s the typical thing, but that doesn’t usually give you enough time to truly do what you need to do with the data. I think that’s where things fail because you do all kinds of horrible things in this rush to get to the MVP, and then you think, “OK, I’m going to scrap all that and start over because now I’m going to have the funding that I really need to move forward.”
However, what happens is that once the sponsor or funder sees the MVP, they say, “Oh, no. We’re just going forward from here.” You can’t say it out loud, but you want to say, “Wait a minute, no. We built all this stuff based on chewing gum that we just stuck together. This isn’t going to scale.” They continue, “Yeah, this is great! I’m going to schedule a meeting. You’re going to show this to the CEO next week.” And before you know it, it just keeps getting bigger as more people get involved. You can’t go back, and then you’re stuck with what you did in the first place, and now you don’t get the proper investment to do it the “right” way or the way you had hoped.
You really have to set the expectations and have that first sit-down with the executive when you’re planning these things out. You need to say, “Look, I’m just going to be honest with you. The reason none of these things have scaled for you in the past is that you didn’t give the team enough time to build these things the right way and to truly invest in it.” People just don’t want to be honest about that. They’re afraid that the first answer will be, “No,” but I actually think it’s different if you’re honest and forthright and you show a legitimate plan. A lot of data scientists don’t want to do that; they don’t want to show a legitimate plan because it’ll show their steps. It’ll document things, and that causes accountability. But if you’re honest from the get-go, you might actually be able to scale your thing, make it outside of the proof-of-concept stage, and reduce the failure rate.
AA: One thing you mentioned really resonated with me when you talked about trust for data science teams. Something I always advocate for in my teams and mentees is that your goal is not to be seen only as a technical expert but as a trusted partner to the business. I think that is a really important differentiation for people to make in their minds: that they need to be trusted by their peers and senior executives, and not just be seen as someone who speaks technical jargon. I completely agree with you there. They have to speak and understand the language of business and translate their findings into actual outcomes: profit increase, efficiency gains, and so on. That is pivotal in building their careers, especially as they progress up the ladder, but we’ll get to some of that in a moment.
You talked about time pressures, which I think is a really big one. Sometimes, they’re very unrealistic, for various reasons. What are some of the best ways to help teams prioritize their demands, as a leader of a data science team? If you’re inundated with requests from senior executives, how do you prioritize what to take on? Do you follow any particular processes or methodologies around trying to figure out how to triage all of this, continue to add value, but still manage expectations?
CA: It’s always going to depend on the specifics; every business case is different. It depends on what you’re trying to accomplish, the relationships that you have, and how much trust you have between yourself and your team. That’s hard because a lot of us deal with lots of contractors and vendors, too. You’re not always going to have access to 100 data scientists just sitting there all the time. Sometimes, you have to say, “OK, let’s pull from outside resources,” and those people are hard to trust because you don’t really know what their levels of understanding of your operating environment are. When you use internal data and you’re working with vendors, there are a lot more checks that come with that.
That being said, the relationship that the data science team needs to have with sponsors and champions is everything, and it’s everything when it comes to prioritization too. Let’s say you’re working with the chief marketing officer and they have been going to the news and publicly touting this strategic initiative that they’re about to do. Let’s say the feature that they keep mentioning – that you haven’t even built yet – is at the top of their list. You talk to their people and you say, “Oh, yeah! They’re so excited about this, this, and this.” You might have something that’s not on that list that you’re working on, because we all get scope creep, and nobody’s worse for scope creep than inquisitive data scientists. We’re always saying, “Wow! Why did that finding come up?” You have to stop and say, “OK, wait a minute. Is this actually contributing to all of those things that this sponsor keeps mentioning?”
There are things that people tell you are important to your face, and then there are things that people will reinforce are important by the things they say to the public, investors, or other executives. There are also things that people don’t want to admit they actually find important because they’re things that are – for lack of a better term – cosmetic.
They’re not deep and meaningful and purposeful to the business, but they’re things that might give extra oomph to their campaign. You can get little wins like that with your sponsors. That’s how you build trust: by prioritizing the things that matter the most to sponsors and the business.
The last thing you want to happen is that you put all your eggs in one executive sponsor basket and then that person walks out the door. Who’s left as your big sponsor within the company that knows what you can do?
So, a lot of it is politics. Who are the sponsors that you need to attach to, and what are their priorities? What is it within a project that will get the response, “Ah! Let’s keep going so that we can build out the other features”? You have to keep them going. If the excitement fades to nothing because you’re producing things that are a cool feature but not something that was ever asked for, then stop.
AA: Yes. That’s great advice on the practicalities of tying your mission to that of the key stakeholders and decision-makers and their priorities. I think that is vital and may not be what a lot of people want to hear, but that’s the reality of it, isn’t it? If you’re not going to get funding, you’re not going to survive.
CA: Yes.
You’ve got to act like you’re a start-up.
AA: Often, I see issues around how success is measured. What metrics should we be using? How do we know what impact we’re creating with our models? Do you tend to think about any specific metrics? I know it’s a general question because it depends on the organization, but are there any particular types of metrics you tend to use to measure the impact of the solutions they’re developing?
CA: What was the strategic goal the project was designed to impact? What business metrics are measuring progress against the goal?
Let’s go back to the example of the insurance company from earlier in the chapter. The executive sponsor, let’s say it’s the Chief Legal Officer (CLO), is concerned that the costs of all the different legal firms they’re doing business with are eating them alive. It is becoming a detriment to the company. You build a model – and an automation capability – with the sole purpose of flagging and refuting legal fee overcharges. To understand whether your model is successful in this example, the question you should ask is, “How much money did my model save the company in refuted legal fees?” You should be able to go back to the original goal of the business sponsor (in other words, the CLO) and measure against that. You shouldn’t have a separate metric for your model. You have business metrics. Period. Did it have an impact or did it not, and by how much? If it wasn’t that much of a reduction (for instance, half a percent, equating to a few dollars, when you’re spending 100 million dollars on data sources and all kinds of other things), it’s not worth it. You need to stop the project.
AA: Something we touched on earlier that I’d like to delve a little bit deeper into is data literacy. It’s something that I see as a big blocker in many organizations, especially at senior executive levels: their lack of data literacy can make or break a project. Is this something you’re seeing? How much do you believe in enterprise-wide data literacy and development? How do you normally try and increase data literacy? Through training programs, such as off-the-shelf training programs or bespoke training programs?
CA: Data literacy is paramount.
I think data literacy is one of the most important things a company can work on to be competitive. Companies that aren’t data literate – if they’re not using data to drive every business discussion or decide how they’re going to strategically fund themselves – are probably not going to be in business for a long time. They probably have a very limited future versus companies that are data literate. I would say the same is true of senior-level executives who are data literate.
They will adapt their strategies more quickly to marketplace uncertainty – which is something we have seen a lot of since the pandemic and we will continue to see a lot of in the future.
Data literacy comes in degrees. Some senior-level executives may not understand the inner workings of a data science team, but they most likely have an idea of the data they would want to see to understand financial metrics associated with the performance of the company. If you want to help your team by helping senior-level executives become more data literate, then forget about training programs.
Instead, while in the midst of tackling a big data project for the senior executive, take extra care to communicate about the processes, tools, and key members of the team. You are the trainer of the senior-level executive that sponsors your major AI initiative. Educate them.
Assign one person from your team whose sole focus is to invest their time in educating that sponsor about how the data science team works, what data they use and how they get it, and the process they undertake. That is the only and best way to train senior-level executives. Because that is the only way they will be vested enough in an outcome related to data, to get them to care. Hopefully, that doesn’t come off as harsh. But senior-level executives have a lot to do in a day. Being data savvy only for data savvy’s sake isn’t going to be on the agenda. But being data savvy so they can understand the risks (meaning, risks of a project’s failure to launch or no ROI) associated with an AI project they are investing in…well, that’s worth it from their perspective.
I don’t encounter a lot of non-data-literate companies anymore – meaning the entire company of employees has no idea how to use data. If you’re in a high-impact industry and you don’t know data, I don’t think you’re going to be in business very long, and we have honestly probably seen that weed-out happen quite extensively during the pandemic. Especially to be in a world where everything is digital, you need customer data, market data, connected-device data, and data of all kinds. This is an essential part of doing business in the world these days.
AA: Yes, I think the definition of data literacy has expanded somewhat. It’s also about having a conceptual understanding of analytics, machine learning, and AI, at least at a very high level. I think that’s become important because some organizations, as I’m sure you see, get swayed by a lot of vendors who are still on the Kool-Aid.
CA: Right.
You’re talking about degrees of data literacy. There are the basics, and then there’s how not to get snowed by a vendor.
I think the procurement part of that is definitely a problem. It’s so much of a problem that I helped the World Economic Forum put out a piece for senior Human Resources(HR) leaders to help them identify the things they should be asking vendors who come in and try to sell them an AI hiring capability of some sort. I think those are good ways to tackle that level of data literacy. If people are having problems understanding what they should be asking vendors, I think that’s a whole different thing. It’s a different level of data literacy, but it’s definitely a problem.
AA: Yes. A great point.
I’d love to hear your thoughts on what makes a great data leader: someone at the senior level – say, the CDO level or the data team leadership level. What are some of the aspects that you see in good, strong data leaders?
CA: I had to do this back in 2013. I was asked by IBM to put together the very first CDO community to promote getting a centralized data analytics function up and running inside of the organizations we did business with – we didn’t even know what to call that role at the time. Was it a CDO, was it a chief analytics officer, or was it a chief data and analytics officer? There was no real precedence yet as there were only three CDOs that existed in the marketplace at the time.
Think about the Fortune 500 companies that you know out there. For any given company, IBM was selling data and analytics capabilities to the finance, marketing, operations, and IT departments of these companies. The IBM sales reps saw some overlap in what departments were purchasing. They finally just said, “We need one person. We’re doing some of the same types of sales over and over, but we have little bits here and little bits there in different departments. We need to create a role that can oversee larger, strategic data and analytics implementations.”
I thought this was massive hubris on IBM’s part at the time. I said, “What? You don’t just go create a role in the marketplace!” They replied, “Oh, yeah! We’ve done it before. The CIO, back in the mainframe days: we had to create an information officer to run all of these systems and mainframes within the financial institutions and NASA.” I said, “OK. I guess we can do that, then.” They said, “OK. I want you to treat the CDO – the chief data and analytics officer, whatever you’re going to call it – like your product.”
At that time, I thought about all the different data leaders that I knew from working at Verizon, at Citi, and across all the different functions within a company, from marketing to finance. I really had to look at what makes a great data leader, because I knew I needed to find those who were doing the best possible things and hold them up as examples to the marketplace. Then, I’d have to get the CEOs of all the biggest companies, pull them together into round tables, and show them the great things that these people were doing that would actually make them want to consolidate data and analytics inside their company and not just have 50 other people trying to access data. Otherwise, we’d reinvent the wheel for data access for marketing, finance, and every other department inside their companies. That’s nuts! They all want the same types of stuff. There should be one person in a company doing this stuff.
The answer to your question is that the best data leaders are the ones that are asking the highest-level questions in a company and are attached to the incentives and strategy of the company as a whole. They think strategically at the CEO level about how they use data to reach the company’s goals. That’s what they do. Plus, they’re politicians.
The worst data leaders are really good at their actual data job, but terrible at managing the minefields of relationships with CIOs or CEOs.
That or they’ll drop the data package at the doorstep – real or virtual – of senior executives and run in the other direction, as opposed to having full conversations and welcoming and educating people, almost as business counselors.
You know school guidance counselors? Well, a cross between a business guidance counselor and a politician is what I would say most good data leaders are. They just happen to also know how to wield data to meet strategic ends. The best CDOs manage to carefully migrate past political minefields, form relationships with all the different C-suite roles, and then come away with sponsors and champions to undertake strategic data initiatives – because usually, they’re not the ones with all of the money to do all the projects. They make friends and influence people so their data analytics teams – federated with other departments or centralized under them – can shine.
AA: What’s your transition into being one of the global data leaders been like? Have you found that challenging? I’d just like to get a better feel for what it’s been like for you.
CA: I was incredibly frustrated because I had a vision. Since CDOs were my product, I had a vision for what CDOs needed to be and when they needed to be that. I started with two data officers in the beginning. They would have a career progression that started with data governance. They’d pull all the data together inside companies (this was back in 2013) and try to make all of that different data work. That was the reason why we were even making the role in the first place. It was why we were trying to make this role in the market and trying to get the CEOs to demand the role. We were putting out job descriptions around what CDOs should be doing, so some hiring firms could go out, hire the people, and start them in an organizational structure at the company.
With that in mind, the maturity progression was supposed to be from data governance to business optimization and then to market innovations. For instance, business optimization would be looking across the business and doing more operationally sound analytics to help businesses cross-sell to customers and things of that nature. A lot of companies do that now. Back in 2013, that wasn’t necessarily happening at the ubiquitous level that we see now. Finally, you’d get to the level of market innovation where the company’s data could become its own source of revenue through the release of data products and related services. You get to a point where your product is internal information that can then be wielded as some sort of a monetization plan for your internal data and practices. That market innovation part of the career progression was and still is an area that many data analytics executives just couldn’t reach, or by the time they finally got there, it had been usurped by a different C-level role, such as the chief digital officer, chief innovation officer, or chief data scientist.
On a second level, I feel like there’s been a split of duties. There’s a group that stays focused on data. Sometimes they stay in the IT department, and sometimes they’re separate. Then, we’ve got chief analytics officers doing more on the business optimization front for the business. Market innovation got taken over somewhere along the path by data science pods. That’s what I call the little teams that work underneath each sponsor as needed on different initiatives. They can often work for digital officers. Sometimes, they’re straight-up working for IT, but a lot of times, they’re vendors being contracted from outside the company to come in and work for specific sponsors, such as a chief marketing officer for a specific project such as an AI social media command center.
I thought that this would be a good progression, but now I see my peers are stuck with data governance or internal business analytics. I’ve moved into AI and I want them to come with me. I’m trying to bridge that gap currently to show them why they’re relevant to AI, because a lot of data and analytics professionals are still not willing to move into that market innovation space.
AA: That’s a great point. What advice would you give to someone who’s transitioning from a technical role to a senior data leadership role? Say that their career aspiration is to go from being a hands-on data scientist to eventually becoming a CDO. What advice can be given on that trajectory with regard to skills, development, or education?
CA: Networking is vital.
Go out and start networking and politicking with as many of the senior leaders within your company as possible. Start forming relationships and understanding what drives them and what’s most important to them.
Also, understand where the landmines are as far as data goes inside the company. What are the types of things that senior executives will shut down? What will cause people to start pulling into themselves when you push too far? It means being a politician, having data charisma, knowing where you can help and where you can’t help, and knowing the CEO and their strategies. Every single one of those C-suite members is going to interpret that main corporate strategy into a strategy for their individual departments, and then you’ve got to figure out how you’re going to be relevant to those departments and recommend projects based on their goals.
AA: As for formal studies, you have an MBA in strategy and business development. Do you think extra, formal studies are important for career progression to leadership roles, or can a lot of it be done through intuition, hands-on practical learning, and maturity in terms of skills development?
CA: That’s a tough one because I think everybody learns differently. What comes naturally to me may not come naturally to others. I’ve definitely learned that. I’m more of an extrovert. I think CDO roles are made 10 times harder if you’re an introvert and you do just enjoy working with the data. I will also say that maybe you don’t always have to be a CDO to enjoy and be fulfilled by your work. For some people, it’ll stress them out too much to try and move into that type of role. If it’s stressful like that, maybe the role’s not for you. One of the hardest things to do is to figure out what makes you happy, because you may get to that chief data analytics officer spot after learning all these things and doing the activities and just think to yourself, “I am so freaking miserable. Just get me back on the team. Plug me back into the matrix.”
To answer the question more directly, if you are intent on becoming an executive in the data, analytics, and innovation field, you will definitely need to understand how business leaders think. I do not think you need to go to business school if you have a computer science or other degree…but you will definitely want to go and talk directly with business leaders in your company to understand their goals and any political struggles they are having within the company. You will have to become highly adept at building relationships, so the development of interpersonal skills will be key. The book How to Win Friends and Influence People by Dale Carnegie will probably be something you will want to read if you feel you have no natural intuition or proclivity when it comes to building relationships inside organizations. I would also rely on employees who have a high emotional IQ and a lengthy tenure inside the company to help me navigate the political landmines. These don’t have to be your direct employees, but set up a regular lunchtime or something of this nature with them to understand more about what they are seeing.
AA: Final question: when you’re hiring (although you’re probably too senior to be involved with data scientists and data engineers at the hiring stage), what skills and attributes do you normally look for in people? What do you expect them to excel at in terms of the technical stuff, such as coding skills, curiosity, and attention to detail? Is there a list you go through in your mind of what you look for in people?
CA: I think this is going to be consistent with what I’ve been saying, but I can’t teach you the personality aspects that I’m looking for. I can just teach you a skill – Python skills, for instance.
The main thing is having people who are flexible and adaptable in their mindset, because some data engineers and data scientists can be very inflexible.
What I mean is that once they are told what they will be working on, then if something changes, they will become frustrated if they cannot work on the project as outlined at the beginning. Data science is highly experimental, and internal and external clients of data science projects change their minds constantly about various aspects of what you might be building. You need people who are comfortable with being flexible and adaptable and also willing to roll up their sleeves and pitch in with every other kind of job, such as data sourcing and pipelines, investigating data, cleaning up the data, and putting the data into various formats. The more data and analytics skills they have, the better for being flexible.
Data scientists need a positive and experimentation mindset.
In data science projects, you’re going to be flinging things at the wall. Much of what you do will fail as you experiment. You’re going to be going back and forth, trying different things until they produce outcomes the team likes. You have to be comfortable with this sort of frenzied, iterative style of working. Even when you’re frustrated, you have to be able to articulate what you’re frustrated about. A data scientist can’t just stew because they’re not going to get very far and they’re probably going to burn out on the project.
Data scientists have to have a team player mindset, but also can’t be a pushover in the process either.
They are going to be in what I call the “AI pod” environment. This is where everyone’s work from the time the data is sourced and extracted to the point where the model is developed and released through an app or an API (data stream) can be thought of in a circular process structure. The work you do in the model testing part of the process may depend on the work of the person whose process came before you, such as the data engineering process. To go fast as a team member, you will have to work iteratively with people both directly in front of you and behind you in the process. They will need your help and you will need their help as you iterate to bring the project to successful outcomes. You’ll be saying, “That didn’t work. Let’s try again on this training data. Can you reset this such and such for me?” Then, someone may come to you about your work and say, “The model you built isn’t working with the application environment; could you try such and such?” I also wouldn’t want to work with data scientists that don’t stand up for themselves. If they truly believe that they are doing something the correct way, I don’t want them to back down when challenged just to save hassle or time or because they do not want to be confrontational to a peer or team leader. This is especially true if there are ethical concerns that could put the validity of a project at risk. You have to be able to stand up for yourself.
So, it’s all personality stuff. Very rarely would I turn someone away on skills. If you have the personality and the skills, that’s best. I can toss you right in and you can get going. But I would expect that people can learn the math, Python or R, and data science skills, all of which can be taught. At IBM, we did a nine-month program with Galvanize.
But what you want and can’t teach is a personality of being inquisitive, constantly learning, being able to work in teams, being interested, and connecting the dots constantly.
I can’t teach data scientists to want to investigate data, people, or the places they come from. I can’t teach that.