This post includes the full Transcript of the Ashling Partners & Indico Fireside Chat on March 10, 2021 and the YouTube video of the entire educational interview: Ashling Partners & Indico Fireside Chat: Strategies for Automating Unstructured Content
The following is a interview hosted by Marie DiTipani of CPO Select and conversation with Tom Wilde, Indico and Ashling Partners Co-Founder, Don Sweeney on Strategies for Automating Unstructured Content
Transcript: Introduction
Marie DiTipani
Great to be here. Tom, great to chat with you today. Let’s continue the discussion and see where we go.
Tom Wilde
Excellent. So we want to chat today about an emerging category of automation solutions very applicable in the back office and around procurement, which is dealing with unstructured content, a problem that is, is not new, but has remained a big challenge for really capturing the full spectrum of ROI automation can bring to the table. I’m done. Maybe before we get started, if you could introduce yourself and a bit more on ashling as well. I’ll do the same and then we could get into this interesting topic.
Don Sweeney
Yes, sir. Thanks. Again, my name is Don Sweeney. I’m one of the co-founders of Ashling Partners, we work with organizations to help them improve their efficiency across business processes, primarily by leveraging Intelligent Automation.
Tom Wilde
Excellent. And Indico is an automation solution designed to solve unstructured using what some of our customers call a template list approach to automation meaning for the first time we’re able to bring an approach that doesn’t rely on rules or OCR, templating and things like that, which have long been a sort of a big impediment to really scaling out unstructured automation solutions. So we recently collaborated with Ashling on a project with Cushman Wakefield, and thought to be really interesting as sort of a both a general discussion around automating unstructured content and maybe some specifics around how these projects come to life and and what are customers really looking for when they’re tackling these kinds of things?
Maybe to start, we’ll go from kind of the general to the specific. What have you found as Ashling Partners as a service provider in the space? What are some of the bigger challenges customers face? Generally, when thinking about automation? Before we even get into the specifics? What are some of the hurdles there are coming when they step into the automation approach?
Don Sweeney
Sure. Thanks, Tom. So I would say the first challenge that companies have is really understanding their process. So having a process mature enough that they understand exactly what people truly do, people think they know what it is, but then what people are actually doing at the detail level, are not always aligned. And then really also being able to measure what the business value would be around automating that process tends to also be a challenge.
Tom Wilde
Now, deployment partners like Ashling you have seen very rapid growth in the last, you know, two to three years. Why has this been a successful approach for customers? Why is it not just the software? Why why services as well? And what do you sort of attribute your rapid growth to as a service provider in the automation space?
Don Sweeney
Yeah, certainly is a hot growth area, as you know, as well. But I think the biggest challenge that companies have, and maybe one of the reasons why they would reach out to a firm like ours is traditionally that was to help them understand that business value to help them really kind of understand the process, it’s now started to pivot more towards helping you understand all of the various technologies that people are bringing into this. And it kind of makes sense, the technology companies like yourself, the r&d, and the technologies are moving so fast, people can’t really keep up with, you know, what couldn’t be done a year ago can now be done or or vice versa.
Tome Wilde
When you think about, you know, automations become sort of a spectrum of technologies. Now, you know, it’s called different things like gardeners more recently began to describe that the bigger tent over the space as hyper automation. In the UK, or other terminology being used to describe the various components of automation. When you know, when you think about unstructured, and we describe it as intelligent process automation, as opposed to robotic process automation, because of the cognitive ability that’s brought to the table when dealing with unstructured it is IPA really different than RPA? Why not just apply, you know, existing RPA technologies to the unstructured problem to document automation?
Don Sweeney
So it’s a great question, and I won’t spend a ton of time going back and doing definitions here. But you know, by definition, RPA is really task automation. It’s kind of a misleading term. I’m sure you’ve seen that many, many times. It’s really taking very specific tasks that people do those mundane repeatable tasks and automating it now. RPA suites are starting To expand their capabilities, but you know, intelligent process automation, again, by definition has some form of machine learning or artificial intelligence that’s making that smarter and starting to broaden what automation can truly do. And so as you start to think about the difference between IPA and RPA, you’re talking about the difference between automating a task and starting to be intelligent enough to automate a broader full end to end process, especially things that may have unstructured data, like you mentioned before, or, you know, some form of document or email that you have to read and include, as part of that process.
Tom Wilde
What changes have you seen in this category of customers really viewing this as something they want to, you know, shop for and evaluate separately? You know, because the requirements are so different, what kind of trends have you seen in the last 18 months around, you know, IPA, as its evolved here.
Don Sweeney
Because it’s evolving so fast, it’s really a challenge for clients, you know, people need to understand what is the core competency of an organization, and then you know, what is a differentiator for a different organization or software product. And so when you start thinking about intelligent process automation, and you’ve used the term multiple times, now about unstructured data, there’s unstructured data, there’s semi structured data, and there’s structured data. So without taking too much time here, you think about structured data, that’s something that you’re pulling out of an application, it’s a field on a page, within an application, or you’re grabbing a cell within Excel or something like that, it’s, it’s a very, very specific spot that you can automate.
That’s very simple to automate. And that’s where our rpa does really, really well. Now, with this being, you know, kind of a CPO type group, you probably deal with a lot of purchase orders, you know, that’s kind of a semi structured environment, you know, the same kind of 15 fields are on a purchase order, but they’re going to be in a slightly different format from one vendor to another. So that’s, that’s kind of, in essence, semi structured, when you get to unstructured, you’re talking about reading a contract or reading a page that might have images on it, or, you know, you have to actually get the context of the language around not just a keyword you’re looking for. And so the further you get from structured to unstructured, you’re really starting to look now into organizations that have specialization in that space. Versus, you know, the ones that are on that more structured side.
Tom Wilde
It’s not a new problem. Why has this historically been so difficult to solve, especially semi to unstructured, you know, what, what has been the big impediment to trying to solve this historically.
Don Sweeney
So the majority of business processes require some form of manipulation of data. And if you think of any kind of process, there’s, it’s not just take data from x and put it in why I mean, that’s a lot of kind of data interface type stuff that’s probably already been solved, I think, as you start to look at any process that the people on this call would have, it’s going to be reading emails, reading contracts, reading, you know, maybe even some of the semi structured data like vendor onboarding forms or something like that. And as you start to kind of take a consolidation of multiple of these, that becomes more challenging, at least historically, to automate. And so you’re looking at really bringing in that AI and machine learning component, to be able to not just identify the data that you need, but also potentially even the context of that data in which you nee a customer once, say this to me, which I thought was just a perfect way to summarize it. They said, you know, what you want, you’re not sure what you’re going to get is a way to describe why this is so frustrating, at times have a problem? You listed off some of the use cases in the procurement lens. What have you seen that some of the top three use cases around unstructured that the customers wish they could solve, you know, and are looking to solve?
Tim Wilde
Yeah, really a lot of the AP processing? I mean, it’s not just simple invoice processing, it comes to, are they sending an invoice? are they sending a credit memo? are they sending maybe a debit memo payback? You know, there’s various things that are all part of that conversation? are they sending the statement, you know, there’s all kinds of different things that they may be requesting. So, really automating that whole conversation aspect starts to expand beyond more of a structured or semi structured environment into a much more unstructured environment. You start thinking about the vendor onboarding and a lot of the compliance components that you need. All of those areas really kind of end to end procure to pay is typically something that you look at around efficiency and effectiveness, it needs to be done well. But it needs to be done with a minimal total cost of ownership. And so you’re really trying to drive a process on moving people away from just doing the data manipulation, data gathering and more towards data analysis, we really start to pivot the discussion to doing more around vendor scorecards, and vendor management’s true vendor, rating and vendor kind of strategic sourcing versus just spending 90% of your day collecting forms that, you know, you need to get so the vendors can be compliant in your system.
Yep. I think that you know, what’s wrong, I get the question of why we’ll throw web forms and things like that, and negate the need, why would they eliminate this problem. And I think that the biggest source of this challenge is when you’re dealing with third party documents, right? When you’re exchanging documents with third parties, because nobody has enough sort of market power to dictate the format, you know, or say, Hey, everyone has to use this, this same document type. And, and those types of use cases are everywhere, and aren’t going away anytime soon. And that’s really the crux of this, because that’s where the unstructured shows up, right? or semi structured is when you’ve got the same type of document purchase order invoice, but the variants are, you know, in many ways, invoices, I always think of as like snowflakes, right? They’re just a million different ways to construct an invoice. And yet every company, you know, has to deal with that as a primary use case work or purchase orders or other other supply chain type documentation.
Don Sweeney
Yeah, Tom, maybe one more example would be things like customs. So anyone who’s dealing with international procurement, you say, Oh, well, customs forms, our standard isn’t that kind of structured data, you know, exactly the form the US government wants. Sure that one form is, but think of all the supporting documentation that comes with the customs forms, all of that can be a myriad of different items, it’s not even the same things every single time. So that’s where it starts to become much more context related and unstructured, then, you know, the actual structured form of the, of the US Customs form now is a great example, two common mis misunderstanding about what unstructured really means, you know, I’ll give you an example from the, from the finance world, in mortgage titles and deeds, you would think that, well, a mortgage title is a mortgage title, but then you realize, well, there’s 50 states, and in those 50, states, there are 15,000 counties. And so, you know, each of them puts their own little spin on what a title is, or a deed. And so before, and each of those have, you know, 50 supporting document types that go with it. So before you know, it, you know, you’re you’re in the 100,000 variants of this, what you thought was a very structured, you know, document that any of us could look at and say, Oh, that’s a title, or that’s a deed, but to the machine, you know, those variances are what creates such intense challenges for historically the, the software to figure it out, when you when you engage with a customer on these kinds of use cases, how do you? How do you try to ensure success? What do you do at the outset, to try to set it up for success? Given that, you know, every customer brings, you know, a slightly different flavor of the problem to the table? What are some things you guys have done there that you’ve seen work?
So we always start with the business objectives? Right? What are we trying to achieve? We’re not implementing technology for the sake of technology, at least Hopefully not. We’re trying to achieve some form of business outcome. And so we really want to make sure that we’re aligning to a business outcome I previously mentioned, the effectiveness and efficiency of things around procure to pay or source today. And so typically, that’s around continual process improvement metrics, or we try to work with organizations to come up with those continual process improvements where it could be cost per purchase order, or cost per invoice. So as a CPI metric is something that quarter over quarter, you should be doing better as you’re looking at it from a total cost of ownership, you should be doing it right the first time, so there’s not fees in re doing it. But also even doing it effectively the first time, you want to bring that cost down over time by adding more automation, making it more touchless. You can even provide a better experience by having automated responses and kind of letting people know where you are throughout that process. So we typically try to really focus on that business outcome approach and a continuous process improvement approach.
If we use this sort of Gartner type cycle metaphor. You know, two years ago, you saw a lot of major companies creating these Centers of Excellence around artificial intelligence. And we’ve seen that kind of wane and go through its trough of disillusionment, I think is the way Gartner describes it.
Tom Wilde
Why is that? What is? What is it about AI that created this sort of trough of disillusionment? And what do you need to do to sort of overcome that and capture the value of using approaches, like indico, that use artificial intelligence to solve the problem?
Don Sweeney
Yeah, I think too many times people try to jump to the result, right. And so people jumped into machine learning, or they jumped in artificial intelligence. And a lot of times, they didn’t have good data. And so you can’t automate something without necessarily having the data for it, especially around machine learning, where it’s learning every time it’s processed from an ongoing basis.
Tom Wilde
So, you know, you and I have worked together on things like resume processing, and the machine learning models, as it’s figuring out, oh, you’re accepting these people versus these people?
Don Sweeney
Well, there’s positives and negatives to, you know, those concerns around what, what it’s actually learning from the resume and maybe the bias, if you will, on some of those areas. So, you know, you really want to start to first look at what your objectives are, and then add the right tool into that objective. I think the other challenge is, people walked in with a hammer looking for a nail. In some of these, they just said, I need to do machine learning, because they came back from a Gartner conference or, you know, whatever. And they maybe weren’t necessarily prepared to do that, or they weren’t doing it for the right reasons, where, you know, when we talked about now, using that unstructured data, you know, you’re bringing in the right tools to process unstructured data, and in essence, make it structured, right, so then be able to then manipulate that data or automate that or, you know, then carry it along further along the process. So, I think people are starting to figure that out. But that’s certainly been the challenges that we’ve seen to date.
Yeah, I think this is where I think RPA and IPA are very, very compatible, where if you can turn on structured infrastructure, then you have platforms surrounding it. They’re already optimized for structured data and are really good at it like RPA, like BPM, like business intelligence, like CRM, where implicitly those platforms assume you’re going to feed them structure. And if you do they work brilliantly, right. They’re designed to do that. And so that’s sort of the crux of it is, is where you can turn unstructured into structure, you kind of unlocked its value.
Absolutely, absolutely.
Tom Wilde
What are some of the things you look forward to from what you hear from customers, and where you guys are trying to sort of push the thinking with your customers? What are some of the key trends that you’re seeing, you know, whether it’s things like citizen data science, low code, you know, what kinds of things are high on your radar, that you’re, you’re keeping careful track of?
Don Sweeney
Yeah, it’s really, it’s really all of what you just said, Tom. So we’re certainly seeing people wanting to bring this in house and empower their users. So you know, think about and I’m dating myself here a little bit, but think about work pre Excel, right? pre Microsoft Office, you had to actually have people who could go do spreadsheets before, they were like true spreadsheets. And, you know, Excel empowered the actual user to now be able to do a lot of their own data manipulation and own calculations, and everybody knows how to use Excel 99% of people know how to use Excel, you know, you’re starting to see that a little bit in the low code side. And in that citizen data scientist side, you’re starting to see, really, that empowerment of the end user, in kind of what we see is people having these Lego blocks, if you will, so kind of pre-built components.
So not everybody is going to start with a clean sheet of paper and write their own code, nor should they. But as things are kind of pre built, then you’ll be able to kind of grab those and assemble them together for an automated process that you’re looking for. So indico may provide a resume processing, or it may provide a reading of contracts or, you know, you talked about even lease extraction and some of those other items, loans and other financial services items that are contract reads. So those might be in essence, a component into a larger automation where my automation might be slightly different than yours and might be slightly different than somebody else’s on the call here. But I’ve got these components that I can then use and personalize. We really think that’s where the puck is heading. so to speak.
Tom Wilde
see a question from the audience that is relevant here that questions around, maybe on a specific customer experience you’ve had, how do they try to calculate the ROI for this kind of investment, you know, is specifically as you can answer as possible? How do customers think about, you know, if they’re going to turn around and ask management to spend this money on services and software, how have you seen ROI be characterized?
Don Sweeney
So I’ll give you the very fast version here. So we really tried to take processes and put them into two buckets. For the sake of this, we’re gonna call it core and differentiated. So a differentiated process is typically things where you are measuring more about experience and other value added metrics. So employee engagement might be something on the experience side, or customer experience might be something so you’re measuring ROI, very different in those, then you would core so core, you know, like procurement, things like that. We talked about efficiency and effectiveness, that’s typically measured by hours back to the business. So what are we freeing up to allow them to go do other things? And so that’s a hard dollar ROI, typically hours back to the business based on cost of the individual. And you’re literally measuring it by, okay, how many hours do I no longer have to spend by reading contracts, extrapolating the data and entering that data into, you know, a contracts module or reading resumes and sorting the resumes to be the top 10 that I’m going to now go give to another person or, you know, doing 3d modeling, or whatever the case may be that are these more significant more meaningful items. From a core side, you’re typically talking about hours saved or hours back to the business.
Tom Wilde
And this one I see is a little bit related. How do you set expectations around? What’s a realistic time to first value when engaging on an unstructured use case? What do you think about that with customers? When it’s sort of they’re trying to dimension? How long will it take me to see the kind of first value?
Don Sweeney
Yeah, and that’s something that we certainly work on with every client. Not only their first engagement, but all engagements are really sitting down and kind of building that business case of what is the value? What’s that business outcome that I mentioned? And then, on the flip side of that, what’s the feasibility or effort that it’s going to take to automate? How much of this can you truly automate? Is it 70% 40% 90%? And you start to actually build a business case, I think, a lot of times, you’re really trying to sit down and figure out what is that business case? And then, you know, part of that, how long is it through? There’s not a right or wrong answer, some try to have a really big business case. First, others look for really small business cases, and they want that frequency of kind of momentum. Again, not a right or wrong model. But you know, certainly people kind of look at it from different approaches.
Tom Wilde
This question is a little bit more technical, let me paraphrase a bit for, for our discussion here. The questions around challenges with training data as it relates to artificial intelligence, and where, you know, I think the questions had some, you know, some seasoned projects fail on that basis in the past, how have you guys approached this or a training data problem as it relates to AI? And I can certainly weigh in on that as well.
Don Sweeney
I’m sure you’re gonna weigh in on this one time. So you absolutely need to make sure you have a good data set that you are using for training purposes. And you want to make sure that that is as broad as possible. You talked about, when you first mentioned Indico, you were talking about kind of a template list driven model, there are certainly organizations kind of the old school way of doing, you know, some of this is I want 10 copies from each vendor, and I’m going to train it, you know, by vendor, and you’re in essence, kind of training it in a template based model. In you know, now, you really want as many diverse options as possible, because you’re really now training it for the fields and extrapolating it more from a field based model. So you really want a high volume? I mean, call it 500. I’m sure you’re gonna say it could be significantly more than that, depending on the use case, of course, but you don’t want like 500 items that are pretty diverse, that you can use to train that model.
Tom Wilde
Yeah, I think, you know, in the case of Indico, one of the big breakthroughs we think we brought to the market is, instead of historically with machine learning, you might need 100,000 labeled samples to get any kind of efficacy within the code, you’re in that sort of 200 To 500, you know, labeled samples range to build a custom model to interpret a particular type of document, be it, you know, type of purchase order a type of contract, you know, with a specific sort of structure that you’re you’re hoping to extract from it. I think the corollary to that also is, we’ve seen a fair amount of questions more recently around bias when it comes to AI. And I don’t mean specifically just bias in terms of social bias but but a model biased to make a type of decisions versus another type of decisions. And so I think what we always preach is, you know, AI is still very much an intelligent Pierrot is the way I like to describe it, right? t’ll pare it back to what it is…And so if you train it with biased data, either, for example, bias towards a certain flavor of document types versus another, it will tend to, you know, predict and lean towards the majority of the training data you’ve given it. And so you need two things there, you need a diversity of training data. And you need the ability to explain how these models are working, you know, with an interface that gives you that that, that level of detail, which again, I think is something that we are strong believers in that this notion that AI has to be a black box is something that we kind of reject, and we don’t think, ultimately the enterprise can can stomach that. I don’t know if you’ve experienced some of that more recently, with AI, if you’ve had customers ask questions around governance and explained ability?
Don Sweeney
Yeah, absolutely. No question. It’s, like you said, AI is really apparent, it’s going to do exactly what you asked it to do, which is going to do it over and over and over again, and, you know, kind of learn based on what you’re, what you’re telling it are good results. So you really want to make sure you’ve got a diverse set of data there.
Yeah, I think that AI governance, it’s called different things is one of the bigger trends for, you know, for 2021. And going forward, what you’re seeing is the general public as well, you know, things like, how does YouTube make recommendations around videos? How does a Tesla make its decisions around self driving, you know, as a society, you know, understanding ai explainability is, is critical, and certainly regulated industries like finance and accounting and insurance, that level of transparency required, right, because of the regulatory impacts that that come into play when you’re dealing with with automation. Well, I think we just have one minute left here.
Tom Wilde
So maybe lastly, any broad lessons learned from these kinds of technologies, if you’re going to kind of give, you know, your wrap up? What are your three bullets in terms of takeaways in terms of setting yourself up for success, you would give folks on the call, based on your experience with the unstructured category?
Don Sweeney
Yeah, so certainly understand what the business value is, hopefully, that came across pretty loud and clear in this conversation, make sure you have a business objective before you do it just for the sake of doing it. Make sure you think about specialists when it comes to the areas that you need specialists in that technology or toolset. And unstructured is certainly an example of that. And then make sure you’ve got you know, an understanding of a good use case, good test data, and kind of really think through the process of knowing what success is. Right. So I think that’ll be the wrap here for me.
Tom Wilde
From our perspective as sort of a vendor, I think, you know, working with Ashling Partners, because they’re seeing these challenges over and over again, your learning curve gets to basically jump to the very bottom because you get to take advantage of all the great learnings that Ashling has had in the market and how to deploy these things and how to be successful. Well, it’s been great, Don, really appreciate the chat here. Great level of detail on tackling this, this emerging category, which I think offers such a tremendous amount of potential value in the procurement arena.
Don Sweeney
Yeah, always great speaking with you Tom.
Tom Wilde
Great. All right, Marie, we’ll turn it back to you.
Marie DiTipani
Thank you guys so much for sharing that. I think I think you know, a lot of it has flown into sort of the conversations we’ve had today and actually our session up next and we’re looking at, you know, what does the technology centric procurement team look like? We think data is going to be, you know, a big leading effort forward and making procurement a catalyst and change and innovation and it will continue to help us be strategic sorcerer’s. So Don, Tom, thank you so much for taking the time to have that chat with us.
Don Sweeney
Thanks for having us.