Getting Your Data AI-Ready Preparing JD Edwards Master Data for Success

September 18th, 2025

39 min read

This session, led by Manuel Neyra and Mo Shujaat from ERP Suites, focuses on the critical role of data readiness in AI projects. The discussion emphasizes that AI success relies not only on clean data but also on having the right kind of data, covering good and bad scenarios alike. Key challenges include stale, incomplete, inconsistent, biased, sparse, or sensitive data, and the importance of ongoing governance. This session highlights practical strategies such as orchestrations, third-party tools, and synthetic data creation to improve data quality. They also outline ERP Suites’ structured AI journey framework, guiding organizations through alignment, assessment, data cleanup, governance, and implementation to ensure sustainable value from AI investments.

Ask ChatGPT

Welcome
Why Data Readiness Matters for AI
Challenges When Data Isn’t Prepared
Common Data Issues
Data Biases and Diversity
Addressing Data Challenges
Preparing Data for AI Projects
Training AI Models and Iterative Data Refinement
Strategies for Data Readiness and Cleanup
Handling Inconsistent and Sensitive Data
Data Biases, Diversity, and Sparsity
ERP Suites AI Journey
Closing Remarks and Q&A

Transcript

Welcome

Hello everyone and good afternoon, good evening or maybe if you're somewhere on the west coast. Uh good morning. Um welcome to the second day of AI uh AI week hosted by ERP suites. Uh had a fantastic day yesterday morning so far has been uh full of of of great content uh great discussions and presentations and happy here to be uh with my colleague Mojat. We're going to talk about data and we've had a lot of discussions around AI in in just uh yesterday and today and and more continue this afternoon and tomorrow. Um and there's been discussions around making sure that AI is delivering the value if you implement an AI solution that you have uh you know that you you're able to realize the value of AI right given those insights given the you know kind of uh you know being able to give you uh better you know better decision making and uh drive efficiencies you know as well as innovate uh and the and the fundamental component right of part of AI beyond of course the technology beyond all the techy uh you know topics that Mo you and I talk about all the time right neural networks and and and all the algorithms that are being built uh and the large language models that need to get trained is not to be forgotten of course is data right it's fundamental to making sure that if you're using AI for any one of those purposes and maybe others that I didn't mention a moment ago that AI and the quality of the data that you have, the amount of data that you have, the state of the data, the aging of the data, just to mention a few of the parameters, it it's it's relevant, right? So, uh you got to make sure that that data is up to snuff, uh if I may say it that way, so that you get, you know, the expected results and of course the corresponding expected benefits. So, today, this was today's session's about. Uh, I think you're gonna really like the the this session. Mo's put a great presentation together in terms of covering a variety of different angles that need to be considered. Some of which you may have thought about, some of you didn't. Uh, but looking forward to the presentation and then I'll uh I'll hop in um, you know, periodically during the presentation because I know this is an area that Mo and I is uh, near and dear to our heart. Uh, we talk about this topic with many customers. So without further ado, I'd like you to hand over the baton to my friend and colleague Moshu Jot, VP of advisory and applications at ERP Suites. Over to you, sir. Awesome. Um, can you guys hear me and see me? Okay. Yes, we can. Great.

Um, so first of all, I'm going to say thank you everybody for joining us. Thanks, Manuel, for that great introduction. Um, I will also apologize in advance. Uh, this is not sponsored by Chick-fil-A by any reason for any reason, but my wife was kind enough to get me some this morning, so I will be sipping on my Chick-fil-A drink throughout the session. Um, before we start it, I've also created a poll for everybody. Um, uh, if you wouldn't mind answering it. Right. So, you know, I I I really loved and I'm really excited about this topic because, you know, we we I think with AI, we tend to get lost in the technology and the glamour and the glitz of AI, right? And Manuel and I talk about this often, usually when I am lifting in the gym and I'm calling him at 7 PM to rant to him about the the the random things that are on my mind. Um, and one of those things always ends up being is that, you know, we don't talk enough about the readiness of data for AI, right? Um, and it's so critical, right? It's so so so important. So, you know, one of the things that, you know, many of the things we're going to talk about today is that we're not just talking about clean data, right? A lot of people say, "Oh, our data is really clean." Or, "Oh, man, our data is really dirty." You know, getting your data ready for AI doesn't just mean that you have clean or dirty data. If you think about it, clean is a very subjective word. What does it really mean, right? Um, there's everyone is going to have a slightly different definition of it. So, we're going to get really, really specific about your data, right? Um and oftentimes people forget about this element which is it's not just that you have clean or unclean data. It's that do you have the right kind of data right? Uh often it's not just about having data in cases that is about the right scenario but you also need data about the bad scenarios especially if you're going to try to predict them right. Um you know you can't teach AI about all the good scenarios and expect it to know about the bad scenarios that you wanted to predict. Um, and the last thing, and this is probably the most key thing I want to take I want you all to take away from this, and this is why I really wanted people to answer the poll, and I'll leave it up for a little while, is because um, you're not alone. A lot of people go through a very similar uh, data journey, right? Um, a lot of people are still in that boat where they are they're just not sure what is what is going on with their data. And a lot of people are having are kind of in that same boat with their data, right? So we'll talk about a lot of specifics uh with that. Right? So that being said, you know, as you get as you get time, please do make sure you're answering uh into the poll.

Why Data Readiness Matters for AI

So we're going to talk about some very specific challenges that that come with the data today, right? And again, I will emphasize, we're not just talking about getting your data cleaned up um or or or getting it unclean, right? We're going to talk about a lot of different elements of it. So, so let's dive into this and and and you know why does it really matter, right? Um it it matters because when you take data with AI and you're you're putting data inside of these AI models and these AI skills, those AI models and skills are going to learn and extrapolate and predict based on what data you're providing it. Right? You know, often I'll refer to this as an intern instead of a um instead of um you know, an assistant or anything like that is because if you think about it, right, the the way we we even teach interns and the way we teach people who are younger is we want them to go through experiences and we want them to go through scenarios where it teaches them things, right? So, it's the same with AI, you know, um if you don't feed it the right kind of data, number one, it's going to be garbage in, garbage out, right? um the data that you train and prep AI on is the data that you're going to is is is where the model will start to to give you results based on right so where you're not seeing where you're where you're providing data that's um you know not complete or data that has some biases in it or data that is um stale right you're going to get uh essentially take any of those acronyms and you're going to get outputs and outcomes that are exactly those you'll get outcomes that are biased or stale or or incomplete, right? Um data without without the right kind of data, right? Again, we're not talking about just clean data. We're talking about all the right kinds of data. You're going to create errors in your model without the right kind of data, right? You're also more likely to uh to to amplify any biases that may be in your model. So, if you think about it, right, you're trying to predict for um you know, orders at risk or you're trying to predict for customers at risk. If you're trying to predict for something like this, if you're not really providing the kinds of data where uh AI can extrapolate those insights and then or sorry, you pull out those insights and extrapolate that you know based on those insights to recognize those patterns, you're not going to really be able to find those biases when it comes into production, right? So, there's a lot of challenges that come with this. Um and and and typically, you know, I think the biggest thing we take about it and and it's one of those things that, you know, um it's it's like, you know, we always say about MRP or anything else, you know, garbage in, garbage out, or if you're not going to feed the right kinds of data to these AI models, you're not going to get the right kinds of outputs um from from these AI models, right?

Challenges When Data Isn’t Prepared

Manuel, anything else that you want anything you'd like to add around what challenges come when customers don't have their data prepped? Oh, if you're speaking, you're on mute, man.

Thank you, sir. Um, yeah, one one topic that I'll I'll tease right now because I think you're going to cover it later in the presentation, but it's an important one, right? In terms of data, specific data, right? You talked about having specific data, the right amount of data. And you know, this happens, you know, if if we think about immediately, you might say, well, no, we don't we have sufficient data. What do you mean, right? Do we have diverse data? And I think from my perspective what I was thinking about is if you have a set of data and then people kind of at the company uh just in terms of tribal knowledge they built up that that particular record configured a certain way can mean one of three things or four things or five things x number of things but they just know by some kind of dynamic that you know later down the stream they're going to treat it differently. Well, that's a problem for AI, right? Anything because it's going to look through data and if the relationship where there's the distinguishing characteristic that will differentiate what those X number of records represent is far enough down the line, there's a chance that not only a human, you know, that is not trained and then is brought into the inner circle to understand why they're doing it together. it will look homogeneous when really that data should you should have distinct records to differentiate you know things that are true different uh truly different and and uh and I've you know I've spoken to customers that have those situations and say well we just handle it you know manually right those kind of things will bite you because then if you run a model uh you know an AI model over it you may get unexpected results does that make sense.

Yeah. Yeah. And and you know, I can't stress this enough, right? And I and I think this is why this is so important to me is we talk about the value of AI and and you know, I think we have to be really careful here because it's still a new technology and leaders are often a little skeptical, right? You know, every time I talk to a customer, they're always like, "Hey, we want to make the investment, but we got to be able to demonstrate value quickly and we got to get a quick win, right?" And I can tell you that the number one reason those quick wins don't demonstrate the value that we're expecting is because the data that's going into them isn't the right kind that we need, you know, and that that is why we stress the importance of this. Later on, we'll talk about our AI journey and we'll talk about kind of what we do. Um, and and sometimes it's a little bit of a disconnect, right? When we're talking to customers and we're telling them, you know, they're like, "Hey, I thought we're talking about AI. Why are you focusing so much on your data?" And it's because it's it is the foundation. It's the bedrock of it all. Right. So, with that being said, we're going to get a little bit more specific.

So, I apologize here. Some of the terminology I'll use, I try to uh you know, focus more on um you know, business side, but we we we we have to get a little specific here when it comes to data, right? And it's really really key to make sure we get specific because like I said earlier, there clean and dirty are subjective words and and we can't focus on clean and dirty. we have to focus on very specific issues that happen with the data right so so let's talk about unclean unprepped data right so I like to categorize them in about eight or so different categories where the problems really cap happen but um you know number eight is really one that's kind of a category of its own so so we'll just go quickly down the list right um and and you know the the thing is oftentimes people will think that oh the problem is our data is bad and you know often we don't find that your data is bad, right? So, it's not really that your data is incorrect, right? It's not often that where we find a customer is labeled in Ohio, but it's really meant to be in Georgia. Not saying it doesn't happen. I'm saying it just isn't the most common occurrence of things that happen. Now, the rest of these that we're talking about, these happen a lot more often, right?

Common Data Issues

Especially my favorite one, which is our data is stale. Uh what does that mean? Right? So, that is typically this is I think the most common culprit. This is our usual suspect. It's lead times haven't been updated in three years. It's uh reorder points haven't been updated. It's oh, we really haven't recategorized our customers in the last five years. Our category codes have been the same for 15 years, right? Um you know, that I think is is end up being a really really big problem. You know, anything you want to add on that, Manuel?

No, I think that's that's spot on, right? It's it's and there's there's a plethora of examples where that data can quickly get stale, right, if you don't keep an eye on it, you know? Um, and and one of the things that I I'll point out here and and not to not to be on my soap box too much is segmentation, customer item segmentation. I know we have a limit of 30 category codes and that's been a that's been, you know, uh, don't worry, I I I I'll ask I'll add it to my many wants to the Oracle product team, right, of more category codes, but those maintaining and updating the strategy behind those category codes is very very important, right? you know making sure that those items and those customers are staying up to date.

The other one we see is the data is incomplete. So oftentimes we'll see uh you know some parts of the data there but not other parts of the data. So for example we'll have lead times but we'll have missing suppliers or we'll have some suppliers for some items and some suppliers for other items or we'll have lead times for some items. Right?

We often also find that data is inconsistent. So in some places you know road is RD some places road is R O A D some places it's RD dot right um and and we find those inconsistencies in data a lot as well right um the typical culprits here are usually um UDC values where sometimes the UDC is doubled up in certain places right so you'll find those uh unit of measures is another culp culprit here where in certain systems you'll have unit of measures as um you um you know uh feet versus other systems will sort as FT right um some customers will use linear feet some customers will just use feet right so there's there's unit of measures um or or or just a lot of times an addresses addresses is a really big area with inconsistent data.

Another issue is that data can be sensitive right so sometimes the data we have to use has private information and we have to make sure that that data doesn't have that that that doesn't leak out into these models Right. Um, it's a big this is this is a big network of of code and intelligence and statistic probabilistic u models that are all coming together to make these AI tools work. You don't want to introduce something that's sensitive because once it's in there, it's hard to really get it out, right? Um, you know, you want to make sure that private information is scrubbed, uh, is removed and and is not part of it, right? Dean, you anonymize it, right? If you can, right?

Data Biases and Diversity

And so the next two we're going to talk about these are a little bit more unique. So these start to come into play when we're talking about very scenariobased very outcomebased in insights that we're trying to pull, right? Um and this is where data biases and data diversity starts to really matter, right? Uh and these are concepts we don't typically think about in our day-to-day. We don't really think about data biases and data diversity. you know, most people don't even think about these things outside of, you know, um, social issues and things like that that are going on, right? But this is very important in data biases both in, um, public AI systems as well as, um, you know, business AI systems, right?

So, you know, you don't want data that's specifically biased towards one outcome or the other, right? uh you want to provide a healthy mix of data that's representative of the reality sample that you want to pull the insights out of. So what does that mean? Right? What that means is if you're trying to create a model that is trying to figure out orders that are at risk of delaying shipment on time, right? You don't want to just provide the model with data around orders that are always delivered on time and in full, right? You want to make sure that orders are you want to you want to provide data of of both.

Now, uh the other skews the other way. You don't only want to provide data of of orders that were delivered late and partially, right? Because that will now skew bias the other way. So, there's a balance here. The other challenge that we see is that data is not diverse, right? um you're not taking a wide breath of of the data that you need and introducing all of those scenarios, right? Sometimes people tend to get a little specific, right?

So, one of my examples on this is when you're for example looking at pricing to to do analysis around pricing optimization or pricing segmentation, right? Or or pricing adherance, right? You don't want to just in introduce pricing for one type of customer or one region. You want to have pricing for a variety of different customers that ensures that the model can really see what are the criteria I'm looking for. Um what what are the different criteria and what are the different points of the context of your pricing model that really matter, right?

And the last problem that I think is really really key is that your data is just sparse. usually don't have enough um you know and that happens sometimes when we're looking at certain types of data around um you know uh like you know usually with orders and things like that but if you're trying to be more predictive around uh uh specific types of purchase orders or specific types of orders or specific types of pricing right if you don't have data around those specific things it's it's kind of hard. So, we'll talk about how um how we address each of these and and where what we really do um when we encounter these problems.

Addressing Data Challenges

But before I move forward, Manuel, I want to open up to you in case you want to add add anything here.

No, all good all good points, right? That the the the the you know, sparse data of course is is a challenge sometimes, right? Because if you have a small pool, you might not have a a stat statistically represented uh kind of amount of data and it and it doesn't feel good, right? You make decisions on on on an impartial view of the world. Um the private information one also hits home uh for everyone, right? Because we we probably all had examples heard about it in in um in in our lives, right? Where where in a note that wasn't intended to be private, you include someone's phone number. That's a problem. Now, now that's exposed.

So, um, and then the last thing I want to say is I think what Mo's talked about is great. It's fantastic and it applies to JD Edwards, but it's not limited to JD Edwards, right? One of the things that as we Mo and I have spoken to customers, it's not just customers saying, "Hey, they're looking at AI with JD Edwards." They're looking at AI for different systems, right? As we know, right? Whether it's CRM or security systems or others, right? there there there's AI uh components that are being incorporated in those uh solutions and others uh where you also have to take these same considerations into account.

Yeah. I think about some of our agentic solutions, right, that we that we talk about with customers and we build for customers. It's never can you just train it on JD Edwards data, right? It's always can you train on JD Edwards data? Can you train it on our product data? Can you train it on our customers other data? Can you train it on CRM data? And and you know uh and that I think was a big learning point for us too because you know we're just JD Edwards nerds. So we set out to do this with JD Edwards and we're like oh wow you know we're going to have to learn how to incorporate a very large universe into this. Um and it was a really um cool experience for us too as as we as we continue to go through it with customers.

So what does this all mean? What do you actually do with all this now? Right. So typically what we we address is there are going to be certain types of data that we want to make sure you're addressing um prior to even kind of starting building out the models that are in AI. Now I'm not saying that nothing happens here in AI before you do it but there's really not much precluding you from doing this type of data cleanup even now right uh we have some customers we work with where we uh actually help them develop a master data governance uh model right now so they can get prepared to do things like AI and things like um you know more accurate supply chain planning right.

Preparing Data for AI Projects

So typically one of these things you want to start doing is you want to start identifying problems and and gaps in your data today, right? You want to find where your data is inconsistent. You want to find where your data is stale. Um and you want to start to start the journey of okay, can we remove any incorrect data? Can we start to fill in gaps that are missing? Right? Can I can we do an initiative around looking at supplier lead times? Right? Um and and typically when we are starting these AI projects, this is really what we're focusing on first is like, you know, if you want to start an AI project around building materials, right, doing something with bombs and routings, you know, one of our questions first is going to be is okay, well, how accurate are your bombs and routings, right?

Um and and oftentimes that five W's type analysis leads us into diving into a little bit of a of a rabbit hole where we say, "Okay, there's an upstream data problem that we have to address first because if we don't, then you're really not get you're not going to get the value out of that AI investment that you're hoping, right?" you know, and that's the wor that's the last thing you want is that there's some there's some data issue that's going to prevent you from being able to go back and so show, hey, here's the ROI from this AI investment we made so that way we can go and do more AI investments.

So really any any sort of data cleanliness type activities standardizing your data normalizing it um updating uh stale data filling any missing data. These are the kind of things that you can start to do now, right? Um, you know, there's data uh cleanup that we'll emphasize once we start doing AI uh model building and AI AI tool building as well, but those typically require some feedback. So, it's it's a little bit harder to do those in advance, right? It's hard for you to address biases in your data until you start actually looking at your model and you start looking at what you want it to do. But it's it's pretty easy for you to determine like, hey, you know, do we have any customers with missing category codes that shouldn't have any, right?

Um and and we'll kind of talk through how do you actually address some of these issues um you know, specifically as you go through and most of the answers for addressing data cleanup problems as always with me is orchestrations. So just going to spoiler alert for everybody out there.

Manuel, anything you want to add here before we talk about train data cleanliness on during your training your AI model?

No, the only thing that I I say is I was waiting for that for the O word orchestration, right? This this may be a record. Yeah, we're almost a record for for the longest longest webinar presentation I've given without having said orchestration. We're we're about 30 minutes in. I just want you all to know it took a lot. You know, it's a milestone. Here we go.

Training AI Models and Iterative Data Refinement

So there's going to be some types of data gaps that you have to address as you're kind of training and building the model. These are data gaps that you can you can start to do this earlier but it's just you know these are data gaps that require refinement and evolution right so these are things like biases diversity do I have enough data right am I getting the right general am I getting the right level of generalization am I getting overfitting or underfitting right am I getting mclassifications right this is the part this is the iterative part of training your model right you know don't think of AI projects or AI journeys as you're going to go through this whole linear thing, right?

And and I'm I'm a big, you know, coming up in the ERP world, right? We having done both waterfall and agile, you know, you want to think of this as kind of a like I like to call it a waggle project. Uh PMs make fun of me for using that, but it's like a mix methodology, right? We do a lot of assessing and identifying initially. We do a big iterative cleanup phase. We do a big iterative training phase and then we're doing you know more of a a go live right type cut over type activity you know and this is a very iterative building training phase because when you first start to build out that model you know it's it's just like a little it's a it's a little baby AI it's got to learn it's got to grow it's got it's got to understand and you know um you have to refine it you have to say okay how do I tweak the data do I uh do I introduce some oversampling or some under sampling right to to address these things. Do I have to dummy up some data, right?

Um and and you know, we'll kind of talk through all those in in specifics, but a lot of those align a lot of those insights start coming as you're building and training your models. Uh because it's just hard to uh it's hard for humans to really understand, okay, where where am I um you know, do I have enough I removed the biases? Have I removed do I have enough diversity? Do I have enough? Um, our brains are amazing pattern recognition machines. They they extrapolate patterns really, really quickly. And that's what we're building. We're building a pattern recognition machine, but unfortunately it's it's not quite uh, you know, as smart as a brain. I, you know, your brain has got like 4,000 Nvidia GPUs behind it, right? Whereas I don't even know if our AI has that. Don't quote me on those numbers, right? But um the point is is that um we're we have to teach it the pattern recognition that a lot of us kind of understand intuitively. So it's it's it's something that it's very iterative. It's not going to happen in one go.

Anything anything to add on on that one?

Um just the one thing right in terms of show you know just like like like you said a baby or or a kid where we where where they grow right if you have children right you know we expose them to certain situations and and and they get exposed to positive situations to your point right to make that example um but they also need to experience some of the you know nos right and the things that are off limits it's same kind of a thing for for an AI model right if you just show it the positive side it's not going to really readily know about some consequences if if we're looking for an AI model to make some suggestions to our business uh you know let's say in the sales process or whatnot if it's just getting fed all the wins and none of the losses it's not going to be able to factor that in in terms of being able to make a a better recommendation so that's that's super important.

Somewhere my wife is going to listen to me delivering this webinar and say how come when it comes to recognizing the pattern that the laundry needs to be done. You have zero Nvidia GPUs. You know, I'm just waiting for that one to come. So, that one. Yeah, that's going to be an interesting one. Mo, you have to tell me. Hold on.

Strategies for Data Readiness and Cleanup

So, uh you know what? So, what do you do about all this? What do you actually do when you need to have to to address this situation, right? Um you have to get your data ready. Uh, and one of my favorite things is sometimes there's just no substitute for rolling up your sleeves and and doing the hard work. That that is that is sometimes what it just comes down to with data readiness, right? Um, so as we're going to go through some of these, you know, we're going to talk about right now what are some data prep and cleanliness exercises you can do. We're going to talk about my favorite thing, which is how you're going to use orchestrations to help with your data problems. Uh, and lastly, and this is very meta of me, is that how you're going to use AI to help solve your data problems to get your data ready for AI. Wow.

Got it. Got it. Got it there. And And please any point if you guys have questions, commentary, um, uh, would like to, you know, uh, you know, get in on the conversation, please drop drop your comments below. There's also still our poll is still open. So, I'd still love to hear responses from everybody if I can. Um, you know, really want to know what your what your uh data is like right now.

So, how are we how do we actually do this, right? So, you know, the thing is that it just takes it takes a good amount of assessing and understanding where your work is. So, we're going to talk about the AI journey in in in a little bit, but one of the things we do, we we do a data readiness assessment. It's actually one of the first things we do, right, Manuel? It's like probably right almost at the beginning, right after alignment day almost, right?

Um. Yep. And then to use cases as well. Yep. Yeah. So, you know, that's the first thing we do is we help we help we want you to identify your use case, right? So, what is it you're going to do with AI and then we want you to identify we want to start working with you to identify where are your data gaps and challenges so you can get that use case out of AI.

So, if your data is incorrect, right? Um, this is one where there really just is no substitute besides just going and cleaning it up right now. There's a combination of thirdparty tools. There's a lot of great APIs nowadays. Um, and and and AI and orchestrations here can help where like for example, if you're missing customer revenue data or you're missing um data around products or you're missing um data around uh mass, you know, if you're if there's master data that's incorrect, right? It's you you you you have to use a combination of orchestrations, AI, and manual efforts, right?

One of my favorite examples is zip codes. There's a really great um zip code cleanup tool which is is not that expensive. Um, and you can pair it with a very simple orchestration. And what it will do is it will clean up all of your zip codes. And then not only that, it will ensure that going forward all of your every time you enter in an address, your zip code is going to get fetched automatically. You won't have to type it in. You can also then use it to continuously monitor zip codes and keep them clean. Right? So that's an example of where you can use thirdparty tools and APIs to really just keep this data um really really clean going forward and and making sure there's no incur incorrect uh you know data that's in there right so this is really helpful around addresses product information you know um a lot of times like product dimension data right um you know if for example you're not you're having a hard time creating keeping product dimension data up to date um there's uh there's both uh AI I enabled tools that you can buy now that are getting very accurate on on doing product dimensioning. Uh there's also old, you know, more legacy type tools like a cue to scan that you can get which would which help keep your data up to date, right?

Um if you have specs or drawings in a PLM or a CAD system, you can integrate those with JD Edwards and bring that in, right? So there's there's a lot of tools here um that allow you to do that. And the nice thing is don't mistake any of this as oh I have to build all these expensive integrations. Um if you need to you should right with your business if you need to there's value to it you should but you could also always just export the data out and start to do mass data updates in Excel or sorry in orchestrations right so orchestration can very much help with mass and targeted data cleanup.

Um lastly, if your data is Dex, if your data is kind of stale or incomplete, first thing you want to do is you want to assess to make sure you find where the data is stale or incomplete, right? Um usually this is a combination of conversations as well as system reports and things and system data we can pull. Same thing here with orchestrations. We can really really help fill in a lot of the blanks with those data.

in um you know typically when I have data that's stale or data that's complete um you know a couple hours in a data excel in an Excel sheet and then an orchestration to take all that data and ingest it back usually gets the data pretty pretty well cleaned up and updated right may take a round or two or a couple of rounds but um we've gone through this with many a customer where um we we've helped uh get those those that data uh cleaned up with them right

Handling Inconsistent and Sensitive Data

Data is uh is is inconsistent, right? So one of the challenges here is um you know in historically we would use things called like fuzzy matching to find data that's inconsistent but but you know similar um now you know AI tools are actually getting very very good at more advanced fuzzy matching right it's it's very good at at identifying similar patterns because of the way that it uses that technology so utilizing AI tools is actually one of the best ways to find inconsistent existent data.

Now again, you want to be careful here because you don't want to introduce data um that's sensitive or data that's that's you know specific to your business to these AI tools to in the efforts of doing data cleanup. So make sure you're always following your AI policies when you're doing that. But you would be shocked at how good many of these large language models are now at finding um cleaning up addresses, cleaning up zip codes, cleaning up street names, street cleaning up these kind of things, right?

Private data I typically always recommend is done manually um either systemically or through a CIS admin or through um you know a manual effort because this is typically very sensitive data, right? You don't want um you don't want data dumps of this being floated around. You don't want to use an AI tool. um you don't want you know that kind of data stored and you know um parsed out and stored in log file. You want to have make sure that that's that's very very sensitive and handled uh handled that way. It's also because many industries have have regulatory or compliance requirements around it especially if you're operating in European countries and and and things like that right with with GDPR.

Data Biases, Diversity, and Sparsity

For our our two problems typically around biases and diversity. I I there's you you have to do a fairly in-depth assessment here. Um if you want to do it beforehand, right? This is why typically it's easier to discover and work on these as you're building the model. Um there's some there is some initial pre-work you can do with these two where you can do some assessments, you can do some analysis, but again it's only going to be in my opinion what I've always found is that that bias is going that that analysis of assessment is going to itself be biased to right because we're only looking for the things we know to look for right.

Once you put it in the model the probabilistic engines will really find all the different things that you don't so there's this balance between how much leg work do do you do beforehand and how much leg work do you kind of do as you're building um the models themselves, right? Um you know to address these typically you you will have to um you know and there's some algorithms that will help with these but typically what you will have to do is um you know address them very strategically. You you sometimes will have to over represent or under represent scenarios to provide enough data or or to address the lack of diversity of data. you will sometimes have to um you know weigh certain scenarios lower and weigh others higher, right? To to ensure that biases are not um impacting the outcomes that you want, right.

And lastly, if if you're having a problem with enough data, one of the I think the the most um helpful tools here is is using orchestrations to create the test data where the model where enough doesn't exist. And the nice thing is is because you have the ability to use orchestrations, you can really just take real data that you have and create iterative copies of it. You don't have to create new data, right? Um, you know, if you're thinking about orders, it's very easy to take that order and copy it multiple times and introduce kind of logic schemes where you can change out items, you can change out quantities, things like that. or you can take um historical data, you can apply um you know certain patterns like change certain patterns on on dates and then you can you can get that data in there.

So if you do need additional data orchestrations are a big help. Um so so one of the ways that we address this is creating that's called creating synthetic data, right? That's us making the data. Sometimes you have to just take data from outside sources. So you may have to go and get data from a a different agency or a different company that that maintains or manages that data set for for and you know has it available for sale. You can get that data from you know for example you can get that data from government agencies. You can get that data from other places. You can get that data from um you know uh companies sometimes themselves right? So, we've talked a lot about, you know, sentiment analysis, uh, you know, this week, right? That's that's something that I I mentioned a lot of my use cases is if you if you even start to think about sentiment analysis, that's data you can go and get and and build out, right?

So, um, those are different types of of of ways that we we kind of we mitigate and help address these data. But, you know, going back to what was on my previous slide is oftent times with getting your data ready, there really is no substitute for the hard work, right? It does just take diving into it. It takes cleaning it up. There's no silver bullet to, you know, hey, we got a magic AI tool that's going to clean up your data. We have magic AI tools that will help clean up your data. Um, and then help ma, you know, AI tools that will help keep your data cleaner, right? But, you know, as Manuel and and I many will say throughout this throughout this week is, you know, AI is not meant to be a human replacement, right? It's meant to be a human augmentation, right? Right. So you are are are still needed to to help and aid in all those things, right?

Um Manuel, anything you'd like to add? Spot on, sir. I'm good.

Ongoing Data Maintenance and Governance

So lastly, right, the the the main thing we want to say is once you have this data in a state where it needs to be, unfortunately, you don't just get to rest. You have to keep maintaining it. You have to keep cleaning it up. Um, and it's going to constantly change and evolve, right? Your data will evolve. Your data will change as it should. And as it's supposed to change and evolve, your AI models, your AI tools also have to grow and adapt as your data changes, right? And that's where we have capabilities such as rag. We have capabilities such as, you know, generative AI and many of the things that Manuel and his team will talk about, you know, throughout um, yesterday and the rest of this week.

About how does AI learn from that data? How does it grow? How does it get better? And that's really a really important that's that's a very important concept because what that also means is these challenges, they don't go away. It's not like once you get it all cleaned up, your data just stays clean. It's not like once all you get rid of all the stale data, you update lead times. Two years later, you're still going to have stale lead times, right? So the best thing is is is as you're doing these as you're evolving with with AI is to also start to think about implementing master data govern principles and frameworks in your businesses implementing best business practices around master data management.

You know best business practices around uh master data governance right making sure you have the right controls in place so people uh the right access controls in place. Um, all of those things matter, right? And you want to make sure that as you're thinking and doing this data cleanup, you know, you don't want that effort to go wasted. So, think about it comprehensively because that that really ends up being uh a really um that really ends up uh ensuring success long term because you're not going to just clean it up once and and and forget about it. You have to keep maintaining it. That that's absolutely key.

Very similar to us us working out, right, Mo? Right. We we work hard. we make some gains and if we don't stay on that you know stay disciplined to continue to you know keep it keep you know working out then you know what happens uh the same thing we start from scratch so you you could potentially be in that same situation if you don't do the upkeep the maintenance that Mo talked about with your data right you're going to put that effort in it will be worthwhile just like working out right worthwhile to make sure you're healthy u make uh the time that you invest to maintain your data will be worthwhile right? Because it will feed you, you know, you will insights and information that AI will be able to harvest from it uh and drive efficiencies elsewhere.

Yeah. And and the thing is is the the longer you let those problems faster like so if you if you do go through a cleanup effort and you aren't able to maintain your data as well, you start to naturally introduce that drift and all those challenges all back into the AI, right? You start to get those those inconsistent results out of it over time. So in order to really maximize that ROI, really get that value out of it, um you know, you have to make sure that this foundation is set uh correctly, right? And that's why, you know, going back to what I said in the beginning, um this is this a very very important concept because this really is the bedrock of whether you're going to realize the value out of your AI uh investment or not.

Right? So that being said, I am going to give a little bit of time to Manuel to talk about how we at ERP suites help customers with this. So this is our AI journey. Um and like I said, you know, when we were talking through here, one of the first things we do here is a data workshop, right? So um you know, Manuel will talk a little bit about how how the AI journey helps customers navigate through um you know, all the various pitfalls that come with an AI project or just investing in AI and and how we've how we've helped customers kind of navigate through that because it's it it is a lot, right? There's no um there's no just, you know, click a subscribe button and now you get an AI tool and it fixes all your problems. It's it's not that that easy. It's not that simple. And and you know, we we want to make sure we're we're kind of doing the service to you all to to walk you through, you know, the the foundational things that that really make AI successful.

ERP Suites AI Journey

Absolutely. Thank you, Mo. And and and I touched upon this briefly yesterday in the uh keynote. Uh Mo and I talked about it right uh yesterday right after Julie Holmes keynote uh and and the point was in terms of adopting AI it's very similar to adopting or implementing an ERP system or another system right you need to have a methodology a process a proven process right that allows you to get a successful implementation and and and realize the the value that you're looking for AI is no different uh and as we as we worked internally at ERP suites and and speaking to customers right all these pe all these components that we have here are relevant there are relevant component it's this is a little bit of like a multi- a polomial equation right there's they each have their contributions to making sure that that what you get out of that polomial equation is a is a positive benefit for your business so uh I'll go through this and and just highlight some of the key components because it's important, right?

One is alignment day, right? There's like alignment day here is really making sure that you know company, you know, what what is understanding what's the company's objectives? What are the uh leaders of the company looking at from kind of a direction vision and what what what's looking to be accomplished? where are there some friction points uh that could help uh you know achieve uh the their goals better or maybe they're not reaching their goals and they need they need some assistance AI is in the mix that's that's a little bit about what a what alignment day is about and then uh we go into a variety of different areas that is really the assessment phase right we get a little bit of understanding of the of the company and then we do some assessment and roadmapping right in terms understanding design thinking where you know that that we had some conversations about potential opportunities in the business uh where things could be better.

Uh doing a design thinking workshop to to really assess that uh as well as the use case uh perspective is trying to find not just any use case but the ones that are impactful right because we like to talk about well there's like Mo mentioned right there's the relevancy of a use case there's the impact right and is also there's a third component I would add is that is like is does that fit for the company is a fit right and and and when I say the fit that's there's a cultural component to that right are there going to be is culturally are they going to adopt you know that solution or will users not really use it um that's what really the use case roadmap is about is to come up with a list of of use cases that will really help um and they are aligned to your objectives.

Security workshop of course is you know kind of summarized in a very short phrase is what is the CISO looking for at your organization or if you have a security officer in terms of what's what's what what are they looking for in terms of protections right are they going to allow any data in uh I would assume that most customers if not all are going to lock data from getting out uh but do you do you want uh ancillary information that will enrich the AI's capabilities to be able to provide you know better uh forecasts for example financial forecasts sales forecasts.

Data and infrastructure workshop I think this is a little bit what Mo just you know was talked about in this session right and looking at data being able to see okay where are the opportunities uh where where the assessments need to be done and how to address them. Um and then the last but not least is coming up with a road map, right? We have use cases, you have all the the things we produce a road map for customers so you can understand all the different pieces here.

Um and and and one thing I I failed to mention is what we're going through right here right now is kind of our our detailed AI journey. We have an abbreviated version of this for quick win scenarios, right? where we still go through all these steps but instead of kind of going breath kind of across all areas of saying and if we we we talked to a customer in alignment day and saying we know it's sales manuel we want to go Mo and Manuel we want you guys to look at sales and we have a lot of friction there we want to be able to uh reduce our our quote to to order signing process right it it's it's x number of days or um x number of hours we want to streamline that we believe we can do that um so Then the focus in these areas that I just discussed will be around those areas.

So just bear in mind this is the detail one. There is a streamlined one that would be really focused but still covering these essential elements. Phase two here is actually okay once you've come up with the use cases the road map we've we've you know we've identified the opportunities in the data so forth right then it's now okay how do we address those components right whether it's the data cleansing that Mo just talked about uh but but also security right does security need to be tweaked to be able to handle those scenarios as well as establishing those policies have those policies been established because remember it's not just JD Edward security it's the AI security and I would say more specifically an enterprise AI security constructs that they have that need to be set up so that you are protected right you get the benefits of being able to get data in but not necessarily leaking data out.

And then of course process improvements and then of the third phase actually okay now you have that foundation solid is uh implementing and delivering the AI solution And this is this is the methodology. So that's you know so at each step of the phase you get deliverables you get you know you get our findings you get our recommendations and then we work together on executing and then with this we have found that this is this is the way that we we're able to get customers not only having a successful implementation um you know but but it's it's it's also maintainable and it mitigates all the potential uh landmines sometimes that people talk about in terms of AI.

Closing Remarks and Q&A

Okay. So, with that, maybe go ahead, M. Do you have any any follow-up comments that? I just, you know, it's it's always amazing to me how whenever we talk about the journey, um, inevitably if I talk about at a conference or a user group, um, you know, somebody will always come up to me and say, can you talk to my CIO or CFO or or C somethingo about that? because they're under the impression that AI is going to be really easy and I can have it done by next week, right? And I always find that funny because it it's it is true, right? And and part of it is that in some cases it is it is like that like copilot is a magical tool. It's really amazing what it can do with with productivity and and word and excel and those things.

Um but there um but then there are some then then there are some very specific um you know uh you know use cases where you you can't just implement it in day one. You you have to go through and really understand the context of of of what's happening right? What is it what does a late order mean to your business? you know, how does it impact your business? What constitutes um whether an order uh was late or did somebody change something or did the customer request something else, right? Um all of those things uh you know, they all they all very uh differ u very much on on how do you actually get the value that you're looking for, right?

So um you know one of the main things with with the AI journey I always talk about is that you know and I said this and I'll say this when if you attend any one of my my use case scenarios or my use case sessions is AI is not the solution to every problem but it is a solution. it is a part of a solution to many problems, right? And that's something that's very important for us to to learn is that as solutioners, we always got a really great tool in our tool belt. Um, and we need to learn how to use that appropriately, right?

So, that being said, um I do want to make sure that we have some time for questions. So, we have about five minutes left. Um, thank you all for your time. Hopefully, you're you're enjoying um you're enjoying your um uh you know, your sessions at AI week. Please uh make sure you're providing feedback to our our team on on whether on how you're how you're liking these sessions, what kind of content we can provide so it's more enjoyable and engagement engaging for you. And please if you have any questions uh chat them drop them in the chat or or the Q&A session uh below uh on the chat window.

And lastly, uh please make sure if you are going to Blueprint. Uh I will be presenting there uh with a customer. So please join uh stop by our booth, attend our sessions. Uh there's also some really great user group meetings coming up. So if you're not, uh going and and uh haven't gone to a user group meeting, please go. It's a lot of fun both during the day uh as well as in the evening. Um and uh please check out our our new podcast, Not Your Grandpa, is JD Edwards.

I'll just add um you can go to our website you can find pretty much all that information there the AI journey all that stuff to get in touch with us as well. So, um, is that I don't know if there were any other questions.

All right. Um, I know we had one other question that you were addressing. mode. Do you want to Are you taking that offline or? Uh, no. But Susan, you wanted to see Susan, that is exactly right for No, just throw data at it and see what happens. Yes, that that's that is that's the great that's a good way to approach it. I like it. Um, I just published the poll results if you're interested.

Um, and I saw a question um from I know Will, you had seen a question around uh illegal characters. Um, so it really it really really it depends. Um, you know, may if you could send me or chat with me some information around what is the type of data you're trying to clean up. Um, what is it like? Is it addresses? Are you talking about illegal characters like forward slashes, backslashes? Some systems don't like question marks like for example in an API query. So in an API payload, you can't send a question mark, right? Like like you you got to make sure the question mark is very very specific thing in a URL parameter, right? So you you have to be very specific about where you're using question marks and backslashes and forward slashes, right? So I'm not sure if you mean those as illegal characters or if there's something else you're you're referring to as illegal characters, but um you know, happy to kind of uh you know, jump in a quick teams chat or or chat on the Excel event platform if we can um to to walk through it. Right.

I I just need to understand a little bit more about what what kind of data you're referring to. It's saying purchase orders. Oh, did I see that? Is that in the chat? Oh, purchase orders. Street address. Um, address. Yeah. So, um, if you're talking about purchase orders and street addresses, so in street addresses, you you have to walk, you have to watch out for like, um, sometimes periods and commas. Certain systems don't like periods and commas. Um, it's, you know, cleaning up around these is going to be very, it's going to be use case specific, right? So, if you're going to if we're going to be um if we're going to do like a total normalization exercise where we say, "Hey, um we want to get rid of all sorts of illegal characters in PO items and descriptions." So, we want to get rid of any backsplaces, any backslashes, any question marks, any periods, um you know, any, you know, maybe some some maybe some carriage return, you know, maybe some pipes or um greater than less than signs, right? if those are in any of those, what you'd want to do is is is you want to dump through that.

Um, so normally there used to be tools that would help do this. Um, there were some tools you could purchase that would do this stuff really well. Um, but honestly now I would tell you there's really great large language models that will do this really really well for you. Um, and large that would be probably one of the best ways to approach it, right? So if I had to if I was doing this right now um you know I would probably create a company specific large language model um coher uh and and meta you know um metal llamas is probably metal llamas are probably going to be best for that and feed it um you know um street addresses right and have it provide it the normalization parameters that you want it to provide and the LLM will take care of it for you right so um LLMs don't have as much of a problem with illegal characters um because of the way they think and But APIs and other systems do, right? So that's how I would approach it.

But, you know, I what I'd like to do is first take a look at, you know, what what purchase orders you're talking about, what street addresses you're talking about, what's going on, what's in there, you know, give it a little bit of a skim to see uh because if it's if that's if that's too much, if you're really just talking about 20 street addresses that have out of, you know, 200,000 that have something, it's probably just easier to just do a quick find and replace in Excel.

Well, um I mean well I'm I'm good. So I'll stick around for another two minutes in case anybody has any other questions. But otherwise, thank you all for joining. Um and please enjoy the rest of your sessions. I have two use case sessions later today. So please join them. One in an hour and one at 4:30 p.m.

ChatGPT

Nate Bushfield

Video Strategist at ERP Suites

Topics:

Artificial Intelligence (AI)

Getting Your Data AI-Ready Preparing JD Edwards Master Data for Success

Table of Contents

Transcript

Welcome

Why Data Readiness Matters for AI

Challenges When Data Isn’t Prepared

Common Data Issues

Data Biases and Diversity

Addressing Data Challenges

Preparing Data for AI Projects

Training AI Models and Iterative Data Refinement

Strategies for Data Readiness and Cleanup

Handling Inconsistent and Sensitive Data

Data Biases, Diversity, and Sparsity

Ongoing Data Maintenance and Governance

ERP Suites AI Journey

Closing Remarks and Q&A

Related Articles

Hybrid vs. Cloud: Where Should JD Edwards and Your AI Actually Live?

JD Edwards and AI Integration: Architecture, Security, and Compatibility FAQ

Strengthening Network Security with AI Powered Threat Detection

Protecting Sensitive JDE Data

Optimizing AI Costs on OCI

Building Strong AI Use Cases, From Pain Points to Possibilities

Architecting AI Agents for JD Edwards EnterpriseOne

AI Functional Use Cases: Manufacturing

AI Functional Use Cases Inventory

Enterprise AI: From Big Uncertainty to Massive ROI

OCI Generative AI Agent

How E1 Orchestrations can leverage Oracle Cloud Infrastructure (OCI) services

Enterprise Document Intelligence Automating Insights and Detecting Anomalies

HeatWave + GenAI: Oracle’sAI-Powered MySQL Just Changed the Game

JD Edwards Product Roadmap

AI Functional Use Cases Procurement

AI Governance & Compliance Explained – Why the Human Element Still Matters!

Oracle’s Code Assist: The AI Co-Pilot That Will Change How You Code Forever!

Oracle's Conversational AI is Revolutionizing Data Insights—See How!

Upgrades in the Age of AI

Confused About AI? This Expert Advice Is a Game-Changer!

OCI vs AWS vs Azure: Which Cloud Platform Dominates JD Edwards?

STOP Struggling with AI! Unlock Easy Implementation Secrets Now!