Realize Live 2023 – Wednesday
Last night I had a discussion with a friend about data, collecting data, and deriving meaning from data. Does data have any value if you can’t measure how much it improves a specific parameter? I’m personally of the view that you can collect as much data as you want, and you’re going to find out exactly what you knew through experience and instinct, well, almost always. Except that the act of collecting data has a tendency to annoy a lot of people. (Do we have stats on how annoyed people get by having their data collected?) And now you made us remember endless user names passwords which are endlessly forgotten. Do we have stats on what percentage of the time some logins are forgotten? I would say if you’ve forgotten a login, it probably didn’t have any actual value in the first place. After all, logins aren’t made for my convenience, they are usually for some washed up SEO marketing person somewhere to collect useless piles of information. If you’re using piles of collected data to sell me something, let me tell you now, you’re wasting your time, money, serverspace, bandwidth, and more importantly, my time and patience.
There are even some of us who will go out of our way to make sure the data you collect is fake or made up. If the data collection is for you rather than for me, I can make up names and interests all day long. I even have a list of emails specially created to make sure I know where you got the email address from. In the end, you don’t have a right to my info, and I don’t have a requirement to play along nicely. Do you know how much of your data is bogus?
Training AI on garbage data is like Wall-E building a new world from blocks of pressed trash. There are better things to do with your time, in fact just about anything is better. The end product is truly trash.
Now that you have all of this bogus data, what are you going to do with it? What CAN you do with it? Modern data science has invented a lot of things you can do with data. The question is, are the results valid? I think people collect it to justify some overhead job that doesn’t accomplish anything, or maybe they collect it because that’s what Google does, and of course Google is wildly successful, so I must also collect data.
I talked to someone a lot of you know a few years ago when he started working on trying to suss out trends regarding serious illnesses from piles of medical patient data. This was a topic where it mattered whether they came to a good conclusion or not. Developments in cancer treatments have come from using computers to mine piles of patient and drug data for trends humans can’t see with their eyes. They’ve learned better diagnosis techniques, better treatments, better outcomes for patients from all of that. So clearly not all data collection is trash. I don’t mean to insinuate that. There are certainly valid uses of piles of data, but collection for the sake of collection is not valid, nor is the use of data for trying to sell me stuff. I’m can find stuff I want to buy on my own, you don’t have to hound me on every platform imaginable. In fact, I will often rebel against advertising, and buy something other than what is advertised.
Back to Siemens Realize Live 2023 for a moment. One of the speakers at the general session this morning did make the link between a big pile of collected data and better and user results. That theme of Users First crops up again. And although skeptical, I started to see that there are ways of handling the right kinds of data to drive good decision making that improves the situation of someone somewhere. So there is a valid process out there somewhere. Just like many people make good use of owning a computer, and others just find ways to waste time or hurt other people. Seeing ideas misused so often breeds cynicism.
I wish we could hear from someone on the indiscriminate collection of data, and how the very act of collecting data can affect outcomes negatively. Like a twisted Heisenberg principle. It just seems like every time you’re forced to use a password or click a salted link its just so someone can associate a bunch of button click data with your account. By now its just habitual and indiscriminate data collection because that’s what people do. And yet I still get phone calls from Siemens inside sales people treating me like a raw sales lead. They’ve got a lot of data on me, and yet they get it massively wrong. I doubt I’m the only one in that boat. This to me says it all. If they have all this data on me, and still cold call me like a first time CAD user, there is something more than a little bit wrong with the process.
Every time I sign in on a website to download some piece of information, I get a hopeful call from an internal sales person. Half the time they are calling for some bogus name I’ve invented to sign in with so I can kind of track what it was that I did that triggered this particular contact. To quote another popular saying “it’s a waste of your time and it annoys the pig”. It’s a shame too, because while it seems that absolutely no thought has gone into how they handle my data, they appear to have indeed put a lot of effort and money into it, all of that just to come not only to the wrong conclusion, but also a useless one even if it had been right. It’s as if all of that effort has not had any actual intentional outcome.
And now we learn that they (not meaning Siemens directly, but the proverbial “they”) are training pet monster AI tools on all of this fake data I have entered every time I’m given the opportunity. It becomes difficult to take AI or data hoarders seriously. Yes, there are some people who are more careful, more selective, more discriminating and get real results that aren’t accidental or statistical anomalies, but marketing folks have proven several times over that they are not in that group.