Epod Episode 1: Barry Van Veen on Machine Learning

On this Episode:

On our premier episode, Susan Ottmann talks to Dr. Barry Van Veen from UW-Madison’s Electrical and Computer Engineering Department about machine learning and why it’s important for engineers focused on data. Barry also discusses the engineering perspective on data science and data analytics versus a computer science view, as well as why signal processing is such a hot area of study today.  

Our Guest:

Barry D. Van Veen received his BS degree in Electrical Engineering from Michigan Technological University in Houghton, Michigan, in 1983, and his PhD in electrical engineering from the University of Colorado-Denver in 1986. He has been with the Department of Electrical and Computer Engineering at the University of Wisconsin–Madison since 1987, where he is currently the associate chair for graduate and online studies, and the Lynn H. Matthias Professor of Electrical and Computer Engineering. He has co-authored the book Signals and Systems with S. Haykin. His current research interests include signal processing and its applications—including the development of algorithms for biomedical signal processing problems. 

Subscribe to Podcast

Transcript:

Justin Kyle Bush  0:00   

Welcome to Epod, a podcast from the UW Madison’s College of Engineering Office of Interdisciplinary Professional Programs. These podcasts are focused on big ideas and engineering and the people behind them. My name is Justin Kyle Bush, and I am your host.  

On today’s episode, Susan Ottmann talks to Dr. Barry Van Veen from UW Madison’s Electrical and Computer Engineering Department about machine learning and why it’s important for engineers focused on data. Barry also discusses the engineering perspective on data science and data analytics versus the computer science view, as well as why signal processing is such a hot area of study today. Take it away, Susan.  

Susan Ottmann  0:54   

Barry, welcome. 

Barry Van Veen  0:56   

Good morning, Susan. It’s great to be here today. 

Susan Ottmann  1:00   

Before we start, Barry, please give us your view on the data analytics program. 

Barry Van Veen  1:08   

Well, data has come to be such a central issue in our world today, from a decision-making standpoint, as well as through all aspects of engineering. Just, you know, trying to understand how systems are behaving, how processes are performing, and so on. So, it has emerged with time over the last decade or so, and it’s certainly a skill that people who earned degrees a number of years ago may not have had the opportunity to develop in their studies. So, the program gives students a great opportunity to catch up on some of the latest tools and techniques for working with data, drawing inferences from data, understanding it and being savvy about modern tools, and computer methods for assessing data. So, I think that it plays an important role given just the way the field has evolved, and the importance of data and engineering today. 

Susan Ottmann  2:22   

Let’s dive a bit deeper and talk about machine learning and why that’s important for engineers who are focused on data today. 

Barry Van Veen  2:31   

Well, first of all, let’s talk about what machine learning is. Because it’s a buzzword, sometimes it may not have a lot of meaning to folks, but we can teach a computer to recognize patterns and to learn in the same way that humans do. Right? So how do we learn? We learn by seeing lots of examples. And, you know, we kind of do this instinctively, when we’re children. You know, we see patterns on a page and we learn what words mean. We learn grammar, we see pictures, and we can tell the difference between a cat and a dog. And those are all things that we can have machines learn as well. And so the thing is, data has such a huge volume that we are able to collect and store now and so on. It’s just impossible for any one person to be able to carefully analyze all that data using the power of their human brain, right? And I mean, we’ve all as engineers, we’ve been using calculators for a really long time, because it’s so much more effective than computing, right? So, machine learning allows us to automate the handling of data. Much the same way that we can use robots, for example, to automate manufacturing. So, it really is enabling things like self-driving cars, and all sorts of systems like that, where we want to be able to analyze vast volumes of data by a computer and make good decisions in a short period of time. 

Susan Ottmann  4:26   

That’s a great description and really helpful. What’s unique about your machine learning course and the reason that you focus on the fundamentals? 

Barry Van Veen  4:37   

Well, so the course that I teach a lot and that I’ve worked on in terms of creating the online curriculum is – the title is somewhat indicative of the topic, right? It’s called Matrix Methods in Machine Learning, and there’s a lot of math behind the way that we teach machines to learn and the way that we, you know, develop algorithms for that process. So, matrices, if you recall, and don’t have, you know, nightmares thinking, hearing the word, right. But matrices are just a collection of numbers in a rectangular format, right? So, rows and columns. We’re used to tables, you can think of tables as matrices. And there’s a lot of math on studying how to manipulate and what you can learn about ordered arrangements of data.  

A big part of machine learning is recognizing patterns in data. Okay. We want to understand, for example, what kind of patterns are there across the rows of this data. So, we might have a case where I mean, the classic example is the Netflix problem, where we have a whole bunch of movies. And then we have a whole bunch of individuals that have watched and rated those movies. And so, we have a matrix of ratings, right? And we’d like to understand what the patterns are, so that we can make predictions of what movie you might like. Okay, you’ve liked these other three movies and we’ve seen that there’s all these other people that have liked these same three movies and have also liked a fourth one, so maybe you’d like that fourth one, okay.  

That’s a simplified example, but there’s patterns in this data and we want to be able to extract those patterns. And matrix math, linear algebra, is the tool of choice for extracting those patterns from the data. So take lifetime students, there’s a number of different branches of engineering where this is the case lifetime. So, take a linear algebra class from a math department. And it just seems like totally, you know, theoretical ideas and abstract and spend all this time proving that something is a vector space. Well, when you take a class like the one we teach on Matrix Methods in Machine Learning, you’re going to be learning that same kind of math stuff. But it’s applied to actual real problems and showing how you can use these concepts like eigen decomposition, singular value decomposition, solving least squares problems, rank and subspaces, and how those actually have meaning in real problems.  

Why students find that to be a really useful way to learn the underlying mathematics is to do so in the context of a current application. They also find that they’re much better prepared, then, when they go to use these tools for machine learning that are out there, having an understanding of what’s sort of under the hood, and what are the fundamental principles that allow these things to work, allows you to make, you know, to intelligently use the tools and allows you to, you know, hopefully make decisions like, “Hey, this doesn’t quite make sense. I wonder if this situation is applying what we talked about where I can’t trust the result?” Okay, so understanding the fundamentals puts you in a great position to be a very effective user of these tools, and not just treat it as a magic black box. 

Susan Ottmann  8:54   

We definitely hear that from our students that the applied nature of the course, both understanding the fundamentals, but then applying it, allows them to learn. And I think as adults, that’s how we learn, we learn by applying. There is some confusion out there between the engineering perspective on data science and data analytics versus the computer science view. What do you see as the difference between these two? 

Barry Van Veen  9:19   

Well, it’s an interesting question. It’s one that I get a lot from students and I have both students. I have engineering students taking my classes. I have a lot of computer science students taking these classes. And generally the engineering students or the engineering perspective on data science is driven a lot more by the fundamentals and by understanding, you know, why things work the way they do, right? Engineers are often asking why, so that we can be more effective at creating new things, right? Why does this work the way it does? Why is this part failing? What’s wrong with it? You know, what, where’s the issue in the manufacturing line? We’re in the process of problem solving, we need to understand why.  

So, I think that one of the things you’ll see is that, in general, is that an engineering perspective on data science is more driven from the fundamentals of understanding why. And it’s less driven by things like, you know, by stereotype. This is a stereotype, okay, I’m going to admit it right up front that this is a stereotype because I’ve had some, I’ve had some computer science students that are, I guess, I’ll call them secret engineers. But they’re just, you know, outstanding students, right. But the classic thing I could see and can characterize in class kind of gives an illustration of the difference is that we’ll have something and we’ll be working on a problem in class, and the engineering students are over here. And they’re trying to like, sort of understand why it works that way and what’s a special case and all that. And the computer science students have gone out, and they’ve found a GitHub someplace, and they’ve downloaded this software, they’ve installed it, they’ve got it running, and they’re feeding data into it. And, you know, just making it, using it. Um, so the, I see that that’s kind of an exaggeration of sorts, but of the different scales, engineers being much more focused on the why.  

The other thing that helps here is to understand the history. Engineers have been interested, gosh, forever, in building things, and designing better instruments, better machines, better processes, and so on. And in the process, in order to do that, we build sensors, right, we build electrical engineering, right? We talk a lot about, I mean, we’ll have these, right, your digital camera, there’s a sensor, we’re sensing the environment, even radar, going back to the 40s. And before even processing, you know, communicating information by radio, all those things, we’re building environmental sensors. You know, measuring RPM on an engine, right? You can tell I’m an electrical engineer, my examples outside of that field are a little primitive, right? But engineers have been building systems that collect data and have been trying to use data to understand how things work for a really long time.  

And on the computer science side, the explosion in data analysis and data science has been fueled a lot by recent technology. You know, things like Google and the amount of data that Google collects, and Amazon, and those kinds of more modern things. And so, I think the engineering perspective really is tied to a lot to the physical world and is tied to the why. And that really distinguishes the engineering perspective that I see. Right, is we’re really tied to the physical world and to the why things happen and how, you know, things work. 

Susan Ottmann  13:54   

Thanks for that explanation. I think we are problem solvers as engineers, and the Why is so important. Another area we find our students interested in is learning about signal processing. Now, I’m a mechanical engineer. So sometimes your work is a bit foreign to me. From a non-electrical engineers perspective, what are the key components of signal processing and why is it important? 

Barry Van Veen  14:21   

Yeah, I sometimes joke that signal processing is what we used to call machine learning before machine learning was invented. And a lot of the mathematical tools that are used in signal processing are extremely similar to the tools that we use in machine learning. Now, when we talk about a signal, we generally referred to some data that depends on some other, on an underlying Physical construct.  

Okay, so like, if you’re collecting data in time, in other words, you’re collecting samples of data and processing those samples sequentially. Or like my voice that’s being recorded right now, right, there’s a bunch of samples of that sound that are being stored. And they’re stored as a function of time. You think of a picture, the values of those pixels occur as a function of position. Okay, so there’s some underlying physical aspect to the world that’s meaningful to that data. And that’s where the term signal comes in. Okay, I think about that as distinct from the kind of ratings data that I mentioned earlier, where, you know, users are rating products or movies or something. You know, there’s no underlying physical notion of something like time, or space, that is associated with that data. It’s just data unconnected from a physical construct. So that’s really where the whole idea of a signal comes from.  

So, signal processing refers to a set of tools that can be used to analyze data where the data is collected over time, for example, or over space. And most commonly, it’s those two cases. There are some special tools that apply when you have this dependence on time or on space. But the tools that apply more broadly to data, which is devoid of that construct of any link to that sort of physical connection, also apply to data that is a function of time. So, the signal processing round generally includes an additional set of tools and ideas that allow you to exploit either say, the time dependence, or the spatial dependence of those signals. And, or data as I should say, with the spatial time dependence of that data.  

So, I don’t know if that helps Susan, make that a little clear, or more clear, but it’s basically, signal tends to imply there’s some underlying physical quantity that is linking the various data that you collect. 

Susan Ottmann  17:50   

I find it very clear, I find it fascinating as well. So, each time I do one of these podcasts, I learn quite a bit. But before we go, is there anything else you’d like to tell our prospective students listening to this podcast or others just interesting in learning more about machine learning, or data analytics, and its application in industry? 

Barry Van Veen  18:11   

Well, you know, being a professor, I’m used to talking 75 minutes at a time, so I could go on for a long time about this. But a couple things, um, one thing that that comes to mind, that was sort of a eureka moment for me, as someone who’s been, you know, working on this stuff since the 80s, was in, I think it was fall of 2019, when Apple announced their iPhone 11. It was actually interesting to me, because one of my hobbies is photography. So I was pretty interested in the camera and the technology that they were using in the iPhone 11. And what was absolutely fascinating is they were touting their technology in a public advertisement as being based on the phone having a neural engine using machine learning. And they were talking, again – this is a public facing ad – okay, for the general public, they’re talking about how many matrix computations the chipset in this iPhone 11 could pull off in in a certain amount of time. And I never thought I would see the day where matrix computations would show up in a public facing advertisement. So, I guess you could maybe, you know, maybe the thing is, if I’m really brutally honest, maybe I’ve just been insecure all these years and Apple finally validated my existence. But I’m joking about that. But that was pretty remarkable.  

And I think, you know, you don’t have to look far to see the pervasive nature of just being able to analyze data. I mean, you know, the whole medical realm, which is an area that I’ve been emphasizing in my career since at least 2000. You know, how do we like, like looking for cancer in Radiological images? I mean, that’s something that’s just taking off from a machine learning standpoint. And you know, one of the you mentioned earlier, my interest in developing algorithms for better understanding how the brain functions. Well one of the things we do is we try to build network models for the human brain. Like so how do different areas of the brain interact with one another to accomplish certain cognitive tasks and other things? Like how is the network connectivity of the brain disrupted when you lose consciousness, such as when you go to sleep or experience anesthesia? And, you know if you look at the things we can measure today, with things like fMRI or even older technologies like electroencephalography, things we can measure about the brain are really quite astounding. And you quickly generate data that overwhelms you in terms of any sort of, you know, human interpretation.  

So, these kind of techniques are just everywhere, right? You don’t have to look far. Think about self driving cars. Think about intelligent inventory management, you know, in a factory, all this is data driven. And I think that we are going to be in for, the age of data is going to be here for quite a while. So, I see it as a really rich opportunity. I personally, my entire career, I’ve gotten to work on all sorts of different problem application areas. From acoustics to wireless communication, to medical imaging, and so on. And it’s just been an incredibly rich career, and what has linked all those things are the tools that I have that allow me to analyze data. Okay, so that’s been super fun. And I really, I think that that’s likely to be the case, for the foreseeable future. We’re gonna have, you know, the power of these tools is growing and the breadth of their applications is growing, it’s just going to create a lot of opportunities. Okay, that wasn’t quite 75 minutes, but I think I came close, so I better quit. 

Susan Ottmann  23:13   

I find your view of the world of data stimulating and knew that you’re helping our students, and so many companies in this very complex arena. Thank you so much for your time today. We appreciate you talking to us and informing us on these very intriguing topics. 

Justin Kyle Bush  23:32   

Thank you for listening to E-Pod. For more episodes, visit interpro.wisc.edu/podcasts. And if you enjoyed this, don’t forget to subscribe, rate, review and share.